Skip to main content

Fast Fault Tolerance Architecture for Programmable Datacenter Networks
draft-fft-architecture-00

Document Type Expired Internet-Draft (individual)
Expired & archived
Authors Dan Li , Kaihui Gao , Shuai Wang , Li Chen , Xuesong Geng
Last updated 2025-04-24 (Latest revision 2024-10-21)
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Expired
Telechat date (None)
Responsible AD (None)
Send notices to (None)

This Internet-Draft is no longer active. A copy of the expired Internet-Draft is available in these formats:

Abstract

This document introduces a fast rerouting architecture for enhancing network resilience through rapid failure detection and swift traffic rerouting within the programmable data plane, leveraging in-band network telemetry and source routing. Unlike traditional methods that rely on the control plane and face significant delays in traffic rerouting, the proposed architecture utilizes a white-box modeling of the data plane to distinguish and analyze packet losses accurately, enabling immediate identification for link failures (including black- hole and gray failures). By utilizing real-time telemetry and SR- based rerouting, the proposed solution significantly reduces rerouting times to a few milliseconds, offering a substantial improvement over existing practices and marking a pivotal advancement in fault tolerance of datacenter networks.

Authors

Dan Li
Kaihui Gao
Shuai Wang
Li Chen
Xuesong Geng

(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)