Skip to main content

BGP Flow Specification Filtered by Destination-QP
draft-lll-idr-flowspec-filter-qp-00

Document Type Active Internet-Draft (individual)
Authors Yisong Liu , Changwang Lin , Jinming Li
Last updated 2026-03-01
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-lll-idr-flowspec-filter-qp-00
IDR Working Group                                                Y. Liu
Internet Draft                                             China Mobile
Intended status: Standards Track                                 C. Lin
Expires: September 02, 2026                        New H3C Technologies
                                                                  J. Li
                                                           China Mobile
                                                         March 02, 2026

             BGP Flow Specification Filtered by Destination-QP
                    draft-lll-idr-flowspec-filter-qp-00

Abstract

   BGP Flowspec mechanism (BGP-FS) [RFC8955] [RFC8956] propagates both
   traffic Flow Specifications and Traffic Filtering Actions by making
   use of the BGP NLRI and the BGP Extended Community encoding formats.

   This document specifies a new BGP-FS component type named
   Destination-QP(Destination Queue Pair) to support filtering by
   Destination-QP.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF). Note that other groups may also distribute
   working documents as Internet-Drafts. The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 02 September 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with

Liu, et al.            Expires September, 2026                [Page 1]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

   respect to this document. Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.

Table of Contents

   1. Introduction.....................................................2
   2. Requirements Language............................................3
   3. Terminology......................................................4
   4. FlowSpec Fileter by Destination-QP...............................4
   5. Use case.........................................................4
   6. Security Considerations..........................................7
   7. IANA Considerations..............................................7
   8. References.......................................................7
      8.1. Normative References........................................7
      8.2. Informative References......................................8
   Authors' Addresses..................................................9

1. Introduction

   BGP Flowspec mechanism (BGP-FS) [RFC8955] [RFC8956] propagates both
   traffic Flow Specifications and Traffic Filtering Actions by making
   use of the BGP NLRI and the BGP Extended Community encoding formats.

   In modern AI training clusters, especially those based on RoCEv2 and
   RDMA for high-performance inter-GPU communication, traffic exhibits
   distinct characteristics that differ significantly from traditional
   Internet flows. AI training communication typically consists of
   long-lived, high-throughput, and delay-sensitive and jitter-
   sensitive flows, such as those generated by collective communication
   operations like AllReduce. These flows are often bound to long-term
   Queue Pair (QP) [IB-SPEC] instances, with relatively stable five-
   tuple fields, making them prone to path polarization and uneven link
   utilization when scheduled solely by five-tuple-based hashing. Such
   static mapping fails to adapt to the dynamic communication patterns
   and strict performance requirements of AI workloads, leading to
   localized congestion, degraded training throughput, and reduced
   cluster efficiency.

   For these reasons, five-tuple-only flow scheduling is no longer
   sufficient for AI-oriented lossless networks. Instead, scheduling
   mechanisms based on a combination of five-tuple and QP information
   have become necessary. By introducing QP-level identification into

Liu, et al.            Expires September, 2026                [Page 2]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

   the flow distribution logic, networks can achieve finer-grained
   traffic steering, better load balancing across equal-cost multi-path
   (ECMP) groups, and improved isolation between communication streams.

   As shown in Figure 1, the controller uses BGP Flow-Spec to
   distribute QP routes to the Ingress PE according to QP. Based on the
   QP value, traffic is redirected to different forwarding paths.
   Methods for redirecting traffic to different paths can include
   redirection to IP [draft-ietf-idr-flowspec-redirect-ip] or
   redirection to SRv6 [draft-ietf0-idr-srv6-flowspec-path-redirect],
   among others. Specific methods are beyond the scope of this
   document.

   At the forwarding level, when the DC sends packets, it carries the
   corresponding QP for traffic belonging to the same task.

   The Ingress PE looks up the QP routes distributed via BGP Flow-Spec
   based on the QP of the traffic and forwards the packets  according
   to the QP routes.

                     +---------+
                     |Controler|
                     +----+----+
                          |
                          |BGP Flow-Spec
                          |for Destination-QP
                          |
         +----+       +---V----+       +-- Path1--+      +--------+
         |DC  |       |Ingress |       |          |      |Egress  |
         |    +-------+PE      +-------+          +------+PE      |
         +----+       +-+------+       |          |      +--------+
                                       +-- Path2--+
                  Figure 1

   This document specifies a new BGP-FS component type named
   Destination-QP(Destination Queue Pair) to support filtering by
   Destination-QP.

2. Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in

Liu, et al.            Expires September, 2026                [Page 3]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

   BCP 14 RFC 2119 [RFC2119] RFC 8174 [RFC8174] when, and only when,
   they appear in all capitals, as shown here.

3. Terminology

   FS: Flow Specification

   QP: Queue Pair

4. FlowSpec Filtered by Destination-QP

   [draft-ietf-idr-fsv2-ip-basic] defines the Components in the IP
   Basic TLV.  This document proposes a new Component for Destination-
   QP information.

   The following new component type is defined.

   *  Destination-QP

   Type TBD - Destination-QP

   Length: variable

   Component Value format: [numeric_op, value]+

   Each Destination-QP value is 4 octets.

   Per section 10 of [RFC8955] , If a receiving BGP speaker cannot
   support this new Flow Specification component type, it MUST discard
   the NLRI value field that contains such unknown components.  Since
   the NLRI field encoding (Section 4 of [RFC8955]) is defined in the
   form of a 2-tuple <length, NLRI value>, message decoding can skip
   over the unknown NLRI value and continue with subsequent remaining
   NLRI.

5. Use case

   The BGP agent specifies the traditional 5-tuple and new defined Destination-QP
   as matching criteria.

   As shown in Figure 2, for traffic with a Destination-QP value of 1001, redirect it
   to Path1; for traffic with a Destination-QP value of 1001, redirect it to Path2.

   BGP-FS Route 1:

    FS Filters

Liu, et al.            Expires September, 2026                [Page 4]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

       Destination: 203.0.113.0/24

       source address: 198.51.100.0/24

       protocol: UDP

       destination port: 4791 (RoCEv2 protocol)

       source port: 10001

       Destination-QP value: 1001 (the newly defined in this document.)

    FS Action:
       Redirect Flow to Path1 (The specific format is not discussed in this
   document.)

    BGP-FS Route 2:
    FS Filters
       Destination: 203.0.113.0/24

       source address: 198.51.100.0/24

       protocol: UDP

       destination port: 4791 (RoCEv2 protocol)

       source port: 10001

       Destination-QP value: 2001 (the newly defined in this document.)

    FS Action:
       Redirect Flow to Path2 (The specific format is not discussed in this
   document.)

Liu, et al.            Expires September, 2026                [Page 5]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

                   +----------+
                   |Controller|
                   +----------+
                        |BGP FS:
                        |NLRI Filter
                        |  Destination Prefix
                        |  Source Prefix
                        |  IP Protocol
                        |  Destination Port
                        |  Source Port
                        |  Destination-QP
                        |
                        |Action:
                        |  Redirect
                        |
                        V
                   +----------+
                   |Ingress PE|
                   +----------+

                  Figure 2

   The Ingress PE receives the Flow-Spec route and installs it into the forwarding
   plane. Upon receiving AI data traffic, it redirects the traffic to the
   corresponding Path for forwarding based on the traffic 5-tuple and Destination-
   QP parameters.

   In this document, the example uses QP, destination address, and source address as
   filtering criteria for flow-spec to redirect traffic to different paths.

   +--------------+-------------------+----------------+-----------+
   | Destination  | Dest Prefix       | Source Preifx  |Redirect   |
   | QP           |                   |                |           |
   +--------------+-------------------+----------------+-----------+
   |  1001        | 203.0.113.0/24 | 198.51.100.0/24|Path1      |
   +--------------+-------------------+----------------+-----------+
   |  2001        | 203.0.113.0/24 | 198.51.100.0/24|Path2      |
   +--------------+-------------------+----------------+-----------+

Liu, et al.            Expires September, 2026                [Page 6]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

6. Security Considerations

   No new security issues are introduced to the BGP protocol by this
   specification.

7. IANA Considerations

   [draft-ietf-idr-flowspec-v2-04] defines the Types for IP Filters.

   This document requested to assign a new type code point from "Non-IP
   Types for IP Filters" registry for Destination-QP.

   Non-IP Types for IP Filters SubTLV

       type     Definition

      ======    ============

        64 -    Parts of SID

        65 -    MPLS Match 1: Label in Label stack

        66 -    MPLS Match 2: EXP bits in top Label

        TBD -   Destination-QP   This document

        67-249  unassigned (reserved for now)

        250-    Filter Error handling

        251-255 Reserved

8. References

  8.1. Normative References

   [IB-SPEC]InfiniBand Trade Association. InfiniBand Routing and
             Forwarding. Architecture Supplement, 2023.

   [I-D.ietf-idr-flowspec-v2]Hares, S., Eastlake, D. E., Yadlapalli,
             C., and S. Maduschke, "BGP Flow Specification Version 2",
             Work in Progress, Internet-Draft, draft-ietf-idr-flowspec-
             v2-04, 28 April 2024,
             <https://www.ietf.org/archive/id/draft-ietf-idr-flowspec-
             v2-04.txt>.

Liu, et al.            Expires September, 2026                [Page 7]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, DOI
             10.17487/RFC2119, March 1997, https://www.rfc-
             editor.org/info/rfc2119>.

   [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
             Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI
             10.17487/RFC4271, January 2006, <https://www.rfc-
             editor.org/info/rfc4271>.

   [RFC8955] Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M.
             Bacher, "Dissemination of Flow Specification Rules",RFC
             8955, DOI 10.17487/RFC8955, December 2020,
             <https://www.rfc-editor.org/info/rfc8955>.

   [RFC8956] Loibl, C., Ed., Raszuk, R., Ed., and S. Hares, Ed.,
             "Dissemination of Flow Specification Rules for IPv6",RFC
             8956, DOI 10.17487/RFC8956, December 2020,
             <https://www.rfc-editor.org/info/rfc8956>.

  8.2. Informative References

Liu, et al.            Expires September, 2026                [Page 8]
   Internet-Draft       FlowSpec by Destination-QP       March 02,2026

Authors' Addresses

   Yisong Liu
   China Mobile
   China
   Email: liuyisong@chinamobile.com

   Changwang Lin
   New H3C Technologies
   China
   Email: linchangwang.04414@h3c.com

   Jinming Li
   China Mobile
   China
   Email: lijinming@chinamobile.com

Liu, et al.            Expires September, 2026                [Page 9]