rfc9524.original | rfc9524.txt | |||
---|---|---|---|---|
Network Working Group D. Voyer, Ed. | Internet Engineering Task Force (IETF) D. Voyer, Ed. | |||
Internet-Draft Bell Canada | Request for Comments: 9524 Bell Canada | |||
Intended status: Standards Track C. Filsfils | Category: Standards Track C. Filsfils | |||
Expires: 29 February 2024 R. Parekh | ISSN: 2070-1721 R. Parekh | |||
Cisco Systems, Inc. | Cisco Systems, Inc. | |||
H. Bidgoli | H. Bidgoli | |||
Nokia | Nokia | |||
Z. Zhang | Z. Zhang | |||
Juniper Networks | Juniper Networks | |||
28 August 2023 | February 2024 | |||
SR Replication segment for Multi-point Service Delivery | Segment Routing Replication for Multipoint Service Delivery | |||
draft-ietf-spring-sr-replication-segment-19 | ||||
Abstract | Abstract | |||
This document describes the Segment Routing Replication segment for | This document describes the Segment Routing Replication segment for | |||
Multi-point service delivery. A Replication segment allows a packet | multipoint service delivery. A Replication segment allows a packet | |||
to be replicated from a Replication node to Downstream nodes. | to be replicated from a replication node to downstream nodes. | |||
Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 29 February 2024. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9524. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2023 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Terminology | |||
1.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Use Cases | |||
2. Replication Segment . . . . . . . . . . . . . . . . . . . . . 4 | 2. Replication Segment | |||
2.1. SR-MPLS data plane . . . . . . . . . . . . . . . . . . . 6 | 2.1. SR-MPLS Data Plane | |||
2.2. SRv6 data plane . . . . . . . . . . . . . . . . . . . . . 7 | 2.2. SRv6 Data Plane | |||
2.2.1. End.Replicate: Replicate and/or Decapsulate . . . . . 9 | 2.2.1. End.Replicate: Replicate and/or Decapsulate | |||
2.2.2. OAM Operations . . . . . . . . . . . . . . . . . . . 13 | 2.2.2. OAM Operations | |||
2.2.3. ICMPv6 Error Messages . . . . . . . . . . . . . . . . 13 | 2.2.3. ICMPv6 Error Messages | |||
3. Implementation Status . . . . . . . . . . . . . . . . . . . . 13 | 3. IANA Considerations | |||
3.1. Cisco implementation . . . . . . . . . . . . . . . . . . 14 | 4. Security Considerations | |||
3.2. Nokia implementation . . . . . . . . . . . . . . . . . . 14 | 5. References | |||
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 | 5.1. Normative References | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 15 | 5.2. Informative References | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 | Appendix A. Illustration of a Replication Segment | |||
7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 | A.1. SR-MPLS | |||
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 | A.2. SRv6 | |||
8.1. Normative References . . . . . . . . . . . . . . . . . . 18 | A.2.1. Pinging a Replication-SID | |||
8.2. Informative References . . . . . . . . . . . . . . . . . 19 | Acknowledgements | |||
Appendix A. Illustration of a Replication Segment . . . . . . . 20 | Contributors | |||
A.1. SR-MPLS . . . . . . . . . . . . . . . . . . . . . . . . . 21 | Authors' Addresses | |||
A.2. SRv6 . . . . . . . . . . . . . . . . . . . . . . . . . . 22 | ||||
A.2.1. Pinging Replication SID . . . . . . . . . . . . . . . 25 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 | ||||
1. Introduction | 1. Introduction | |||
Replication segment is a new type of segment for Segment Routing (SR) | The Replication segment is a new type of segment for Segment Routing | |||
[RFC8402], which allows a node (henceforth called a Replication node) | (SR) [RFC8402], which allows a node (henceforth called a "replication | |||
to replicate packets to a set of other nodes (called Downstream | node") to replicate packets to a set of other nodes (called | |||
nodes) in a Segment Routing Domain. A Replication segment can | "downstream nodes") in an SR domain. A Replication segment can | |||
replicate packets to directly connected nodes or to downstream nodes | replicate packets to directly connected nodes or to downstream nodes | |||
(without need for state on the transit routers). This document | (without the need for state on the transit routers). This document | |||
focuses on specifying behavior of a Replication segment for both | focuses on specifying the behavior of a Replication segment for both | |||
Segment Routing with Multiprotocol Label Switching (SR-MPLS) | Segment Routing with Multiprotocol Label Switching (SR-MPLS) | |||
[RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The | [RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The | |||
examples in the Appendix illustrate the behavior of a Replication | examples in Appendix A illustrate the behavior of a Replication | |||
Segment in SR domain. The use of two or more Replication segments | Segment in an SR domain. The use of two or more Replication segments | |||
stitched together to form a tree using a control plane is left to be | stitched together to form a tree using a control plane is left to be | |||
specified in other documents. The management of IP multicast groups, | specified in other documents. The management of IP multicast groups, | |||
building IP multicast trees, and performing multicast congestion | building IP multicast trees, and performing multicast congestion | |||
control are out of scope of this document. | control are out of scope of this document. | |||
1.1. Terminology | 1.1. Terminology | |||
This section defines terms introduced and used frequently in this | This section defines terms introduced and used frequently in this | |||
document. Refer to Terminology sections of [RFC8402], [RFC8754] and | document. Refer to the Terminology sections of [RFC8402], [RFC8754], | |||
[RFC8986] for other terms used in Segment Routing. | and [RFC8986] for other terms used in SR. | |||
* Replication segment: A segment in SR domain that replicates | Replication segment: A segment in an SR domain that replicates | |||
packets. See Section 2 for details. | packets. See Section 2 for details. | |||
* Replication node: A node in SR domain which replicates packets | Replication node: A node in an SR domain that replicates packets | |||
based on Replication segment. | based on a Replication segment. | |||
* Downstream nodes: A Replication segment replicates packets to a | Downstream nodes: A Replication segment replicates packets to a set | |||
set of nodes. These nodes are Downstream nodes. | of nodes. These nodes are downstream nodes. | |||
* Replication state: State held for a Replication segment at a | Replication state: State held for a Replication segment at a | |||
Replication node. It is conceptually a list of replication | replication node. It is conceptually a list of Replication | |||
branches to Downstream nodes. The list can be empty. | branches to downstream nodes. The list can be empty. | |||
* Replication SID: Data plane identifier of a Replication segment. | Replication-SID: Data plane identifier of a Replication segment. | |||
This is a SR-MPLS label or SRv6 Segment Identifier (SID). | This is an SR-MPLS label or SRv6 Segment Identifier (SID). | |||
* SRH: IPv6 Segment Routing Header [RFC8754]. | SRH: IPv6 Segment Routing Header [RFC8754]. | |||
* Point-to-Multipoint Service: A service that has one ingress node | Point-to-Multipoint (P2MP) Service: A service that has one ingress | |||
and one or more egress nodes. A packet is delivered to all the | node and one or more egress nodes. A packet is delivered to all | |||
egress nodes | the egress nodes. | |||
* Root node: An ingress node of a P2MP service, | Root node: An ingress node of a P2MP service. | |||
* Leaf node: An egress node of a P2MP service. | ||||
* Bud node: A node that is both a Replication node and a Leaf node. | Leaf node: An egress node of a P2MP service. | |||
Bud node: A node that is both a replication node and a leaf node. | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
1.2. Use Cases | 1.2. Use Cases | |||
In the simplest use case, a single Replication segment includes the | In the simplest use case, a single Replication segment includes the | |||
ingress node of a multi-point service and the egress nodes of the | ingress node of a multipoint service and the egress nodes of the | |||
service as all the Downstream nodes. This achieves Ingress | service as all the downstream nodes. This achieves Ingress | |||
Replication [RFC7988] that has been widely used for Multicast VPN | Replication [RFC7988] that has been widely used for Multicast VPN | |||
(MVPN) [RFC6513] and Ethernet VPN (EVPN)[RFC7432] bridging of | (MVPN) [RFC6513] and Ethernet VPN (EVPN) [RFC7432] bridging of | |||
Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This | Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This | |||
Replication segment can be either provisioned locally on ingress and | Replication segment on ingress and egress nodes can either be | |||
egress nodes, or using dynamic auto-discovery procedures for MVPN and | provisioned locally or using dynamic autodiscovery procedures for | |||
EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior for EVPN | MVPN and EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior | |||
BUM traffic. | for EVPN BUM traffic. | |||
Replication segments can also be used to form trees by stitching | Replication segments can also be used to form trees by stitching | |||
Replication segments on a Root node, intermediate Replication nodes | Replication segments on a root node, intermediate replication nodes, | |||
and Leaf nodes for efficient delivery of MVPN and EVPN BUM traffic. | and leaf nodes for efficient delivery of MVPN and EVPN BUM traffic. | |||
2. Replication Segment | 2. Replication Segment | |||
In a Segment Routing Domain, a Replication segment is a logical | In an SR domain, a Replication segment is a logical construct that | |||
construct which connects a Replication node to a set of Downstream | connects a replication node to a set of downstream nodes. A | |||
nodes. A Replication segment is a local segment instantiated at a | Replication segment is a local segment instantiated at a Replication | |||
Replication node. It can be either provisioned locally on a node or | node. It can be either provisioned locally on a node or programmed | |||
programmed by a control plane. | by a control plane. | |||
Replication segments can be stitched together to form a tree by | Replication segments can be stitched together to form a tree by | |||
either local provisioning on nodes or using a control plane. The | either local provisioning on nodes or using a control plane. The | |||
procedures for doing this are out of scope of this document. One | procedures for doing this are out of scope of this document. One | |||
such control plane using a PCE with SR P2MP policy is specified in | such control plane using a PCE with the SR P2MP policy is specified | |||
[I-D.ietf-pim-sr-p2mp-policy]. However, if local provisioning is | in [P2MP-POLICY]. However, if local provisioning is used to stitch | |||
used to stitch Replication segments, then a chain of Replication | Replication segments, then a chain of Replication segments SHOULD NOT | |||
segments SHOULD NOT form a loop. If a control plane is used to | form a loop. If a control plane is used to stitch Replication | |||
stitch Replication segments, the control plane specification MUST | segments, the control plane specification MUST prevent loops or | |||
prevent loops, or to detect and mitigate loops in steady state. | detect and mitigate loops in steady state. | |||
A Replication segment is identified by the tuple <Replication-ID, | A Replication segment is identified by the tuple <Replication-ID, | |||
Node-ID>, where: | Node-ID>, where: | |||
* Replication-ID: An identifier for a Replication segment that is | Replication-ID: An identifier for a Replication segment that is | |||
unique in context of the Replication node. | unique in context of the replication node. | |||
* Node-ID: The address of the Replication node that the Replication | Node-ID: The address of the replication node for the Replication | |||
segment is for. Note that the Root of a multi-point service is | segment. Note that the root of a multipoint service is also a | |||
also a Replication node. | Replication node. | |||
Replication-ID is a variable length field. In simplest case, it can | Replication-ID is a variable-length field. In the simplest case, it | |||
be a 32-bit number, but it can be extended or modified as required | can be a 32-bit number, but it can be extended or modified as | |||
based on specific use of a Replication segment. This is out of scope | required based on the specific use of a Replication segment. This is | |||
for this document. The length of Replication-ID is specified in the | out of scope for this document. The length of the Replication-ID is | |||
signaling mechanism used for Replication segment. Examples of such | specified in the signaling mechanism used for the Replication | |||
signaling and extensions are described in | segment. Examples of such signaling and extensions are described in | |||
[I-D.ietf-pim-sr-p2mp-policy]. When the PCE signals a Replication | [P2MP-POLICY]. When the PCE signals a Replication segment to its | |||
segment to its node, the <Replication-ID, Node-ID> tuple identifies | node, the <Replication-ID, Node-ID> tuple identifies the segment. | |||
the segment. | ||||
A Replication segment includes the following elements: | A Replication segment includes the following elements: | |||
* Replication SID: The Segment Identifier of a Replication segment. | Replication-SID: The Segment Identifier of a Replication segment. | |||
This is a SR-MPLS label or a SRv6 SID [RFC8402]. | This is an SR-MPLS label or an SRv6 SID [RFC8402]. | |||
* Downstream nodes: Set of nodes in Segment Routing domain to which | Downstream nodes: Set of nodes in an SR domain to which a packet is | |||
a packet is replicated by the Replication segment. | replicated by the Replication segment. | |||
* Replication state: See below. | Replication state: See below. | |||
The Downstream nodes and Replication state of a Replication segment | The downstream nodes and Replication state (RS) of a Replication | |||
can change over time, depending on the network state and Leaf nodes | segment can change over time, depending on the network state and leaf | |||
of a multi-point service that the segment is part of. | nodes of a multipoint service that the segment is part of. | |||
Replication SID identifies the Replication segment in the forwarding | The Replication-SID identifies the Replication segment in the | |||
plane. At a Replication node, the Replication SID operates on | forwarding plane. At a replication node, the Replication-SID | |||
Replication state of the Replication segment. | operates on the RS of the Replication segment. | |||
Replication state is a list of replication branches to the Downstream | RS is a list of Replication branches to the downstream nodes. In | |||
nodes. In this document, each branch is abstracted to a <Downstream | this document, each branch is abstracted to a <downstream node, | |||
node, Downstream Replication SID> tuple. <Downstream node> represents | downstream Replication-SID> tuple. <downstream node> represents the | |||
the reachability from the Replication node to the Downstream node. | reachability from the replication node to the downstream node. In | |||
In its simplest form, this MAY be specified as an interface or next- | its simplest form, this MAY be specified as an interface or next-hop | |||
hop if downstream node is adjacent to the Replication node. The | if the downstream node is adjacent to the replication node. The | |||
reachability may be specified in terms of Flexible Algorithm path | reachability may be specified in terms of a Flexible Algorithm path | |||
(including the default algorithm) [RFC9350], or specified by an SR | (including the default algorithm) [RFC9350] or specified by an SR- | |||
explicit path represented either by a SID-list (of one or more SIDs) | explicit path represented either by a SID list (of one or more SIDs) | |||
or by a Segment Routing Policy [RFC9256]. Downstream Replication SID | or by a Segment Routing Policy [RFC9256]. The downstream | |||
is the Replication SID of the Replication segment at the Downstream | Replication-SID is the Replication-SID of the Replication segment at | |||
node. | the downstream node. | |||
A packet is steered into a Replication segment at a Replication node | A packet is steered into a Replication segment at a replication node | |||
in two ways: | in two ways: | |||
* When the Active Segment [RFC8402] is a locally instantiated | * When the active segment [RFC8402] is a locally instantiated | |||
Replication SID | Replication-SID. | |||
* By the Root of a multi-point service based on local configuration | * By the root of a multipoint service based on local configuration | |||
outside the scope of this document. | that is outside the scope of this document. | |||
In either case, the packet is replicated to each Downstream node in | In either case, the packet is replicated to each downstream node in | |||
the associated Replication state. | the associated RS. | |||
If a Downstream node is an egress (Leaf) of the multi-point service, | If a downstream node is an egress (leaf) of the multipoint service, | |||
no further replication is needed. The Leaf node's Replication | no further replication is needed. The leaf node's Replication | |||
segment has an indicator for Leaf role and it does not have any | segment has an indicator for the leaf role, and it does not have any | |||
Replication state i.e. the list of Replication branches is empty. | RS (i.e., the list of Replication branches is empty). The | |||
The Replication SID at a Leaf node MAY be used to identify the multi- | Replication-SID at a leaf node MAY be used to identify the multipoint | |||
point service. Notice that the segment on the Leaf node is still | service. Notice that the segment on the leaf node is still referred | |||
referred to as a Replication segment for the purpose of | to as a "Replication segment" for the purpose of generalization. | |||
generalization. | ||||
A node can be a Bud node, i.e. it is a Replication node and a Leaf | A node can be a bud node (i.e., it is a replication node and a leaf | |||
node of a multi-point service [I-D.ietf-pim-sr-p2mp-policy]. | node of a multipoint service [P2MP-POLICY]). The Replication segment | |||
Replication segment of a Bud node has a list of Replication Branches | of a bud node has a list of Replication branches as well as a leaf | |||
as well as Leaf role indicator. | role indicator. | |||
In principle it is possible for different Replication segments to | In principle, it is possible for different Replication segments to | |||
replicate packets to the same Replication segment on a Downstream | replicate packets to the same Replication segment on a downstream | |||
node. However, such usage is intentionally left out of scope of this | node. However, such usage is intentionally left out of scope of this | |||
document. | document. | |||
2.1. SR-MPLS data plane | 2.1. SR-MPLS Data Plane | |||
When the Active Segment is a Replication SID, the processing results | When the active segment is a Replication-SID, the processing results | |||
in a POP [RFC8402] operation and lookup of the associated Replication | in a POP [RFC8402] operation and the lookup of the associated RS. | |||
state. For each replication in the Replication state, the operation | For each replication in the RS, the operation is a PUSH [RFC8402] of | |||
is a PUSH [RFC8402] of the downstream Replication SID and an optional | the downstream Replication-SID and an optional segment list onto the | |||
segment list on to the packet to steer the packet to the Downstream | packet to steer the packet to the downstream node. | |||
node. | ||||
The operation performed on incoming Replication SID is NEXT [RFC8402] | The operation performed on the incoming Replication-SID is NEXT | |||
at Leaf/Bud nodes where delivery of payload off tree is per local | [RFC8402] at a leaf or bud node where delivery of payload off the | |||
configuration. For some usages, this may involve looking at the next | tree is per local configuration. For some usages, this may involve | |||
SID for example to get the necessary context. | looking at the next SID, for example, to get the necessary context. | |||
When the Root of a multi-point service steers a packet to a | When the root of a multipoint service steers a packet to a | |||
Replication segment, it results in a replication to each Downstream | Replication segment, it results in a replication to each downstream | |||
node in the associated replication state. The operation is a PUSH of | node in the associated RS. The operation is a PUSH of the | |||
the replication SID and an optional segment list on to the packet | Replication-SID and an optional segment list onto the packet, which | |||
which is forwarded to the downstream node. | is forwarded to the downstream node. | |||
The following applies to Replication SID in MPLS encapsulation: | The following applies to a Replication-SID in MPLS encapsulation: | |||
* SIDs MAY be inserted before the downstream SR-MPLS Replication SID | * SIDs MAY be inserted before the downstream SR-MPLS Replication-SID | |||
in order to guide a packet from a non-adjacent SR node to a | in order to guide a packet from a non-adjacent SR node to a | |||
Replication node. | replication node. | |||
* A Replication node MAY replicate a packet to a non-adjacent | * A replication node MAY replicate a packet to a non-adjacent | |||
Downstream node using SIDs it inserts in the copy preceding the | downstream node using SIDs it inserts in the copy preceding the | |||
downstream Replication SID. The Downstream node may be a Leaf | downstream Replication-SID. The downstream node may be a leaf | |||
node of the Replication segment, or another Replication node, or | node of the Replication segment, another replication node, or both | |||
both in case of Bud node. | in the case of a bud node. | |||
* A Replication node MAY use an Anycast SID or Border Gateway | * A replication node MAY use an Anycast-SID or a Border Gateway | |||
Protocol (BGP) PeerSet SID in segment list to send a replicated | Protocol (BGP) PeerSet-SID in the segment list to send a | |||
packet to one downstream Replication node in an Anycast set if and | replicated packet to one downstream replication node in a set of | |||
only if all nodes in the set have an identical Replication SID and | Anycast nodes. This occurs if and only if all nodes in the set | |||
reach the same set of receivers. | have an identical Replication-SID and reach the same set of | |||
receivers. | ||||
* For some use cases, there MAY be SIDs after the Replication SID in | * For some use cases, there MAY be SIDs after the Replication-SID in | |||
the segment list of a packet. These SIDs are used only by the | the segment list of a packet. These SIDs are used only by the | |||
Leaf/Bud nodes to forward a packet off the tree independent of the | leaf and bud nodes to forward a packet off the tree independent of | |||
Replication SID. Coordination regarding the absence or presence | the Replication-SID. Coordination regarding the absence or | |||
and value of context information for Leaf/Bud nodes is outside the | presence and value of context information for leaf and bud nodes | |||
scope of this document. | is outside the scope of this document. | |||
2.2. SRv6 data plane | 2.2. SRv6 Data Plane | |||
For SRv6 [RFC8986], this document specifies “Endpoint with | For SRv6 [RFC8986], this document specifies "Endpoint with | |||
replication” behavior (End.Replicate for short) to replicate a packet | replication and/or decapsulate" behavior (End.Replicate for short) to | |||
and forward the replicas according to a Replication state. | replicate a packet and forward the replicas according to an RS. | |||
When processing a packet destined to a local Replication SID, the | When processing a packet destined to a local Replication-SID, the | |||
packet is replicated according to the associated Replication state to | packet is replicated according to the associated RS to downstream | |||
Downstream nodes and/or locally delivered off tree when this is a | nodes and/or locally delivered off the tree when this is a leaf or | |||
Leaf/Bud node.For replication, the outer header is re-used, and the | bud node. For replication, the outer header is reused, and the | |||
Downstream Replication SID, from Replication state, is written into | downstream Replication-SID, from RS, is written into the outer IPv6 | |||
the outer IPv6 header destination address. If required, an optional | header Destination Address (DA). If required, an optional segment | |||
segment list may be used on some branches using H.Encaps.Red | list may be used on some branches using H.Encaps.Red [RFC8986] (while | |||
[RFC8986] (while some other branches may not need that). Note that | some other branches may not need that). Note that this H.Encaps.Red | |||
this H.Encaps.Red is independent of the replication segment – it is | is independent of the Replication segment: it is just used to steer | |||
just used to steer the replicated packet on a traffic engineered path | the replicated packet on a traffic-engineered path to a downstream | |||
to a Downstream node. The penultimate segment in encapsulating IPv6 | node. The penultimate segment in the encapsulating IPv6 header will | |||
header will execute Ultimate Segment Decapsulation (USD) flavor | execute the Ultimate Segment Decapsulation (USD) flavor [RFC8986] of | |||
[RFC8986] of End/End.X behavior and forward the inner (replicated) | End/End.X behavior and forward the inner (replicated) packet to the | |||
packet to the Downstream node. If H.Encaps.Red is used to steer a | downstream node. If H.Encaps.Red is used to steer a replicated | |||
replicated packet to a Downstream node, the operator must ensure the | packet to a downstream node, the operator must ensure the MTU on path | |||
MTU on path to the Downstream node is sufficient to account for | to the downstream node is sufficient to account for additional SRv6 | |||
additional SRv6 encapsulation. This also applies when the | encapsulation. This also applies when the Replication segment is for | |||
Replication segment is for the Root node, whose upstream node has | the root node, whose upstream node has placed the Replication-SID in | |||
placed the Replication-SID in the header. | the header. | |||
A local application on Root, for e.g. MVPN [RFC6513] or EVPN | A local application on root (e.g., MVPN [RFC6513] or EVPN [RFC7432]) | |||
[RFC7432], may also apply H.Encaps.Red and then steer the resulting | may also apply H.Encaps.Red and then steer the resulting traffic into | |||
traffic into the Replication segment. Again, note that the | the Replication segment. Again, note that H.Encaps.Red is | |||
H.Encaps.Red is independent of the Replication segment – it is the | independent of the Replication segment: it is the action of the | |||
action of the application (e.g. MVPN/EVPN service). If the service | application (e.g. MVPN or EVPN service). If the service is on a | |||
is on a Root node, the two H.Encaps mentioned, one for the service | root node, then the two H.Encaps mentioned, one for the service and | |||
and other in the previous paragraph for replication to Downstream | the other in the previous paragraph for replication to the downstream | |||
node SHOULD be combined for optimization (to avoid extra IPv6 | node, SHOULD be combined for optimization (to avoid extra IPv6 | |||
encapsulation). | encapsulation). | |||
When processing a packet destined to a local Replication SID, IPv6 | When processing a packet destined to a local Replication-SID, the | |||
Hop Limit MUST be decremented and MUST be non-zero to replicate the | IPv6 Hop Limit MUST be decremented and MUST be non-zero to replicate | |||
packet. A Root node that encapsulates a payload can set the IPv6 Hop | the packet. A root node that encapsulates a payload can set the IPv6 | |||
Limit based on a local policy. This local policy SHOULD set the IPv6 | Hop Limit based on a local policy. This local policy SHOULD set the | |||
Hop Limit so that a replicated packet can reach the furthest Leaf | IPv6 Hop Limit so that a replicated packet can reach the furthest | |||
node. A Root node can also have a local policy to set the IPv6 Hop | leaf node. A root node can also have a local policy to set the IPv6 | |||
Limit from the payload. In this case, IPv6 Hop Limit may not be | Hop Limit from the payload. In this case, the IPv6 Hop Limit may not | |||
sufficient to get the replicated packet to all the Leaf nodes; non- | be sufficient to get the replicated packet to all the leaf nodes. | |||
replication nodes i.e. nodes which forward replicated packets based | Non-replication nodes (i.e., nodes that forward replicated packets | |||
on IPv6 locator unicast prefix can decrement IPv6 Hop Limit to zero | based on the IPv6 locator unicast prefix) can decrement the IPv6 Hop | |||
and originate ICMPv6 Error packets to the Root node. This can result | Limit to zero and originate ICMPv6 error packets to the root node. | |||
in a storm of ICMPv6 packets (see Section 2.2.3) to the Root node. | This can result in a storm of ICMPv6 packets (see Section 2.2.3) to | |||
To avoid this, a Replication Segment has an optional IPv6 Hop Limit | the root node. To avoid this, a Replication segment has an optional | |||
threshold. If this threshold is set, a Replication node MUST discard | IPv6 Hop Limit Threshold. If this threshold is set, a replication | |||
an incoming packet with local Replication SID if the IPv6 Hop Limit | node MUST discard an incoming packet with a local Replication-SID if | |||
in the packet is less than the threshold and log this in a rate | the IPv6 Hop Limit in the packet is less than the threshold and log | |||
limited manner. The IPv6 Hop Limit Threshold SHOULD be set so that | this in a rate-limited manner. The IPv6 Hop Limit Threshold SHOULD | |||
incoming packet can be replicated to furthest Leaf node. | be set so that an incoming packet can be replicated to the furthest | |||
leaf node. | ||||
For Leaf/Bud nodes local delivery off the tree is per Replication SID | For leaf and bud nodes, local delivery off the tree is per | |||
or next SID (if present in SRH). For some usages, this may involve | Replication-SID or the next SID (if present in the SRH). For some | |||
getting the necessary context either from the next SID (e.g., MVPN | usages, this may involve getting the necessary context either from | |||
with shared tree) or from the replication SID itself (e.g., MVPN with | the next SID (e.g., MVPN with a shared tree) or from the Replication- | |||
non-shared tree). In both cases, the context association is achieved | SID itself (e.g., MVPN with a non-shared tree). In both cases, the | |||
with signaling and is out of scope of this document. | context association is achieved with signaling and is out of scope of | |||
this document. | ||||
The following applies to Replication SID in SRv6 encapsulation: | The following applies to a Replication-SID in SRv6 encapsulation: | |||
* There MAY be SIDs preceding the SRv6 Replication SID in order to | * There MAY be SIDs preceding the SRv6 Replication-SID in order to | |||
guide a packet from a non-adjacent SR node to a Replication node | guide a packet from a non-adjacent SR node to a replication node | |||
via an explicit path. | via an explicit path. | |||
* A Replication node MAY steer a replicated packet on an explicit | * A replication node MAY steer a replicated packet on an explicit | |||
path to a non-adjacent Downstream node using SIDs it inserts in | path to a non-adjacent downstream node using SIDs it inserts in | |||
the copy preceding the downstream Replication SID. The Downstream | the copy preceding the downstream Replication-SID. The downstream | |||
node may be a Leaf node of the Replication segment, or another | node may be a leaf node of the Replication segment, another | |||
Replication node, or both in case of Bud node. | replication node, or both in the case of a bud node. | |||
* For SRv6, as described in above paragraphs, the insertion of SIDs | * For SRv6, as described in above paragraphs, the insertion of SIDs | |||
prior to Replication SID entails a new IPv6 encapsulation with | prior to the Replication-SID entails a new IPv6 encapsulation with | |||
SRH, but this can be optimized on Root node or for compressed SRv6 | the SRH. However, this can be optimized on the root node or for | |||
SIDs. | compressed SRv6 SIDs. | |||
* The locator of Replication SID is sufficient to guide a packet on | * The locator of the Replication-SID is sufficient to guide a packet | |||
shortest path, for default or Flexible algorithm, between non- | on the shortest path between non-adjacent nodes for default or | |||
adjacent nodes. | Flexible Algorithms. | |||
* A Replication node MAY use an Anycast SID or BGP PeerSet SID in | * A replication node MAY use an Anycast-SID or a BGP PeerSet-SID in | |||
segment list to send a replicated packet to one downstream | the segment list to send a replicated packet to one downstream | |||
Replication node in an Anycast set if and only if all nodes in the | replication node in an Anycast set. This occurs if and only if | |||
set have an identical Replication SID and reach the same set of | all nodes in the set have an identical Replication-SID and reach | |||
receivers. | the same set of receivers. | |||
* There MAY be SIDs after the Replication SID in the SRH of a | * There MAY be SIDs after the Replication-SID in the SRH of a | |||
packet. These SIDs are used to provide additional context for | packet. These SIDs are used to provide additional context for | |||
processing a packet locally at the node where the Replication SID | processing a packet locally at the node where the Replication-SID | |||
is the Active Segment. Coordination regarding the absence or | is the active segment. Coordination regarding the absence or | |||
presence and value of context information for Leaf/Bud nodes is | presence and value of context information for leaf and bud nodes | |||
outside the scope of this document. | is outside the scope of this document. | |||
2.2.1. End.Replicate: Replicate and/or Decapsulate | 2.2.1. End.Replicate: Replicate and/or Decapsulate | |||
The "Endpoint with replication and/or decapsulate behavior | The "Endpoint with replication and/or decapsulate" (End.Replicate for | |||
(End.Replicate for short) is variant of End behavior. The pseudo- | short) is a variant of End behavior. The pseudocode in this section | |||
code in this section follows the convention introduced in RFC 8986 | follows the convention introduced in [RFC8986]. | |||
[RFC8986]. | ||||
A Replication state conceptually contains the following elements: | An RS conceptually contains the following elements: | |||
Replication state: | Replication state: | |||
{ | { | |||
Node-Role: {Head, Transit, Leaf, Bud}; | Node-Role: {Head, Transit, Leaf, Bud}; | |||
IPv6 Hop Limit Threshold; # default is zero | IPv6 Hop Limit Threshold; # default is zero | |||
# On Leaf, replication list is zero length | # On Leaf, replication list is zero length | |||
Replication-List: | Replication-List: | |||
{ | { | |||
Downstream node: <Node-Identifier>; | downstream node: <Node-Identifier>; | |||
Downstream Replication SID: R-SID; | downstream Replication-SID: R-SID; | |||
# Segment-List may be empty | # Segment-List may be empty | |||
Segment-List: [SID-1, .... SID-N]; | Segment-List: [SID-1, .... SID-N]; | |||
} | } | |||
} | } | |||
Below is the Replicate function on a packet for Replication state | Below is the Replicate function on a packet for Replication state | |||
(RS). | (RS). | |||
S01. Replicate(RS, packet) | S01. Replicate(RS, packet) | |||
S02. { | S02. { | |||
S03. For each Replication R in RS.Replication-List { | S03. For each Replication R in RS.Replication-List { | |||
S04. Make a copy of the packet | S04. Make a copy of the packet | |||
S05. Set IPv6 DA = RS.R-SID | S05. Set IPv6 DA = RS.R-SID | |||
S06. If RS.Segment-List is not empty { | S06. If RS.Segment-List is not empty { | |||
S07. # Head node may optimize below encapsulation and | S07. # Head node may optimize below encapsulation and | |||
S08. # the encapsulation of packet in a single encapsulation | S08. # the encapsulation of packet in a single encapsulation | |||
S09. Execute H.Encaps or H.Encaps.Red with RS.Segment-List | S09. Execute H.Encaps or H.Encaps.Red with RS.Segment-List | |||
on packet copy #RFC 8986 Section 5.1, 5.2 | on packet copy #RFC 8986, Sections 5.1 and 5.2 | |||
S10. } | S10. } | |||
S11. Submit the packet to the egress IPv6 FIB lookup and | S11. Submit the packet to the egress IPv6 FIB lookup and | |||
transmission to the new destination | transmission to the new destination | |||
S12. } | S12. } | |||
S13. } | S13. } | |||
Notes: | Notes: | |||
* The IPv6 destination address in the copy of a packet is set from | * The IPv6 DA in the copy of a packet is set from the local state | |||
local state and not from SRH | and not from the SRH. | |||
When N receives a packet whose IPv6 DA is S and S is a local | When N receives a packet whose IPv6 DA is S and S is a local | |||
End.Replicate SID, N does: | End.Replicate SID, N does: | |||
S01. Lookup FUNCT portion of S to get Replication state RS | S01. Lookup FUNCT portion of S to get Replication state (RS) | |||
S02. If (IPv6 Hop Limit <= 1) { | S02. If (IPv6 Hop Limit <= 1) { | |||
S03. Discard the packet | S03. Discard the packet | |||
S04. # ICMPv6 Time Exceeded is not permitted (ICMPv6 section below) | S04. # ICMPv6 Time Exceeded is not permitted | |||
S05. } | (see Section 2.2.3) | |||
S06. If RS is not found { | S05. } | |||
S07. Discard the packet | S06. If RS is not found { | |||
S08. } | S07. Discard the packet | |||
S09. If (IPv6 Hop Limit < RS.IPv6 Hop Limit Threshold) { | S08. } | |||
S10. Discard the packet | S09. If (IPv6 Hop Limit < RS.IPv6 Hop Limit Threshold) { | |||
S11. # Rate-limited logging | S10. Discard the packet | |||
S12. } | S11. # Rate-limited logging | |||
S13. Decrement IPv6 Hop Limit by 1 | S12. } | |||
S14. If (IPv6 NH == SRH and SRH TLVs present) { | S13. Decrement IPv6 Hop Limit by 1 | |||
S15. Process SRH TLVs if allowed by local configuration | S14. If (IPv6 NH == SRH and SRH TLVs present) { | |||
S16. } | S15. Process SRH TLVs if allowed by local configuration | |||
S17. Call Replicate(RS, packet) | S16. } | |||
S18. If (RS.Node-Role == Leaf OR RS.Node-Role == Bud) { | S17. Call Replicate(RS, packet) | |||
S19. If (IPv6 NH == SRH and Segments Left > 0) { | S18. If (RS.Node-Role == Leaf OR RS.Node-Role == bud) { | |||
S20. Derive packet processing context(PPC) from Segment List | S19. If (IPv6 NH == SRH and Segments Left > 0) { | |||
S21. If (Segments Left != 0) { | S20. Derive packet processing context (PPC) from Segment List | |||
S22. Discard the packet | S21. If (Segments Left != 0) { | |||
S23. # ICMPv6 Parameter Problem with Code 0 | S22. Discard the packet | |||
S24. # (Erroneous header field encountered) | S23. # ICMPv6 Parameter Problem message with Code 0 | |||
S25. # is not permitted (ICMPv6 section below) | S24. # (Erroneous header field encountered) | |||
S26. } | S25. # is not permitted (Section 2.2.3) | |||
S27. } Else { | S26. } | |||
S28. Derive packet processing context(PPC) | S27. } Else { | |||
from FUNCT of Replication SID | S28. Derive packet processing context (PPC) | |||
S29. } | from FUNCT of Replicatio-SID | |||
S30. Process the next header | S29. } | |||
S31. } | S30. Process the next header | |||
S31. } | ||||
The processing of Upper-Layer header of a packet matching | The processing of the Upper-Layer header of a packet matching the | |||
End.Replicate SID at Leaf/Bud node is as follows: | End.Replicate SID at a leaf or bud node is as follows: | |||
S01. If (Upper-Layer header type == 4(IPv4) OR | S01. If (Upper-Layer header type == 4(IPv4) OR | |||
Upper-Layer header type == 41(IPv6) ) { | Upper-Layer header type == 41(IPv6) ) { | |||
S02. Remove the outer IPv6 header with all its extension headers | S02. Remove the outer IPv6 header with all its extension headers | |||
S03. Process the packet in context of PPC | S03. Process the packet in context of PPC | |||
S04. } Else If (Upper-Layer header type == 143(Ethernet) ) { | S04. } Else If (Upper-Layer header type == 143(Ethernet) ) { | |||
S05. Remove the outer IPv6 header with all its extension headers | S05. Remove the outer IPv6 header with all its extension headers | |||
S06. Process the Ethernet Frame in context of PPC | S06. Process the Ethernet Frame in context of PPC | |||
S07. } Else If (Upper-Layer header type is allowed | S07. } Else If (Upper-Layer header type is allowed | |||
by local configuration) { | by local configuration) { | |||
S08. Proceed to process the Upper-Layer header | S08. Proceed to process the Upper-Layer header | |||
S09. } Else { | S09. } Else { | |||
S10. Discard the packet | S10. Discard the packet | |||
S11. # ICMPv6 Parameter Problem with Code 4 | S11. # ICMPv6 Parameter Problem message with Code 4 | |||
S12. # (SR Upper-layer Header Error) | S12. # (SR Upper-Layer header Error) | |||
S13. # is not permitted (ICMPv6 section below) | S13. # is not permitted (Section 2.2.3) | |||
S14. } | S14. } | |||
Notes: | Notes: | |||
* The behavior above MAY result in a packet with partially processed | * The behavior above MAY result in a packet with a partially | |||
segment list in SRH under some circumstances. Fox example a head | processed segment list in the SRH under some circumstances. For | |||
node may encode a context SID in an SRH. As per pseudo-code | example, a head node may encode a context-SID in an SRH. As per | |||
above, a Replication node that receives a packet with local | the pseudocode above, a replication node that receives a packet | |||
Replication SID will not process the SRH segment list and just | with a local Replication-SID will not process the SRH segment list | |||
forward a copy with unmodified SRH to Downstream nodes. | and will just forward a copy with an unmodified SRH to downstream | |||
nodes. | ||||
* The packet processing context usually is a FIB table T | * The packet processing context is usually a FIB table "T". | |||
Processing the Replication SID may modify, if configured to process | If configured to process TLVs, processing the Replication-SID may | |||
TLVs, the "variable-length data" of TLV types that change en route. | modify the "variable-length data" of TLV types that change en route. | |||
Therefore, TLVs that change en route are mutable. The remainder of | Therefore, TLVs that change en route are mutable. The remainder of | |||
the SRH (Segments Left, Flags, Tag, Segment List, and TLVs that do | the SRH (Segments Left, Flags, Tag, Segment List, and TLVs that do | |||
not change en route) are immutable while processing this SID. | not change en route) are immutable while processing this SID. | |||
2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV | 2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV | |||
If a Root node encodes a context SID in SRH with an optional HMAC SRH | If a root node encodes a context-SID in an SRH with an optional HMAC | |||
TLV [RFC8754], it MUST set the 'D' bit as defined in Section 2.1.2 | SRH TLV [RFC8754], it MUST set the 'D' bit as defined in | |||
because the Replication SID is not part of the segment list in SRH. | Section 2.1.2 of [RFC8754] because the Replication-SID is not part of | |||
the segment list in the SRH. | ||||
HMAC generation and verification is as specified in RFC 8754. | HMAC generation and verification is as specified in [RFC8754]. | |||
Verification of HMAC TLV is determined by local configuration. If | Verification of an HMAC TLV is determined by local configuration. If | |||
verification fails, an implementation of Replication SID MUST NOT | verification fails, an implementation of a Replication-SID MUST NOT | |||
originate an ICMPv6 error message (parameter problem, code 0). The | originate an ICMPv6 Parameter Problem message with code 0. The | |||
failure SHOULD be logged (rate limited) and the packet SHOULD be | failure SHOULD be logged (rate-limited) and the packet SHOULD be | |||
discarded. | discarded. | |||
2.2.2. OAM Operations | 2.2.2. OAM Operations | |||
RFC 9259 [RFC9259] specifies procedures for OAM operations like ping | [RFC9259] specifies procedures for Operations, Administration, and | |||
and traceroute on SRv6 SIDs. | Maintenance (OAM) like ping and traceroute on SRv6 SIDs. | |||
It is possible to ping a Replication SID of a Leaf/Bud node, assuming | Assuming the source node knows the Replication-SID a priori, it is | |||
the source node knows the Replication SID a priori, directly by | possible to ping a Replication-SID of a leaf or bud node directly by | |||
putting it in the IPv6 destination address without a SRH or in a SRH | putting it in the IPv6 DA without an SRH or in an SRH as the last | |||
as the last segment. While it is not possible to ping a Replication | segment. While it is not possible to ping a Replication-SID of a | |||
SID of a transit node because transit nodes do not process upper | transit node because transit nodes do not process Upper-Layer | |||
layer headers, it is still possible to ping a Replication SID of | headers, it is still possible to ping a Replication-SID of a leaf or | |||
Leaf/Bud node of a tree via the Replication SID of intermediate | bud node of a tree via the Replication-SID of intermediate transit | |||
transit nodes. The source of ping MUST compute the ICMPv6 Echo | nodes. The source of the ping MUST compute the ICMPv6 Echo Request | |||
Request checksum using the Replication SID of Leaf/Bud as destination | checksum using the Replication-SID of the leaf or bud node as the DA. | |||
address. The source can then send the Echo Request packet to a | The source can then send the Echo Request packet to a transit node's | |||
transit node's Replication SID. The transit nodes replicate the | Replication-SID. The transit node replicates the packet by replacing | |||
packet by replacing the IPv6 destination address till the packet | the IPv6 DA until the packet reaches the leaf or bud node, which | |||
reaches the Leaf/Bud node which responds with an ICMPv6 Echo Reply. | responds with an ICMPv6 Echo Reply. Note that a transit replication | |||
Note that a transit Replication node may replicate Echo Request | node may replicate Echo Request packets to other leaf or bud nodes. | |||
packets to other Leaf/Bud nodes. These nodes will drop the Echo | These nodes will drop the Echo Request due to an incorrect checksum. | |||
Request due to incorrect checksum. Procedures to prevent the mis- | Procedures to prevent the misdelivery of an Echo Request may be | |||
delivery of Echo Request may be addressed in a future document. | addressed in a future document. Appendix A.2.1 illustrates examples | |||
Appendix A.2.1 illustrates examples of ping to a Replication SID. | of a ping to a Replication-SID. | |||
Traceroute to a Leaf/Bud node Replication SID is not possible due to | Traceroute to a leaf or bud node Replication-SID is not possible due | |||
restriction prohibiting origination of ICMPv6 Time Exceeded error | to restrictions prohibiting the origination of the ICMPv6 Time | |||
message for a Replication SID as described in the section below. | Exceeded error message for a Replication-SID as described in | |||
Section 2.2.3. | ||||
2.2.3. ICMPv6 Error Messages | 2.2.3. ICMPv6 Error Messages | |||
ICMPv6 RFC [RFC4443] Section 2.4 states an ICMPv6 error message MUST | Section 2.4 of [RFC4443] states an ICMPv6 error message MUST NOT be | |||
NOT be originated as a result of receiving a packet destined to an | originated as a result of receiving a packet destined to an IPv6 | |||
IPv6 multicast address. This is to prevent a storm of ICMPv6 error | multicast address. This is to prevent a source node from being | |||
messages resulting from replicated IPv6 packets from overwhelming a | overwhelmed by a storm of ICMPv6 error messages resulting from | |||
source node. There are two exceptions (1) the Packet Too Big message | replicated IPv6 packets. There are two exceptions: | |||
for Path MTU discovery, and (2) Parameter Problem Message, Code 2 | ||||
reporting an unrecognized IPv6 option. An implementation of | ||||
Replication segment for SRv6 MUST enforce these same restrictions and | ||||
exceptions. | ||||
3. Implementation Status | ||||
Note to the RFC Editor: Please remove this section and reference to | ||||
RFC 7942 before publication. | ||||
This section records the status of known implementations of the | ||||
protocol defined by this specification at the time of posting of this | ||||
Internet-Draft, and is based on a proposal described in RFC 7942 | ||||
[RFC7942]. The description of implementations in this section is | ||||
intended to assist the IETF in its decision processes in progressing | ||||
drafts to RFCs. Please note that the listing of any individual | ||||
implementation here does not imply endorsement by the IETF. | ||||
Furthermore, no effort has been spent to verify the information | ||||
presented here that was supplied by IETF contributors. This is not | ||||
intended as, and must not be construed to be, a catalog of available | ||||
implementations or their features. Readers are advised to note that | ||||
other implementations may exist. According to RFC 7942 [RFC7942], | ||||
"this will allow reviewers and working groups to assign due | ||||
consideration to documents that have the benefit of running code, | ||||
which may serve as evidence of valuable experimentation and feedback | ||||
that have made the implemented protocols more mature. It is up to | ||||
the individual working groups to use this information as they see | ||||
fit". | ||||
There are two known implementations of this draft by Cisco and Nokia. | ||||
Interoperability reports for the implementations are not applicable | ||||
since this draft does not specify inter-operable elements of | ||||
Replication segments. | ||||
3.1. Cisco implementation | ||||
Cisco Implementation uses Replication segments defined in this draft | 1. The Packet Too Big message for Path MTU discovery, and | |||
as a basis for PCE to compute and establish P2MP trees in SR domain | ||||
to provide multi-point services. The implementation, based on latest | ||||
version of this draft, is in production and supports all MUST and | ||||
SHOULD clauses for SR-MPLS Replication segments. The documentation | ||||
is available at Cisco documentation | ||||
(https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/ | ||||
asr9k-r7-3/segment-routing/configuration/guide/b-segment-routing-cg- | ||||
asr9000-73x/b-segment-routing-cg-asr9000-71x_chapter_01001.html) and | ||||
the point of contact is Rishabh Parekh (riparekh@cisco.com). | ||||
3.2. Nokia implementation | 2. The ICMPv6 Parameter Problem message with Code 2 reporting an | |||
unrecognized IPv6 option. | ||||
Nokia has implemented replication SID as defined in this draft to | An implementation of a Replication segment for SRv6 MUST enforce | |||
establish P2MP tree in segment routing domain. The implementation | these same restrictions and exceptions. | |||
supports SR-MPLS encapsulation and has all the MUST and SHOULD clause | ||||
in this draft. The implementation is at general availability | ||||
maturity and is compliant with the latest version of the draft. The | ||||
documentation for implementation can be found at Nokia help | ||||
(https://infocenter.nokia.com/public/7750SR207R1A/ | ||||
index.jsp?topic=%2Fcom.sr.multicast%2Fhtml%2Ftreesid.html) and the | ||||
point of contact is hooman.bidgoli@nokia.com. | ||||
4. IANA Considerations | 3. IANA Considerations | |||
IANA has assigned the following codepoint for End.Replicate behavior | IANA has assigned the following codepoint for End.Replicate behavior | |||
in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" | in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" | |||
registry group. | registry group. | |||
+=======+========+===================+===========+ | +=======+========+===================+===========+============+ | |||
| Value | Hex | Endpoint behavior | Reference | | | Value | Hex | Endpoint Behavior | Reference | Change | | |||
+=======+========+===================+===========+ | | | | | | Controller | | |||
| 75 | 0x004B | End.Replicate | [This.ID] | | +=======+========+===================+===========+============+ | |||
+-------+--------+-------------------+-----------+ | | 75 | 0x004B | End.Replicate | RFC 9524 | IETF | | |||
+-------+--------+-------------------+-----------+------------+ | ||||
Table 1: IETF - SRv6 Endpoint Behaviors | Table 1: SRv6 Endpoint Behavior | |||
5. Security Considerations | 4. Security Considerations | |||
The SID behaviors defined in this document are deployed within an SR | The SID behaviors defined in this document are deployed within an SR | |||
domain [RFC8402]. An SR domain needs protection from outside | domain [RFC8402]. An SR domain needs protection from outside | |||
attackers as described in [RFC8754] and following is a brief reminder | attackers (as described in [RFC8754]). The following is a brief | |||
of the same: | reminder of the same: | |||
* For SR-MPLS deployments: | * For SR-MPLS deployments: | |||
- By disabling MPLS on external interfaces of each edge node or | - Disable MPLS on external interfaces of each edge node or any | |||
any other technique to filter labeled traffic ingress on these | other technique to filter labeled traffic ingress on these | |||
interfaces. | interfaces. | |||
* For SRv6 deployments: | * For SRv6 deployments: | |||
- Allocate all the SIDs from an IPv6 prefix block S/s and | - Allocate all the SIDs from an IPv6 prefix block S/s and | |||
configure each external interface of each edge node of the | configure each external interface of each edge node of the | |||
domain with an inbound infrastructure access list (IACL) that | domain with an inbound Infrastructure Access Control List | |||
drops any incoming packet with a destination address in S/s. | (IACL) that drops any incoming packet with a DA in S/s. | |||
- Additionally, an iACL may be applied to all nodes (k) | - Additionally, an IACL may be applied to all nodes (k) | |||
provisioning SIDs as defined in this specification: | provisioning SIDs as defined in this specification: | |||
o Assign all interface addresses from within IPv6 prefix A/a. | o Assign all interface addresses from within IPv6 prefix A/a. | |||
At node k, all SIDs local to k are assigned from prefix Sk/ | At node k, all SIDs local to k are assigned from prefix Sk/ | |||
sk. Configure each internal interface of each SR node k in | sk. Configure each internal interface of each SR node k in | |||
the SR domain with an inbound IACL that drops any incoming | the SR domain with an inbound IACL that drops any incoming | |||
packet with a destination address in Sk/sk if the source | packet with a DA in Sk/sk if the source address is not in A/ | |||
address is not in A/a. | a. | |||
- Denying traffic with spoofed source addresses by implementing | - Deny traffic with spoofed source addresses by implementing | |||
recommendations in BCP 84 [RFC3704]. | recommendations in BCP 84 [RFC3704]. | |||
- Additionally the block S/s from which SIDs are allocated may be | - Additionally, the block S/s from which SIDs are allocated may | |||
a non-globally-routable address such as ULA or the prefix | be an address that is not globally routable such as a Unique | |||
defined in [I-D.ietf-6man-sids]. | Local Address (ULA) or the prefix defined in [SIDS-SRv6]. | |||
Failure to protect the SR MPLS domain by correctly provisioning MPLS | Failure to protect the SR-MPLS domain by correctly provisioning MPLS | |||
support per interface permits attackers from outside the domain to | support per interface permits attackers from outside the domain to | |||
send packets that use the replication services provisioned within the | send packets that use the replication services provisioned within the | |||
domain. | domain. | |||
Failure to protect the SRv6 domain with IACLs on external interfaces, | Failure to protect the SRv6 domain with IACLs on external interfaces | |||
combined with failure to implement BCP 38 [RFC2827]or apply IACLs on | combined with failure to implement the recommendations of BCP 38 | |||
nodes provisioning SIDs, permits attackers from outside the SR domain | [RFC2827] or apply IACLs on nodes provisioning SIDs permits attackers | |||
to send packets that use the replication services provisioned within | from outside the SR domain to send packets that use the replication | |||
the domain. | services provisioned within the domain. | |||
Given the definition of the Replication segment in this document, an | Given the definition of the Replication segment in this document, an | |||
attacker subverting ingress filter above cannot take advantage of a | attacker subverting the ingress filters above cannot take advantage | |||
stack of replication segments to perform amplification attacks nor | of a stack of Replication segments to perform amplification attacks | |||
link exhaustion attacks. Replication segment trees always terminate | nor link exhaustion attacks. Replication segment trees always | |||
at a Leaf or Bud node resulting in a decapsulation. This however | terminate at a leaf or bud node resulting in a decapsulation. | |||
does allow an attacker to inject traffic to the receivers within a | However, this does allow an attacker to inject traffic to the | |||
P2MP service. | receivers within a P2MP service. | |||
This document introduces a SR segment endpoint behavior that | This document introduces an SR segment endpoint behavior that | |||
replicates and decapsulates an inner payload for both the MPLS and | replicates and decapsulates an inner payload for both the MPLS and | |||
IPv6 data planes. Similar to any MPLS end of stack label, or SRv6 | IPv6 data planes. Similar to any MPLS end-of-stack label, or SRv6 | |||
END.D* behavior, if the protections described above are not | END.D* behavior, if the protections described above are not | |||
implemented an attacker can perform an attack via the decapsulating | implemented, an attacker can perform an attack via the decapsulating | |||
segment (including the one described in this document). | segment (including the one described in this document). | |||
Incorrect provisioning of Replication segments can result in a chain | Incorrect provisioning of Replication segments can result in a chain | |||
of Replication segments forming a loop. This can happen if | of Replication segments forming a loop. This can happen if | |||
Replication segments are provisioned on SR nodes without using a | Replication segments are provisioned on SR nodes without using a | |||
control plane. In this case, replicated packets can create a storm | control plane. In this case, replicated packets can create a storm | |||
till MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements | until MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements | |||
to zero. A control plane, for example PCE, can be used to prevent | to zero. A control plane such as PCE can be used to prevent loops. | |||
loops. The control plane protocols (like PCEP, BGP, etc.) used to | The control plane protocols (like Path Computation Element | |||
instantiate Replication segments can leverage their own security | Communication Protocol (PCEP), BGP, etc.) used to instantiate | |||
mechanisms such as encryption, authentication filtering etc. | Replication segments can leverage their own security mechanisms such | |||
as encryption, authentication filtering, etc. | ||||
For SRv6, Section 2.2.3 describes an exception for Parameter Problem | ||||
Message, code 2 ICMPv6 Error messages. If an attacker sends a packet | ||||
destined to Replication SID with source address of a node and with an | ||||
extension header using unknown option type marked as mandatory, then | ||||
a large number of ICMPv6 Parameter Problem messages can cause a | ||||
denial-of-service attack on the source node. Although this | ||||
specification does not specify any extension headers, any future | ||||
extension of this document doing so is susceptible to this security | ||||
concern. | ||||
If an attacker can forge an IPv6 packet with source address of a | ||||
node, Replication SID as destination address and an IPv6 Hop Limit | ||||
such that nodes which forward replicated packets on IPv6 locator | ||||
unicast prefix, decrement the Hop Limit to zero, then these nodes can | ||||
cause a storm of ICMPv6 Error packets to overwhelm the source node | ||||
under attack. The IPv6 Hop Limit Threshold check described in | ||||
Section 2.2 can help mitigate such attacks. | ||||
6. Acknowledgements | ||||
The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, | ||||
Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry | ||||
Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong | ||||
Xie for their valuable inputs. | ||||
7. Contributors | ||||
Clayton Hassen Bell Canada Vancouver Canada | ||||
Email: clayton.hassen@bell.ca | ||||
Kurtis Gillis Bell Canada Halifax Canada | ||||
Email: kurtis.gillis@bell.ca | ||||
Arvind Venkateswaran Cisco Systems, Inc. San Jose US | ||||
Email: arvvenka@cisco.com | ||||
Zafar Ali Cisco Systems, Inc. US | ||||
Email: zali@cisco.com | ||||
Swadesh Agrawal Cisco Systems, Inc. San Jose US | ||||
Email: swaagraw@cisco.com | ||||
Jayant Kotalwar Nokia Mountain View US | ||||
Email: jayant.kotalwar@nokia.com | ||||
Tanmoy Kundu Nokia Mountain View US | ||||
Email: tanmoy.kundu@nokia.com | ||||
Andrew Stone Nokia Ottawa Canada | ||||
Email: andrew.stone@nokia.com | ||||
Tarek Saad Cisco Systems Inc. Canada | For SRv6, Section 2.2.3 describes an exception for the ICMPv6 | |||
Parameter Problem message with Code 2. If an attacker sends a packet | ||||
destined to a Replication-SID with the source address of a node and | ||||
with an extension header using the unknown option type marked as | ||||
mandatory, then a large number of ICMPv6 Parameter Problem messages | ||||
can cause a denial-of-service attack on the source node. Although | ||||
this document does not specify any extension headers, any future | ||||
extension of this document that does so is susceptible to this | ||||
security concern. | ||||
Email:tsaad@cisco.com | If an attacker can forge an IPv6 packet with: | |||
Kamran Raza Cisco Systems, Inc. Canada | * the source address of a node, | |||
Email:skraza@cisco.com | * a Replication-SID as the DA, and | |||
Jingrong Xie Huawei Technologies Beijing China | * an IPv6 Hop Limit such that nodes that forward replicated packets | |||
on an IPv6 locator unicast prefix, decrement the Hop Limit to | ||||
zero, | ||||
Email:xiejingrong@huawei.com | then these nodes can cause a storm of ICMPv6 error packets to | |||
overwhelm the source node under attack. The IPv6 Hop Limit Threshold | ||||
check described in Section 2.2 can help mitigate such attacks. | ||||
8. References | 5. References | |||
8.1. Normative References | 5.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet | [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet | |||
Control Message Protocol (ICMPv6) for the Internet | Control Message Protocol (ICMPv6) for the Internet | |||
Protocol Version 6 (IPv6) Specification", STD 89, | Protocol Version 6 (IPv6) Specification", STD 89, | |||
RFC 4443, DOI 10.17487/RFC4443, March 2006, | RFC 4443, DOI 10.17487/RFC4443, March 2006, | |||
skipping to change at page 19, line 22 ¶ | skipping to change at line 724 ¶ | |||
(SRv6) Network Programming", RFC 8986, | (SRv6) Network Programming", RFC 8986, | |||
DOI 10.17487/RFC8986, February 2021, | DOI 10.17487/RFC8986, February 2021, | |||
<https://www.rfc-editor.org/info/rfc8986>. | <https://www.rfc-editor.org/info/rfc8986>. | |||
[RFC9259] Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. | [RFC9259] Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. | |||
Chen, "Operations, Administration, and Maintenance (OAM) | Chen, "Operations, Administration, and Maintenance (OAM) | |||
in Segment Routing over IPv6 (SRv6)", RFC 9259, | in Segment Routing over IPv6 (SRv6)", RFC 9259, | |||
DOI 10.17487/RFC9259, June 2022, | DOI 10.17487/RFC9259, June 2022, | |||
<https://www.rfc-editor.org/info/rfc9259>. | <https://www.rfc-editor.org/info/rfc9259>. | |||
8.2. Informative References | 5.2. Informative References | |||
[I-D.filsfils-spring-srv6-net-pgm-illustration] | [P2MP-POLICY] | |||
Filsfils, C., Camarillo, P., Li, Z., Matsushima, S., | Voyer, D., Ed., Filsfils, C., Parekh, R., Bidgoli, H., and | |||
Z. J. Zhang, "Segment Routing Point-to-Multipoint Policy", | ||||
Work in Progress, Internet-Draft, draft-ietf-pim-sr-p2mp- | ||||
policy-07, 11 October 2023, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-pim-sr- | ||||
p2mp-policy-07>. | ||||
[PGM-ILLUSTRATION] | ||||
Filsfils, C., Camarillo, P., Ed., Li, Z., Matsushima, S., | ||||
Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and | Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and | |||
J. Leddy, "Illustrations for SRv6 Network Programming", | J. Leddy, "Illustrations for SRv6 Network Programming", | |||
Work in Progress, Internet-Draft, draft-filsfils-spring- | Work in Progress, Internet-Draft, draft-filsfils-spring- | |||
srv6-net-pgm-illustration-04, 30 March 2021, | srv6-net-pgm-illustration-04, 30 March 2021, | |||
<https://datatracker.ietf.org/doc/html/draft-filsfils- | <https://datatracker.ietf.org/doc/html/draft-filsfils- | |||
spring-srv6-net-pgm-illustration-04>. | spring-srv6-net-pgm-illustration-04>. | |||
[I-D.ietf-6man-sids] | ||||
Krishnan, S., "Segment Identifiers in SRv6", Work in | ||||
Progress, Internet-Draft, draft-ietf-6man-sids-03, 11 | ||||
April 2023, <https://datatracker.ietf.org/doc/html/draft- | ||||
ietf-6man-sids-03>. | ||||
[I-D.ietf-pim-sr-p2mp-policy] | ||||
Voyer, D., Filsfils, C., Parekh, R., Bidgoli, H., and Z. | ||||
J. Zhang, "Segment Routing Point-to-Multipoint Policy", | ||||
Work in Progress, Internet-Draft, draft-ietf-pim-sr-p2mp- | ||||
policy-06, 13 April 2023, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-pim-sr- | ||||
p2mp-policy-06>. | ||||
[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: | [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: | |||
Defeating Denial of Service Attacks which employ IP Source | Defeating Denial of Service Attacks which employ IP Source | |||
Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, | Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, | |||
May 2000, <https://www.rfc-editor.org/info/rfc2827>. | May 2000, <https://www.rfc-editor.org/info/rfc2827>. | |||
[RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed | [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed | |||
Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March | Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March | |||
2004, <https://www.rfc-editor.org/info/rfc3704>. | 2004, <https://www.rfc-editor.org/info/rfc3704>. | |||
[RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ | [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ | |||
BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February | BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February | |||
2012, <https://www.rfc-editor.org/info/rfc6513>. | 2012, <https://www.rfc-editor.org/info/rfc6513>. | |||
[RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., | |||
Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based | Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based | |||
Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February | Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February | |||
2015, <https://www.rfc-editor.org/info/rfc7432>. | 2015, <https://www.rfc-editor.org/info/rfc7432>. | |||
[RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running | ||||
Code: The Implementation Status Section", BCP 205, | ||||
RFC 7942, DOI 10.17487/RFC7942, July 2016, | ||||
<https://www.rfc-editor.org/info/rfc7942>. | ||||
[RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress | [RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress | |||
Replication Tunnels in Multicast VPN", RFC 7988, | Replication Tunnels in Multicast VPN", RFC 7988, | |||
DOI 10.17487/RFC7988, October 2016, | DOI 10.17487/RFC7988, October 2016, | |||
<https://www.rfc-editor.org/info/rfc7988>. | <https://www.rfc-editor.org/info/rfc7988>. | |||
[RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., | [RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., | |||
Decraene, B., Litkowski, S., and R. Shakir, "Segment | Decraene, B., Litkowski, S., and R. Shakir, "Segment | |||
Routing with the MPLS Data Plane", RFC 8660, | Routing with the MPLS Data Plane", RFC 8660, | |||
DOI 10.17487/RFC8660, December 2019, | DOI 10.17487/RFC8660, December 2019, | |||
<https://www.rfc-editor.org/info/rfc8660>. | <https://www.rfc-editor.org/info/rfc8660>. | |||
skipping to change at page 20, line 44 ¶ | skipping to change at line 782 ¶ | |||
[RFC9256] Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, | [RFC9256] Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, | |||
A., and P. Mattes, "Segment Routing Policy Architecture", | A., and P. Mattes, "Segment Routing Policy Architecture", | |||
RFC 9256, DOI 10.17487/RFC9256, July 2022, | RFC 9256, DOI 10.17487/RFC9256, July 2022, | |||
<https://www.rfc-editor.org/info/rfc9256>. | <https://www.rfc-editor.org/info/rfc9256>. | |||
[RFC9350] Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., | [RFC9350] Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., | |||
and A. Gulko, "IGP Flexible Algorithm", RFC 9350, | and A. Gulko, "IGP Flexible Algorithm", RFC 9350, | |||
DOI 10.17487/RFC9350, February 2023, | DOI 10.17487/RFC9350, February 2023, | |||
<https://www.rfc-editor.org/info/rfc9350>. | <https://www.rfc-editor.org/info/rfc9350>. | |||
[SIDS-SRv6] | ||||
Krishnan, S., "Segment Identifiers in SRv6", Work in | ||||
Progress, Internet-Draft, draft-ietf-6man-sids-05, 8 | ||||
January 2024, <https://datatracker.ietf.org/doc/html/ | ||||
draft-ietf-6man-sids-05>. | ||||
Appendix A. Illustration of a Replication Segment | Appendix A. Illustration of a Replication Segment | |||
This section illustrates an example of a single Replication segment. | This section illustrates an example of a single Replication segment. | |||
Examples showing Replication segment stitched together to form P2MP | Examples showing Replication segments stitched together to form a | |||
tree (based on SR P2MP policy) are in [I-D.ietf-pim-sr-p2mp-policy]. | P2MP tree (based on SR P2MP policy) are in [P2MP-POLICY]. | |||
Consider the following topology: | Consider the following topology: | |||
R3------R6 | R3------R6 | |||
/ \ | / \ | |||
R1----R2----R5-----R7 | R1----R2----R5-----R7 | |||
\ / | \ / | |||
+--R4---+ | +--R4---+ | |||
Figure 1: Topology for illustration of Replication Segment | Figure 1: Topology for Illustration of a Replication Segment | |||
A.1. SR-MPLS | A.1. SR-MPLS | |||
In this example, the Node-SID of a node Rn is N-SIDn and Adjacency- | In this example, the Node-SID of a node Rn is N-SIDn and the Adj-SID | |||
SID from node Rm to node Rn is A-SIDmn. Interface between Rm and Rn | from node Rm to node Rn is A-SIDmn. The interface between Rm and Rn | |||
is Lmn. The state representation uses "R-SID->Lmn" to represent a | is Lmn. The state representation uses "R-SID->Lmn" to represent a | |||
packet replication with outgoing replication SID R-SID sent on | packet replication with outgoing Replication-SID R-SID sent on | |||
interface Lmn. | interface Lmn. | |||
Assume a Replication segment identified with R-ID at Replication node | Assume a Replication segment identified with R-ID at Replication node | |||
R1 and downstream nodes R2, R6 and R7. The Replication SID at node n | R1 and downstream nodes R2, R6, and R7. The Replication-SID at node | |||
is R-SIDn. A packet replicated from R1 to R7 has to traverse R4. | n is R-SIDn. A packet replicated from R1 to R7 has to traverse R4. | |||
The Replication segment state at nodes R1, R2, R6 and R7 is shown | The Replication segments at nodes R1, R2, R6, and R7 are shown below. | |||
below. Note nodes R3, R4 and R5 do not have state for the | Note nodes R3, R4, and R5 do not have a Replication segment. | |||
Replication segment. | ||||
Replication segment at R1: | Replication segment at R1: | |||
Replication segment <R-ID,R1>: | Replication segment | |||
Replication SID: R-SID1 | <R-ID,R1>: Replication-SID: R-SID1 Replication state: R2: | |||
Replication state: | <R-SID2->L12> R6: <N-SID6, R-SID6> R7: <N-SID4, | |||
R2: <R-SID2->L12> | A-SID47, R-SID7> | |||
R6: <N-SID6, R-SID6> | ||||
R7: <N-SID4, A-SID47, R-SID7> | ||||
Replication to R2 steers the packet directly to R2 on interface L12. | Replication to R2 steers the packet directly to R2 on interface L12. | |||
Replication to R6, using N-SID6, steers the packet via shortest path | Replication to R6, using N-SID6, steers the packet via the shortest | |||
to that node. Replication to R7 is steered via R4, using N-SID4 and | path to that node. Replication to R7 is steered via R4, using N-SID4 | |||
then adjacency SID A-SID47 to R7. | and then adjacency SID A-SID47 to R7. | |||
Replication segment at R2: | Replication segment at R2: | |||
Replication segment <R-ID,R2>: | Replication segment | |||
Replication SID: R-SID2 | <R-ID,R2>: Replication-SID: R-SID2 Replication state: R2: | |||
Replication state: | <Leaf> | |||
R2: <Leaf> | ||||
Replication segment at R6: | Replication segment at R6: | |||
Replication segment <R-ID,R6>: | Replication segment | |||
Replication SID: R-SID6 | <R-ID,R6>: Replication-SID: R-SID6 Replication state: R6: | |||
Replication state: | <Leaf> | |||
R6: <Leaf> | ||||
Replication segment at R7: | Replication segment at R7: | |||
Replication segment <R-ID,R7>: | Replication segment | |||
Replication SID: R-SID7 | <R-ID,R7>: Replication-SID: R-SID7 Replication state: R7: | |||
Replication state: | <Leaf> | |||
R7: <Leaf> | ||||
When a packet is steered into the Replication segment at R1: | When a packet is steered into the Replication segment at R1: | |||
* Since R1 is directly connected to R2, R1 performs PUSH operation | * R1 performs the PUSH operation with just the <R-SID2> label for | |||
with just <R-SID2> label for the replicated copy and sends it to | the replicated copy and sends it to R2 on interface L12, since R1 | |||
R2 on interface L12. R2, as Leaf, performs NEXT operation, pops | is directly connected to R2. R2, as leaf, performs the NEXT | |||
R-SID2 label and delivers the payload. | operation, pops the R-SID2 label, and delivers the payload. | |||
* R1 performs PUSH operation with <N-SID6, R-SID6> label stack for | * R1 performs the PUSH operation with the <N-SID6, R-SID6> label | |||
the replicated copy to R6 and sends it to R2, the nexthop on | stack for the replicated copy to R6 and sends it to R2, which is | |||
shortest path to R6. R2 performs CONTINUE operation on N-SID6 and | the nexthop on the shortest path to R6. R2 performs the CONTINUE | |||
forwards it to R3. R3 is the penultimate hop for N-SID6; it | operation on N-SID6 and forwards it to R3. R3 is the penultimate | |||
performs penultimate hop popping, which corresponds to the NEXT | hop for N-SID6; it performs penultimate hop popping, which | |||
operation and the packet is then sent to R6 with <R-SID6> in the | corresponds to the NEXT operation. The packet is then sent to R6 | |||
label stack. R6, as Leaf, performs NEXT operation, pops R-SID6 | with <R-SID6> in the label stack. R6, as leaf, performs the NEXT | |||
label and delivers the payload. | operation, pops the R-SID6 label, and delivers the payload. | |||
* R1 performs PUSH operation with <N-SID4, A-SID47, R-SID7> label | * R1 performs the PUSH operation with the <N-SID4, A-SID47, R-SID7> | |||
stack for the replicated copy to R7 and sends it to R2, the | label stack for the replicated copy to R7 and sends it to R2, | |||
nexthop on shortest path to R4. R2 is the penultimate hop for | which is the nexthop on the shortest path to R4. R2 is the | |||
N-SID4; it performs penultimate hop popping, which corresponds to | penultimate hop for N-SID4; it performs penultimate hop popping, | |||
the NEXT operation and the packet is then sent to R4 with | which corresponds to the NEXT operation. The packet is then sent | |||
<A-SID47, R-SID1> in the label stack. R4 performs NEXT operation, | to R4 with <A-SID47, R-SID1> in the label stack. R4 performs the | |||
pops A-SID47, and delivers packet to R7 with <R-SID7> in the label | NEXT operation, pops A-SID47, and delivers the packet to R7 with | |||
stack. R7, as Leaf, performs NEXT operation, pops R-SID7 label | <R-SID7> in the label stack. R7, as leaf, performs the NEXT | |||
and delivers the payload. | operation, pops the R-SID7 label, and delivers the payload. | |||
A.2. SRv6 | A.2. SRv6 | |||
For SRv6 , we use SID allocation scheme, reproduced below, from | For SRv6, we use the SID allocation scheme, reproduced below, from | |||
Illustrations for SRv6 Network Programming | "Illustrations for SRv6 Network Programming" [PGM-ILLUSTRATION]: | |||
[I-D.filsfils-spring-srv6-net-pgm-illustration] | ||||
* 2001:db8::/32 is an IPv6 block allocated by a Regional Internet | * 2001:db8::/32 is an IPv6 block allocated by a Regional Internet | |||
Registry (RIR) to the operator | Registry (RIR) to the operator. | |||
* 2001:db8:0::/48 is dedicated to the internal address space | * 2001:db8:0::/48 is dedicated to the internal address space. | |||
* 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space | ||||
* 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space. | ||||
* We assume a location expressed in 64 bits and a function expressed | * We assume a location expressed in 64 bits and a function expressed | |||
in 16 bits | in 16 bits. | |||
* Node k has a classic IPv6 loopback address 2001:db8::k/128 which | * Node k has a classic IPv6 loopback address 2001:db8::k/128, which | |||
is advertised in the Interior Gateway Protocol (IGP) | is advertised in the Interior Gateway Protocol (IGP). | |||
* Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs | * Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs | |||
will be explicitly assigned from that block | will be explicitly assigned from that block. | |||
* Node k advertises 2001:db8:cccc:k::/64 in its IGP | * Node k advertises 2001:db8:cccc:k::/64 in its IGP. | |||
* Function :1:: (function 1, for short) represents the End function | * Function :1:: (function 1, for short) represents the End function | |||
with Penultimate Segment Pop of SRH (PSP) [RFC8986] and USD | with the Penultimate Segment Pop (PSP) of the SRH [RFC8986] and | |||
support | USD support. | |||
* Function :Cn:: (function Cn, for short) represents the End.X | * Function :Cn:: (function Cn, for short) represents the End.X | |||
function from to Node n with PSP and USD support | function from to Node n with PSP and USD support. | |||
Each node k has: | Each node k has: | |||
* An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an | * An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an | |||
End function with additional support for PSP and USD | End function with additional support for PSP and USD. | |||
* An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an | * An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an | |||
End.X function to neighbor J with additional support for PSP and | End.X function to neighbor J with additional support for PSP and | |||
USD | USD. | |||
* An explicit SID instantiation 2001:db8:cccc:k:Fk::/128 bound to an | * An explicit SID instantiation 2001:db8:cccc:k:Fk::/128 bound to an | |||
End.Replicate function | End.Replicate function. | |||
Assume a Replication segment identified with R-ID at Replication node | Assume a Replication segment identified with R-ID at Replication node | |||
R1 and downstream nodes R2, R6 and R7. The Replication SID at node | R1 and downstream nodes R2, R6, and R7. The Replication-SID at node | |||
k, bound to an End.Replicate function, is 2001:db8:cccc:k:Fk::/128. | k, bound to an End.Replicate function, is 2001:db8:cccc:k:Fk::/128. | |||
A packet replicated from R1 to R7 has to traverse R4. | A packet replicated from R1 to R7 has to traverse R4. | |||
The Replication segment state at nodes R1, R2, R6 and R7 is shown | The Replication segments at nodes R1, R2, R6, and R7 are shown below. | |||
below. Note nodes R3, R4 and R5 do not have state for the | Note nodes R3, R4, and R5 do not have a Replication segment. The | |||
Replication segment. The state representation uses "R-SID->Lmn" to | state representation uses "R-SID->Lmn" to represent a packet | |||
represent a packet replication with outgoing replication SID R-SID | replication with outgoing Replication-SID R-SID sent on interface | |||
sent on interface Lmn. "SL" represents and optional segment list used | Lmn. "SL" represents an optional segment list used to steer a | |||
to steer a replicated packet on a specific path to a Downstream node. | replicated packet on a specific path to a downstream node. | |||
Replication segment at R1: | Replication segment at R1: | |||
Replication segment <R-ID,R1>: | Replication segment | |||
Replication SID: 2001:db8:cccc:1:F1::0 | <R-ID,R1>: Replication-SID: 2001:db8:cccc:1:F1::0 Replication | |||
Replication state: | state: R2: <2001:db8:cccc:2:F2::0->L12> R6: | |||
R2: <2001:db8:cccc:2:F2::0->L12> | <2001:db8:cccc:6:F6::0> R7: <2001:db8:cccc:4:C7::0>, SL: | |||
R6: <2001:db8:cccc:6:F6::0> | <2001:db8:cccc:7:F7::0> | |||
R7: <2001:db8:cccc:4:C7::0>, SL: <2001:db8:cccc:7:F7::0> | ||||
Replication to R2 steers the packet directly to R2 on interface L12. | Replication to R2 steers the packet directly to R2 on interface L12. | |||
Replication to R6, using 2001:db8:cccc:6:F6::0, steers the packet via | Replication to R6, using 2001:db8:cccc:6:F6::0, steers the packet via | |||
shortest path to that node. Replication to R7 is steered via R4, | the shortest path to that node. Replication to R7 is steered via R4, | |||
using H.Encaps.Red with End.X SID 2001:db8:cccc:4:C7::0 at R4 to R7. | using H.Encaps.Red with End.X SID 2001:db8:cccc:4:C7::0 at R4 to R7. | |||
Replication segment at R2: | Replication segment at R2: | |||
Replication segment <R-ID,R2>: | Replication segment | |||
Replication SID: 2001:db8:cccc:2:F2::0 | <R-ID,R2>: Replication-SID: 2001:db8:cccc:2:F2::0 Replication | |||
Replication state: | state: R2: <Leaf> | |||
R2: <Leaf> | ||||
Replication segment at R6: | Replication segment at R6: | |||
Replication segment <R-ID,R6>: | Replication segment | |||
Replication SID: 2001:db8:cccc:6:F6::0 | <R-ID,R6>: Replication-SID: 2001:db8:cccc:6:F6::0 Replication | |||
Replication state: | state: R6: <Leaf> | |||
R6: <Leaf> | ||||
Replication segment at R7: | Replication segment at R7: | |||
Replication segment <R-ID,R7>: | Replication segment | |||
Replication SID: 2001:db8:cccc:7:F7::0 | <R-ID,R7>: Replication-SID: 2001:db8:cccc:7:F7::0 Replication | |||
Replication state: | state: R7: <Leaf> | |||
R7: <Leaf> | ||||
When a packet, (A,B2), is steered into the Replication segment at R1: | When a packet, (A,B2), is steered into the Replication segment at R1: | |||
* Since R1 is directly connected to R2, R1 creates encapsulated | * R1 creates an encapsulated replicated copy (2001:db8::1, | |||
replicated copy (2001:db8::1, 2001:db8:cccc:2:F2::0) (A, B2), and | 2001:db8:cccc:2:F2::0) (A, B2), and sends it to R2 on interface | |||
sends it to R2 on interface L12. R2, as Leaf, removes outer IPv6 | L12, since R1 is directly connected to R2. R2, as leaf, removes | |||
header and delivers the payload. | the outer IPv6 header and delivers the payload. | |||
* R1 creates encapsulated replicated copy (2001:db8::1, | * R1 creates an encapsulated replicated copy (2001:db8::1, | |||
2001:db8:cccc:6:F6::0) (A, B2) then forwards the resulting packet | 2001:db8:cccc:6:F6::0) (A, B2) then forwards the resulting packet | |||
on the shortest path to 2001:db8:cccc:6::/64. R2 and R3 forward | on the shortest path to 2001:db8:cccc:6::/64. R2 and R3 forward | |||
the packet using 2001:db8:cccc:6::/64. R6, as Leaf, removes outer | the packet using 2001:db8:cccc:6::/64. R6, as leaf, removes the | |||
IPv6 header and delivers the payload. | outer IPv6 header and delivers the payload. | |||
* R1 has to steer packet to Downstream node R7 via node R4. It can | * R1 has to steer the packet to downstream node R7 via node R4. It | |||
do this in one of two ways: | can do this in one of two ways: | |||
- R1 creates encapsulated replicated copy (2001:db8::1, | - R1 creates an encapsulated replicated copy (2001:db8::1, | |||
2001:db8:cccc:7:F7::0) (A, B2) and then performs H.Encaps.Red | 2001:db8:cccc:7:F7::0) (A, B2) and then performs H.Encaps.Red | |||
using the SL to create (2001:db8::1, 2001:db8:cccc:4:C7::0) | using the SL to create the (2001:db8::1, 2001:db8:cccc:4:C7::0) | |||
(2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) packet. It sends | (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) packet. It sends | |||
this packet to R2, the nexthop on shortest path to | this packet to R2, which is the nexthop on the shortest path to | |||
2001:db8:cccc:4::/64. R2 forwards packet to R4 using | 2001:db8:cccc:4::/64. R2 forwards the packet to R4 using | |||
2001:db8:cccc:4::/64. R4 executes End.X function on | 2001:db8:cccc:4::/64. R4 executes the End.X function on | |||
2001:db8:cccc:4:C7::0, performs USD action, removes outer IPv6 | 2001:db8:cccc:4:C7::0, performs a USD action, removes the outer | |||
encapsulation and sends resulting packet (2001:db8::1, | IPv6 encapsulation, and sends the resulting packet | |||
2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as Leaf, removes | (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as | |||
outer IPv6 header and delivers the payload. | leaf, removes the outer IPv6 header and delivers the payload. | |||
- R1 is Root of replication segment. Therefore, it can combine | - R1 is the root of the Replication segment. Therefore, it can | |||
above encapsulations to create encapsulated replicated copy | combine above encapsulations to create an encapsulated | |||
(2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8:cccc:7:F7::0; | replicated copy (2001:db8::1, 2001:db8:cccc:4:C7::0) | |||
SL=1) (A, B2) and sends it to R2, the nexthop on shortest path | (2001:db8:cccc:7:F7::0; SL=1) (A, B2) and sends it to R2, which | |||
to 2001:db8:cccc:4::/64. R2 forwards packet to R4 using | is the nexthop on the shortest path to 2001:db8:cccc:4::/64. | |||
2001:db8:cccc:4::/64. R4 executes End.X function on | R2 forwards the packet to R4 using 2001:db8:cccc:4::/64. R4 | |||
2001:db8:cccc:4:C7::0, performs PSP action, removes SRH and | executes the End.X function on 2001:db8:cccc:4:C7::0, performs | |||
sends resulting packet (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, | a PSP action, removes the SRH, and sends the resulting packet | |||
B2) to R7. R7, as Leaf, removes outer IPv6 header and delivers | (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as | |||
the payload. | leaf, removes the outer IPv6 header and delivers the payload. | |||
A.2.1. Pinging Replication SID | A.2.1. Pinging a Replication-SID | |||
This section illustrates ping of a Replication SID. | This section illustrates the ping of a Replication-SID. | |||
Node R1 pings replication SID of node R6 directly by sending the | Node R1 pings the Replication-SID of node R6 directly by sending the | |||
following packet: | following packet: | |||
1. R1 to R6: (2001:db8::1, 2001:db8:cccc:6:F6::0; NH=ICMPv6) (ICMPv6 | 1. R1 to R6: (2001:db8::1, 2001:db8:cccc:6:F6::0; NH=ICMPv6) (ICMPv6 | |||
Echo Request) | Echo Request). | |||
2. Node R6 as a Leaf processes upper layer ICMPv6 Echo Request and | 2. Node R6 as a leaf processes the upper-layer ICMPv6 Echo Request | |||
responds with ICMPv6 Echo Reply | and responds with an ICMPv6 Echo Reply. | |||
Node R1 pings Replication SID of R7 via R4 by sending the following | Node R1 pings the Replication-SID of R7 via R4 by sending the | |||
packet with SRH: | following packet with the SRH: | |||
1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:C7::0) | 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:C7::0) | |||
(2001:db8:cccc:7:F7::0; SL=1; NH=ICMPV6) (ICMPv6 Echo Request) | (2001:db8:cccc:7:F7::0; SL=1; NH=ICMPV6) (ICMPv6 Echo Request). | |||
2. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 | 2. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 | |||
Echo Request) | Echo Request). | |||
3. Node R7 as a Leaf processes upper layer ICMPv6 Echo Request and | 3. Node R7 as a leaf processes the upper-layer ICMPv6 Echo Request | |||
responds with ICMPv6 Echo Reply | and responds with an ICMPv6 Echo Reply. | |||
Assume node R4 is a transit Replication node with Replication SID | Assume node R4 is a transit replication node with Replication-SID | |||
2001:db8:cccc:4:F4::0 replicating to R7. Node R1 pings Replication | 2001:db8:cccc:4:F4::0 replicating to R7. Node R1 pings the | |||
SID of R7 via Replication SID of R4 as follows: | Replication-SID of R7 via the Replication-SID of R4 as follows: | |||
1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:F4::0; NH=ICMPv6) (ICMPv6 | 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:F4::0; NH=ICMPv6) (ICMPv6 | |||
Echo Request) | Echo Request). | |||
2. R4 replicates to R7 by replacing IPv6 destination address with | 2. R4 replicates to R7 by replacing the IPv6 DA with the | |||
Replication SID of R7 from its Replication state | Replication-SID of R7 from its Replication state. | |||
3. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 | 3. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 | |||
Echo Request) | Echo Request). | |||
4. Node R7 as a Leaf processes upper layer ICMPv6 Echo Request and | 4. Node R7 as a leaf processes the upper-layer ICMPv6 Echo Request | |||
responds with ICMPv6 Echo Reply | and responds with an ICMPv6 Echo Reply. | |||
Acknowledgements | ||||
The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, | ||||
Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry | ||||
Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong | ||||
Xie for their valuable inputs. | ||||
Contributors | ||||
Clayton Hassen | ||||
Bell Canada | ||||
Vancouver | ||||
Canada | ||||
Email: clayton.hassen@bell.ca | ||||
Kurtis Gillis | ||||
Bell Canada | ||||
Halifax | ||||
Canada | ||||
Email: kurtis.gillis@bell.ca | ||||
Arvind Venkateswaran | ||||
Cisco Systems, Inc. | ||||
San Jose, CA | ||||
United States of America | ||||
Email: arvvenka@cisco.com | ||||
Zafar Ali | ||||
Cisco Systems, Inc. | ||||
United States of America | ||||
Email: zali@cisco.com | ||||
Swadesh Agrawal | ||||
Cisco Systems, Inc. | ||||
San Jose, CA | ||||
United States of America | ||||
Email: swaagraw@cisco.com | ||||
Jayant Kotalwar | ||||
Nokia | ||||
Mountain View, CA | ||||
United States of America | ||||
Email: jayant.kotalwar@nokia.com | ||||
Tanmoy Kundu | ||||
Nokia | ||||
Mountain View, CA | ||||
United States of America | ||||
Email: tanmoy.kundu@nokia.com | ||||
Andrew Stone | ||||
Nokia | ||||
Ottawa | ||||
Canada | ||||
Email: andrew.stone@nokia.com | ||||
Tarek Saad | ||||
Cisco Systems, Inc. | ||||
Canada | ||||
Email: tsaad@cisco.com | ||||
Kamran Raza | ||||
Cisco Systems, Inc. | ||||
Canada | ||||
Email: skraza@cisco.com | ||||
Jingrong Xie | ||||
Huawei Technologies | ||||
Beijing | ||||
China | ||||
Email: xiejingrong@huawei.com | ||||
Authors' Addresses | Authors' Addresses | |||
Daniel Voyer (editor) | Daniel Voyer (editor) | |||
Bell Canada | Bell Canada | |||
Montreal | Montreal | |||
Canada | Canada | |||
Email: daniel.voyer@bell.ca | Email: daniel.voyer@bell.ca | |||
Clarence Filsfils | Clarence Filsfils | |||
Cisco Systems, Inc. | Cisco Systems, Inc. | |||
Brussels | Brussels | |||
Belgium | Belgium | |||
Email: cfilsfil@cisco.com | Email: cfilsfil@cisco.com | |||
Rishabh Parekh | Rishabh Parekh | |||
Cisco Systems, Inc. | Cisco Systems, Inc. | |||
San Jose, | San Jose, CA | |||
United States of America | United States of America | |||
Email: riparekh@cisco.com | Email: riparekh@cisco.com | |||
Hooman Bidgoli | Hooman Bidgoli | |||
Nokia | Nokia | |||
Ottawa | Ottawa | |||
Canada | Canada | |||
Email: hooman.bidgoli@nokia.com | Email: hooman.bidgoli@nokia.com | |||
Zhaohui Zhang | Zhaohui Zhang | |||
End of changes. 177 change blocks. | ||||
702 lines changed or deleted | 660 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |