| rfc9625.original | rfc9625.txt | |||
|---|---|---|---|---|
| BESS W. Lin | Internet Engineering Task Force (IETF) W. Lin | |||
| Internet-Draft Z. Zhang | Request for Comments: 9625 Z. Zhang | |||
| Intended status: Standards Track J. Drake | Category: Standards Track J. Drake | |||
| Expires: 5 September 2024 E. Rosen, Ed. | ISSN: 2070-1721 E. Rosen, Ed. | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| J. Rabadan | J. Rabadan | |||
| Nokia | Nokia | |||
| A. Sajassi | A. Sajassi | |||
| Cisco Systems | Cisco Systems | |||
| 4 March 2024 | August 2024 | |||
| EVPN Optimized Inter-Subnet Multicast (OISM) Forwarding | EVPN Optimized Inter-Subnet Multicast (OISM) Forwarding | |||
| draft-ietf-bess-evpn-irb-mcast-11 | ||||
| Abstract | Abstract | |||
| Ethernet VPN (EVPN) provides a service that allows a single Local | Ethernet VPN (EVPN) provides a service that allows a single Local | |||
| Area Network (LAN), comprising a single IP subnet, to be divided into | Area Network (LAN), comprising a single IP subnet, to be divided into | |||
| multiple "segments". Each segment may be located at a different | multiple segments. Each segment may be located at a different site, | |||
| site, and the segments are interconnected by an IP or MPLS backbone. | and the segments are interconnected by an IP or MPLS backbone. | |||
| Intra-subnet traffic (either unicast or multicast) always appears to | Intra-subnet traffic (either unicast or multicast) always appears to | |||
| the end users to be bridged, even when it is actually carried over | the end users to be bridged, even when it is actually carried over | |||
| the IP or MPLS backbone. When a single "tenant" owns multiple such | the IP or MPLS backbone. When a single tenant owns multiple such | |||
| LANs, EVPN also allows IP unicast traffic to be routed between those | LANs, EVPN also allows IP unicast traffic to be routed between those | |||
| LANs. This document specifies new procedures that allow inter-subnet | LANs. This document specifies new procedures that allow inter-subnet | |||
| IP multicast traffic to be routed among the LANs of a given tenant, | IP multicast traffic to be routed among the LANs of a given tenant | |||
| while still making intra-subnet IP multicast traffic appear to be | while still making intra-subnet IP multicast traffic appear to be | |||
| bridged. These procedures can provide optimal routing of the inter- | bridged. These procedures can provide optimal routing of the inter- | |||
| subnet multicast traffic, and do not require any such traffic to | subnet multicast traffic and do not require any such traffic to | |||
| egress a given router and then ingress that same router. These | egress a given router and then ingress that same router. These | |||
| procedures also accommodate IP multicast traffic that originates or | procedures also accommodate IP multicast traffic that originates or | |||
| is destined external to the EVPN domain. | is destined to be external to the EVPN domain. | |||
| Requirements Language | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
| "OPTIONAL" in this document are to be interpreted as described in BCP | ||||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
| capitals, as shown here. | ||||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
| provisions of BCP 78 and BCP 79. | ||||
| Internet-Drafts are working documents of the Internet Engineering | ||||
| Task Force (IETF). Note that other groups may also distribute | ||||
| working documents as Internet-Drafts. The list of current Internet- | ||||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
| Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
| and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
| time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
| material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
| Internet Standards is available in Section 2 of RFC 7841. | ||||
| This Internet-Draft will expire on 5 September 2024. | Information about the current status of this document, any errata, | |||
| and how to provide feedback on it may be obtained at | ||||
| https://www.rfc-editor.org/info/rfc9625. | ||||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
| license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
| and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
| extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
| described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
| provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
| in the Revised BSD License. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
| 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Terminology | |||
| 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.1.1. Requirements Language | |||
| 1.2.1. Segments, Broadcast Domains, and Tenants . . . . . . 7 | 1.2. Background | |||
| 1.2.2. Inter-BD (Inter-Subnet) IP Traffic . . . . . . . . . 8 | 1.2.1. Segments, Broadcast Domains, and Tenants | |||
| 1.2.3. EVPN and IP Multicast . . . . . . . . . . . . . . . . 9 | 1.2.2. Inter-BD (Inter-Subnet) IP Traffic | |||
| 1.2.4. BDs, MAC-VRFS, and EVPN Service Models . . . . . . . 9 | 1.2.3. EVPN and IP Multicast | |||
| 1.3. Need for EVPN-aware Multicast Procedures . . . . . . . . 10 | 1.2.4. BDs, MAC-VRFs, and EVPN Service Models | |||
| 1.4. Additional Requirements That Must be Met by the | 1.3. Need for EVPN-Aware Multicast Procedures | |||
| Solution . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.4. Additional Requirements That Must Be Met by the Solution | |||
| 1.5. Model of Operation: Overview . . . . . . . . . . . . . . 13 | 1.5. Model of Operation: Overview | |||
| 1.5.1. Control Plane . . . . . . . . . . . . . . . . . . . . 13 | 1.5.1. Control Plane | |||
| 1.5.2. Data Plane . . . . . . . . . . . . . . . . . . . . . 15 | 1.5.2. Data Plane | |||
| 2. Detailed Model of Operation . . . . . . . . . . . . . . . . . 18 | 2. Detailed Model of Operation | |||
| 2.1. Supplementary Broadcast Domain . . . . . . . . . . . . . 18 | 2.1. Supplementary Broadcast Domain | |||
| 2.2. Detecting When a Route is For/From a Particular BD . . . 19 | 2.2. Detecting When a Route is for/from a Particular BD | |||
| 2.3. Use of IRB Interfaces at Ingress PE . . . . . . . . . . . 22 | 2.3. Use of IRB Interfaces at Ingress PE | |||
| 2.4. Use of IRB Interfaces at an Egress PE . . . . . . . . . . 24 | 2.4. Use of IRB Interfaces at an Egress PE | |||
| 2.5. Announcing Interest in (S,G) . . . . . . . . . . . . . . 24 | 2.5. Announcing Interest in (S,G) | |||
| 2.6. Tunneling Frames from Ingress PE to Egress PEs . . . . . 25 | 2.6. Tunneling Frames from Ingress PEs to Egress PEs | |||
| 2.7. Advanced Scenarios . . . . . . . . . . . . . . . . . . . 26 | 2.7. Advanced Scenarios | |||
| 3. EVPN-aware Multicast Solution Control Plane . . . . . . . . . 27 | 3. EVPN-Aware Multicast Solution Control Plane | |||
| 3.1. Supplementary Broadcast Domain (SBD) and Route Targets . 27 | 3.1. Supplementary Broadcast Domain (SBD) and Route Targets | |||
| 3.2. Advertising the Tunnels Used for IP Multicast . . . . . . 28 | 3.2. Advertising the Tunnels Used for IP Multicast | |||
| 3.2.1. Constructing Routes for the SBD . . . . . . . . . . . 29 | 3.2.1. Constructing Routes for the SBD | |||
| 3.2.2. Ingress Replication . . . . . . . . . . . . . . . . . 29 | 3.2.2. Ingress Replication | |||
| 3.2.3. Assisted Replication . . . . . . . . . . . . . . . . 30 | 3.2.3. Assisted Replication | |||
| 3.2.3.1. Automatic SBD Matching . . . . . . . . . . . . . 31 | 3.2.3.1. Automatic SBD Matching | |||
| 3.2.4. BIER . . . . . . . . . . . . . . . . . . . . . . . . 31 | 3.2.4. BIER | |||
| 3.2.5. Inclusive P2MP Tunnels . . . . . . . . . . . . . . . 32 | 3.2.5. Inclusive P2MP Tunnels | |||
| 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive | 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive | |||
| Tunnels . . . . . . . . . . . . . . . . . . . . . . 32 | Tunnels | |||
| 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise | 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise | |||
| Inclusive Tunnels Specific to IP Multicast . . . . 34 | Inclusive Tunnels Specific to IP Multicast | |||
| 3.2.6. Selective Tunnels . . . . . . . . . . . . . . . . . . 35 | 3.2.6. Selective Tunnels | |||
| 3.3. Advertising SMET Routes . . . . . . . . . . . . . . . . . 35 | 3.3. Advertising SMET Routes | |||
| 4. Constructing Multicast Forwarding State . . . . . . . . . . . 38 | 4. Constructing Multicast Forwarding State | |||
| 4.1. Layer 2 Multicast State . . . . . . . . . . . . . . . . . 38 | 4.1. Layer 2 Multicast State | |||
| 4.1.1. Constructing the OIF List . . . . . . . . . . . . . . 39 | 4.1.1. Constructing the OIF List | |||
| 4.1.2. Data Plane: Applying the OIF List to an (S,G) | 4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | |||
| Frame . . . . . . . . . . . . . . . . . . . . . . . . 40 | 4.1.2.1. Eligibility of an AC to Receive a Frame | |||
| 4.1.2.1. Eligibility of an AC to Receive a Frame . . . . . 40 | 4.1.2.2. Applying the OIF List | |||
| 4.1.2.2. Applying the OIF List . . . . . . . . . . . . . . 40 | 4.2. Layer 3 Forwarding State | |||
| 4.2. Layer 3 Forwarding State . . . . . . . . . . . . . . . . 42 | 5. Interworking with Non-OISM EVPN PEs | |||
| 5. Interworking with non-OISM EVPN-PEs . . . . . . . . . . . . . 42 | 5.1. IPMG Designated Forwarder | |||
| 5.1. IPMG Designated Forwarder . . . . . . . . . . . . . . . . 45 | 5.2. Ingress Replication | |||
| 5.2. Ingress Replication . . . . . . . . . . . . . . . . . . . 46 | 5.2.1. Ingress PE is Non-OISM | |||
| 5.2.1. Ingress PE is non-OISM . . . . . . . . . . . . . . . 47 | 5.2.2. Ingress PE is OISM | |||
| 5.2.2. Ingress PE is OISM . . . . . . . . . . . . . . . . . 48 | 5.3. P2MP Tunnels | |||
| 5.3. P2MP Tunnels . . . . . . . . . . . . . . . . . . . . . . 49 | 6. Traffic to/from Outside the EVPN Tenant Domain | |||
| 6. Traffic to/from Outside the EVPN Tenant Domain . . . . . . . 50 | 6.1. Layer 3 Interworking via EVPN OISM PEs | |||
| 6.1. Layer 3 Interworking via EVPN OISM PEs . . . . . . . . . 50 | 6.1.1. General Principles | |||
| 6.1.1. General Principles . . . . . . . . . . . . . . . . . 50 | 6.1.2. Interworking with MVPN | |||
| 6.1.2. Interworking with MVPN . . . . . . . . . . . . . . . 54 | 6.1.2.1. MVPN Sources with EVPN Receivers | |||
| 6.1.2.1. MVPN Sources with EVPN Receivers . . . . . . . . 56 | 6.1.2.1.1. Identifying MVPN Sources | |||
| 6.1.2.1.1. Identifying MVPN Sources . . . . . . . . . . 56 | 6.1.2.1.2. Joining a Flow from an MVPN Source | |||
| 6.1.2.1.2. Joining a Flow from an MVPN Source . . . . . 56 | 6.1.2.2. EVPN Sources with MVPN Receivers | |||
| 6.1.2.2. EVPN Sources with MVPN Receivers . . . . . . . . 58 | 6.1.2.2.1. General Procedures | |||
| 6.1.2.2.1. General procedures . . . . . . . . . . . . . 59 | 6.1.2.2.2. Any-Source Multicast (ASM) Groups | |||
| 6.1.2.2.2. Any-Source Multicast (ASM) Groups . . . . . . 60 | 6.1.2.2.3. Source on Multihomed Segment | |||
| 6.1.2.2.3. Source on Multihomed Segment . . . . . . . . 61 | 6.1.2.3. Obtaining Optimal Routing of Traffic between MVPN | |||
| 6.1.2.3. Obtaining Optimal Routing of Traffic Between MVPN | and EVPN | |||
| and EVPN . . . . . . . . . . . . . . . . . . . . . 61 | 6.1.2.4. Selecting the MEG SBD-DR | |||
| 6.1.2.4. Selecting the MEG SBD-DR . . . . . . . . . . . . 62 | 6.1.3. Interworking with Global Table Multicast | |||
| 6.1.3. Interworking with 'Global Table Multicast' . . . . . 63 | 6.1.4. Interworking with PIM | |||
| 6.1.4. Interworking with PIM . . . . . . . . . . . . . . . . 63 | 6.1.4.1. Source Inside EVPN Domain | |||
| 6.1.4.1. Source Inside EVPN Domain . . . . . . . . . . . . 64 | 6.1.4.2. Source Outside EVPN Domain | |||
| 6.1.4.2. Source Outside EVPN Domain . . . . . . . . . . . 65 | 6.2. Interworking with PIM via an External PIM Router | |||
| 6.2. Interworking with PIM via an External PIM Router . . . . 65 | ||||
| 7. Using an EVPN Tenant Domain as an Intermediate (Transit) | 7. Using an EVPN Tenant Domain as an Intermediate (Transit) | |||
| Network for Multicast traffic . . . . . . . . . . . . . . 67 | Network for Multicast Traffic | |||
| 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 69 | 8. IANA Considerations | |||
| 9. Security Considerations . . . . . . . . . . . . . . . . . . . 69 | 9. Security Considerations | |||
| 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 70 | 10. References | |||
| 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 70 | 10.1. Normative References | |||
| 11.1. Normative References . . . . . . . . . . . . . . . . . . 70 | 10.2. Informative References | |||
| 11.2. Informative References . . . . . . . . . . . . . . . . . 72 | Appendix A. Integrated Routing and Bridging | |||
| Appendix A. Integrated Routing and Bridging . . . . . . . . . . 73 | Acknowledgements | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 | Authors' Addresses | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. Terminology | 1.1. Terminology | |||
| In this document we make frequent use of the following terminology: | In this document, we make frequent use of the following terminology: | |||
| * OISM: Optimized Inter-Subnet Multicast. EVPN-PEs that follow the | OISM: Optimized Inter-Subnet Multicast. EVPN PEs that follow the | |||
| procedures of this document will be known as "OISM" PEs. EVPN-PEs | procedures of this document will be known as "OISM" Provider Edges | |||
| that do not follow the procedures of this document will be known | (PEs). EVPN PEs that do not follow the procedures of this | |||
| as "non-OISM" PEs. | document will be known as "non-OISM" PEs. | |||
| * IP Multicast Packet: An IP packet whose IP Destination Address | IP Multicast Packet: An IP packet whose IP Destination Address field | |||
| field is a multicast address that is not a link-local address. | is a multicast address that is not a link-local address. (Link- | |||
| (Link-local addresses are IPv4 addresses in the 224/24 range and | local addresses are IPv4 addresses in the 224/24 range and IPv6 | |||
| IPv6 address in the FF02/16 range.) | addresses in the FF02/16 range.) | |||
| * IP Multicast Frame: An Ethernet frame whose payload is an IP | IP Multicast Frame: An Ethernet frame whose payload is an IP | |||
| multicast packet (as defined above). | multicast packet (as defined above). | |||
| * (S,G) Multicast Packet: An IP multicast packet whose IP Source | (S,G) Multicast Packet: An IP multicast packet whose Source IP | |||
| Address field contains S and whose IP Destination Address field | Address field contains S and whose IP Destination Address field | |||
| contains G. | contains G. | |||
| * (S,G) Multicast Frame: An IP multicast frame whose payload | (S,G) Multicast Frame: An IP multicast frame whose payload contains | |||
| contains S in its IP Source Address field and G in its IP | S in its Source IP Address field and G in its IP Destination | |||
| Destination Address field. | Address field. | |||
| * EVPN Instance (EVI): An EVPN instance spanning the Provider Edge | EVI: EVPN Instance. An EVPN instance spanning the PE devices | |||
| (PE) devices participating in that EVPN. | participating in that EVPN. | |||
| * Broadcast Domain (BD): an emulated Ethernet, such that two systems | BD: Broadcast Domain. An emulated Ethernet, such that two systems | |||
| on the same BD will receive each other's link-local broadcasts. | on the same BD will receive each other's link-local broadcasts. | |||
| Note that EVPN supports service models in which a single EVPN | Note that EVPN supports service models in which a single EVI | |||
| Instance contains only one BD, and service models in which a | contains only one BD and service models in which a single EVI | |||
| single EVI contains multiple BDs. Both types of service model are | contains multiple BDs. Both types of service models are supported | |||
| supported by this draft. In all models, a given BD belongs to | by this document. In all models, a given BD belongs to only one | |||
| only one EVI. | EVI. | |||
| * Designated Forwarder (DF). As defined in [RFC7432], an Ethernet | DF: Designated Forwarder. As defined in [RFC7432], an Ethernet | |||
| segment may be multi-homed (attached to more than one PE). An | segment may be multihomed (attached to more than one PE). An | |||
| Ethernet segment may also contain multiple BDs, of one or more | Ethernet segment may also contain multiple BDs of one or more | |||
| EVIs. For each such EVI, one of the PEs attached to the segment | EVIs. For each such EVI, one of the PEs attached to the segment | |||
| becomes that EVI's DF for that segment. Since a BD may belong to | becomes that EVI's DF for that segment. Since a BD may belong to | |||
| only one EVI, we can speak unambiguously of the BD's DF for a | only one EVI, we can speak unambiguously of the BD's DF for a | |||
| given segment. | given segment. | |||
| When the text makes it clear that we are speaking in the context | AC: Attachment Circuit. An AC connects the bridging function of an | |||
| of a given BD, we will frequently use the term "a segment's DF" to | EVPN PE to an Ethernet segment of a particular BD. ACs are not | |||
| mean the given BD's DF for that segment. | visible at the Layer 3. | |||
| * AC: Attachment Circuit. An AC connects the bridging function of | ||||
| an EVPN-PE to an Ethernet segment of a particular BD. ACs are not | ||||
| visible at the router (L3) layer. | ||||
| If a given Ethernet segment, attached to a given PE, contains n | If a given Ethernet segment, attached to a given PE, contains n | |||
| BDs, we will say that the PE has n ACs to that segment. | BDs, we say that the PE has n ACs to that segment. | |||
| * L3 Gateway: An L3 Gateway is a PE that connects an EVPN tenant | L3 Gateway: An L3 Gateway is a PE that connects an EVPN Tenant | |||
| domain to an external multicast domain by performing both the OISM | Domain to an external multicast domain by performing both the OISM | |||
| procedures and the Layer 3 multicast procedures of the external | procedures and the Layer 3 multicast procedures of the external | |||
| domain. | domain. | |||
| * PEG (PIM/EVPN Gateway): A L3 Gateway that connects an EVPN Tenant | PEG: PIM/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant | |||
| Domain to an external multicast domain whose Layer 3 multicast | Domain to an external multicast domain whose Layer 3 multicast | |||
| procedures are those of PIM [RFC7761]. | procedures are those of PIM [RFC7761]. | |||
| * MEG (MVPN/EVPN Gateway): A L3 Gateway that connects an EVPN Tenant | MEG: MVPN/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant | |||
| Domain to an external multicast domain whose Layer 3 multicast | Domain to an external multicast domain whose Layer 3 multicast | |||
| procedures are those of MVPN ([RFC6513], [RFC6514]). | procedures are those of Multicast VPN (MVPN) [RFC6513] [RFC6514]. | |||
| * IPMG (IP Multicast Gateway): A PE that is used for interworking | IPMG: IP Multicast Gateway. A PE that is used for interworking OISM | |||
| OISM EVPN-PEs with non-OISM EVPN-PEs. | EVPN PEs with non-OISM EVPN PEs. | |||
| * DR (Designated Router): A PE that has special responsibilities for | DR: Designated Router. A PE that has special responsibilities for | |||
| handling multicast on a given BD. | handling multicast on a given BD. | |||
| * FHR (First Hop Router): The FHR is a PIM router [RFC7761] with | FHR: First Hop Router. The FHR is a PIM router [RFC7761] with | |||
| special responsibilities. It is the first multicast router to see | special responsibilities. It is the first multicast router to see | |||
| (S,G) packets from source S, and if G is an "Any Source Multicast | (S,G) packets from source S, and if G is an Any-Source Multicast | |||
| (ASM)" group, the FHR is responsible for sending PIM Register | (ASM) group, the FHR is responsible for sending PIM Register | |||
| messages to the PIM Rendezvous Point for group G. | messages to the PIM Rendezvous Point (RP) for group G. | |||
| * LHR (Last Hop Router): The LHR is a PIM router [RFC7761] with | LHR: Last Hop Router. The LHR is a PIM router [RFC7761] with | |||
| special responsibilities. Generally, it is attached to a LAN, and | special responsibilities. Generally, it is attached to a LAN, and | |||
| it determines whether there are any hosts on the LAN that need to | it determines whether there are any hosts on the LAN that need to | |||
| receive a given multicast flow. If so, it creates and sends the | receive a given multicast flow. If so, it creates and sends the | |||
| PIM Join messages that are necessary to receive the flow. | PIM Join messages that are necessary to receive the flow. | |||
| * EC (Extended Community). A BGP Extended Communities attribute | EC: Extended Community. A BGP Extended Communities attribute | |||
| ([RFC4360], [RFC7153]) is a BGP path attribute that consists of | [RFC4360] [RFC7153] is a BGP path attribute that consists of one | |||
| one or more extended communities. | or more Extended Communities. | |||
| * RT (Route Target): A Route Target is a particular kind of BGP | RT: Route Target. A Route Target is a particular kind of BGP | |||
| Extended Community. A BGP Extended Community consists of a type | Extended Community. A BGP Extended Community consists of a type | |||
| field, a sub-type field, and a value field. Certain type/sub-type | field, a sub-type field, and a value field. Certain type/sub-type | |||
| combinations indicate that a particular Extended Community is an | combinations indicate that a particular Extended Community is an | |||
| RT. RT1 and RT2 are considered to be the same RT if and only if | RT. RT1 and RT2 are considered to be the same RT if and only if | |||
| they have the same type, same sub-type, and same value fields. | they have the same type, sub-type, and value fields. | |||
| * Use of the "C-" prefix. In many documents on VPN multicast, the | C- prefix: In many documents on VPN multicast, the prefix C- appears | |||
| prefix "C-" appears before any address or wildcard that refers to | before any address or wildcard that refers to an address or | |||
| an address or addresses in a tenant's address space, rather than | addresses in a tenant's address space rather than to an address of | |||
| to an address of addresses in the address space of the backbone | addresses in the address space of the backbone network. This | |||
| network. This document omits the "C-" prefix in many cases where | document omits the C- prefix in many cases where it is clear from | |||
| it is clear from the context that the reference is to the tenant's | the context that the reference is to the tenant's address space. | |||
| address space. | ||||
| This document also assumes familiarity with the terminology of | This document also assumes familiarity with the terminology of | |||
| [RFC4364], [RFC6514], [RFC7432], [RFC7761], [RFC9251], [RFC9136] and | [RFC4364], [RFC6514], [RFC7432], [RFC7761], [RFC9136], [RFC9251], and | |||
| [I-D.ietf-bess-evpn-bum-procedure-updates]. | [RFC9572]. | |||
| 1.1.1. Requirements Language | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
| "OPTIONAL" in this document are to be interpreted as described in | ||||
| BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
| capitals, as shown here. | ||||
| 1.2. Background | 1.2. Background | |||
| Ethernet VPN (EVPN) [RFC7432] provides a Layer 2 VPN (L2VPN) | Ethernet VPN (EVPN) [RFC7432] provides a Layer 2 VPN (L2VPN) | |||
| solution, which allows an IP or MPLS backbone provider to offer | solution, which allows an IP or MPLS backbone provider to offer | |||
| Ethernet service to a set of customers, known as "tenants". | Ethernet service to a set of customers, known as "tenants". | |||
| In this section (as well as in [RFC9135]), we provide some essential | In this section (as well as in [RFC9135]), we provide some essential | |||
| background information on EVPN. | background information on EVPN. | |||
| 1.2.1. Segments, Broadcast Domains, and Tenants | 1.2.1. Segments, Broadcast Domains, and Tenants | |||
| One of the key concepts of EVPN is the Broadcast Domain (BD). A BD | One of the key concepts of EVPN is the Broadcast Domain (BD). A BD | |||
| is essentially an emulated Ethernet. Each BD belongs to a single | is essentially an emulated Ethernet. Each BD belongs to a single | |||
| tenant. A BD typically consists of multiple Ethernet "segments", and | tenant. A BD typically consists of multiple Ethernet segments, and | |||
| each segment may be attached to a different EVPN Provider Edge | each segment may be attached to a different EVPN Provider Edge (EVPN | |||
| (EVPN-PE) router. EVPN-PE routers are often referred to as "Network | PE) router. EVPN PE routers are often referred to as "Network | |||
| Virtualization Endpoints" or NVEs. However, this document will use | Virtualization Endpoints (NVEs)". However, this document will use | |||
| the term "EVPN-PE", or, when the context is clear, just "PE". | the term "EVPN PE" or, when the context is clear, just "PE". | |||
| In this document, the term "segment" is used interchangeable with the | In this document, the term "segment" is used interchangeably with | |||
| "Ethernet Segment" or "ES" in [RFC7432]. | "Ethernet Segment" or "ES", as defined in [RFC7432]. | |||
| Attached to each segment are "Tenant Systems" (TSes). A TS may be | Attached to each segment are Tenant Systems (TSs). A TS may be any | |||
| any type of system, physical or virtual, host or router, etc., that | type of system, physical or virtual, host or router, etc., that can | |||
| can attach to an Ethernet. | attach to an Ethernet. | |||
| When two TSes are on the same segment, traffic between them does not | When two TSs are on the same segment, traffic between them does not | |||
| pass through an EVPN-PE. When two TSes are on different segments of | pass through an EVPN PE. When two TSs are on different segments of | |||
| the same BD, traffic between them does pass through an EVPN-PE. | the same BD, traffic between them does pass through an EVPN PE. | |||
| When two TSes, say TS1 and TS2 are on the same BD, then: | When two TSs, say TS1 and TS2, are on the same BD, then the following | |||
| occurs: | ||||
| * If TS1 knows the MAC address of TS2, TS1 can send unicast Ethernet | * If TS1 knows the Media Access Control (MAC) address of TS2, TS1 | |||
| frames to TS2. TS2 will receive the frames unaltered. | can send unicast Ethernet frames to TS2. TS2 will receive the | |||
| frames unaltered. | ||||
| * If TS1 broadcasts an Ethernet frame, TS2 will receive the | * If TS1 broadcasts an Ethernet frame, TS2 will receive the | |||
| unaltered frame. | unaltered frame. | |||
| * If TS1 multicasts an Ethernet frame, TS2 will receive the | * If TS1 multicasts an Ethernet frame, TS2 will receive the | |||
| unaltered frame, as long as TS2 has been provisioned to receive | unaltered frame as long as TS2 has been provisioned to receive the | |||
| the Ethernet multicast destination MAC address. | Ethernet multicast destination MAC address. | |||
| When we say that TS2 receives an unaltered frame from TS1, we mean | When we say that TS2 receives an unaltered frame from TS1, we mean | |||
| that the frame still contains TS1's MAC address, and that no | that the frame still contains TS1's MAC address and that no | |||
| alteration of the frame's payload (and consequently, no alteration of | alteration of the frame's payload (and consequently, no alteration of | |||
| the payload's IP header) has been made. | the payload's IP header) has been made. | |||
| EVPN allows a single segment to be attached to multiple PE routers. | EVPN allows a single segment to be attached to multiple PE routers. | |||
| This is known as "EVPN multi-homing". Suppose a given segment is | This is known as "EVPN multihoming". Suppose a given segment is | |||
| attached to both PE1 and PE2, and suppose PE1 receives a frame from | attached to both PE1 and PE2, and suppose PE1 receives a frame from | |||
| that segment. It may be necessary for PE1 to send the frame over the | that segment. It may be necessary for PE1 to send the frame over the | |||
| backbone to PE2. EVPN has procedures to ensure that such a frame | backbone to PE2. EVPN has procedures to ensure that such a frame | |||
| cannot be sent by PE2 back to its originating segment. This is | cannot be sent back to its originating segment by PE2. This is | |||
| particularly important for multicast, because a frame arriving at PE1 | particularly important for multicast, because a frame arriving at PE1 | |||
| from a given segment will already have been seen by all the systems | from a given segment will already have been seen by all the systems | |||
| on that segment that need to see it. If the frame were sent back to | on that segment that need to see it. If the frame was sent back to | |||
| the originating segment by PE2, receivers on that segment would | the originating segment by PE2, receivers on that segment would | |||
| receive the packet twice. Even worse, the frame might be sent back | receive the packet twice. Even worse, the frame might be sent back | |||
| to PE1, which could cause an infinite loop. | to PE1, which could cause an infinite loop. | |||
| 1.2.2. Inter-BD (Inter-Subnet) IP Traffic | 1.2.2. Inter-BD (Inter-Subnet) IP Traffic | |||
| If a given tenant has multiple BDs, the tenant may wish to allow IP | If a given tenant has multiple BDs, the tenant may wish to allow IP | |||
| communication among these BDs. Such a set of BDs is known as an | communication among these BDs. Such a set of BDs is known as an | |||
| "EVPN Tenant Domain" or just a "Tenant Domain". | "EVPN Tenant Domain" or just a "Tenant Domain". | |||
| If tenant systems TS1 and TS2 are not in the same BD, then they do | If tenant systems TS1 and TS2 are not in the same BD, then they do | |||
| not receive unaltered Ethernet frames from each other. In order for | not receive unaltered Ethernet frames from each other. In order for | |||
| TS1 to send traffic to TS2, TS1 encapsulates an IP datagram inside an | TS1 to send traffic to TS2, TS1 encapsulates an IP datagram inside an | |||
| Ethernet frame, and uses Ethernet to send these frames to an IP | Ethernet frame and uses Ethernet to send these frames to an IP | |||
| router. The router decapsulates the IP datagram, does the IP | router. The router decapsulates the IP datagram, does the IP | |||
| processing and re-encapsulates the datagram for Ethernet. The MAC | processing, and re-encapsulates the datagram for Ethernet. The MAC | |||
| source address field now has the MAC address of the router, not of | Source Address field now has the MAC address of the router, not of | |||
| TS1. The TTL field of the IP datagram should be decremented by | TS1. The TTL field of the IP datagram should be decremented by | |||
| exactly 1, even if the frame needs to be sent from one PE to another. | exactly 1, even if the frame needs to be sent from one PE to another. | |||
| The structure of the provider's backbone is thus hidden from the | The structure of the provider's backbone is thus hidden from the | |||
| tenants. | tenants. | |||
| EVPN accommodates the need for inter-BD communication within a Tenant | EVPN accommodates the need for inter-BD communication within a Tenant | |||
| Domain by providing an integrated L2/L3 service for unicast IP | Domain by providing an integrated L2/L3 service for unicast IP | |||
| traffic. EVPN's Integrated Routing and Bridging (IRB) functionality | traffic. EVPN's Integrated Routing and Bridging (IRB) functionality | |||
| is specified in [RFC9135]. Each BD in a Tenant Domain is assumed to | is specified in [RFC9135]. Each BD in a Tenant Domain is assumed to | |||
| be a single IP subnet, and each IP subnet within a given Tenant | be a single IP subnet, and each IP subnet within a given Tenant | |||
| Domain is assumed to be a single BD. EVPN's IRB functionality allows | Domain is assumed to be a single BD. EVPN's IRB functionality allows | |||
| IP traffic to travel from one BD to another, and ensures that proper | IP traffic to travel from one BD to another and ensures that proper | |||
| IP processing (e.g., TTL decrement) is done. | IP processing (e.g., TTL decrement) is done. | |||
| A brief overview of IRB, including the notion of an "IRB interface", | A brief overview of IRB, including the notion of an IRB interface, | |||
| can be found in Appendix A. As explained there, an IRB interface is | can be found in Appendix A. As explained there, an IRB interface is | |||
| a sort of virtual interface connecting an L3 routing instance to a | a sort of virtual interface connecting an L3 routing instance to a | |||
| BD. A BD may have multiple attachment circuits (ACs) to a given PE, | BD. A BD may have multiple Attachment Circuits (ACs) to a given PE, | |||
| where each AC connects to a different Ethernet segment of the BD. | where each AC connects to a different Ethernet segment of the BD. | |||
| However, these ACs are not visible to the L3 routing function; from | However, these ACs are not visible to the L3 routing function; from | |||
| the perspective of an L3 routing instance, a PE has just one | the perspective of an L3 routing instance, a PE has just one | |||
| interface to each BD, viz., the IRB interface for that BD. | interface to each BD, viz., the IRB interface for that BD. | |||
| In this document, when traffic is routed out of an IRB interface, we | In this document, when traffic is routed out of an IRB interface, we | |||
| say it is sent down the IRB interface to the BD that the IRB is for. | say it is sent down the IRB interface to the BD that the IRB is for. | |||
| In the other direction, traffic is sent up the IRB interface from the | In the other direction, traffic is sent up the IRB interface from the | |||
| BD to the L3 routing instance. | BD to the L3 routing instance. | |||
| The "L3 routing instance" depicted in Appendix A is associated with a | The L3 routing instance depicted in Appendix A is associated with a | |||
| single Tenant Domain, and may be thought of as an IP-VRF for that | single Tenant Domain and may be thought of as IP Virtual Routing and | |||
| Tenant Domain. | Forwarding (IP-VRF) for that Tenant Domain. | |||
| 1.2.3. EVPN and IP Multicast | 1.2.3. EVPN and IP Multicast | |||
| [RFC9135] and [RFC9136] cover inter-subnet (inter-BD) IP unicast | [RFC9135] and [RFC9136] cover inter-subnet (inter-BD) IP unicast | |||
| forwarding, but they do not cover inter-subnet IP multicast | forwarding, but they do not cover inter-subnet IP multicast | |||
| forwarding. | forwarding. | |||
| [RFC7432] covers intra-subnet (intra-BD) Ethernet multicast. The | [RFC7432] covers intra-subnet (intra-BD) Ethernet multicast. The | |||
| intra-subnet Ethernet multicast procedures of [RFC7432] are used for | intra-subnet Ethernet multicast procedures of [RFC7432] are used for | |||
| Ethernet Broadcast traffic, for Ethernet unicast traffic whose MAC | Ethernet broadcast traffic, Ethernet unicast traffic whose | |||
| Destination Address field contains an Unknown address, and for | Destination MAC Address field contains an unknown address, and | |||
| Ethernet traffic whose MAC Destination Address field contains an | Ethernet traffic whose Destination MAC Address field contains an | |||
| Ethernet Multicast MAC address. These three classes of traffic are | Ethernet multicast MAC address. These three classes of traffic are | |||
| known collectively as "BUM traffic" (Broadcast/Unknown-Unicast/ | known collectively as "BUM traffic" (Broadcast, Unknown Unicast, or | |||
| Multicast), and the procedures for handling BUM traffic are known as | Multicast traffic), and the procedures for handling BUM traffic are | |||
| "BUM procedures". | known as "BUM procedures". | |||
| [RFC9251] extends the intra-subnet Ethernet multicast procedures by | [RFC9251] extends the intra-subnet Ethernet multicast procedures by | |||
| adding procedures that are specific to, and optimized for, the use of | adding procedures that are specific to, and optimized for, the use of | |||
| IP multicast within a subnet. However, that document does not cover | IP multicast within a subnet. However, that document does not cover | |||
| inter-subnet IP multicast. | inter-subnet IP multicast. | |||
| The purpose of this document is to specify procedures for EVPN that | The purpose of this document is to specify procedures for EVPN that | |||
| provide optimized IP multicast functionality within an EVPN tenant | provide optimized IP multicast functionality within an EVPN Tenant | |||
| domain. This document also specifies procedures that allow IP | Domain. This document also specifies procedures that allow IP | |||
| multicast packets to be sourced from or destined to systems outside | multicast packets to be sourced from or destined to systems outside | |||
| the Tenant Domain. We refer to the entire set of these procedures as | the Tenant Domain. The entire set of procedures are referred to as | |||
| "OISM" (Optimized Inter-Subnet Multicast) procedures. | "Optimized Inter-Subnet Multicast (OISM)" procedures. | |||
| In order to support the OISM procedures specified in this document, | In order to support the OISM procedures specified in this document, | |||
| an EVPN-PE MUST also support [RFC9135] and [RFC9251]. (However, | an EVPN PE MUST also support [RFC9135] and [RFC9251]. (However, | |||
| certain procedures in [RFC9251] are modified when OISM is supported.) | certain procedures in [RFC9251] are modified when OISM is supported.) | |||
| 1.2.4. BDs, MAC-VRFS, and EVPN Service Models | 1.2.4. BDs, MAC-VRFs, and EVPN Service Models | |||
| [RFC7432] defines the notion of "MAC-VRF". A MAC-VRF contains one or | [RFC7432] defines the notion of MAC-VRF (MAC Virtual Routing and | |||
| more "Bridge Tables" (see section 3 of [RFC7432] for a discussion of | Forwarding). A MAC-VRF contains one or more bridge tables (see | |||
| this terminology), each of which represents a single Broadcast | Section 3 of [RFC7432]), each of which represents a single Broadcast | |||
| Domain. | Domain. | |||
| In the IRB model (outlined in Appendix A), an L3 routing instance has | In the IRB model (outlined in Appendix A), an L3 routing instance has | |||
| one IRB interface per BD, NOT one per MAC-VRF. This document does | one IRB interface per BD, NOT one per MAC-VRF. This document does | |||
| not distinguish between a "Broadcast Domain" and a "Bridge Table", | not distinguish between a Broadcast Domain and a bridge table; | |||
| and will use the terms interchangeably (or will use the acronym "BD" | instead, it uses the terms interchangeably (or will use the acronym | |||
| to refer to either). The way the BDs are grouped into MAC-VRFs is | "BD" to refer to either). The way the BDs are grouped into MAC-VRFs | |||
| not relevant to the procedures specified in this document. | is not relevant to the procedures specified in this document. | |||
| Section 6 of [RFC7432] also defines several different EVPN service | Section 6 of [RFC7432] also defines several different EVPN service | |||
| models: | models: | |||
| * In the "vlan-based service", each MAC-VRF contains one "bridge | * In the vlan-based service, each MAC-VRF contains one bridge table, | |||
| table", where the bridge table corresponds to a particular Virtual | where the bridge table corresponds to a particular Virtual LAN | |||
| LAN (VLAN). (See section 3 of [RFC7432] for a discussion of this | (VLAN) (see Section 3 of [RFC7432]). Thus, each VLAN is treated | |||
| terminology.) Thus, each VLAN is treated as a BD. | as a BD. | |||
| * In the "vlan bundle service", each MAC-VRF contains one bridge | * In the vlan bundle service, each MAC-VRF contains one bridge | |||
| table, where the bridge table corresponds to a set of VLANs. Thus | table, where the bridge table corresponds to a set of VLANs. | |||
| a set of VLANs are treated as constituting a single BD. | Thus, a set of VLANs are treated as constituting a single BD. | |||
| * In the "vlan-aware bundle service", each MAC-VRF may contain | * In the vlan-aware bundle service, each MAC-VRF may contain | |||
| multiple bridge tables, where each bridge table corresponds to one | multiple bridge tables, where each bridge table corresponds to one | |||
| BD. If a MAC-VRF contains several bridge tables, then it | BD. If a MAC-VRF contains several bridge tables, then it | |||
| corresponds to several BDs. | corresponds to several BDs. | |||
| The procedures in this document are intended to work for all these | The procedures in this document are intended to work for all these | |||
| service models. | service models. | |||
| 1.3. Need for EVPN-aware Multicast Procedures | 1.3. Need for EVPN-Aware Multicast Procedures | |||
| Inter-subnet IP multicast among a set of BDs can be achieved, in a | Inter-subnet IP multicast among a set of BDs can be achieved, in a | |||
| non-optimal manner, without any specific EVPN procedures. For | non-optimal manner, without any specific EVPN procedures. For | |||
| instance, if a particular tenant has n BDs among which he wants to | instance, if a particular tenant has n BDs among which it wants to | |||
| send IP multicast traffic, he can simply attach a conventional | send IP multicast traffic, it can simply attach a conventional | |||
| multicast router to all n BDs. Or more generally, as long as each BD | multicast router to all n BDs. Or more generally, as long as each BD | |||
| has at least one IP multicast router, and the IP multicast routers | has at least one IP multicast router, and the IP multicast routers | |||
| communicate multicast control information with each other, | communicate multicast control information with each other, | |||
| conventional IP multicast procedures will work normally, and no | conventional IP multicast procedures will work normally, and no | |||
| special EVPN functionality is needed. | special EVPN functionality is needed. | |||
| However, that technique does not provide optimal routing for | However, that technique does not provide optimal routing for | |||
| multicast. In conventional multicast routing, for a given multicast | multicast. In conventional multicast routing, for a given multicast | |||
| flow, there is only one multicast router on each BD that is permitted | flow, there is only one multicast router on each BD that is permitted | |||
| to send traffic of that flow to the BD. If that BD has receivers for | to send traffic of that flow to the BD. If that BD has receivers for | |||
| a given flow, but the source of the flow is not on that BD, then the | a given flow, but the source of the flow is not on that BD, then the | |||
| flow must pass through that multicast router. This leads to the | flow must pass through that multicast router. This leads to the | |||
| "hair-pinning" problem described (for unicast) in Appendix A. | hairpinning problem described (for unicast) in Appendix A. | |||
| For example, consider an (S,G) flow that is sourced by a TS S and | For example, consider an (S,G) flow that is sourced by a TS S and | |||
| needs to be received by TSes R1 and R2. Suppose S is on a segment of | needs to be received by TSs R1 and R2. Suppose S is on a segment of | |||
| BD1, R1 is on a segment of BD2, but both are attached to PE1. | BD1, R1 is on a segment of BD2, but both are attached to PE1. Also | |||
| Suppose also that the tenant has a multicast router, attached to a | suppose that the tenant has a multicast router attached to a segment | |||
| segment of BD1 and to a segment of BD2. However, the segments to | of BD1 and to a segment of BD2. However, the segments to which that | |||
| which that router is attached are both attached to PE2. Then the | router is attached are both attached to PE2. Then, the flow from S | |||
| flow from S to R would have to follow the path: | to R would have to follow the path: S-->PE1-->PE2-->tenant multicast | |||
| S-->PE1-->PE2-->Tenant Multicast Router-->PE2-->PE1-->R1. Obviously, | router-->PE2-->PE1-->R1. Obviously, the path S-->PE1-->R would be | |||
| the path S-->PE1-->R would be preferred. | preferred. | |||
| +---+ +---+ | +---+ +---+ | |||
| |PE1+----------------------+PE2| | |PE1+----------------------+PE2| | |||
| +---+-+ +-+---+ | +---+-+ +-+---+ | |||
| | \ \ / / | | | \ \ / / | | |||
| BD1 BD2 BD3 BD3 BD2 BD1 | BD1 BD2 BD3 BD3 BD2 BD1 | |||
| | | | \ | | | | | | \ | | | |||
| S R1 R2 router | S R1 R2 router | |||
| Now suppose that there is a second receiver, R2. R2 is attached to a | Now suppose that there is a second receiver, R2. R2 is attached to a | |||
| third BD, BD3. However, it is attached to a segment of BD3 that is | third BD, BD3. However, it is attached to a segment of BD3 that is | |||
| attached to PE1. And suppose also that the Tenant Multicast Router | attached to PE1. And suppose that the tenant multicast router is | |||
| is attached to a segment of BD3 that attaches to PE2. In this case, | attached to a segment of BD3 that attaches to PE2. In this case, the | |||
| the Tenant Multicast Router will make two copies of the packet, one | tenant multicast router will make two copies of the packet, one for | |||
| for BD2 and one for BD3. PE2 will send both copies back to PE1. Not | BD2 and one for BD3. PE2 will send both copies back to PE1. Not | |||
| only is the routing sub-optimal, but also PE2 sends multiple copies | only is the routing sub-optimal, but PE2 also sends multiple copies | |||
| of the same packet to PE1. This is a further sub-optimality. | of the same packet to PE1, which is a further sub-optimality. | |||
| This is only an example; many more examples of sub-optimal multicast | This is only an example; many more examples of sub-optimal multicast | |||
| routing can easily be given. To eliminate sub-optimal routing and | routing can easily be given. To eliminate sub-optimal routing and | |||
| extra copies, it is necessary to have a multicast solution that is | extra copies, it is necessary to have a multicast solution that is | |||
| EVPN-aware, and that can use its knowledge of the internal structure | EVPN-aware and that can use its knowledge of the internal structure | |||
| of a Tenant Domain to ensure that multicast traffic gets routed | of a Tenant Domain to ensure that multicast traffic gets routed | |||
| optimally. The procedures in this document allow us to avoid all | optimally. The procedures in this document allow us to avoid all | |||
| such sub-optimalities when routing inter-subnet multicast traffic | such sub-optimalities when routing inter-subnet multicast traffic | |||
| within a Tenant Domain. | within a Tenant Domain. | |||
| 1.4. Additional Requirements That Must be Met by the Solution | 1.4. Additional Requirements That Must Be Met by the Solution | |||
| In addition to providing optimal routing of multicast flows within a | In addition to providing optimal routing of multicast flows within a | |||
| Tenant Domain, the EVPN-aware multicast solution is intended to | Tenant Domain, the EVPN-aware multicast solution is intended to | |||
| satisfy the following requirements: | satisfy the following requirements: | |||
| * The solution must integrate well with the procedures specified in | * The solution must integrate well with the procedures specified in | |||
| [RFC9251]. That is, an integrated set of procedures must handle | [RFC9251]. That is, an integrated set of procedures must handle | |||
| both intra-subnet multicast and inter-subnet multicast. | both intra-subnet multicast and inter-subnet multicast. | |||
| * With regard to intra-subnet multicast, the solution MUST maintain | * With regard to intra-subnet multicast, the solution MUST maintain | |||
| the integrity of multicast Ethernet service. This means: | the integrity of the multicast Ethernet service. This means: | |||
| - If a source and a receiver are on the same subnet, the MAC | - If a source and a receiver are on the same subnet, the MAC | |||
| source address (SA) of the multicast frame sent by the source | Source Address (SA) of the multicast frame sent by the source | |||
| will not get rewritten. | will not get rewritten. | |||
| - If a source and a receiver are on the same subnet, no IP | - If a source and a receiver are on the same subnet, no IP | |||
| processing of the Ethernet payload is done. The IP TTL is not | processing of the Ethernet payload is done. The IP TTL is not | |||
| decremented, the IPv4 header checksum is not changed, no | decremented, the IPv4 header checksum is not changed, no | |||
| fragmentation is done, etc. | fragmentation is done, etc. | |||
| * On the other hand, if a source and a receiver are on different | * On the other hand, if a source and a receiver are on different | |||
| subnets, the frame received by the receiver will not have the MAC | subnets, the frame received by the receiver will not have the MAC | |||
| Source address of the source, as the frame will appear to have | Source Address of the source, as the frame will appear to have | |||
| come from a multicast router. Also, proper processing of the IP | come from a multicast router. Also, proper processing of the IP | |||
| header is done, e.g., TTL decrement by 1, header checksum | header is done, e.g., TTL decrements by 1, header checksum | |||
| modification, possible fragmentation, etc. | modification, possible fragmentation, etc. | |||
| * If a Tenant Domain contains several BDs, it MUST be possible for a | * If a Tenant Domain contains several BDs, it MUST be possible for a | |||
| multicast flow (even when the multicast group address is an "any | multicast flow (even when the multicast group address is an ASM | |||
| source multicast" (ASM) address), to have sources in one of those | address) to have sources in one of those BDs and receivers in one | |||
| BDs and receivers in one or more of the other BDs, without | or more of the other BDs without requiring the presence of any | |||
| requiring the presence of any system performing PIM Rendezvous | system performing PIM RP functions [RFC7761]. | |||
| Point (RP) functions [RFC7761]. | ||||
| * Sometimes a MAC address used by one TS on a particular BD is also | * Sometimes a MAC address used by one TS on a particular BD is also | |||
| used by another TS on a different BD. Inter-subnet routing of | used by another TS on a different BD. Inter-subnet routing of | |||
| multicast traffic MUST NOT make any assumptions about the | multicast traffic MUST NOT make any assumptions about the | |||
| uniqueness of a MAC address across several BDs. | uniqueness of a MAC address across several BDs. | |||
| * If two EVPN-PEs attached to the same Tenant Domain both support | * If two EVPN PEs attached to the same Tenant Domain both support | |||
| the OISM procedures, each may receive inter-subnet multicasts from | the OISM procedures, each may receive inter-subnet multicasts from | |||
| the other, even if the egress PE is not attached to any segment of | the other, even if the egress PE is not attached to any segment of | |||
| the BD from which the multicast packets are being sourced. It | the BD from which the multicast packets are being sourced. It | |||
| MUST NOT be necessary to provision the egress PE with knowledge of | MUST NOT be necessary to provision the egress PE with knowledge of | |||
| the ingress BD. | the ingress BD. | |||
| * There must be a procedure that allows EVPN-PE routers supporting | * There must be a procedure that allows EVPN PE routers supporting | |||
| OISM procedures to send/receive multicast traffic to/from EVPN-PE | OISM procedures to send/receive multicast traffic to/from EVPN PE | |||
| routers that support only [RFC7432], but that do not support the | routers that support only [RFC7432] but that does not support the | |||
| OISM procedures or even the procedures of [RFC9135]. However, | OISM procedures or even the procedures of [RFC9135]. However, | |||
| when interworking with such routers (which we call "non-OISM PE | when interworking with such routers (which we call "non-OISM PE | |||
| routers"), optimal routing may not be achievable. | routers"), optimal routing may not be achievable. | |||
| * It MUST be possible to support scenarios in which multicast flows | * It MUST be possible to support scenarios in which multicast flows | |||
| with sources inside a Tenant Domain have "external" receivers, | with sources inside a Tenant Domain have external receivers, i.e., | |||
| i.e., receivers that are outside the domain. It must also be | receivers that are outside the domain. It must also be possible | |||
| possible to support scenarios where multicast flows with external | to support scenarios where multicast flows with external sources | |||
| sources (sources outside the Tenant Domain) have receivers inside | (sources outside the Tenant Domain) have receivers inside the | |||
| the domain. | domain. | |||
| This presupposes that unicast routes to multicast sources outside | This presupposes that unicast routes to multicast sources outside | |||
| the domain can be distributed to EVPN-PEs attached to the domain, | the domain can be distributed to EVPN PEs attached to the domain | |||
| and that unicast routes to multicast sources within the domain can | and that unicast routes to multicast sources within the domain can | |||
| be distributed outside the domain. | be distributed outside the domain. | |||
| Of particular importance is the scenario in which the external | Of particular importance are the scenarios in which the external | |||
| sources and/or receivers are reachable via L3VPN/MVPN, and the | sources and/or receivers are reachable via L3VPN/MVPN or via IP/ | |||
| scenario in which external sources and/or receivers are reachable | PIM. | |||
| via IP/PIM. | ||||
| The solution for external interworking MUST allow for deployment | The solution for external interworking MUST allow for deployment | |||
| scenarios in which EVPN does not need to export a host route for | scenarios in which EVPN does not need to export a host route for | |||
| every multicast source. | every multicast source. | |||
| * The solution for external interworking must not presuppose that | * The solution for external interworking must not presuppose that | |||
| the same tunneling technology is used within both the EVPN domain | the same tunneling technology is used within both the EVPN domain | |||
| and the external domain. For example, MVPN interworking must be | and the external domain. For example, MVPN interworking must be | |||
| possible when MVPN is using MPLS P2MP tunneling, and EVPN is using | possible when MVPN is using MPLS Point-to-Multipoint (P2MP) | |||
| Ingress Replication or VXLAN tunneling. | tunneling and when EVPN is using Ingress Replication (IR) or | |||
| Virtual eXtensible Local Area Network (VXLAN) tunneling. | ||||
| * The solution must not be overly dependent on the details of a | * The solution must not be overly dependent on the details of a | |||
| small set of use cases, but must be adaptable to new use cases as | small set of use cases but must be adaptable to new use cases as | |||
| they arise. (That is, the solution must be robust.) | they arise. (That is, the solution must be robust.) | |||
| 1.5. Model of Operation: Overview | 1.5. Model of Operation: Overview | |||
| 1.5.1. Control Plane | 1.5.1. Control Plane | |||
| In this section, and in the remainder of this document, we assume the | In this section, and in the remainder of this document, we assume the | |||
| reader is familiar with the procedures of IGMP/MLD (see [RFC3376] and | reader is familiar with the procedures of IGMP / Multicast Listener | |||
| [RFC3810]), by which hosts announce their interest in receiving | Discovery (MLD) (see [RFC3376] and [RFC3810]), by which hosts | |||
| particular multicast flows. | announce their interest in receiving particular multicast flows. | |||
| Consider a Tenant Domain consisting of a set of k BDs: BD1, ..., BDk. | Consider a Tenant Domain consisting of a set of k BDs: BD1, ..., BDk. | |||
| To support the OISM procedures, each Tenant Domain must also be | To support the OISM procedures, each Tenant Domain must also be | |||
| associated with a "Supplementary Broadcast Domain" (SBD). An SBD is | associated with a Supplementary Broadcast Domain (SBD). An SBD is | |||
| treated in the control plane as a real BD, but it does not have any | treated in the control plane as a real BD, but it does not have any | |||
| ACs. The SBD has several uses; these will be described later in this | ACs. The SBD has several uses; these will be described later in this | |||
| document (see Section 2.1 and Section 3). | document (see Sections 2.1 and 3). | |||
| Each PE that attaches to one or more of the BDs in a given tenant | Each PE that attaches to one or more of the BDs in a given Tenant | |||
| domain will be provisioned to recognize that those BDs are part of | Domain will be provisioned to recognize that those BDs are part of | |||
| the same Tenant Domain. Note that a given PE does not need to be | the same Tenant Domain. Note that a given PE does not need to be | |||
| configured with all the BDs of a given Tenant Domain. In general, a | configured with all the BDs of a given Tenant Domain. In general, a | |||
| PE will only be attached to a subset of the BDs in a given Tenant | PE will only be attached to a subset of the BDs in a given Tenant | |||
| Domain, and will be configured only with that subset of BDs. | Domain and will be configured only with that subset of BDs. However, | |||
| However, each PE attached to a given Tenant Domain must be configured | each PE attached to a given Tenant Domain must be configured with the | |||
| with the SBD for that Tenant Domain. | SBD for that Tenant Domain. | |||
| Suppose a particular segment of a particular BD is attached to PE1. | Suppose a particular segment of a particular BD is attached to PE1. | |||
| [RFC7432] specifies that PE1 must originate an Inclusive Multicast | [RFC7432] specifies that PE1 must originate an Inclusive Multicast | |||
| Ethernet Tag (IMET) route for that BD, and that the IMET route must | Ethernet Tag (IMET) route for that BD and that the IMET route must be | |||
| be propagated to all other PEs attached to the same BD. If the given | propagated to all other PEs attached to the same BD. If the given | |||
| segment contains a host that has interest in receiving a particular | segment contains a host that has interest in receiving a particular | |||
| multicast flow, either an (S,G) flow or a (*,G) flow, PE1 will learn | multicast flow, either an (S,G) flow or a (*,G) flow, PE1 will learn | |||
| of that interest by participating in the IGMP/MLD snooping | of that interest by participating in the IGMP/MLD snooping | |||
| procedures, as specified in [RFC4541]. In this case: | procedures, as specified in [RFC4541]. In this case: | |||
| * PE1 is interested in receiving the flow; | * PE1 is interested in receiving the flow; | |||
| * The AC attaching the interested host to PE1 is also said to be | * the AC attaching the interested host to PE1 is also said to be | |||
| interested in the flow; | interested in the flow; and | |||
| * The BD containing an AC that is interested in a particular flow is | * the BD containing an AC that is interested in a particular flow is | |||
| also said to be interested in that flow. | also said to be interested in that flow. | |||
| Once PE1 determines that it has an AC that is interested in receiving | Once PE1 determines that it has an AC that is interested in receiving | |||
| a particular flow or set of flows, it originates one or more | a particular flow or set of flows, it originates one or more | |||
| Selective Multicast Ethernet Tag (SMET) route(s) [RFC9251] to | Selective Multicast Ethernet Tag (SMET) routes [RFC9251] to advertise | |||
| advertise that interest. | that interest. | |||
| Note that each IMET or SMET route is "for" a particular BD. The | Note that each IMET or SMET route is for a particular BD. The notion | |||
| notion of a route being "for" a particular BD is explained in | of a route being for a particular BD is explained in Section 2.2. | |||
| Section 2.2. | ||||
| When OISM is being supported, the procedures of [RFC9251], are | When OISM is being supported, the procedures of [RFC9251] are | |||
| modified as follows: | modified as follows: | |||
| * The IMET route originated by a particular PE for a particular BD | * The IMET route originated by a particular PE for a particular BD | |||
| is distributed to all other PEs attached to the Tenant Domain | is distributed to all other PEs attached to the Tenant Domain | |||
| containing that BD, even to those PEs that are not attached to | containing that BD, even to those PEs that are not attached to | |||
| that particular BD. | that particular BD. | |||
| * The SMET routes originated by a particular PE are originated on a | * The SMET routes originated by a particular PE are originated on a | |||
| per-Tenant-Domain basis, rather than on a per-BD basis. That is, | per-Tenant-Domain basis rather than a per-BD basis. That is, the | |||
| the SMET routes are considered to be for the Tenant Domain's SBD, | SMET routes are considered to be for the Tenant Domain's SBD | |||
| rather than for any of its ordinary BDs. These SMET routes are | rather than any of its ordinary BDs. These SMET routes are | |||
| distributed to all the PEs attached to the Tenant Domain. | distributed to all the PEs attached to the Tenant Domain. | |||
| In this way, each PE attached to a given Tenant Domain learns, | In this way, each PE attached to a given Tenant Domain learns, | |||
| from each other PE attached to the same Tenant Domain, the set of | from the other PEs attached to the same Tenant Domain, the set of | |||
| flows that are of interest to each of those other PEs. | flows that are of interest to each of those other PEs. | |||
| An OISM PE that is provisioned with several BDs in the same Tenant | An OISM PE that is provisioned with several BDs in the same Tenant | |||
| Domain MUST originate an IMET route for each such BD. To indicate | Domain MUST originate an IMET route for each such BD. To indicate | |||
| its support of [RFC9251], it SHOULD attach the EVPN Multicast Flags | its support of [RFC9251], it SHOULD attach the EVPN Multicast Flags | |||
| Extended Community to each such IMET route, but it MUST attach the EC | Extended Community to each such IMET route, but it MUST attach the EC | |||
| to at least one such IMET route. | to at least one such IMET route. | |||
| Suppose PE1 is provisioned with both BD1 and BD2, and is provisioned | Suppose PE1 is provisioned with both BD1 and BD2 and considers them | |||
| to consider them to be part of the same Tenant Domain. It is | to be part of the same Tenant Domain. It is possible that PE1 will | |||
| possible that PE1 will receive from PE2 both an IMET route for BD1 | receive both an IMET route for BD1 and an IMET route for BD2 from | |||
| and an IMET route for BD2. If either of these IMET routes has the | PE2. If either of these IMET routes has the EVPN Multicast Flags | |||
| EVPN Multicast Flags Extended Community, PE1 MUST assume that PE2 is | Extended Community, PE1 MUST assume that PE2 is supporting the | |||
| supporting the procedures of [RFC9251] for ALL BDs in the Tenant | procedures of [RFC9251] for ALL BDs in the Tenant Domain. | |||
| Domain. | ||||
| If a PE supports OISM functionality, it indicates that by setting the | If a PE supports OISM functionality, it indicates that, by setting | |||
| "OISM-supported" flag in the Multicast Flags Extended Community that | the OISM-supported flag in the Multicast Flags Extended Community, it | |||
| it attaches to some or all of its IMET routes. An OISM PE SHOULD | attaches to some or all of its IMET routes. An OISM PE SHOULD attach | |||
| attach this EC with the OISM-supported flag set to all the IMET | this EC with the OISM-supported flag set to all the IMET routes it | |||
| routes it originates. However, if PE1 imports IMET routes from PE2, | originates. However, if PE1 imports IMET routes from PE2, and at | |||
| and at least one of PE2's IMET routes indicates that PE2 is an OISM | least one of PE2's IMET routes indicates that PE2 is an OISM PE, PE1 | |||
| PE, PE1 MUST assume that PE2 is following OISM procedures. | MUST assume that PE2 is following OISM procedures. | |||
| 1.5.2. Data Plane | 1.5.2. Data Plane | |||
| Suppose PE1 has an AC to a segment in BD1, and PE1 receives from that | Suppose PE1 has an AC to a segment in BD1 and PE1 receives an (S,G) | |||
| AC an (S,G) multicast frame (as defined in Section 1.1). | multicast frame from that AC (as defined in Section 1.1). | |||
| There may be other ACs of PE1 on which TSes have indicated an | There may be other ACs of PE1 on which TSs have indicated an interest | |||
| interest (via IGMP/MLD) in receiving (S,G) multicast packets. PE1 is | (via IGMP/MLD) in receiving (S,G) multicast packets. PE1 is | |||
| responsible for sending the received multicast packet on those ACs. | responsible for sending the received multicast packet on those ACs. | |||
| There are two cases to consider: | There are two cases to consider: | |||
| * Intra-Subnet Forwarding: In this case, an attachment AC with | * Intra-Subnet Forwarding: In this case, an AC with interest in | |||
| interest in (S,G) is connected to a segment that is part of the | (S,G) is connected to a segment that is part of the source BD, | |||
| source BD, BD1. If the segment is not multi-homed, or if PE1 is | BD1. If the segment is not multihomed, or if PE1 is the | |||
| the Designated Forwarder (DF) (see [RFC7432]) for that segment, | Designated Forwarder (DF) (see [RFC7432]) for that segment, PE1 | |||
| PE1 sends the multicast frame on that AC without changing the MAC | sends the multicast frame on that AC without changing the MAC SA. | |||
| SA. The IP header is not modified at all; in particular, the TTL | The IP header is not modified at all; in particular, the TTL is | |||
| is not decremented. | not decremented. | |||
| * Inter-Subnet Forwarding: An AC with interest in (S,G) is connected | * Inter-Subnet Forwarding: An AC with interest in (S,G) is connected | |||
| to a segment of BD2, where BD2 is different than BD1. If PE1 is | to a segment of BD2, where BD2 is different than BD1. If PE1 is | |||
| the DF for that segment (or if the segment is not multi-homed), | the DF for that segment (or if the segment is not multihomed), PE1 | |||
| PE1 decapsulates the IP multicast packet, performs any necessary | decapsulates the IP multicast packet, performs any necessary IP | |||
| IP processing (including TTL decrement), then re-encapsulates the | processing (including TTL decrement), and then re-encapsulates the | |||
| packet appropriately for BD2. PE1 then sends the packet on the | packet appropriately for BD2. PE1 then sends the packet on the | |||
| AC. Note that after re-encapsulation, the MAC SA will be PE1's | AC. Note that after re-encapsulation, the MAC SA will be PE1's | |||
| MAC address on BD2. The IP TTL will have been decremented by 1. | MAC address on BD2. The IP TTL will have been decremented by 1. | |||
| In addition, there may be other PEs that are interested in (S,G) | In addition, there may be other PEs that are interested in (S,G) | |||
| traffic. Suppose PE2 is such a PE. Then PE1 tunnels a copy of the | traffic. Suppose PE2 is such a PE. Then, PE1 tunnels a copy of the | |||
| IP multicast frame (with its original MAC SA, and with no alteration | IP multicast frame (with its original MAC SA and with no alteration | |||
| of the payload's IP header) to PE2. The tunnel encapsulation | of the payload's IP header) to PE2. The tunnel encapsulation | |||
| contains information that PE2 can use to associate the frame with an | contains information that PE2 can use to associate the frame with an | |||
| "apparent source BD". If the actual source BD of the frame is BD1, | apparent source BD. If the actual source BD of the frame is BD1, | |||
| then: | then: | |||
| * If PE2 is attached to BD1, the tunnel encapsulation used to send | * If PE2 is attached to BD1, the tunnel encapsulation used to send | |||
| the frame to PE2 will cause PE2 to identify BD1 as the apparent | the frame to PE2 will cause PE2 to identify BD1 as the apparent | |||
| source BD. | source BD. | |||
| * If PE2 is not attached to BD1, the tunnel encapsulation used to | * If PE2 is not attached to BD1, the tunnel encapsulation used to | |||
| send the frame to PE2 will cause PE2 to identify the SBD as the | send the frame to PE2 will cause PE2 to identify the SBD as the | |||
| apparent source BD. | apparent source BD. | |||
| Note that the tunnel encapsulation used for a particular BD will have | Note that the tunnel encapsulation used for a particular BD will have | |||
| been advertised in an IMET route or S-PMSI route | been advertised in an IMET route or a Selective Provider Multicast | |||
| [I-D.ietf-bess-evpn-bum-procedure-updates] for that BD. That route | Service Interface (S-PMSI) route [RFC9572] for that BD. That route | |||
| carries a PMSI Tunnel attribute, which specifies how packets | carries a PMSI Tunnel Attribute (PTA), which specifies how packets | |||
| originating from that BD are encapsulated. This information enables | originating from that BD are encapsulated. This information enables | |||
| the PE receiving a tunneled packet to identify the apparent source BD | the PE receiving a tunneled packet to identify the apparent source BD | |||
| as stated above. See Section 3.2 for more details. | as stated above. See Section 3.2 for more details. | |||
| When PE2 receives the tunneled frame, it will forward it on any of | When PE2 receives the tunneled frame, it will forward it on any of | |||
| its ACs that have interest in (S,G). | its ACs that have interest in (S,G). | |||
| If PE2 determines from the tunnel encapsulation that the apparent | If PE2 determines from the tunnel encapsulation that the apparent | |||
| source BD is BD1, then | source BD is BD1, then: | |||
| * For those ACs that connect PE2 to BD1, the intra-subnet forwarding | * For those ACs that connect PE2 to BD1, the intra-subnet forwarding | |||
| procedure described above is used, except that it is now PE2, not | procedure described above is used, except that it is now PE2, not | |||
| PE1, carrying out that procedure. Unmodified EVPN procedures from | PE1, carrying out that procedure. Unmodified EVPN procedures from | |||
| [RFC7432] are used to ensure that a packet originating from a | [RFC7432] are used to ensure that a packet originating from a | |||
| multi-homed segment is never sent back to that segment. | multihomed segment is never sent back to that segment. | |||
| * For those ACs that do not connect to BD1, the inter-subnet | * For those ACs that do not connect to BD1, the inter-subnet | |||
| forwarding procedure described above is used, except that it is | forwarding procedure described above is used, except that it is | |||
| now PE2, not PE1, carrying out that procedure. | now PE2, not PE1, carrying out that procedure. | |||
| If the tunnel encapsulation identifies the apparent source BD as the | If the tunnel encapsulation identifies the apparent source BD as the | |||
| SBD, PE2 applies the inter-subnet forwarding procedures described | SBD, PE2 applies the inter-subnet forwarding procedures described | |||
| above to all of its ACs that have interest in the flow. | above to all of its ACs that have interest in the flow. | |||
| These procedures ensure that an IP multicast frame travels from its | These procedures ensure that an IP multicast frame travels from its | |||
| ingress PE to all egress PEs that are interested in receiving it. | ingress PE to all egress PEs that are interested in receiving it. | |||
| While in transit, the frame retains its original MAC SA, and the | While in transit, the frame retains its original MAC SA, and the | |||
| payload of the frame retains its original IP header. Note that in | payload of the frame retains its original IP header. Note that in | |||
| all cases, when an IP multicast packet is sent from one BD to | all cases, when an IP multicast packet is sent from one BD to | |||
| another, these procedures cause its TTL to be decremented by 1. | another, these procedures cause its TTL to be decremented by 1. | |||
| So far we have assumed that an IP multicast packet arrives at its | So far, we have assumed that an IP multicast packet arrives at its | |||
| ingress PE over an AC that belongs to one of the BDs in a given | ingress PE over an AC that belongs to one of the BDs in a given | |||
| Tenant Domain. However, it is possible for a packet to arrive at its | Tenant Domain. However, it is possible for a packet to arrive at its | |||
| ingress PE in other ways. Since an EVPN-PE supporting IRB has an | ingress PE in other ways. Since an EVPN PE supporting IRB has an IP- | |||
| IP-VRF, it is possible that the IP-VRF will have a "VRF interface" | VRF, it is possible that the IP-VRF will have a VRF interface that is | |||
| that is not an IRB interface. For example, there might be a VRF | not an IRB interface. For example, there might be a VRF interface | |||
| interface that is actually a physical link to an external Ethernet | that is actually a physical link to an external Ethernet switch, a | |||
| switch, or to a directly attached host, or to a router. When an | directly attached host, or a router. When an EVPN PE, say PE1, | |||
| EVPN-PE, say PE1, receives a packet through such means, we will say | receives a packet through such means, we will say that the packet has | |||
| that the packet has an "external" source (i.e., a source "outside the | an external source (i.e., a source outside the Tenant Domain). There | |||
| Tenant Domain"). There are also other scenarios in which a multicast | are also other scenarios in which a multicast packet might have an | |||
| packet might have an external source, e.g., it might arrive over an | external source, e.g., it might arrive over an MVPN tunnel from an | |||
| MVPN tunnel from an L3VPN PE. In such cases, we will still refer to | L3VPN PE. In such cases, we will still refer to PE1 as the "ingress | |||
| PE1 as the "ingress EVPN-PE". | EVPN PE". | |||
| When an EVPN-PE, say PE1, receives an externally sourced multicast | When an EVPN PE, say PE1, receives an externally sourced multicast | |||
| packet, and there are receivers for that packet inside the Tenant | packet, and there are receivers for that packet inside the Tenant | |||
| Domain, it does the following: | Domain, it does the following: | |||
| * Suppose PE1 has an AC in BD1 that has interest in (S,G). Then PE1 | * Suppose PE1 has an AC in BD1 that has interest in (S,G). Then, | |||
| encapsulates the packet for BD1, filling in the MAC SA field with | PE1 encapsulates the packet for BD1, filling in the MAC SA field | |||
| PE1's own MAC address on BD1. It sends the resulting frame on the | with PE1's own MAC address on BD1. It sends the resulting frame | |||
| AC. | on the AC. | |||
| * Suppose some other EVPN-PE, say PE2, has interest in (S,G). PE1 | * Suppose some other EVPN PE, say PE2, has interest in (S,G). PE1 | |||
| encapsulates the packet for Ethernet, filling in the MAC SA field | encapsulates the packet for Ethernet, filling in the MAC SA field | |||
| with PE1's own MAC address on the SBD. PE1 then tunnels the | with PE1's own MAC address on the SBD. PE1 then tunnels the | |||
| packet to PE2. The tunnel encapsulation will identify the | packet to PE2. The tunnel encapsulation will identify the | |||
| apparent source BD as the SBD. Since the apparent source BD is | apparent source BD as the SBD. Since the apparent source BD is | |||
| the SBD, PE2 will know to treat the frame as an inter-subnet | the SBD, PE2 will know to treat the frame as an inter-subnet | |||
| multicast. | multicast. | |||
| When ingress replication is used to transmit IP multicast frames from | When IR is used to transmit IP multicast frames from an ingress EVPN | |||
| an ingress EVPN-PE to a set of egress PEs, then the ingress PE has to | PE to a set of egress PEs, then the ingress PE has to send multiple | |||
| send multiple copies of the frame. Each copy is the original | copies of the frame. Each copy is the original Ethernet frame; | |||
| Ethernet frame; decapsulation and IP processing take place only at | decapsulation and IP processing take place only at the egress PE. | |||
| the egress PE. | ||||
| If a Point-to-Multipoint (P2MP) tree or BIER [I-D.ietf-bier-evpn] is | If a P2MP tree or Bit Index Explicit Replication (BIER) [RFC9624] is | |||
| used to transmit an IP multicast frame from an ingress PE to a set of | used to transmit an IP multicast frame from an ingress PE to a set of | |||
| egress PEs, then the ingress PE only has to send one copy of the | egress PEs, then the ingress PE only has to send one copy of the | |||
| frame to each of its next hops. Again, each egress PE receives the | frame to each of its next hops. Again, each egress PE receives the | |||
| original frame and does any necessary IP processing. | original frame and does any necessary IP processing. | |||
| 2. Detailed Model of Operation | 2. Detailed Model of Operation | |||
| The model described in Section 1.5.2 can be expressed more precisely | The model described in Section 1.5.2 can be expressed more precisely | |||
| using the notion of "IRB interface" (see Appendix A). For a given | using the notion of IRB interface (see Appendix A). For a given | |||
| Tenant Domain: | Tenant Domain: | |||
| * A given PE has one IRB interface for each BD to which it is | * A given PE has one IRB interface for each BD to which it is | |||
| attached. This IRB interface connects L3 routing to that BD. | attached. This IRB interface connects L3 routing to that BD. | |||
| When IP multicast packets are sent or received on the IRB | When IP multicast packets are sent or received on the IRB | |||
| interfaces, the semantics of the interface is modified from the | interfaces, the semantics of the interface are modified from the | |||
| semantics described in Appendix A. See Section 2.3 for the | semantics described in Appendix A. See Section 2.3 for the | |||
| details of the modification. | details of the modification. | |||
| * Each PE also has an IRB interface that connects L3 routing to the | * Each PE also has an IRB interface that connects L3 routing to the | |||
| SBD. The semantics of this interface is different than the | SBD. The semantics of this interface is different than the | |||
| semantics of the IRB interface to the real BDs. See Section 2.3. | semantics of the IRB interface to the real BDs. See Section 2.3. | |||
| In this section we assume that PIM is not enabled on the IRB | In this section, we assume that PIM is not enabled on the IRB | |||
| interfaces. In general, it is not necessary to enable PIM on the IRB | interfaces. In general, it is not necessary to enable PIM on the IRB | |||
| interfaces unless there are PIM routers on one of the Tenant Domain's | interfaces unless there are PIM routers on one of the Tenant Domain's | |||
| BDs, or unless there is some other scenario requiring a Tenant | BDs or there is some other scenario requiring a Tenant Domain's L3 | |||
| Domain's L3 routing instance to become a PIM adjacency of some other | routing instance to become a PIM adjacency of some other system. | |||
| system. These cases will be discussed in Section 7. | These cases will be discussed in Section 7. | |||
| 2.1. Supplementary Broadcast Domain | 2.1. Supplementary Broadcast Domain | |||
| Suppose a given Tenant Domain contains three BDs (BD1, BD2, BD3) and | Suppose a given Tenant Domain contains three BDs (BD1, BD2, and BD3) | |||
| two PEs (PE1, PE2). PE1 attaches to BD1 and BD2, while PE2 attaches | and two PEs (PE1 and PE2). PE1 attaches to BD1 and BD2, while PE2 | |||
| to BD2 and BD3. | attaches to BD2 and BD3. | |||
| To carry out the procedures described above, all the PEs attached to | To carry out the procedures described above, all the PEs attached to | |||
| the Tenant Domain must be provisioned with the SBD for that tenant | the Tenant Domain must be provisioned with the SBD for that Tenant | |||
| domain. A Route Target (RT) must be associated with the SBD, and | Domain. An RT must be associated with the SBD and provisioned on | |||
| provisioned on each of those PEs. We will refer to that RT as the | each of those PEs. We will refer to that RT as the "SBD-RT". | |||
| "SBD-RT". | ||||
| A Tenant Domain is also configured with an IP-VRF [RFC9135], and the | A Tenant Domain is also configured with an IP-VRF [RFC9135], and the | |||
| IP-VRF is associated with an RT. This RT MAY be the same as the | IP-VRF is associated with an RT. This RT MAY be the same as the SBD- | |||
| SBD-RT. | RT. | |||
| Suppose an (S,G) multicast frame originating on BD1 has a receiver on | Suppose an (S,G) multicast frame originating on BD1 has a receiver on | |||
| BD3. PE1 will transmit the packet to PE2 as a frame, and the | BD3. PE1 will transmit the packet to PE2 as a frame, and the | |||
| encapsulation will identify the frame's source BD as BD1. Since PE2 | encapsulation will identify the frame's source BD as BD1. Since PE2 | |||
| is not provisioned with BD1, it will treat the packet as if its | is not provisioned with BD1, it will treat the packet as if its | |||
| source BD were the SBD. That is, a packet can be transmitted from | source BD were the SBD. That is, a packet can be transmitted from | |||
| BD1 to BD3 even though its ingress PE is not configured for BD3, and/ | BD1 to BD3 even though its ingress PE is not configured for BD3 and/ | |||
| or its egress PE is not configured for BD1. | or its egress PE is not configured for BD1. | |||
| EVPN supports service models in which a given EVPN Instance (EVI) can | EVPN supports service models in which a given EVI can contain only | |||
| contain only one BD. It also supports service models in which a | one BD. It also supports service models in which a given EVI can | |||
| given EVI can contain multiple BDs. No matter which service model is | contain multiple BDs. No matter which service model is being used | |||
| being used for a particular tenant, it is highly RECOMMENDED that an | for a particular tenant, it is highly RECOMMENDED that an EVI | |||
| EVI containing only the SBD be provisioned for that tenant. | containing only the SBD be provisioned for that tenant. | |||
| If, for some reason, it is not feasible to provision an EVI that | If, for some reason, it is not feasible to provision an EVI that | |||
| contains only the SBD, it is possible to put the SBD in an EVI that | contains only the SBD, it is possible to put the SBD in an EVI that | |||
| contains other BDs. However, in that case, the SBD-RT MUST be | contains other BDs. However, in that case, the SBD-RT MUST be | |||
| different than the RT associated with any other BD. Otherwise the | different than the RT associated with any other BD. Otherwise, the | |||
| procedures of this document (as detailed in Sections 2.2 and 3.1) | procedures of this document (as detailed in Sections 2.2 and 3.1) | |||
| will not produce correct results. | will not produce correct results. | |||
| 2.2. Detecting When a Route is For/From a Particular BD | 2.2. Detecting When a Route is for/from a Particular BD | |||
| In this document, we frequently say that a particular multicast route | In this document, we frequently say that a particular multicast route | |||
| is "from" a particular BD, or is "for" a particular BD, or is | is "from" or "for" a particular BD or is "related to" or "associated | |||
| "related to" a particular BD, or "is associated with" a particular | with" a particular BD. These terms are used interchangeably. | |||
| BD. These terms are used interchangeably. Subsequent sections of | Subsequent sections of this document explain when various routes must | |||
| this document explain when various routes must be originated for | be originated for particular BDs. In this section, we explain how | |||
| particular BDs. In this section, we explain how the PE originating a | the PE originating a route marks the route to indicate which BD it is | |||
| route marks the route to indicate which BD it is for. We also | for. We also explain how a PE receiving the route determines which | |||
| explain how a PE receiving the route determines which BD the route is | BD the route is for. | |||
| for. | ||||
| In EVPN, each BD is assigned a Route Target (RT). An RT is a BGP | In EVPN, each BD is assigned an RT. An RT is a BGP Extended | |||
| extended community that can be attached to the BGP routes used by the | Community that can be attached to the BGP routes used by the EVPN | |||
| EVPN control plane. In some EVPN service models, each BD is assigned | control plane. In some EVPN service models, each BD is assigned a | |||
| a unique RT. In other service models, a set of BDs (all in the same | unique RT. In other service models, a set of BDs (all in the same | |||
| EVI) may be assigned the same RT. The RT that is assigned to the SBD | EVI) may be assigned the same RT. The RT that is assigned to the SBD | |||
| is called the "SBD-RT". | is called the "SBD-RT". | |||
| In those service models that allow a set of BDs to share a single RT, | In those service models that allow a set of BDs to share a single RT, | |||
| each BD is assigned a non-zero Tag ID. The Tag ID appears in the | each BD is assigned a non-zero Tag ID. The Tag ID appears in the | |||
| Network Layer Reachability Information (NLRI) of many of the BGP | Network Layer Reachability Information (NLRI) of many of the BGP | |||
| routes that are used by the EVPN control plane. | routes that are used by the EVPN control plane. | |||
| A given route may be for the SBD, or for an "ordinary BD" (a BD that | A given route may be for the SBD or an ordinary BD (a BD that is not | |||
| is not the SBD). An RT that has been assigned to an ordinary BD will | the SBD). An RT that has been assigned to an ordinary BD will be | |||
| be known as an "ordinary BD-RT". | known as an "ordinary BD-RT". | |||
| When constructing an IMET, SMET, S-PMSI, or Leaf | When constructing an IMET, SMET, S-PMSI, or Leaf [RFC9572] route that | |||
| [I-D.ietf-bess-evpn-bum-procedure-updates] route that is for a given | is for a given BD, the following rules apply: | |||
| BD, the following rules apply: | ||||
| * If the route is for an ordinary BD, say BD1, then | * If the route is for an ordinary BD, say BD1, then: | |||
| - the route MUST carry the ordinary BD-RT associated with BD1, | - the route MUST carry the ordinary BD-RT associated with BD1 and | |||
| and | ||||
| - the route MUST NOT carry any RT that is associated with an | - the route MUST NOT carry any RT that is associated with an | |||
| ordinary BD other than BD1. | ordinary BD other than BD1. | |||
| * If the route is for the SBD, the route MUST carry the SBD-RT, and | * If the route is for the SBD, the route MUST carry the SBD-RT and | |||
| MUST NOT carry any RT that is associated with any other BD. | MUST NOT carry any RT that is associated with any other BD. | |||
| * As detailed in subsequent sections, under certain circumstances a | * As detailed in subsequent sections, under certain circumstances, a | |||
| route that is for BD1 may carry both the RT of BD1 and also the | route that is for BD1 may carry both the RT of BD1 and also the | |||
| SBD-RT. | SBD-RT. | |||
| The IMET route for the SBD MUST carry a Multicast Flags Extended | The IMET route for the SBD MUST carry a Multicast Flags Extended | |||
| Community, in which an "OISM SBD" flag is set. | Community in which an OISM SBD flag is set. | |||
| The IMET route for a BD other than the SBD SHOULD carry an EVI-RT EC | The IMET route for a BD other than the SBD SHOULD carry an EVI-RT EC | |||
| as defined in [RFC9251]. The EC is constructed from the SBD-RT, to | as defined in [RFC9251]. The EC is constructed from the SBD-RT to | |||
| indicate the BD's corresponding SBD. This allows all PEs to check | indicate the BD's corresponding SBD. This allows all PEs to check | |||
| that they have consistent SBD provisioning and allow an Assisted | that they have consistent SBD provisioning and allows an Assisted | |||
| Replication (AR) replicator to automatically determine a BD's | Replication (AR) replicator to automatically determine a BD's | |||
| corresponding SBD without any provisioning, as explained in | corresponding SBD without any provisioning, as explained in | |||
| Section 3.2.3.1. | Section 3.2.3.1. | |||
| When receiving an IMET, SMET, S-PMSI, or Leaf route, it is necessary | When receiving an IMET, SMET, S-PMSI, or Leaf route, it is necessary | |||
| for the receiving PE to determine the BD to which the route belongs. | for the receiving PE to determine the BD to which the route belongs. | |||
| This is done by examining the RTs carried by the route, as well as | This is done by examining the RTs carried by the route, as well as | |||
| the Tag ID field of the route's NLRI. There are several cases to | the Tag ID field of the route's NLRI. There are several cases to | |||
| consider. Some of these cases are error cases that arise when the | consider. Some of these cases are error cases that arise when the | |||
| route has not been properly constructed. | route has not been properly constructed. | |||
| When one of the error cases is detected, the route MUST be regarded | When one of the error cases is detected, the route MUST be regarded | |||
| as a malformed route, and the "treat-as-withdraw" procedure of | as a malformed route, and the treat-as-withdraw procedure of | |||
| [RFC7606] MUST be applied. Note that these error cases are only | [RFC7606] MUST be applied. Note that these error cases are only | |||
| detectable by EVPN procedures at the receiving PE; BGP procedures at | detectable by EVPN procedures at the receiving PE; BGP procedures at | |||
| intermediate nodes will generally not detect the existence of such | intermediate nodes will generally not detect the existence of such | |||
| error cases, and in general SHOULD NOT attempt to do so. | error cases and in general SHOULD NOT attempt to do so. | |||
| Case 1: The receiving PE recognizes more than one of the route's RTs | Case 1: The receiving PE recognizes more than one of the route's RTs | |||
| as being an SBD-RT (i.e., the route carries SBD-RTs of more | as being an SBD-RT (i.e., the route carries SBD-RTs of more | |||
| than one Tenant Domain). | than one Tenant Domain). | |||
| This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
| constructed. | constructed. | |||
| Case 2: The receiving PE recognizes one of the route's RTs as being | Case 2: The receiving PE recognizes one of the route's RTs as being | |||
| associated with an ordinary BD, and recognizes one of the | associated with an ordinary BD and recognizes one of the | |||
| route's other RTs as being associated with a different | route's other RTs as being associated with a different | |||
| ordinary BD. | ordinary BD. | |||
| This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
| constructed. | constructed. | |||
| Case 3: The receiving PE recognizes one of the route's RTs as being | Case 3: The receiving PE recognizes one of the route's RTs as being | |||
| associated with an ordinary BD in a particular Tenant | associated with an ordinary BD in a particular Tenant Domain | |||
| Domain, and recognizes another of the route's RTs as being | and recognizes another of the route's RTs as being | |||
| associated with the SBD of a different Tenant Domain. | associated with the SBD of a different Tenant Domain. | |||
| This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
| constructed. | constructed. | |||
| Case 4: The receiving PE does not recognize any of the route's RTs | Case 4: The receiving PE does not recognize any of the route's RTs | |||
| as being associated with an ordinary BD in any of its tenant | as being associated with an ordinary BD in any of its Tenant | |||
| domains, but does recognize one of the RTs as the SBD-RT of | Domains but does recognize one of the RTs as the SBD-RT of | |||
| one of its Tenant Domains. | one of its Tenant Domains. | |||
| In this case, the receiving PE associates the route with the | In this case, the receiving PE associates the route with the | |||
| SBD of that Tenant Domain. This association is made even if | SBD of that Tenant Domain. This association is made even if | |||
| the Tag ID field of the route's NLRI is not the Tag ID of | the Tag ID field of the route's NLRI is not the Tag ID of | |||
| the SBD. | the SBD. | |||
| This is a normal use case where either (a) the route is for | This is a normal use case where either (a) the route is for | |||
| a BD to which the receiving PE is not attached, or (b) the | a BD to which the receiving PE is not attached or (b) the | |||
| route is for the SBD. In either case, the receiving PE | route is for the SBD. In either case, the receiving PE | |||
| associates the route with the SBD. | associates the route with the SBD. | |||
| Case 5: The receiving PE recognizes exactly one of the RTs as an | Case 5: The receiving PE recognizes exactly one of the RTs as an | |||
| ordinary BD-RT that is associated with one of the PE's EVIs, | ordinary BD-RT that is associated with one of the PE's EVIs, | |||
| say EVI-1. The receiving PE also recognizes one of the RTs | say EVI-1. The receiving PE also recognizes one of the RTs | |||
| as being the SBD-RT of the Tenant Domain containing EVI-1. | as being the SBD-RT of the Tenant Domain containing EVI-1. | |||
| In this case, the route is associated with the BD in EVI-1 | In this case, the route is associated with the BD in EVI-1 | |||
| that is identified (in the context of EVI-1) by the Tag ID | that is identified (in the context of EVI-1) by the Tag ID | |||
| field of the route's NLRI. (If EVI-1 contains only a single | field of the route's NLRI. (If EVI-1 contains only a single | |||
| BD, the Tag ID is likely to be zero.) | BD, the Tag ID is likely to be zero.) | |||
| This is the case where the route is for a BD to which the | This is the case where the route is for a BD to which the | |||
| receiving PE is attached, but the route also carries the | receiving PE is attached, but the route also carries the | |||
| SBD-RT. In this case, the receiving PE associates the route | SBD-RT. In this case, the receiving PE associates the route | |||
| with the ordinary BD, not with the SBD. | with the ordinary BD, not with the SBD. | |||
| N.B.: According to the above rules, the mapping from BD to RT is a | Note that according to the above rules, the mapping from BD to RT is | |||
| many-to-one or one-to-one mapping. A route that an EVPN-PE | a many-to-one or one-to-one mapping. A route that an EVPN PE | |||
| originates for a particular BD carries that BD's RT, and an EVPN-PE | originates for a particular BD carries that BD's RT, and an EVPN PE | |||
| that receives the route associates it with a BD as described above. | that receives the route associates it with a BD as described above. | |||
| However, RTs are not used only to help identify the BD to which a | However, RTs are not used only to help identify the BD to which a | |||
| route belongs; they may also used by BGP to determine the path along | route belongs; they may also be used by BGP to determine the path | |||
| which the route is distributed, and to determine which PEs receive | along which the route is distributed and to determine which PEs | |||
| the route. There may be cases where it is desirable to originate a | receive the route. There may be cases where it is desirable to | |||
| route for a particular BD, but have that route distributed to only | originate a route for a particular BD but have that route distributed | |||
| some of the EVPN-PEs attached to that BD. Or one might want the | to only some of the EVPN PEs attached to that BD. Or one might want | |||
| route distributed to some intermediate set of systems, where it might | the route distributed to some intermediate set of systems, where it | |||
| be modified or replaced before being propagated further. Such | might be modified or replaced before being propagated further. Such | |||
| situations are outside the scope of this document. | situations are outside the scope of this document. | |||
| Additionally, there may be situations where it is desirable to | Additionally, there may be situations where it is desirable to | |||
| exchange routes among two or more different Tenant Domains ("EVPN | exchange routes among two or more different Tenant Domains (EVPN | |||
| Extranet"). Such situations are outside the scope of this document. | Extranet). Such situations are outside the scope of this document. | |||
| 2.3. Use of IRB Interfaces at Ingress PE | 2.3. Use of IRB Interfaces at Ingress PE | |||
| When an (S,G) multicast frame is received from an AC belonging to a | When an (S,G) multicast frame is received from an AC belonging to a | |||
| particular BD, say BD1: | particular BD, say BD1: | |||
| 1. The frame is sent unchanged to other EVPN-PEs that are interested | 1. The frame is sent unchanged to other EVPN PEs that are interested | |||
| in (S,G) traffic. The encapsulation used to send the frame to | in (S,G) traffic. The encapsulation used to send the frame to | |||
| the other EVPN-PEs depends on the tunnel type being used for | the other EVPN PEs depends on the tunnel type being used for | |||
| multicast transmission. (For our purposes, we consider Ingress | multicast transmission. (For our purposes, we consider IR, AR, | |||
| Replication (IR), Assisted Replication (AR) and BIER to be | and BIER to be tunnel types, even though IR, AR, and BIER do not | |||
| "tunnel types", even though IR, AR and BIER do not actually use | actually use P2MP tunnels.) At the egress PE, the apparent | |||
| P2MP tunnels.) At the egress PE, the apparent source BD of the | source BD of the frame can be inferred from the tunnel | |||
| frame can be inferred from the tunnel encapsulation. If the | encapsulation. If the egress PE is not attached to the actual | |||
| egress PE is not attached to the actual source BD, it will infer | source BD, it will infer that the apparent source BD is the SBD. | |||
| that the apparent source BD is the SBD. | ||||
| Note that the the inter-PE transmission of a multicast frame | Note that the inter-PE transmission of a multicast frame among | |||
| among EVPN-PEs of the same Tenant Domain does NOT involve the IRB | EVPN PEs of the same Tenant Domain does NOT involve the IRB | |||
| interfaces, as long as the multicast frame was received over an | interfaces as long as the multicast frame was received over an AC | |||
| AC attached to one of the Tenant Domain's BDs. | attached to one of the Tenant Domain's BDs. | |||
| 2. The frame is also sent up the IRB interface that attaches BD1 to | 2. The frame is also sent up the IRB interface that attaches BD1 to | |||
| the Tenant Domain's L3 routing instance in this PE. That is, the | the Tenant Domain's L3 routing instance in this PE. That is, the | |||
| L3 routing instance, behaving as if it were a multicast router, | L3 routing instance, behaving as if it were a multicast router, | |||
| receives the IP multicast frames that arrive at the PE from its | receives the IP multicast frames that arrive at the PE from its | |||
| local ACs. The L3 routing instance decapsulates the frame's | local ACs. The L3 routing instance decapsulates the frame's | |||
| payload to extract the IP multicast packet, decrements the IP | payload to extract the IP multicast packet, decrements the IP | |||
| TTL, adjusts the header checksum, and does any other necessary IP | TTL, adjusts the header checksum, and does any other necessary IP | |||
| processing (e.g., fragmentation). | processing (e.g., fragmentation). | |||
| 3. The L3 routing instance keeps track of which BDs have local | 3. The L3 routing instance keeps track of which BDs have local | |||
| receivers for (S,G) traffic. (A "local receiver" is a TS, | receivers for (S,G) traffic. (A local receiver is a TS, | |||
| reachable via a local AC, that has expressed interest in (S,G) | reachable via a local AC, that has expressed interest in (S,G) | |||
| traffic.) If the L3 routing instance has an IRB interface to | traffic.) If the L3 routing instance has an IRB interface to | |||
| BD2, and it knows that BD2 has a LOCAL receiver interested in | BD2, and it knows that BD2 has a LOCAL receiver interested in | |||
| (S,G) traffic, it encapsulates the packet in an Ethernet header | (S,G) traffic, it encapsulates the packet in an Ethernet header | |||
| for BD2, putting its own MAC address in the MAC SA field. Then | for BD2, putting its own MAC address in the MAC SA field. Then, | |||
| it sends the packet down the IRB interface to BD2. | it sends the packet down the IRB interface to BD2. | |||
| If a packet is sent from the L3 routing instance to a particular BD | If a packet is sent from the L3 routing instance to a particular BD | |||
| via the IRB interface (step 3 in the above list), and if the BD in | via the IRB interface (step 3 in the above list), and if the BD in | |||
| question is NOT the SBD, the packet is sent ONLY to LOCAL ACs of that | question is NOT the SBD, the packet is sent ONLY to LOCAL ACs of that | |||
| BD. If the packet needs to go to other PEs, it has already been sent | BD. If the packet needs to go to other PEs, it has already been sent | |||
| to them in step 1. Note that this is a change in the IRB interface | to them in step 1. Note that this is a change in the IRB interface | |||
| semantics from what is described in [RFC9135] and Figure 2. | semantics from what is described in [RFC9135] and Figure 3. | |||
| If a given locally attached segment is multi-homed, existing EVPN | If a given locally attached segment is multihomed, existing EVPN | |||
| procedures ensure that a packet is not sent by a given PE to that | procedures ensure that a packet is not sent by a given PE to that | |||
| segment unless the PE is the DF for that segment. Those procedures | segment unless the PE is the DF for that segment. Those procedures | |||
| also ensure that a packet is never sent by a PE to its segment of | also ensure that a packet is never sent by a PE to its segment of | |||
| origin. Thus EVPN segment multi-homing is fully supported; duplicate | origin. Thus, EVPN segment multihoming is fully supported; duplicate | |||
| delivery to a segment or looping on a segment are thereby prevented, | delivery to a segment or looping on a segment are thereby prevented | |||
| without the need for any new procedures to be defined in this | without the need for any new procedures to be defined in this | |||
| document. | document. | |||
| What if an IP multicast packet is received from outside the tenant | What if an IP multicast packet is received from outside the Tenant | |||
| domain? For instance, perhaps PE1's IP-VRF for a particular tenant | Domain? For instance, perhaps PE1's IP-VRF for a particular Tenant | |||
| domain also has a physical interface leading to an external switch, | Domain also has a physical interface leading to an external switch, | |||
| host, or router, and PE1 receives an IP multicast packet or frame on | host, or router and PE1 receives an IP multicast packet or frame on | |||
| that interface. Or perhaps the packet is from an L3VPN, or a | that interface, or perhaps the packet is from an L3VPN or a different | |||
| different EVPN Tenant Domain. | EVPN Tenant Domain. | |||
| Such a packet is first processed by the L3 routing instance, which | Such a packet is first processed by the L3 routing instance, which | |||
| decrements TTL and does any other necessary IP processing. Then the | decrements TTL and does any other necessary IP processing. Then, the | |||
| packet is sent into the Tenant Domain by sending it down the IRB | packet is sent into the Tenant Domain by sending it down the IRB | |||
| interface to the SBD of that Tenant Domain. This requires | interface to the SBD of that Tenant Domain. This requires | |||
| encapsulating the packet in an Ethernet header. The MAC SA field | encapsulating the packet in an Ethernet header. The MAC SA field | |||
| will contain the PE's own MAC on the SBD. | will contain the PE's own MAC on the SBD. | |||
| An IP multicast packet sent by the L3 routing instance down the IRB | An IP multicast packet sent by the L3 routing instance down the IRB | |||
| interface to the SBD is treated as if it had arrived from a local AC, | interface to the SBD is treated as if it had arrived from a local AC, | |||
| and steps 1-3 are applied. Note that the semantics of sending a | and steps 1-3 are applied. Note that the semantics of sending a | |||
| packet down the IRB interface to the SBD are thus slightly different | packet down the IRB interface to the SBD are thus slightly different | |||
| than the semantics of sending a packet down other IRB interfaces. IP | than the semantics of sending a packet down other IRB interfaces. IP | |||
| multicast packets sent down the SBD's IRB interface may be | multicast packets sent down the SBD's IRB interface may be | |||
| distributed to other PEs, but IP multicast packets sent down other | distributed to other PEs, but IP multicast packets sent down other | |||
| IRB interfaces are distributed only to local ACs. | IRB interfaces are distributed only to local ACs. | |||
| If a PE sends a link-local multicast packet down the SBD IRB | If a PE sends a link-local multicast packet down the SBD IRB | |||
| interface, that packet will be distributed (as an Ethernet frame) to | interface, that packet will be distributed (as an Ethernet frame) to | |||
| other PEs of the Tenant Domain, but will not appear on any of the | other PEs of the Tenant Domain but will not appear on any of the | |||
| actual BDs. | actual BDs. | |||
| 2.4. Use of IRB Interfaces at an Egress PE | 2.4. Use of IRB Interfaces at an Egress PE | |||
| Suppose an egress EVPN-PE receives an (S,G) multicast frame from the | Suppose an egress EVPN PE receives an (S,G) multicast frame from the | |||
| frame's ingress EVPN-PE. As described above, the packet will arrive | frame's ingress EVPN PE. As described above, the packet will arrive | |||
| as an Ethernet frame over a tunnel from the ingress PE, and the | as an Ethernet frame over a tunnel from the ingress PE, and the | |||
| tunnel encapsulation will identify the source BD of the Ethernet | tunnel encapsulation will identify the source BD of the Ethernet | |||
| frame. | frame. | |||
| We define the notion of the frame's "apparent source BD" as follows. | We define the notion of the frame's apparent source BD as follows. | |||
| If the egress PE is attached to the actual source BD, the actual | If the egress PE is attached to the actual source BD, the actual | |||
| source BD is the apparent source BD. If the egress PE is not | source BD is the apparent source BD. If the egress PE is not | |||
| attached to the actual source BD, the SBD is the apparent source BD. | attached to the actual source BD, the SBD is the apparent source BD. | |||
| The egress PE now takes the following steps: | The egress PE now takes the following steps: | |||
| 1. If the egress PE has ACs belonging to the apparent source BD of | 1. If the egress PE has ACs belonging to the apparent source BD of | |||
| the frame, it sends the frame unchanged to any ACs of that BD | the frame, it sends the frame unchanged to any ACs of that BD | |||
| that have interest in (S,G) packets. The MAC SA of the frame is | that have interest in (S,G) packets. The MAC SA of the frame is | |||
| not modified, and the IP header of the frame's payload is not | not modified, and the IP header of the frame's payload is not | |||
| modified in any way. | modified in any way. | |||
| 2. The frame is also sent to the L3 routing instance by being sent | 2. The frame is also sent to the L3 routing instance by being sent | |||
| up the IRB interface that attaches the L3 routing instance to the | up the IRB interface that attaches the L3 routing instance to the | |||
| apparent source BD. Steps 2 and 3 of Section 2.3 are then | apparent source BD. Steps 2 and 3 listed in Section 2.3 are then | |||
| applied. | applied. | |||
| 2.5. Announcing Interest in (S,G) | 2.5. Announcing Interest in (S,G) | |||
| [RFC9251] defines procedures used by an egress PE to announce its | [RFC9251] defines procedures used by an egress PE to announce its | |||
| interest in a multicast flow or set of flows. If an egress PE | interest in a multicast flow or set of flows. If an egress PE | |||
| determines it has LOCAL receivers in a particular BD, say BD1, that | determines it has LOCAL receivers in a particular BD, say BD1, that | |||
| are interested in a particular set of flows, it originates one or | are interested in a particular set of flows, it originates one or | |||
| more SMET routes for BD1. Each SMET route specifies a particular | more SMET routes for BD1. Each SMET route specifies a particular | |||
| (S,G) or (*,G) flow. By originating an SMET route for BD1, a PE is | (S,G) or (*,G) flow. By originating a SMET route for BD1, a PE is | |||
| announcing "I have receivers for (S,G) or (*,G) in BD1". Such an | announcing "I have receivers for (S,G) or (*,G) in BD1". Such a SMET | |||
| SMET route carries the Route Target (RT) for BD1, ensuring that it | route carries the RT for BD1, ensuring that it will be distributed to | |||
| will be distributed to all PEs that are attached to BD1. | all PEs that are attached to BD1. | |||
| The OISM procedures for originating SMET routes differ slightly from | The OISM procedures for originating SMET routes differ slightly from | |||
| those in [RFC9251]. In most cases, the SMET routes are considered to | those in [RFC9251]. In most cases, the SMET routes are considered to | |||
| be for the SBD, rather than for the BD containing local receivers. | be for the SBD rather than the BD containing local receivers. These | |||
| These SMET routes carry the SBD-RT, and do not carry any ordinary BD- | SMET routes carry the SBD-RT and do not carry any ordinary BD-RT. | |||
| RT. Details on the processing of SMET routes can be found in | Details on the processing of SMET routes can be found in Section 3.3. | |||
| Section 3.3. | ||||
| Since the SMET routes carry the SBD-RT, every ingress PE attached to | Since the SMET routes carry the SBD-RT, every ingress PE attached to | |||
| a particular Tenant Domain will learn of all other PEs (attached to | a particular Tenant Domain will learn of all other PEs (attached to | |||
| the same Tenant Domain) that have interest in a particular set of | the same Tenant Domain) that have interest in a particular set of | |||
| flows. Note that a PE that receives a given SMET route does not | flows. Note that a PE that receives a given SMET route does not | |||
| necessarily have any BDs (other than the SBD) in common with the PE | necessarily have any BDs (other than the SBD) in common with the PE | |||
| that originates that SMET route. | that originates that SMET route. | |||
| If all the sources and receivers for a given (*,G) are in the Tenant | If all the sources and receivers for a given (*,G) are in the Tenant | |||
| Domain, inter-subnet "Any Source Multicast" traffic will be properly | Domain, inter-subnet ASM traffic will be properly routed without | |||
| routed without requiring any Rendezvous Points, shared trees, or | requiring any RPs, shared trees, or other complex aspects of | |||
| other complex aspects of multicast routing infrastructure. Suppose, | multicast routing infrastructure. Suppose, for example, that: | |||
| for example, that: | ||||
| * PE1 has a local receiver, on BD1, for (*,G) | * PE1 has a local receiver, on BD1, for (*,G) and | |||
| * PE2 has a local source, on BD2, for (*,G). | * PE2 has a local source, on BD2, for (*,G). | |||
| PE1 will originate an SMET(*,G) route for the SBD, and PE2 will | PE1 will originate a SMET(*,G) route for the SBD, and PE2 will | |||
| receive that route, even if PE2 is not attached to BD1. PE2 will | receive that route, even if PE2 is not attached to BD1. PE2 will | |||
| thus know to forward (S,G) traffic to PE1. PE1 does not need to do | thus know to forward (S,G) traffic to PE1. PE1 does not need to do | |||
| any "source discovery". (This does assume that source S does not | any source discovery. (This does assume that source S does not send | |||
| send the same (S,G) datagram on two different BDs, and that the | the same (S,G) datagram on two different BDs and that the Tenant | |||
| Tenant Domain does not contain two or more sources with the same IP | Domain does not contain two or more sources with the same IP address | |||
| address S. The use of multicast sources that have IP "anycast" | S. The use of multicast sources that have IP anycast addresses is | |||
| addresses is outside the scope of this document.) | outside the scope of this document.) | |||
| If some PE attached to the Tenant Domain does not support [RFC9251], | If some PE attached to the Tenant Domain does not support [RFC9251], | |||
| it will be assumed to be interested in all flows. Whether a | it will be assumed to be interested in all flows. Whether a | |||
| particular remote PE supports [RFC9251] is determined by the presence | particular remote PE supports [RFC9251] or not is determined by the | |||
| of the Multicast Flags Extended Community in its IMET route; this is | presence of the Multicast Flags Extended Community in its IMET route; | |||
| specified in [RFC9251]. | this is specified in [RFC9251]. | |||
| 2.6. Tunneling Frames from Ingress PE to Egress PEs | 2.6. Tunneling Frames from Ingress PEs to Egress PEs | |||
| [RFC7432] specifies the procedures for setting up and using "BUM | [RFC7432] specifies the procedures for setting up and using BUM | |||
| tunnels". A BUM tunnel is a tunnel used to carry traffic on a | tunnels. A BUM tunnel is a tunnel used to carry traffic on a | |||
| particular BD if that traffic is (a) broadcast traffic, or (b) | particular BD if that traffic is (a) broadcast traffic, (b) unicast | |||
| unicast traffic with an unknown MAC DA, or (c) Ethernet multicast | traffic with an unknown Destination MAC Address, or (c) Ethernet | |||
| traffic. | multicast traffic. | |||
| This document allows the BUM tunnels to be used as the default | This document allows the BUM tunnels to be used as the default | |||
| tunnels for transmitting IP multicast frames. It also allows a | tunnels for transmitting IP multicast frames. It also allows a | |||
| separate set of tunnels to be used, instead of the BUM tunnels, as | separate set of tunnels to be used, instead of the BUM tunnels, as | |||
| the default tunnels for carrying IP multicast frames. Let's call | the default tunnels for carrying IP multicast frames. Let's call | |||
| these "IP Multicast Tunnels". | these "IP multicast tunnels". | |||
| When the tunneling is done via Ingress Replication or via BIER, this | When the tunneling is done via IR or via BIER, this difference is of | |||
| difference is of no significance. However, when P2MP tunnels are | no significance. However, when P2MP tunnels are used, there is a | |||
| used, there is a significant advantage to having separate IP | significant advantage to having separate IP multicast tunnels. | |||
| multicast tunnels. | ||||
| Other things being equal, it is desirable for an ingress PE to | It is desirable for an ingress PE to transmit a copy of a given (S,G) | |||
| transmit a copy of a given (S,G) multicast frame on only one P2MP | multicast frame on only one P2MP tunnel. All egress PEs interested | |||
| tunnel. All egress PEs interested in (S,G) packets then have to join | in (S,G) packets then have to join that tunnel. If the source BD and | |||
| that tunnel. If the source BD and PE for an (S,G) frame are BD1 and | PE for an (S,G) frame are BD1 and PE1, respectively, and if PE2 has | |||
| PE1 respectively, and if PE2 has receivers on BD2 for (S,G), then PE2 | receivers on BD2 for (S,G), then PE2 must join the P2MP Label | |||
| must join the P2MP LSP on which PE1 transmits the (S,G) frame. PE2 | Switched Path (LSP) on which PE1 transmits the (S,G) frame. PE2 must | |||
| must join this P2MP LSP even if PE2 is not attached to the source BD | join this P2MP LSP even if PE2 is not attached to the source BD, BD1. | |||
| (BD1). If PE1 were transmitting the multicast frame on its BD1 BUM | If PE1 was transmitting the multicast frame on its BD1 BUM tunnel, | |||
| tunnel, then PE2 would have to join the BD1 BUM tunnel, even though | then PE2 would have to join the BD1 BUM tunnel, even though PE2 has | |||
| PE2 has no BD1 attachment circuits. This would cause PE2 to pull all | no BD1 Attachment Circuits. This would cause PE2 to pull all the BUM | |||
| the BUM traffic from BD1, most of which it would just have to | traffic from BD1, most of which it would just have to discard. Thus, | |||
| discard. Thus it is RECOMMENDED that the default IP multicast | it is RECOMMENDED that the default IP multicast tunnels be distinct | |||
| tunnels be distinct from the BUM tunnels. | from the BUM tunnels. | |||
| Notwithstanding the above, link-local IP multicast traffic MUST | Notwithstanding the above, link-local IP multicast traffic MUST | |||
| always be carried on the BUM tunnels, and ONLY on the BUM tunnels. | always be carried on the BUM tunnels and ONLY on the BUM tunnels. | |||
| link-local IP multicast traffic consists of IPv4 traffic with a | Link-local IP multicast traffic consists of IPv4 traffic with a | |||
| destination address prefix of 224/24 and IPv6 traffic with a | destination address prefix of 224/24 and IPv6 traffic with a | |||
| destination address prefix of FF02/16. In this document, the terms | destination address prefix of FF02/16. In this document, the terms | |||
| "IP multicast packet" and "IP multicast frame" are defined in | "IP multicast packet" and "IP multicast frame" are defined in | |||
| Section 1.1 so as to exclude link-local traffic. | Section 1.1 so as to exclude link-local traffic. | |||
| Note that it is also possible to use "selective tunnels" to carry | Note that it is also possible to use selective tunnels to carry | |||
| particular multicast flows (see Section 3.2). When an (S,G) frame is | particular multicast flows (see Section 3.2). When an (S,G) frame is | |||
| transmitted on a selective tunnel, it is not transmitted on the BUM | transmitted on a selective tunnel, it is not transmitted on the BUM | |||
| tunnel or on the default IP Multicast tunnel. | tunnel or on the default IP multicast tunnel. | |||
| 2.7. Advanced Scenarios | 2.7. Advanced Scenarios | |||
| There are some deployment scenarios that require special procedures: | There are some deployment scenarios that require special procedures: | |||
| 1. Some multicast sources or receivers are attached to PEs that | 1. Some multicast sources or receivers are attached to PEs that | |||
| support [RFC7432], but do not support this document or [RFC9135]. | support [RFC7432] but do not support this document or [RFC9135]. | |||
| To interoperate with these "non-OISM PEs", it is necessary to | To interoperate with these non-OISM PEs, it is necessary to have | |||
| have one or more gateway PEs that interface the tunnels discussed | one or more gateway PEs that interface the tunnels discussed in | |||
| in this document with the BUM tunnels of the legacy PEs. This is | this document with the BUM tunnels of the legacy PEs. This is | |||
| discussed in Section 5. | discussed in Section 5. | |||
| 2. Sometimes multicast traffic originates from outside the EVPN | 2. Sometimes multicast traffic originates from outside the EVPN | |||
| domain, or needs to be sent outside the EVPN domain. This is | domain or needs to be sent outside the EVPN domain. This is | |||
| discussed in Section 6. An important special case of this, | discussed in Section 6. An important special case of this, | |||
| integration with MVPN, is discussed in Section 6.1.2. | integration with MVPN, is discussed in Section 6.1.2. | |||
| 3. In some scenarios, one or more of the tenant systems is a PIM | 3. In some scenarios, one or more of the tenant systems is a PIM | |||
| router, and the Tenant Domain is used as a transit network that | router, and the Tenant Domain is used as a transit network that | |||
| is part of a larger multicast domain. This is discussed in | is part of a larger multicast domain. This is discussed in | |||
| Section 7. | Section 7. | |||
| 3. EVPN-aware Multicast Solution Control Plane | 3. EVPN-Aware Multicast Solution Control Plane | |||
| 3.1. Supplementary Broadcast Domain (SBD) and Route Targets | 3.1. Supplementary Broadcast Domain (SBD) and Route Targets | |||
| As discussed in Section 2.1, every Tenant Domain is associated with a | As discussed in Section 2.1, every Tenant Domain is associated with a | |||
| single Supplementary Broadcast Domain (SBD). Recall that a Tenant | single SBD. Recall that a Tenant Domain is defined to be a set of | |||
| Domain is defined to be a set of BDs that can freely send and receive | BDs that can freely send and receive IP multicast traffic to/from | |||
| IP multicast traffic to/from each other. If an EVPN-PE has one or | each other. If an EVPN PE has one or more ACs in a BD of a | |||
| more ACs in a BD of a particular Tenant Domain, and if the EVPN-PE | particular Tenant Domain, and if the EVPN PE supports the procedures | |||
| supports the procedures of this document, that EVPN-PE MUST be | of this document, that EVPN PE MUST be provisioned with the SBD of | |||
| provisioned with the SBD of that Tenant Domain. | that Tenant Domain. | |||
| At each EVPN-PE attached to a given Tenant Domain, there is an IRB | At each EVPN PE attached to a given Tenant Domain, there is an IRB | |||
| interface leading from the L3 routing instance of that Tenant Domain | interface leading from the L3 routing instance of that Tenant Domain | |||
| to the SBD. However, the SBD has no ACs. | to the SBD. However, the SBD has no ACs. | |||
| Each SBD is provisioned with a Route Target (RT). All the EVPN-PEs | Each SBD is provisioned with an RT. All the EVPN PEs supporting a | |||
| supporting a given SBD are provisioned with that RT as an import RT. | given SBD are provisioned with that RT as an import RT. That RT MUST | |||
| That RT MUST NOT be the same as the RT associated with any other BD. | NOT be the same as the RT associated with any other BD. | |||
| We will use the term "SBD-RT" to denote the RT that has been assigned | We will use the term "SBD-RT" to denote the RT that has been assigned | |||
| to the SBD. Routes carrying this RT will be propagated to all | to the SBD. Routes carrying this RT will be propagated to all EVPN | |||
| EVPN-PEs in the same Tenant Domain as the originator. | PEs in the same Tenant Domain as the originator. | |||
| Section 2.2 specifies the rules by which an EVPN-PE that receives a | Section 2.2 specifies the rules by which an EVPN PE that receives a | |||
| route determines whether a received route "belongs to" a particular | route determines whether a received route belongs to a particular | |||
| ordinary BD or SBD. | ordinary BD or SBD. | |||
| Section 2.2 also specifies additional rules that must be followed | Section 2.2 also specifies additional rules that must be followed | |||
| when constructing routes that belong to a particular BD, including | when constructing routes that belong to a particular BD, including | |||
| the SBD. | the SBD. | |||
| The SBD SHOULD be in an EVPN Instance (EVI) of its own. Even if the | The SBD SHOULD be in an EVI of its own. Even if the SBD is not in an | |||
| SBD is not in an EVI of its own, the SBD-RT MUST be different than | EVI of its own, the SBD-RT MUST be different than the RT associated | |||
| the RT associated with any other BD. This restriction is necessary | with any other BD. This restriction is necessary in order for the | |||
| in order for the rules of Sections 2.2 and 3.1 to work correctly. | rules of Sections 2.2 and 3.1 to work correctly. | |||
| Note that an SBD, just like any other BD, is associated on each | Note that an SBD, just like any other BD, is associated on each EVPN | |||
| EVPN-PE with a MAC-VRF. Per [RFC7432], each MAC-VRF is associated | PE with a MAC-VRF. Per [RFC7432], each MAC-VRF is associated with a | |||
| with a Route Distinguisher (RD). When constructing a route that is | Route Distinguisher (RD). When constructing a route that is for an | |||
| "for" an SBD, an EVPN-PE will place the RD of the associated MAC-VRF | SBD, an EVPN PE will place the RD of the associated MAC-VRF in the | |||
| in the "Route Distinguisher" field of the NLRI. (If the Tenant | Route Distinguisher field of the NLRI. (If the Tenant Domain has | |||
| Domain has several MAC-VRFs on a given PE, the EVPN-PE has a choice | several MAC-VRFs on a given PE, the EVPN PE has a choice of which RD | |||
| of which RD to use.) | to use.) | |||
| If Assisted Replication (AR, see [I-D.ietf-bess-evpn-optimized-ir]) | If AR [RFC9574] is used, each AR-REPLICATOR for a given Tenant Domain | |||
| is used, each AR-REPLICATOR for a given Tenant Domain must be | must be provisioned with the SBD of that Tenant Domain, even if the | |||
| provisioned with the SBD of that Tenant Domain, even if the | AR-REPLICATOR does not have any L3 routing instances. | |||
| AR-REPLICATOR does not have any L3 routing instance. | ||||
| 3.2. Advertising the Tunnels Used for IP Multicast | 3.2. Advertising the Tunnels Used for IP Multicast | |||
| The procedures used for advertising the tunnels that carry IP | The procedures used for advertising the tunnels that carry IP | |||
| multicast traffic depend upon the type of tunnel being used. If the | multicast traffic depend upon the type of tunnel being used. If the | |||
| tunnel type is neither Ingress Replication, Assisted Replication, nor | tunnel type is neither IR, AR, nor BIER, there are procedures for | |||
| BIER, there are procedures for advertising both "inclusive tunnels" | advertising both inclusive tunnels and selective tunnels. | |||
| and "selective tunnels". | ||||
| When IR, AR or BIER are used to transmit IP multicast packets across | When IR, AR, or BIER are used to transmit IP multicast packets across | |||
| the core, there are no P2MP tunnels. Once an ingress EVPN-PE | the core, there are no P2MP tunnels. Once an ingress EVPN PE | |||
| determines the set of egress EVPN-PEs for a given flow, the IMET | determines the set of egress EVPN PEs for a given flow, the IMET | |||
| routes contain all the information needed to transport packets of | routes contain all the information needed to transport packets of | |||
| that flow to the egress PEs. | that flow to the egress PEs. | |||
| If AR is used, the ingress EVPN-PE is also an AR-LEAF and the IMET | If AR is used, the ingress EVPN PE is also an AR-LEAF, and the IMET | |||
| route coming from the selected AR-REPLICATOR contains the information | route coming from the selected AR-REPLICATOR contains the information | |||
| needed. The AR-REPLICATOR will behave as an ingress EVPN-PE when | needed. The AR-REPLICATOR will behave as an ingress EVPN PE when | |||
| sending a flow to the egress EVPN-PEs. | sending a flow to the egress EVPN PEs. | |||
| If the tunneling technique requires P2MP tunnels to be set up (e.g., | If the tunneling technique requires P2MP tunnels to be set up (e.g., | |||
| RSVP-TE P2MP, mLDP, PIM), some of the tunnels may be selective | RSVP-TE P2MP, Multipoint LDP (mLDP), or PIM), some of the tunnels may | |||
| tunnels and some may be inclusive tunnels. | be selective tunnels and some may be inclusive tunnels. | |||
| Selective P2MP tunnels are always advertised by the ingress PE using | Selective P2MP tunnels are always advertised by the ingress PE using | |||
| S-PMSI A-D routes [I-D.ietf-bess-evpn-bum-procedure-updates]. | S-PMSI Auto-Discovery (A-D) routes [RFC9572]. | |||
| For inclusive tunnels, there is a choice between using a BD's | For inclusive tunnels, there is a choice between using a BD's | |||
| ordinary "BUM tunnel" [RFC7432] as the default inclusive tunnel for | ordinary BUM tunnel as the default inclusive tunnel for carrying IP | |||
| carrying IP multicast traffic, or using a separate IP multicast | multicast traffic or using a separate IP multicast tunnel as the | |||
| tunnel as the default inclusive tunnel for carrying IP multicast. In | default inclusive tunnel for carrying IP multicast. In the former | |||
| the former case, the inclusive tunnel is advertised in an IMET route. | case, the inclusive tunnel is advertised in an IMET route. In the | |||
| In the latter case, the inclusive tunnel is advertised in a (C-*,C-*) | latter case, the inclusive tunnel is advertised in a (C-*,C-*) S-PMSI | |||
| S-PMSI A-D route [I-D.ietf-bess-evpn-bum-procedure-updates]. Details | A-D route [RFC9572]. Details may be found in subsequent sections. | |||
| may be found in subsequent sections. | ||||
| 3.2.1. Constructing Routes for the SBD | 3.2.1. Constructing Routes for the SBD | |||
| There are situations in which an EVPN-PE needs to originate IMET, | There are situations in which an EVPN PE needs to originate IMET, | |||
| SMET, and/or SPMSI routes for the SBD. Throughout this document, we | SMET, and/or S-PMSI routes for the SBD. Throughout this document, we | |||
| will refer to such routes respectively as "SBD-IMET routes", | will refer to such routes respectively as "SBD-IMET routes", "SBD- | |||
| "SBD-SMET routes", and "SBD-SPMSI routes". Subsequent sections | SMET routes", and "SBD-SPMSI routes". Subsequent sections detail the | |||
| detail the conditions under which these routes need to be originated. | conditions under which these routes need to be originated. | |||
| When an EVPN-PE needs to originate an SBD-IMET, SBD-SMET, or | When an EVPN PE needs to originate an SBD-IMET, SBD-SMET, or SBD- | |||
| SBD-SPMSI route, it constructs the route as follows: | SPMSI route, it constructs the route as follows: | |||
| * the RD field of the route's NLRI is set to the RD of the MAC-VRF | * The RD field of the route's NLRI is set to the RD of the MAC-VRF | |||
| that is associated with the SBD; | that is associated with the SBD. | |||
| * the SBD-RT is attached to the route; | * The SBD-RT is attached to the route. | |||
| * the "Tag ID" field of the route's NLRI is set to the Tag ID that | * The Tag ID field of the route's NLRI is set to the Tag ID that has | |||
| has been assigned to the SBD. This is most likely 0 if a | been assigned to the SBD. This is most likely 0 if a VLAN-based | |||
| VLAN-based or VLAN-bundle service is being used, but non-zero if a | or VLAN-bundle service is being used but non-zero if a VLAN-aware | |||
| VLAN-aware bundle service is being used. | bundle service is being used. | |||
| 3.2.2. Ingress Replication | 3.2.2. Ingress Replication | |||
| When Ingress Replication (IR) is used to transport IP multicast | When IR is used to transport IP multicast frames of a given Tenant | |||
| frames of a given Tenant Domain, each EVPN-PE attached to that Tenant | Domain, each EVPN PE attached to that Tenant Domain MUST originate an | |||
| Domain MUST originate an SBD-IMET route (see Section 3.2.1). | SBD-IMET route (see Section 3.2.1). | |||
| The SBD-IMET route MUST carry a PMSI Tunnel attribute (PTA), and the | The SBD-IMET route MUST carry a PTA, and the MPLS Label field of the | |||
| MPLS label field of the PTA MUST specify a downstream-assigned MPLS | PTA MUST specify a downstream-assigned MPLS label that maps uniquely | |||
| label that maps uniquely (in the context of the originating EVPN-PE) | (in the context of the originating EVPN PE) to the SBD. | |||
| to the SBD. | ||||
| Following the procedures of [RFC7432], an EVPN-PE MUST also originate | Following the procedures of [RFC7432], an EVPN PE MUST also originate | |||
| an IMET route for each BD to which it is attached. Each of these | an IMET route for each BD to which it is attached. Each of these | |||
| IMET routes carries a PTA specifying a downstream-assigned label that | IMET routes carries a PTA specifying a downstream-assigned label that | |||
| maps uniquely, in the context of the originating EVPN-PE, to the BD | maps uniquely, in the context of the originating EVPN PE, to the BD | |||
| in question. These IMET routes need not carry the SBD-RT. | in question. These IMET routes need not carry the SBD-RT. | |||
| When an ingress EVPN-PE needs to use IR to send an IP multicast frame | When an ingress EVPN PE needs to use IR to send an IP multicast frame | |||
| from a particular source BD to an egress EVPN-PE, the ingress PE | from a particular source BD to an egress EVPN PE, the ingress PE | |||
| determines whether the egress PE has originated an IMET route for | determines whether or not the egress PE has originated an IMET route | |||
| that BD. If so, that IMET route contains the MPLS label that the | for that BD. If so, that IMET route contains the MPLS label that the | |||
| egress PE has assigned to the source BD. The ingress PE uses that | egress PE has assigned to the source BD. The ingress PE uses that | |||
| label when transmitting the packet to the egress PE. Otherwise, the | label when transmitting the packet to the egress PE. Otherwise, the | |||
| ingress PE uses the label that the egress PE has assigned to the SBD | ingress PE uses the label that the egress PE has assigned to the SBD | |||
| (in the SBD-IMET route originated by the egress). | (in the SBD-IMET route originated by the egress). | |||
| Note that the set of IMET routes originated by a given egress PE, and | Note that the set of IMET routes originated by a given egress PE, and | |||
| installed by a given ingress PE, may change over time. If the egress | installed by a given ingress PE, may change over time. If the egress | |||
| PE withdraws its IMET route for the source BD, the ingress PE MUST | PE withdraws its IMET route for the source BD, the ingress PE MUST | |||
| stop using the label carried in that IMET route, and instead MUST use | stop using the label carried in that IMET route and instead MUST use | |||
| the label carried in the SBD-IMET route from that egress PE. | the label carried in the SBD-IMET route from that egress PE. | |||
| Implementors must also take into account that an IMET route from a | Implementors must also take into account that an IMET route from a | |||
| particular PE for a particular BD may arrive after that PE's SBD-IMET | particular PE for a particular BD may arrive after that PE's SBD-IMET | |||
| route. | route. | |||
| 3.2.3. Assisted Replication | 3.2.3. Assisted Replication | |||
| When Assisted Replication is used to transport IP multicast frames of | When AR is used to transport IP multicast frames of a given Tenant | |||
| a given Tenant Domain, each EVPN-PE (including the AR-REPLICATOR) | Domain, each EVPN PE (including the AR-REPLICATOR) attached to the | |||
| attached to the Tenant Domain MUST originate an SBD-IMET route (see | Tenant Domain MUST originate an SBD-IMET route (see Section 3.2.1). | |||
| Section 3.2.1). | ||||
| An AR-REPLICATOR attached to a given Tenant Domain is considered to | An AR-REPLICATOR attached to a given Tenant Domain is considered to | |||
| be an EVPN-PE of that Tenant Domain. It is attached to all the BDs | be an EVPN PE of that Tenant Domain. It is attached to all the BDs | |||
| in the Tenant Domain, but it does not necessarily have L3 routing | in the Tenant Domain, but it does not necessarily have L3 routing | |||
| instances. | instances. | |||
| As with Ingress Replication, the SBD-IMET route carries a PTA where | As with IR, the SBD-IMET route carries a PTA where the MPLS Label | |||
| the MPLS label field specifies the downstream-assigned MPLS label | field specifies the downstream-assigned MPLS label that identifies | |||
| that identifies the SBD. However, the AR-REPLICATOR and AR-LEAF | the SBD. However, the AR-REPLICATOR and AR-LEAF EVPN PEs will set | |||
| EVPN-PEs will set the PTA's flags differently, as per | the PTA's flags differently, as per [RFC9574]. | |||
| [I-D.ietf-bess-evpn-optimized-ir]. | ||||
| In addition, each EVPN-PE originates an IMET route for each BD to | In addition, each EVPN PE originates an IMET route for each BD to | |||
| which it is attached. As in the case of Ingress Replication, these | which it is attached. As in the case of IR, these routes carry the | |||
| routes carry the downstream-assigned MPLS labels that identify the | downstream-assigned MPLS labels that identify the BDs and do not | |||
| BDs and do not carry the SBD-RT. | carry the SBD-RT. | |||
| When an ingress EVPN-PE, acting as AR-LEAF, needs to send an IP | When an ingress EVPN PE, acting as AR-LEAF, needs to send an IP | |||
| multicast frame from a particular source BD to an egress EVPN-PE, the | multicast frame from a particular source BD to an egress EVPN PE, the | |||
| ingress PE determines whether there is any AR-REPLICATOR that | ingress PE determines whether or not there is any AR-REPLICATOR that | |||
| originated an IMET route for that BD. After the AR-REPLICATOR | originated an IMET route for that BD. After the AR-REPLICATOR | |||
| selection (if there are more than one), the AR-LEAF uses the label | selection (if there are more than one), the AR-LEAF uses the label | |||
| contained in the IMET route of the AR-REPLICATOR when transmitting | contained in the IMET route of the AR-REPLICATOR when transmitting | |||
| packets to it. The AR-REPLICATOR receives the packet and, based on | packets to it. The AR-REPLICATOR receives the packet and, based on | |||
| the procedures specified in [I-D.ietf-bess-evpn-optimized-ir] and in | the procedures specified in [RFC9574] and in Section 3.2.2 of this | |||
| Section 3.2.2 of this document, transmits the packets to the egress | document, transmits the packets to the egress EVPN PEs using the | |||
| EVPN-PEs using the labels contained in the received IMET routes for | labels contained in the received IMET routes for either the source BD | |||
| either the source BD or the SBD. | or the SBD. | |||
| If an ingress AR-LEAF for a given BD has not received any IMET route | If an ingress AR-LEAF for a given BD has not received any IMET route | |||
| for that BD from an AR-REPLICATOR, the ingress AR-LEAF follows the | for that BD from an AR-REPLICATOR, the ingress AR-LEAF follows the | |||
| procedures in Section 3.2.2. | procedures in Section 3.2.2. | |||
| 3.2.3.1. Automatic SBD Matching | 3.2.3.1. Automatic SBD Matching | |||
| Each PE needs to know a BD's corresponding SBD. Configuring that | Each PE needs to know a BD's corresponding SBD. Configuring that | |||
| information in each BD is one way but it requires repetitive | information in each BD is one way, but it requires repetitive | |||
| configuration and consistency checking (to make sure that all the BDs | configuration and consistency checking (to make sure that all the BDs | |||
| of the same tenant are configured with the same SBD). A better way | of the same tenant are configured with the same SBD). A better way | |||
| is to configure the SBD info in the L3 routing instance so that all | is to configure the SBD info in the L3 routing instance so that all | |||
| related BDs will derive the SBD information. | related BDs will derive the SBD information. | |||
| An AR-replicator also needs to know same information, though it does | An AR-REPLICATOR also needs to know the same information, though it | |||
| not necessarily have an L3 routing instance. However, from the EVI- | does not necessarily have an L3 routing instance. However, from the | |||
| RT EC in a BD's IMET route, an AR-replicator can derive the | EVI-RT EC in a BD's IMET route, an AR-REPLICATOR can derive the | |||
| corresponding SBD of that BD without any configuration. | corresponding SBD of that BD without any configuration. | |||
| 3.2.4. BIER | 3.2.4. BIER | |||
| When BIER is used to transport multicast packets of a given Tenant | When BIER is used to transport multicast packets of a given Tenant | |||
| Domain, and a given EVPN-PE attached to that Tenant Domain is a | Domain, and a given EVPN PE attached to that Tenant Domain is a | |||
| possible ingress EVPN-PE for traffic originating outside that Tenant | possible ingress EVPN PE for traffic originating outside that Tenant | |||
| Domain, the given EVPN-PE MUST originate an SBD-IMET route, (see | Domain, the given EVPN PE MUST originate an SBD-IMET route (see | |||
| Section 3.2.1). | Section 3.2.1). | |||
| In addition, IMET routes that are originated for other BDs in the | In addition, IMET routes that are originated for other BDs in the | |||
| Tenant Domain MUST carry the SBD-RT. | Tenant Domain MUST carry the SBD-RT. | |||
| Each IMET route (including but not limited to the SBD-IMET route) | Each IMET route (including but not limited to the SBD-IMET route) | |||
| MUST carry a PMSI Tunnel attribute (PTA). The MPLS label field of | MUST carry a PTA. The MPLS Label field of the PTA MUST specify an | |||
| the PTA MUST specify an upstream-assigned MPLS label that maps | upstream-assigned MPLS label that maps uniquely (in the context of | |||
| uniquely (in the context of the originating EVPN-PE) to the BD for | the originating EVPN PE) to the BD for which the route is originated. | |||
| which the route is originated. | ||||
| Suppose an ingress EVPN-PE, PE1, needs to use BIER to tunnel an IP | Suppose an ingress EVPN PE, say PE1, needs to use BIER to tunnel an | |||
| multicast frame to a set of egress EVPN-PEs. And suppose the frame's | IP multicast frame to a set of egress EVPN PEs. And suppose the | |||
| source BD is BD1. The frame is encapsulated as follows: | frame's source BD is BD1. The frame is encapsulated as follows: | |||
| * A four-octet MPLS label stack entry [RFC3032] is prepended to the | * A four-octet MPLS label stack entry [RFC3032] is prepended to the | |||
| frame. The Label field is set to the upstream-assigned label that | frame. The Label field is set to the upstream-assigned label that | |||
| PE1 has assigned to BD1. | PE1 has assigned to BD1. | |||
| * The resulting MPLS packet is then encapsulated in a BIER | * The resulting MPLS packet is then encapsulated in a BIER | |||
| encapsulation [RFC8296], [I-D.ietf-bier-evpn]. The BIER BitString | encapsulation [RFC8296] [RFC9624]. The BIER BitString is set to | |||
| is set to identify the egress EVPN-PEs. The BIER "proto" field is | identify the egress EVPN PEs. The BIER Proto field is set to the | |||
| set to the value for "MPLS packet with upstream-assigned label at | value for "MPLS packet with an upstream-assigned label at top of | |||
| top of stack". | the stack". | |||
| Note: It is possible that the packet being tunneled from PE1 | Note: It is possible that the packet being tunneled from PE1 | |||
| originated outside the Tenant Domain. In this case, the actual | originated outside the Tenant Domain. In this case, the actual | |||
| source BD (BD1) is considered to be the SBD, and the | source BD, BD1, is considered to be the SBD, and the upstream- | |||
| upstream-assigned label it carries will be the label that PE1 | assigned label it carries will be the label that PE1 assigned to the | |||
| assigned to the SBD, and advertised in its SBD-IMET route. | SBD and advertised in its SBD-IMET route. | |||
| Suppose an egress PE, PE2, receives such a BIER packet. The BFIR-id | Suppose an egress PE, say PE2, receives such a BIER packet. The | |||
| field of the BIER header allows PE2 to determine that the ingress PE | BFIR-id field of the BIER header allows PE2 to determine that the | |||
| is PE1. There are then two cases to consider: | ingress PE is PE1. There are then two cases to consider: | |||
| 1. PE2 has received and installed an IMET route for BD1 from PE1. | 1. PE2 has received and installed an IMET route for BD1 from PE1. | |||
| In this case, the BIER packet will be carrying the | In this case, the BIER packet will be carrying the upstream- | |||
| upstream-assigned label that is specified in the PTA of that IMET | assigned label that is specified in the PTA of that IMET route. | |||
| route. This enables PE2 to determine the "apparent source BD" | This enables PE2 to determine the apparent source BD (as defined | |||
| (as defined in Section 2.4). | in Section 2.4). | |||
| 2. PE2 has not received and installed an IMET route for BD1 from | 2. PE2 has not received and installed an IMET route for BD1 from | |||
| PE1. | PE1. | |||
| In this case, PE2 will not recognize the upstream-assigned label | In this case, PE2 will not recognize the upstream-assigned label | |||
| carried in the BIER packet. PE2 MUST discard the packet. | carried in the BIER packet. PE2 MUST discard the packet. | |||
| Further details on the use of BIER to support EVPN can be found in | Further details on the use of BIER to support EVPN can be found in | |||
| [I-D.ietf-bier-evpn]. | [RFC9624]. | |||
| 3.2.5. Inclusive P2MP Tunnels | 3.2.5. Inclusive P2MP Tunnels | |||
| 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive Tunnels | 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive Tunnels | |||
| The procedures in this section apply only when | The procedures in this section apply only when: | |||
| (a) it is desired to use the BUM tunnels to carry IP multicast | a) it is desired to use the BUM tunnels to carry IP multicast | |||
| traffic across the backbone, and | traffic across the backbone and | |||
| (b) the BUM tunnels are P2MP tunnels (i.e., neither IR, AR, nor BIER | b) the BUM tunnels are P2MP tunnels (i.e., neither IR, AR, nor BIER | |||
| are being used to transport the BUM traffic). | are being used to transport the BUM traffic). | |||
| In this case, an IP multicast frame (whether inter-subnet or | In this case, an IP multicast frame (whether inter-subnet or intra- | |||
| intra-subnet) will be carried across the backbone in the BUM tunnel | subnet) will be carried across the backbone in the BUM tunnel | |||
| belonging to its source BD. Each EVPN-PE attached to a given Tenant | belonging to its source BD. Each EVPN PE attached to a given Tenant | |||
| Domain needs to join the BUM tunnels for every BD in the Tenant | Domain needs to join the BUM tunnels for every BD in the Tenant | |||
| Domain, even those BDs to which the EVPN-PE is not locally attached. | Domain, even those BDs to which the EVPN PE is not locally attached. | |||
| This ensures that an IP multicast packet from any source BD can reach | This ensures that an IP multicast packet from any source BD can reach | |||
| all PEs attached to the Tenant Domain. | all PEs attached to the Tenant Domain. | |||
| Note that this will cause all the BUM traffic from a given BD in a | Note that this will cause all the BUM traffic from a given BD in a | |||
| Tenant Domain to be sent to all PEs that attach to that Tenant | Tenant Domain to be sent to all PEs that attach to that Tenant | |||
| Domain, even the PEs that don't attach to the given BD. To avoid | Domain, even the PEs that don't attach to the given BD. To avoid | |||
| this, it is RECOMMENDED that the BUM tunnels not be used as IP | this, it is RECOMMENDED that the BUM tunnels not be used as IP | |||
| Multicast inclusive tunnels, and that the procedures of | multicast inclusive tunnels and that the procedures of | |||
| Section 3.2.5.2 be used instead. | Section 3.2.5.2 be used instead. | |||
| If a PE is a possible ingress EVPN-PE for traffic originating outside | If a PE is a possible ingress EVPN PE for traffic originating outside | |||
| the Tenant Domain, the PE MUST originate an SBD-IMET route (see | the Tenant Domain, the PE MUST originate an SBD-IMET route (see | |||
| Section 3.2.1). This route MUST carry a PTA specifying the P2MP | Section 3.2.1). This route MUST carry a PTA specifying the P2MP | |||
| tunnel used for transmitting IP multicast packets that originate | tunnel used for transmitting IP multicast packets that originate | |||
| outside the tenant domain. All EVPN-PEs of the Tenant Domain MUST | outside the Tenant Domain. All EVPN PEs of the Tenant Domain MUST | |||
| join the tunnel specified in the PTA of an SBD-IMET route: | join the tunnel specified in the PTA of an SBD-IMET route: | |||
| * If the tunnel is an RSVP-TE P2MP tunnel, the originator of the | * If the tunnel is an RSVP-TE P2MP tunnel, the originator of the | |||
| route MUST use RSVP-TE P2MP procedures to add each PE of the | route MUST use RSVP-TE P2MP procedures to add each PE of the | |||
| Tenant Domain to the tunnel, even PEs that have not originated an | Tenant Domain to the tunnel, even PEs that have not originated an | |||
| SBD-IMET route. | SBD-IMET route. | |||
| * If the tunnel is an mLDP or PIM tunnel, each PE importing the | * If the tunnel is an mLDP or PIM tunnel, each PE importing the SBD- | |||
| SBD-IMET route MUST add itself to the tunnel, using mLDP or PIM | IMET route MUST add itself to the tunnel, using mLDP or PIM | |||
| procedures, respectively. | procedures, respectively. | |||
| Whether or not a PE originates an SBD-IMET route, it will of course | Whether or not a PE originates an SBD-IMET route, it will of course | |||
| originate an IMET route for each BD to which it is attached. Each of | originate an IMET route for each BD to which it is attached. Each of | |||
| these IMET routes MUST carry the SBD-RT, as well as the RT for the BD | these IMET routes MUST carry the SBD-RT, as well as the RT for the BD | |||
| to which it belongs. | to which it belongs. | |||
| If a received IMET route is not the SBD-IMET route, it will also be | If a received IMET route is not the SBD-IMET route, it will also be | |||
| carrying the RT for its source BD. The route's NLRI will carry the | carrying the RT for its source BD. The route's NLRI will carry the | |||
| Tag ID for the source BD. From the RT and the Tag ID, any PE | Tag ID for the source BD. From the RT and the Tag ID, any PE | |||
| receiving the route can determine the route's source BD. | receiving the route can determine the route's source BD. | |||
| If the MPLS label field of the PTA contains zero, the specified P2MP | If the MPLS Label field of the PTA contains zero, the specified P2MP | |||
| tunnel is used only to carry frames of a single source BD. | tunnel is used only to carry frames of a single source BD. | |||
| If the MPLS label field of the PTA does not contain zero, it MUST | If the MPLS Label field of the PTA does not contain zero, it MUST | |||
| contain an upstream-assigned MPLS label that maps uniquely (in the | contain an upstream-assigned MPLS label that maps uniquely (in the | |||
| context of the originating EVPN-PE) to the source BD (or, in the case | context of the originating EVPN PE) to the source BD (or in the case | |||
| of an SBD-IMET route, to the SBD). The tunnel may then be used to | of an SBD-IMET route, to the SBD). The tunnel may then be used to | |||
| carry frames of multiple source BDs. The apparent source BD of a | carry frames of multiple source BDs. The apparent source BD of a | |||
| particular packet is inferred from the label carried by the packet. | particular packet is inferred from the label carried by the packet. | |||
| IP multicast traffic originating outside the Tenant Domain is | IP multicast traffic originating outside the Tenant Domain is | |||
| transmitted with the label corresponding to the SBD, as specified in | transmitted with the label corresponding to the SBD, as specified in | |||
| the ingress EVPN-PE's SBD-IMET route. | the ingress EVPN PE's SBD-IMET route. | |||
| 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise Inclusive | 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise Inclusive | |||
| Tunnels Specific to IP Multicast | Tunnels Specific to IP Multicast | |||
| The procedures of this section apply when (and only when) it is | The procedures of this section apply when (and only when) it is | |||
| desired to transmit IP multicast traffic on an inclusive tunnel, but | desired to transmit IP multicast traffic on an inclusive tunnel but | |||
| not on the same tunnel used to transmit BUM traffic. | not on the same tunnel used to transmit BUM traffic. | |||
| However, these procedures do NOT apply when the tunnel type is | However, these procedures do NOT apply when the tunnel type is IR or | |||
| Ingress Replication or BIER, EXCEPT in the case where it is necessary | BIER, EXCEPT in the case where it is necessary to interwork between | |||
| to interwork between non-OISM PEs and OISM PEs, as specified in | non-OISM PEs and OISM PEs, as specified in Section 5. | |||
| Section 5. | ||||
| Each EVPN-PE attached to the given Tenant Domain MUST originate an | Each EVPN PE attached to the given Tenant Domain MUST originate an | |||
| SBD-SPMSI A-D route. The NLRI of that route MUST contain (C-*,C-*) | SBD-SPMSI A-D route. The NLRI of that route MUST contain (C-*,C-*) | |||
| (see [RFC6625]). Additional rules for constructing that route are | (see [RFC6625]). Additional rules for constructing that route are | |||
| given in Section 3.2.1. | given in Section 3.2.1. | |||
| In addition, an EVPN-PE MUST originate an S-PMSI A-D route containing | In addition, an EVPN PE MUST originate an S-PMSI A-D route containing | |||
| (C-*,C-*) in its NLRI for each of the other BDs, in the given Tenant | (C-*,C-*) in its NLRI for each of the other BDs, in the given Tenant | |||
| Domain, to which it is attached. All such routes MUST carry the | Domain, to which it is attached. All such routes MUST carry the SBD- | |||
| SBD-RT. This ensures that those routes are imported by all EVPN-PEs | RT. This ensures that those routes are imported by all EVPN PEs | |||
| attached to the Tenant Domain. | attached to the Tenant Domain. | |||
| A PE receiving these routes follows the procedures of Section 2.2 to | A PE receiving these routes follows the procedures of Section 2.2 to | |||
| determine which BD the route is for. | determine which BD the route is for. | |||
| If the MPLS label field of the PTA contains zero, the specified | If the MPLS Label field of the PTA contains zero, the specified | |||
| tunnel is used only to carry frames of a single source BD. | tunnel is used only to carry frames of a single source BD. | |||
| If the MPLS label field of the PTA does not contain zero, it MUST | If the MPLS Label field of the PTA does not contain zero, it MUST | |||
| specify an upstream-assigned MPLS label that maps uniquely (in the | specify an upstream-assigned MPLS label that maps uniquely (in the | |||
| context of the originating EVPN-PE) to the source BD. The tunnel may | context of the originating EVPN PE) to the source BD. The tunnel may | |||
| be used to carry frames of multiple source BDs, and the apparent | be used to carry frames of multiple source BDs, and the apparent | |||
| source BD for a particular packet is inferred from the label carried | source BD for a particular packet is inferred from the label carried | |||
| by the packet. | by the packet. | |||
| The EVPN-PE advertising these S-PMSI A-D route routes is specifying | The EVPN PE advertising these S-PMSI A-D routes is specifying the | |||
| the default tunnel that it will use (as ingress PE) for transmitting | default tunnel that it will use (as ingress PE) for transmitting IP | |||
| IP multicast packets. The upstream-assigned label allows an egress | multicast packets. The upstream-assigned label allows an egress PE | |||
| PE to determine the apparent source BD of a given packet. | to determine the apparent source BD of a given packet. | |||
| 3.2.6. Selective Tunnels | 3.2.6. Selective Tunnels | |||
| An ingress EVPN-PE for a given multicast flow or set of flows can | An ingress EVPN PE for a given multicast flow or set of flows can | |||
| always assign the flow to a particular P2MP tunnel by originating an | always assign the flow to a particular P2MP tunnel by originating an | |||
| S-PMSI A-D route whose NLRI identifies the flow or set of flows. The | S-PMSI A-D route whose NLRI identifies the flow or set of flows. The | |||
| NLRI of the route could be (C-*,C-G), or (C-S,C-G). The S-PMSI A-D | NLRI of the route could be (C-*,C-G) or (C-S,C-G). The S-PMSI A-D | |||
| route MUST carry the SBD-RT, so that it is imported by all EVPN-PEs | route MUST carry the SBD-RT so that it is imported by all EVPN PEs | |||
| attached to the Tenant Domain. | attached to the Tenant Domain. | |||
| An S-PMSI A-D route is "for" a particular source BD. It MUST carry | An S-PMSI A-D route is for a particular source BD. It MUST carry the | |||
| the RT associated with that BD, and it MUST have the Tag ID for that | RT associated with that BD, and it MUST have the Tag ID for that BD | |||
| BD in its NLRI. | in its NLRI. | |||
| When an EVPN-PE imports an S-PMSI A-D route, it applies the rules of | When an EVPN PE imports an S-PMSI A-D route, it applies the rules of | |||
| Section 2.2 to associate the route with a particular BD. | Section 2.2 to associate the route with a particular BD. | |||
| Each such route MUST contain a PTA, as specified in Section 3.2.5.2. | Each such route MUST contain a PTA, as specified in Section 3.2.5.2. | |||
| An egress EVPN-PE interested in the specified flow or flows MUST join | An egress EVPN PE interested in the specified flow or flows MUST join | |||
| the specified tunnel. Procedures for joining the specified tunnel | the specified tunnel. Procedures for joining the specified tunnel | |||
| are specific to the tunnel type. (Note that if the tunnel type is | are specific to the tunnel type. (Note that if the tunnel type is | |||
| RSVP-TE P2MP LSP, the Leaf Information Required (LIR) flag of the PTA | RSVP-TE P2MP LSP, the Leaf Information Required (LIR) flag of the PTA | |||
| SHOULD NOT be set. An ingress OISM PE knows which OISM EVPN PEs are | SHOULD NOT be set. An ingress OISM PE knows which OISM EVPN PEs are | |||
| interested in any given flow, and hence can add them to the RSVP-TE | interested in any given flow and hence can add them to the RSVP-TE | |||
| P2MP tunnel that carries such flows.) | P2MP tunnel that carries such flows.) | |||
| If the PTA does not specify a non-zero MPLS label, the apparent | If the PTA does not specify a non-zero MPLS label, the apparent | |||
| source BD of any packets that arrive on that tunnel is considered to | source BD of any packets that arrive on that tunnel is considered to | |||
| be the BD associated with the route that carries the PTA. If the PTA | be the BD associated with the route that carries the PTA. If the PTA | |||
| does specify a non-zero MPLS label, the apparent source BD of any | does specify a non-zero MPLS label, the apparent source BD of any | |||
| packets that arrive on that tunnel carrying the specified label is | packets that arrive on that tunnel carrying the specified label is | |||
| considered to be the BD associated with the route that carries the | considered to be the BD associated with the route that carries the | |||
| PTA. | PTA. | |||
| It should be noted that when either IR or BIER is used, there is no | It should be noted that, when either IR or BIER is used, there is no | |||
| need for an ingress PE to use S-PMSI A-D routes to assign specific | need for an ingress PE to use S-PMSI A-D routes to assign specific | |||
| flows to selective tunnels. The procedures of Section 3.3, along | flows to selective tunnels. The procedures of Section 3.3, along | |||
| with the procedures of Section 3.2.2, Section 3.2.3, or | with the procedures of Sections 3.2.2, 3.2.3, and 3.2.4, provide the | |||
| Section 3.2.4, provide the functionality of selective tunnels without | functionality of selective tunnels without the need to use S-PMSI A-D | |||
| the need to use S-PMSI A-D routes. | routes. | |||
| 3.3. Advertising SMET Routes | 3.3. Advertising SMET Routes | |||
| [RFC9251] allows an egress EVPN-PE to express its interest in a | [RFC9251] allows an egress EVPN PE to express its interest in a | |||
| particular multicast flow or set of flows by originating an SMET | particular multicast flow or set of flows by originating a SMET | |||
| route. The NLRI of the SMET route identifies the flow or set of | route. The NLRI of the SMET route identifies the flow or set of | |||
| flows as (C-*,C-*) or (C-*,C-G) or (C-S,C-G). | flows as (C-*,C-*), (C-*,C-G), or (C-S,C-G). | |||
| Each SMET route belongs to a particular BD. The Tag ID for the BD | Each SMET route belongs to a particular BD. The Tag ID for the BD | |||
| appears in the NLRI of the route, and the route carries the RT | appears in the NLRI of the route, and the route carries the RT | |||
| associated with that BD. From this <RT, tag> pair, other EVPN-PEs | associated with that BD. From this <RT, tag> pair, other EVPN PEs | |||
| can identify the BD to which a received SMET route belongs. | can identify the BD to which a received SMET route belongs. | |||
| (Remember though that the route may be carrying multiple RTs.) | (Remember though that the route may be carrying multiple RTs.) | |||
| There are three cases to consider: | There are three cases to consider: | |||
| * Case 1: It is known that no BD of a Tenant Domain contains a | Case 1: It is known that no BD of a Tenant Domain contains a | |||
| multicast router. | multicast router. | |||
| In this case, an egress PE advertises its interest in a flow or | In this case, an egress PE advertises its interest in a flow | |||
| set of flows by originating an SMET route that belongs to the SBD. | or set of flows by originating a SMET route that belongs to | |||
| We refer to this as an SBD-SMET route. The SBD-SMET route carries | the SBD. We refer to this as an SBD-SMET route. The SBD- | |||
| the SBD-RT, and has the Tag ID for the SBD in its NLRI. SMET | SMET route carries the SBD-RT and has the Tag ID for the SBD | |||
| routes for the individual BDs are not needed, because there is no | in its NLRI. SMET routes for the individual BDs are not | |||
| need for a PE that receives an SMET route to send a corresponding | needed, because there is no need for a PE that receives a | |||
| IGMP/MLD Join message on any of its ACs. | SMET route to send a corresponding IGMP/MLD Join message on | |||
| any of its ACs. | ||||
| * Case 2: It is known that more than one BD of a Tenant Domain may | Case 2: It is known that more than one BD of a Tenant Domain may | |||
| contain a multicast router. | contain a multicast router. | |||
| This is very like Case 1. An egress PE advertises its interest in | This is much like Case 1. An egress PE advertises its | |||
| a flow or set of flows by originating an SBD-SMET route. The | interest in a flow or set of flows by originating an SBD- | |||
| SBD-SMET route carries the SBD-RT, and has the Tag ID for the SBD | SMET route. The SBD-SMET route carries the SBD-RT and has | |||
| in its NLRI. | the Tag ID for the SBD in its NLRI. | |||
| In this case, it is important to be sure that SMET routes for the | In this case, it is important to be sure that SMET routes | |||
| individual BDs are not originated. Suppose, for example, that PE1 | for the individual BDs are not originated. For example, | |||
| had local receivers for a given flow on both BD1 and BD2, and that | suppose that PE1 had local receivers for a given flow on | |||
| it originated SMET routes for both those BDs. Then PEs receiving | both BD1 and BD2 and that it originated SMET routes for both | |||
| those SMET routes might send IGMP/MLD Joins on both those BDs. | those BDs. Then, PEs receiving those SMET routes might send | |||
| This could cause externally sourced multicast traffic to enter the | IGMP/MLD Joins on both those BDs. This could cause | |||
| Tenant Domain at both BDs, which could result in duplication of | externally sourced multicast traffic to enter the Tenant | |||
| data. | Domain at both BDs, which could result in duplication of | |||
| data. | ||||
| Note that if it is possible that more than one BD contains a | Note that if it is possible that more than one BD contains a | |||
| tenant multicast router, then in order to receive multicast data | tenant multicast router, then in order to receive multicast | |||
| originating from outside EVPN, the PEs MUST follow the procedures | data originating from outside EVPN, the PEs MUST follow the | |||
| of Section 6. | procedures of Section 6. | |||
| * Case 3: It is known that only a single BD of a Tenant Domain | Case 3: It is known that only a single BD of a Tenant Domain | |||
| contains a multicast router. | contains a multicast router. | |||
| Suppose that an egress PE is attached to a BD on which there might | Suppose that an egress PE is attached to a BD on which there | |||
| be a tenant multicast router. (The tenant router is not | might be a tenant multicast router. (The tenant router is | |||
| necessarily on a segment that is attached to that PE.) And | not necessarily on a segment that is attached to that PE.) | |||
| suppose that the PE has one or more ACs attached to that BD which | And suppose that the PE has one or more ACs attached to that | |||
| are interested in a given multicast flow. In this case, in | BD, which are interested in a given multicast flow. In this | |||
| addition to the SMET route for the SBD, the egress PE MAY | case, in addition to the SMET route for the SBD, the egress | |||
| originate an SMET route for that BD. This will enable the ingress | PE MAY originate a SMET route for that BD. This will enable | |||
| PE(s) to send IGMP/MLD messages on ACs for the BD, as specified in | the ingress PE(s) to send IGMP/MLD messages on ACs for the | |||
| [RFC9251]. As long as that is the only BD on which there is a | BD, as specified in [RFC9251]. As long as that is the only | |||
| tenant multicast router, there is no possibility of duplication of | BD on which there is a tenant multicast router, there is no | |||
| data. | possibility of duplication of data. | |||
| This document does not specify procedures for dynamically determining | This document does not specify procedures for dynamically determining | |||
| which of the three cases applies to a given deployment; the PEs of a | which of the three cases applies to a given deployment; the PEs of a | |||
| given Tenant Domain MUST be provisioned to know which case applies. | given Tenant Domain MUST be provisioned to know which case applies. | |||
| As detailed in [RFC9251], an SMET route carries flags indicating | As detailed in [RFC9251], a SMET route carries flags indicating | |||
| whether IGMP (v1, v2 or v3) or MLD (v1 or v2) messages should be | whether IGMP (v1, v2, or v3) or MLD (v1 or v2) messages should be | |||
| triggered on the ACs of the BD to which the SMET route belongs. For | triggered on the ACs of the BD to which the SMET route belongs. For | |||
| IGMP v3 and MLD v2, the IE flag also indicates whether the source | IGMP v3 and MLD v2, the Include/Exclude (IE) flag also indicates | |||
| information in the SMET route is of an Include Group type or Exclude | whether the source information in the SMET route is of an Include | |||
| Group type. If an SBD PE needs to generate IGMP/MLD reports as it is | Group type or Exclude Group type. If an SBD PE needs to generate | |||
| the case in section 6.2), or the route is for an (S, G) state, the | IGMP/MLD reports (as it is the case in Section 6.2) or the route is | |||
| value of the flags MUST be set according to the rules in [RFC9251]. | for an (S, G) state, the value of the flags MUST be set according to | |||
| Otherwise, the flags SHOULD be set to 0. | the rules in [RFC9251]. Otherwise, the flags SHOULD be set to 0. | |||
| Note that a PE only needs to originate the set of SBD-SMET routes | Note that a PE only needs to originate the set of SBD-SMET routes | |||
| that are needed to receive multicast traffic in which it is | that are needed in order to receive multicast traffic that the PE is | |||
| interested. Suppose PE1 has ACs attached to BD1 that are interested | interested in. Suppose PE1 has ACs attached to BD1 that are | |||
| in (C-*,C-G) traffic, and ACs attached to BD2 that are interested in | interested in (C-*,C-G) traffic and ACs attached to BD2 that are | |||
| (C-S,C-G) traffic. A single SBD-SMET route specifying (C-*,C-G) will | interested in (C-S,C-G) traffic. A single SBD-SMET route specifying | |||
| attract all the necessary flows. | (C-*,C-G) will attract all the necessary flows. | |||
| As another example, suppose the ACs attached to BD1 are interested in | As another example, suppose the ACs attached to BD1 are interested in | |||
| (C-*,C-G) but not in (C-S,C-G), while the ACs attached to BD2 are | (C-*,C-G) but not in (C-S,C-G), while the ACs attached to BD2 are | |||
| interested in (C-S,C-G). A single SBD-SMET route specifying | interested in (C-S,C-G). A single SBD-SMET route specifying | |||
| (C-*,C-G) will pull in all the necessary flows. | (C-*,C-G) will pull in all the necessary flows. | |||
| In other words, to determine the set of SBD-SMET routes that have to | In other words, to determine the set of SBD-SMET routes that have to | |||
| be sent for a given C-G, the PE has to merge the IGMP/MLD state for | be sent for a given C-G, the PE has to merge the IGMP/MLD state for | |||
| all the BDs (of the given Tenant Domain) to which it is attached. | all the BDs (of the given Tenant Domain) to which it is attached. | |||
| Per [RFC9251], importing an SMET route for a particular BD will cause | Per [RFC9251], importing a SMET route for a particular BD will cause | |||
| IGMP/MLD state to be instantiated for the IRB interface to that BD. | the IGMP/MLD state to be instantiated for the IRB interface to that | |||
| This applies as well when the BD is the SBD. | BD. This also applies when the BD is the SBD. | |||
| However, traffic that originates in one of the actual BDs of a | However, traffic that originates in one of the actual BDs of a | |||
| particular Tenant Domain MUST NOT be sent down the IRB interface that | particular Tenant Domain MUST NOT be sent down the IRB interface that | |||
| connects the L3 routing instance of that Tenant Domain to the SBD. | connects the L3 routing instance of that Tenant Domain to the SBD. | |||
| That would cause duplicate delivery of traffic, since such traffic | That would cause duplicate delivery of traffic, since such traffic | |||
| will have already been distributed throughout the Tenant Domain. | will have already been distributed throughout the Tenant Domain. | |||
| Therefore, when setting up the IGMP/MLD state based on SBD-SMET | Therefore, when setting up the IGMP/MLD state based on SBD-SMET | |||
| routes, care must be taken to ensure that the IRB interface to the | routes, care must be taken to ensure that the IRB interface to the | |||
| SBD is not added to the Outgoing Interface (OIF) list if the traffic | SBD is not added to the Outgoing Interface (OIF) list if the traffic | |||
| originates within the Tenant Domain. | originates within the Tenant Domain. | |||
| There are some multicast scenarios that make use of "anycast | There are some multicast scenarios that make use of anycast sources. | |||
| sources". For example, two different sources may share the same | For example, two different sources may share the same anycast IP | |||
| anycast IP address, say S1, and each may transmit an (S1,G) multicast | address, say S1, and each may transmit an (S1,G) multicast flow. In | |||
| flow. In such a scenario, the two (S1,G) flows are typically | such a scenario, the two (S1,G) flows are typically identical. | |||
| identical. Ordinary PIM procedures will cause only one the flows to | Ordinary PIM procedures will cause only one of the flows to be | |||
| be delivered to each receiver that has expressed interest in either | delivered to each receiver that has expressed interest in either | |||
| (*,G) or (S1,G). However, the OISM procedures described in this | (*,G) or (S1,G). However, the OISM procedures described in this | |||
| document will result in both of the (S1,G) flows being distributed in | document will result in both of the (S1,G) flows being distributed in | |||
| the Tenant Domain, and duplicate delivery will result. Therefore, if | the Tenant Domain, and duplicate delivery will result. Therefore, if | |||
| there are receivers for (*,G) in a given Tenant Domain, there MUST | there are receivers for (*,G) in a given Tenant Domain, there MUST | |||
| NOT be anycast sources for G within that Tenant Domain. (This | NOT be anycast sources for G within that Tenant Domain. (This | |||
| restriction could be lifted by defining additional procedures; | restriction could be lifted by defining additional procedures; | |||
| however that is outside the scope of this document.) | however, that is outside the scope of this document.) | |||
| 4. Constructing Multicast Forwarding State | 4. Constructing Multicast Forwarding State | |||
| 4.1. Layer 2 Multicast State | 4.1. Layer 2 Multicast State | |||
| An EVPN-PE maintains "layer 2 multicast state" for each BD to which | An EVPN PE maintains Layer 2 multicast state for each BD to which it | |||
| it is attached. Note that this is used for forwarding IP multicast | is attached. Note that this is used for forwarding IP multicast | |||
| frames based on the inner IP header. The state is learned through | frames based on the inner IP header. The state is learned through | |||
| IGMP/MLD snooping [RFC4541] and procedures in this document. | IGMP/MLD snooping [RFC4541] and procedures in this document. | |||
| Let PE1 be an EVPN-PE, and BD1 be a BD to which it is attached. At | Let PE1 be an EVPN PE and BD1 be a BD to which it is attached. At | |||
| PE1, BD1's layer 2 multicast state for a given (C-S,C-G) or (C-*,C-G) | PE1, BD1's Layer 2 multicast state for a given (C-S,C-G) or (C-*,C-G) | |||
| governs the disposition of an IP multicast packet that is received by | governs the disposition of an IP multicast packet that is received by | |||
| BD1's layer 2 multicast function on an EVPN-PE. | BD1's Layer 2 multicast function on an EVPN PE. | |||
| An IP multicast (S,G) packet is considered to have been received by | An IP multicast (S,G) packet is considered to have been received by | |||
| BD1's layer 2 multicast function in PE1 in the following cases: | BD1's Layer 2 multicast function in PE1 in the following cases: | |||
| * The packet is the payload of an Ethernet frame received by PE1 | * The packet is the payload of an Ethernet frame received by PE1 | |||
| from an AC that attaches to BD1. | from an AC that attaches to BD1. | |||
| * The packet is the payload of an Ethernet frame whose apparent | * The packet is the payload of an Ethernet frame whose apparent | |||
| source BD is BD1, and which is received by the PE1 over a tunnel | source BD is BD1, which is received by the PE1 over a tunnel from | |||
| from another EVPN-PE. | another EVPN PE. | |||
| * The packet is received from BD1's IRB interface (i.e., has been | * The packet is received from BD1's IRB interface (i.e., has been | |||
| transmitted by PE1's L3 routing instance down BD1's IRB | transmitted by PE1's L3 routing instance down BD1's IRB | |||
| interface). | interface). | |||
| According to the procedures of this document, all transmission of IP | According to the procedures of this document, all transmissions of IP | |||
| multicast packets from one EVPN-PE to another is done at layer 2. | multicast packets from one EVPN PE to another are done at Layer 2. | |||
| That is, the packets are transmitted as Ethernet frames, according to | That is, the packets are transmitted as Ethernet frames, according to | |||
| the layer 2 multicast state. | the Layer 2 multicast state. | |||
| Each layer 2 multicast state (S,G) or (*,G) contains a set of "output | Each Layer 2 multicast state (S,G) or (*,G) contains a set of | |||
| interfaces" (OIF list). The disposition of an (S,G) multicast frame | outgoing interfaces (an OIF list). The disposition of an (S,G) | |||
| received by BD1's layer 2 multicast function is determined as | multicast frame received by BD1's Layer 2 multicast function is | |||
| follows: | determined as follows: | |||
| * The OIF list is taken from BD1's layer 2 (S,G) state, or if there | * The OIF list is taken from BD1's Layer 2 (S,G) state, or if there | |||
| is no such (S,G) state, then from BD1's (*,G) state. (If neither | is no such (S,G) state, then it is taken from BD1's (*,G) state. | |||
| state exists, the OIF list is considered to be null.) | (If neither state exists, the OIF list is considered to be null.) | |||
| * The rules of Section 4.1.2 are applied to the OIF list. This will | * The rules of Section 4.1.2 are applied to the OIF list. This will | |||
| generally result in the frame being transmitted to some, but not | generally result in the frame being transmitted to some, but not | |||
| all, elements of the OIF list. | all, elements of the OIF list. | |||
| Note that there is no Reverse Path Forwarding (RPF) check at layer 2. | Note that there is no Reverse Path Forwarding (RPF) check at Layer 2. | |||
| 4.1.1. Constructing the OIF List | 4.1.1. Constructing the OIF List | |||
| In this document, we have extended the procedures of [RFC9251] so | In this document, we have extended the procedures of [RFC9251] so | |||
| that IMET and SMET routes for a particular BD are distributed not | that IMET and SMET routes for a particular BD are distributed not | |||
| just to PEs that attach to that BD, but to PEs that attach to any BD | just to PEs that attach to that BD but to PEs that attach to any BD | |||
| in the Tenant Domain. In this way, each PE attached to a given | in the Tenant Domain. In this way, each PE attached to a given | |||
| Tenant Domain learns, from other PE attached to the same Tenant | Tenant Domain learns, from another PE attached to the same Tenant | |||
| Domain, the set of flows that are of interest to each of those other | Domain, the set of flows that are of interest to each of those other | |||
| PEs. (If some PE attached to the Tenant Domain does not support | PEs. (If some PE attached to the Tenant Domain does not support | |||
| [RFC9251], it will be assumed to be interested in all flows. Whether | [RFC9251], it will be assumed to be interested in all flows. Whether | |||
| a particular remote PE supports [RFC9251] is determined by the | or not a particular remote PE supports [RFC9251] is determined by the | |||
| presence of an Extended Community in its IMET route; this is | presence of an Extended Community in its IMET route; this is | |||
| specified in [RFC9251].) If a set of remote PEs are interested in a | specified in [RFC9251].) If a set of remote PEs are interested in a | |||
| particular flow, the tunnels used to reach those PEs are added to the | particular flow, the tunnels used to reach those PEs are added to the | |||
| OIF list of the multicast states corresponding to that flow. | OIF list of the multicast states corresponding to that flow. | |||
| An EVPN-PE may run IGMP/MLD snooping procedures [RFC4541] on each of | An EVPN PE may run IGMP/MLD snooping procedures [RFC4541] on each of | |||
| its ACs, in order to determine the set of flows of interest to each | its ACs in order to determine the set of flows of interest to each | |||
| AC. (An AC is said to be interested in a given flow if it connects | AC. (An AC is said to be interested in a given flow if it connects | |||
| to a segment that has tenant systems interested in that flow.) If | to a segment that has tenant systems interested in that flow.) If | |||
| IGMP/MLD procedures are not being run on a given AC, that AC is | IGMP/MLD procedures are not being run on a given AC, that AC is | |||
| considered to be interested in all flows. For each BD, the set of | considered to be interested in all flows. For each BD, the set of | |||
| ACs interested in a given flow is determined, and the ACs of that set | ACs interested in a given flow is determined, and the ACs of that set | |||
| are added to the OIF list of that BD's multicast state for that flow. | are added to the OIF list of that BD's multicast state for that flow. | |||
| The OIF list for each multicast state must also contain the IRB | The OIF list for each multicast state must also contain the IRB | |||
| interface for the BD to which the state belongs. | interface for the BD to which the state belongs. | |||
| Implementors should note that the OIF list of a multicast state will | Implementors should note that the OIF list of a multicast state will | |||
| change from time to time as ACs and/or remote PEs either become | change from time to time as ACs and/or remote PEs either become | |||
| interested in, or lose interest in, particular multicast flows. | interested in or lose interest in particular multicast flows. | |||
| 4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | 4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | |||
| When an (S,G) multicast frame is received by the layer 2 multicast | When an (S,G) multicast frame is received by the Layer 2 multicast | |||
| function of a given EVPN-PE, say PE1, its disposition depends (a) on | function of a given EVPN PE, say PE1, its disposition depends upon | |||
| the way it was received, (b) upon the OIF list of the corresponding | (a) the way it was received, (b) the OIF list of the corresponding | |||
| multicast state (see Section 4.1.1), (c) upon the "eligibility" of an | multicast state (see Section 4.1.1), (c) the eligibility of an AC to | |||
| AC to receive a given frame (see Section 4.1.2.1) and (d) upon its | receive a given frame (see Section 4.1.2.1), and (d) its apparent | |||
| apparent source BD (see Section 3.2 for information about determining | source BD (see Section 3.2 for information about determining the | |||
| the apparent source BD of a frame received over a tunnel from another | apparent source BD of a frame received over a tunnel from another | |||
| PE). | PE). | |||
| 4.1.2.1. Eligibility of an AC to Receive a Frame | 4.1.2.1. Eligibility of an AC to Receive a Frame | |||
| A given (S,G) multicast frame is eligible to be transmitted by a | A given (S,G) multicast frame is eligible to be transmitted by a | |||
| given PE, say PE1, on a given AC, say AC1, only if one of the | given PE, say PE1, on a given AC, say AC1, only if one of the | |||
| following conditions holds: | following conditions holds: | |||
| 1. ESI labels are being used, PE1 is the DF for the segment to which | 1. Ethernet Segment Identifier (ESI) labels are being used, PE1 is | |||
| AC1 is connected, and the frame did not originate from that same | the DF for the segment to which AC1 is connected, and the frame | |||
| segment (as determined by the ESI label), or | did not originate from that same segment (as determined by the | |||
| ESI label). | ||||
| 2. The ingress PE for the frame is a remote PE, say PE2, local bias | 2. The ingress PE for the frame is a remote PE, say PE2, local bias | |||
| is being used, and PE2 is not connected to the same segment as | is being used, and PE2 is not connected to the same segment as | |||
| AC1. | AC1. | |||
| 4.1.2.2. Applying the OIF List | 4.1.2.2. Applying the OIF List | |||
| Assume a given (S,G) multicast frame has been received by a given PE, | Assume a given (S,G) multicast frame has been received by a given PE, | |||
| say PE1. PE1 determines the apparent source BD of the frame, finds | say PE1. PE1 determines the apparent source BD of the frame, finds | |||
| the layer 2 (S,G) state for that BD (or the (*,G) state if there is | the Layer 2 (S,G) state for that BD (or the (*,G) state if there is | |||
| no (S,G) state), and uses the OIF list from that state. (Note that | no (S,G) state), and uses the OIF list from that state. (Note that | |||
| if PE1 is not attached to the actual source BD, the apparent source | if PE1 is not attached to the actual source BD, the apparent source | |||
| BD will be the SBD.) | BD will be the SBD.) | |||
| Suppose PE1 has determined the frame's apparent source BD to be BD1 | If PE1 has determined the frame's apparent source BD to be BD1 (which | |||
| (which may or may not be the SBD.) There are the following cases to | may or may not be the SBD), then the following cases should be | |||
| consider: | considered: | |||
| 1. The frame was received by PE1 from a local AC, say AC1, that | 1. The frame was received by PE1 from a local AC, say AC1, that | |||
| attaches to BD1. | attaches to BD1. | |||
| a. The frame MUST be sent on all local ACs of BD1 that appear in | a. The frame MUST be sent on all local ACs of BD1 that appear in | |||
| the OIF list, except for AC1 itself. | the OIF list, except for AC1 itself. | |||
| b. The frame MUST also be delivered to any other EVPN-PEs that | b. The frame MUST also be delivered to any other EVPN PEs that | |||
| have interest in it. This is achieved as follows: | have interest in it. This is achieved as follows: | |||
| i. If (a) AR is being used, and (b) PE1 is an AR-LEAF, and | i. If (a) AR is being used, (b) PE1 is an AR-LEAF, and (c) | |||
| (c) the OIF list is non-null, PE1 MUST send the frame to | the OIF list is non-null, PE1 MUST send the frame to the | |||
| the AR-REPLICATOR. | AR-REPLICATOR. | |||
| ii. Otherwise the frame MUST be sent on all tunnels in the | ii. Otherwise, the frame MUST be sent on all tunnels in the | |||
| OIF list. | OIF list. | |||
| c. The frame MUST be sent to the local L3 routing instance by | c. The frame MUST be sent to the local L3 routing instance by | |||
| being sent up the IRB interface of BD1. It MUST NOT be sent | being sent up the IRB interface of BD1. It MUST NOT be sent | |||
| up any other IRB interfaces. | up any other IRB interfaces. | |||
| 2. The frame was received by PE1 over a tunnel from another PE. | 2. The frame was received by PE1 over a tunnel from another PE. | |||
| (See Section 3.2 for the rules to determine the apparent source | (See Section 3.2 for the rules to determine the apparent source | |||
| BD of a packet received from another PE. Note that if PE1 is not | BD of a packet received from another PE. Note that if PE1 is not | |||
| attached to the source BD, it will regard the SBD as the apparent | attached to the source BD, it will regard the SBD as the apparent | |||
| skipping to change at page 41, line 34 ¶ | skipping to change at line 1875 ¶ | |||
| a. The frame MUST be sent on all local ACs in the OIF list that | a. The frame MUST be sent on all local ACs in the OIF list that | |||
| connect to BD1 and that are eligible (per Section 4.1.2.1) to | connect to BD1 and that are eligible (per Section 4.1.2.1) to | |||
| receive the frame. | receive the frame. | |||
| b. The frame MUST be sent up the IRB interface of the apparent | b. The frame MUST be sent up the IRB interface of the apparent | |||
| source BD. (Note that this may be the SBD.) The frame MUST | source BD. (Note that this may be the SBD.) The frame MUST | |||
| NOT be sent up any other IRB interfaces. | NOT be sent up any other IRB interfaces. | |||
| c. If PE1 is not an AR-REPLICATOR, it MUST NOT send the frame to | c. If PE1 is not an AR-REPLICATOR, it MUST NOT send the frame to | |||
| any other EVPN-PEs. However, if PE1 is an AR-REPLICATOR, it | any other EVPN PEs. However, if PE1 is an AR-REPLICATOR, it | |||
| MUST send the frame to all tunnels in the OIF list, except | MUST send the frame to all tunnels in the OIF list, except | |||
| for the tunnel over which the frame was received. | for the tunnel over which the frame was received. | |||
| 3. The frame was received by PE1 from the BD1 IRB interface (i.e., | 3. The frame was received by PE1 from the BD1 IRB interface (i.e., | |||
| the frame has been transmitted by PE1's L3 routing instance down | the frame has been transmitted by PE1's L3 routing instance down | |||
| the BD1 IRB interface), and BD1 is NOT the SBD. | the BD1 IRB interface), and BD1 is NOT the SBD. | |||
| a. The frame MUST be sent on all local ACs in the OIF list that | a. The frame MUST be sent on all local ACs in the OIF list that | |||
| are eligible, as per Section 4.1.2.1, to receive the frame. | are eligible, as per Section 4.1.2.1, to receive the frame. | |||
| b. The frame MUST NOT be sent to any other EVPN-PEs. | b. The frame MUST NOT be sent to any other EVPN PEs. | |||
| c. The frame MUST NOT be sent up any IRB interfaces. | c. The frame MUST NOT be sent up any IRB interfaces. | |||
| 4. The frame was received from the SBD IRB interface (i.e., has been | 4. The frame was received from the SBD IRB interface (i.e., has been | |||
| transmitted by PE1's L3 routing instance down the SBD IRB | transmitted by PE1's L3 routing instance down the SBD IRB | |||
| interface). | interface). | |||
| a. The frame MUST be sent on all tunnels in the OIF list. This | a. The frame MUST be sent on all tunnels in the OIF list. This | |||
| causes the frame to be delivered to any other EVPN-PEs that | causes the frame to be delivered to any other EVPN PEs that | |||
| have interest in it. | have interest in it. | |||
| b. The frame MUST NOT be sent on any local ACs. | b. The frame MUST NOT be sent on any local ACs. | |||
| c. The frame MUST NOT be sent up any IRB interfaces. | c. The frame MUST NOT be sent up any IRB interfaces. | |||
| 4.2. Layer 3 Forwarding State | 4.2. Layer 3 Forwarding State | |||
| If an EVPN-PE is performing IGMP/MLD procedures on the ACs of a given | If an EVPN PE is performing IGMP/MLD procedures on the ACs of a given | |||
| BD, it processes those messages at layer 2 to help form the layer 2 | BD, it processes those messages at Layer 2 to help form the Layer 2 | |||
| multicast state. It also sends those messages up that BD's IRB | multicast state. It also sends those messages up that BD's IRB | |||
| interface to the L3 routing instance of a particular tenant domain. | interface to the L3 routing instance of a particular Tenant Domain. | |||
| This causes (C-S,C-G) or (C-*,C-G) L3 state to be created/updated. | This causes the (C-S,C-G) or (C-*,C-G) L3 state to be created/ | |||
| updated. | ||||
| A layer 3 multicast state has both an Input Interface (IIF) and an | A Layer 3 multicast state has both an Input Interface (IIF) and an | |||
| OIF list. | OIF list. | |||
| For a (C-S,C-G) state, if the source BD is present on the PE, the IIF | For a (C-S,C-G) state, if the source BD is present on the PE, the IIF | |||
| is set to the IRB interface that attaches to that BD. Otherwise the | is set to the IRB interface that attaches to that BD. Otherwise, the | |||
| IIF is set to the SBD IRB interface. | IIF is set to the SBD IRB interface. | |||
| For (C-*,C-G) states, traffic can arrive from any BD, so the IIF | For (C-*,C-G) states, traffic can arrive from any BD, so the IIF | |||
| needs to be set to a wildcard value meaning "any IRB interface". | needs to be set to a wildcard value meaning "any IRB interface". | |||
| The OIF list of these states includes one or more of the IRB | The OIF list of these states includes one or more of the IRB | |||
| interfaces of the Tenant Domain. In general, maintenance of the OIF | interfaces of the Tenant Domain. In general, maintenance of the OIF | |||
| list does not require any EVPN-specific procedures. However, there | list does not require any EVPN-specific procedures. However, there | |||
| is one EVPN-specific rule: | is one EVPN-specific rule: | |||
| If the IIF is one of the IRB interfaces (or the wild card meaning | If the IIF is one of the IRB interfaces (or the wildcard meaning | |||
| "any IRB interface"), then the SBD IRB interface MUST NOT be added | "any IRB interface"), then the SBD IRB interface MUST NOT be added | |||
| to the OIF list. Traffic originating from within a particular | to the OIF list. Traffic originating from within a particular | |||
| EVPN Tenant Domain must not be sent down the SBD IRB interface, as | EVPN Tenant Domain must not be sent down the SBD IRB interface, as | |||
| such traffic has already been distributed to all EVPN-PEs attached | such traffic has already been distributed to all EVPN PEs attached | |||
| to that Tenant Domain. | to that Tenant Domain. | |||
| Please also see Section 6.1.1, which states a modification of this | Please also see Section 6.1.1, which states a modification of this | |||
| rule for the case where OISM is interworking with external Layer 3 | rule for the case where OISM is interworking with external Layer 3 | |||
| multicast routing. | multicast routing. | |||
| 5. Interworking with non-OISM EVPN-PEs | 5. Interworking with Non-OISM EVPN PEs | |||
| It is possible that a given Tenant Domain will be attached to both | It is possible that a given Tenant Domain will be attached to both | |||
| OISM PEs and non-OISM PEs. Inter-subnet IP multicast should be | OISM PEs and non-OISM PEs. Inter-subnet IP multicast should be | |||
| possible and fully functional even if not all PEs attaching to a | possible and fully functional even if not all PEs attaching to a | |||
| Tenant Domain can be upgraded to support OISM functionality. | Tenant Domain can be upgraded to support OISM functionality. | |||
| Note that the non-OISM PEs are not required to have IRB support, or | Note that the non-OISM PEs are not required to have IRB support or | |||
| support for [RFC9251]. It is however advantageous for the non-OISM | support for [RFC9251]. However, it is advantageous for the non-OISM | |||
| PEs to support [RFC9251]. | PEs to support [RFC9251]. | |||
| In this section, we will use the following terminology: | In this section, we will use the following terminology: | |||
| * PE-S: the ingress PE for an (S,G) flow. | PE-S: The ingress PE for an (S,G) flow. | |||
| * PE-R: an egress PE for an (S,G) flow. | PE-R: An egress PE for an (S,G) flow. | |||
| * BD-S: the source BD for an (S,G) flow. PE-S must have one or more | BD-S: The source BD for an (S,G) flow. PE-S must have one or more | |||
| ACs attached BD-S, at least one of which attaches to host S. | ACs attached to BD-S, at least one of which attaches to host S. | |||
| * BD-R: a BD that contains a host interested in the flow. The host | BD-R: A BD that contains a host interested in the flow. The host is | |||
| is attached to PE-R via an AC that belongs to BD-R. | attached to PE-R via an AC that belongs to BD-R. | |||
| To allow OISM PEs to interwork with non-OISM PEs, a given Tenant | To allow OISM PEs to interwork with non-OISM PEs, a given Tenant | |||
| Domain needs to contain one or more "IP Multicast Gateways" (IPMGs). | Domain needs to contain one or more IP Multicast Gateways (IPMGs). | |||
| An IPMG is an OISM PE with special responsibilities regarding the | An IPMG is an OISM PE with special responsibilities regarding the | |||
| interworking between OISM and non-OISM PEs. | interworking between OISM and non-OISM PEs. | |||
| If a PE is functioning as an IPMG, it MUST signal this fact by | If a PE is functioning as an IPMG, it MUST signal this fact by | |||
| setting the "IPMG" flag in the Multicast Flags EC that it attaches to | setting the IPMG flag in the Multicast Flags EC that it attaches to | |||
| its IMET routes. An IPMG SHOULD attach this EC, with the IPMG flag | its IMET routes. An IPMG SHOULD attach this EC, with the IPMG flag | |||
| set, to all IMET routes it originates. Furthermore, if PE1 imports | set, to all IMET routes it originates. Furthermore, if PE1 imports | |||
| any IMET route from PE2 that has the EC present with the "IPMG" flag | any IMET route from PE2 that has the EC present with the IPMG flag | |||
| set, then the PE1 will assume that PE2 is an IPMG. | set, then the PE1 will assume that PE2 is an IPMG. | |||
| An IPMG Designated Forwarder (IPMG-DF) selection procedure is used to | An IPMG Designated Forwarder (IPMG-DF) selection procedure is used to | |||
| ensure that, at any given time, there is exactly one active IPMG-DF | ensure that there is exactly one active IPMG-DF for any given BD at | |||
| for any given BD. Details of the IPMG-DF selection procedure are in | any given time. Details of the IPMG-DF selection procedure are in | |||
| Section 5.1. The IPMG-DF for a given BD, say BD-S, has special | Section 5.1. The IPMG-DF for a given BD, say BD-S, has special | |||
| functions to perform when it receives (S,G) frames on that BD: | functions to perform when it receives (S,G) frames on that BD: | |||
| * If the frames are from a non-OISM PE-S: | * If the frames are from a non-OISM PE-S: | |||
| - The IPMG-DF forwards them to OISM PEs that do not attach to | - The IPMG-DF forwards them to OISM PEs that do not attach to | |||
| BD-S but have interest in (S,G). | BD-S but have interest in (S,G). | |||
| Note that OISM PEs that do attach to BD-S will have received | Note that OISM PEs that do attach to BD-S will have received | |||
| the frames on the BUM tunnel from the non-OISM PE-S. | the frames on the BUM tunnel from the non-OISM PE-S. | |||
| skipping to change at page 44, line 16 ¶ | skipping to change at line 2000 ¶ | |||
| with interest in (S,G), it will receive one copy of the frame | with interest in (S,G), it will receive one copy of the frame | |||
| for each such BD. This is necessary because the non-OISM PEs | for each such BD. This is necessary because the non-OISM PEs | |||
| cannot move IP multicast traffic from one BD to another. | cannot move IP multicast traffic from one BD to another. | |||
| * If the frames are from an OISM PE, the IPMG-DF forwards them to | * If the frames are from an OISM PE, the IPMG-DF forwards them to | |||
| non-OISM PEs that have interest in (S,G) on ACs that do not belong | non-OISM PEs that have interest in (S,G) on ACs that do not belong | |||
| to BD-S. | to BD-S. | |||
| If a non-OISM PE has interest in (S,G) on an AC belonging to BD-S, | If a non-OISM PE has interest in (S,G) on an AC belonging to BD-S, | |||
| it will have received a copy of the (S,G) frame, encapsulated for | it will have received a copy of the (S,G) frame, encapsulated for | |||
| BD-S, from the OISM PE-S. (See Section 3.2.2.) If the non-OISM | BD-S, from the OISM PE-S (see Section 3.2.2). If the non-OISM PE | |||
| PE has interest in (S,G) on one or more ACs belonging to | has interest in (S,G) on one or more ACs belonging to BD- | |||
| BD-R1,...,BD-Rk where the BD-Ri are distinct from BD-S, the | R1,...,BD-Rk where the BD-Ri are distinct from BD-S, the IPMG-DF | |||
| IPMG-DF needs to send it a copy of the frame for each BD-Ri. | needs to send it a copy of the frame for each BD-Ri. | |||
| If an IPMG receives a frame on a BD for which it is not the IPMG-DF, | If an IPMG receives a frame on a BD for which it is not the IPMG-DF, | |||
| it just follows normal OISM procedures. | it just follows normal OISM procedures. | |||
| This section specifies several sets of procedures: | This section specifies several sets of procedures: | |||
| * the procedures that the IPMG-DF for a given BD needs to follow | * the procedures that the IPMG-DF for a given BD needs to follow | |||
| when receiving, on that BD, an IP multicast frame from a non-OISM | when receiving, on that BD, an IP multicast frame from a non-OISM | |||
| PE; | PE; | |||
| * the procedures that the IPMG-DF for a given BD needs to follow | * the procedures that the IPMG-DF for a given BD needs to follow | |||
| when receiving, on that BD, an IP multicast frame from an OISM PE; | when receiving, on that BD, an IP multicast frame from an OISM PE; | |||
| and | ||||
| * the procedures that an OISM PE needs to follow when receiving, on | * the procedures that an OISM PE needs to follow when receiving, on | |||
| a given BD, an IP multicast frame from a non-OISM PE, when the | a given BD, an IP multicast frame from a non-OISM PE, when the | |||
| OISM PE is not the IPMG-DF for that BD. | OISM PE is not the IPMG-DF for that BD. | |||
| To enable OISM/non-OISM interworking in a given Tenant Domain, the | To enable OISM/non-OISM interworking in a given Tenant Domain, the | |||
| Tenant Domain MUST have some EVPN-PEs that can function as IPMGs. An | Tenant Domain MUST have some EVPN PEs that can function as IPMGs. An | |||
| IPMG must be configured with the SBD. It must also be configured | IPMG must be configured with the SBD. It must also be configured | |||
| with every BD of the Tenant Domain that exists on any of the non-OISM | with every BD of the Tenant Domain that exists on any of the non-OISM | |||
| PEs of that domain. (Operationally, it may be simpler to configure | PEs of that domain. (Operationally, it may be simpler to configure | |||
| the IPMG with all the BDs of the Tenant Domain.) | the IPMG with all the BDs of the Tenant Domain.) | |||
| A non-OISM PE of course only needs to be configured with BDs for | Of course, a non-OISM PE only needs to be configured with BDs for | |||
| which it has ACs. An OISM PE that is not an IPMG only needs to be | which it has ACs. An OISM PE that is not an IPMG only needs to be | |||
| configured with the SBD and with the BDs for which it has ACs. | configured with the SBD and with the BDs for which it has ACs. | |||
| An IPMG MUST originate a wildcard SMET route (with (C-*,C-*) in the | An IPMG MUST originate a wildcard SMET route (with (C-*,C-*) in the | |||
| NLRI) for each BD in the Tenant Domain. This will cause it to | NLRI) for each BD in the Tenant Domain. This will cause it to | |||
| receive all the IP multicast traffic that is sourced in the Tenant | receive all the IP multicast traffic that is sourced in the Tenant | |||
| Domain. Note that non-OISM nodes that do not support [RFC9251] will | Domain. Note that non-OISM nodes that do not support [RFC9251] will | |||
| send all the multicast traffic from a given BD to all PEs attached to | send all the multicast traffic from a given BD to all PEs attached to | |||
| that BD, even if those PEs do not originate an SMET route. | that BD, even if those PEs do not originate a SMET route. | |||
| The interworking procedures vary somewhat depending upon whether | The interworking procedures vary somewhat depending upon whether | |||
| packets are transmitted from PE to PE via Ingress Replication (IR) or | packets are transmitted from PE to PE via IR or via P2MP tunnels. In | |||
| via Point-to-Multipoint (P2MP) tunnels. We do not consider the use | this section, we do not consider the use of BIER due to the low | |||
| of BIER in this section, due to the low likelihood of there being a | likelihood of there being a non-OISM PE that supports BIER. | |||
| non-OISM PE that supports BIER. | ||||
| 5.1. IPMG Designated Forwarder | 5.1. IPMG Designated Forwarder | |||
| Every PE that is eligible for selection as an IPMG-DF for a | Every PE that is eligible for selection as an IPMG-DF for a | |||
| particular BD originates both an IMET route for that BD and an | particular BD originates both an IMET route for that BD and an SBD- | |||
| SBD-IMET route. As stated in Section 5, these SBD-IMET routes carry | IMET route. As stated in Section 5, these SBD-IMET routes carry a | |||
| a Multicast Flags EC with the IPMG Flag set. | Multicast Flags EC with the IPMG flag set. | |||
| These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | |||
| Election EC and its use is specified in [RFC8584]. When the route is | Election EC and its use is specified in [RFC8584]. When the route is | |||
| originated, the AC-DF bit in the DF Election EC SHOULD not be set. | originated, the AC-DF bit in the DF Election EC SHOULD NOT be set. | |||
| This bit is not used when selecting an IPMSG-DF, i.e., it MUST be | This bit is not used when selecting an IPMG-DF, i.e., it MUST be | |||
| ignored by the receiver of an SBD-IMET route. | ignored by the receiver of an SBD-IMET route. | |||
| In the context of a given Tenant Domain, to select the IPMG-DF for a | In the context of a given Tenant Domain, to select the IPMG-DF for a | |||
| particular BD, say BD1, the IPMGs of the Tenant Domain perform the | particular BD, say BD1, the IPMGs of the Tenant Domain perform the | |||
| following procedure: | following procedures: | |||
| * From the set of received SBD-IMET routes for the given tenant | * From the set of received SBD-IMET routes for the given Tenant | |||
| domain, determine the candidate set of PEs that support IPMG | Domain, determine the candidate set of PEs that support IPMG | |||
| functionality for that domain. | functionality for that domain. | |||
| * Eliminate from that candidate set any PEs from which an IMET route | * From that candidate set, eliminate any PEs from which an IMET | |||
| for BD1 has not been received. | route for BD1 has not been received. | |||
| * Select a DF Election algorithm as specified in [RFC8584]. Some of | * Select a DF election algorithm as specified in [RFC8584]. Some of | |||
| the possible algorithms can be found, e.g., in [RFC8584], | the possible algorithms can be found, e.g., in [RFC8584], | |||
| [RFC7432], and [I-D.ietf-bess-evpn-pref-df]. | [RFC7432], and [EVPN-DF]. | |||
| * Apply the DF Election Algorithm (see [RFC8584]) to the candidate | * Apply the DF election algorithm (see [RFC8584]) to the candidate | |||
| set of PEs. The "winner' becomes the IPMG-DF for BD1. | set of PEs. The winner becomes the IPMG-DF for BD1. | |||
| Note that even if a given PE supports MEG Section 6.1.2) and/or PEG | Note that even if a given PE supports MEG (Section 6.1.2) and/or PEG | |||
| (Section 6.1.4) functionality, as well as IPMG functionality, its | (Section 6.1.4) functionality, as well as IPMG functionality, its | |||
| SBD-IMET routes carry only one DF Election EC. | SBD-IMET routes carry only one DF Election EC. | |||
| 5.2. Ingress Replication | 5.2. Ingress Replication | |||
| The procedures of this section are used when Ingress Replication is | The procedures of this section are used when IR is used to transmit | |||
| used to transmit packets from one PE to another. | packets from one PE to another. | |||
| When a non-OISM PE-S transmits a multicast frame from BD-S to another | When a non-OISM PE-S transmits a multicast frame from BD-S to another | |||
| PE, PE-R, PE-S will use the encapsulation specified in the BD-S IMET | PE, say PE-R, PE-S will use the encapsulation specified in the BD-S | |||
| route that was originated by PE-R. This encapsulation will include | IMET route that was originated by PE-R. This encapsulation will | |||
| the label that appears in the "MPLS label" field of the PMSI Tunnel | include the label that appears in the MPLS Label field of the PTA of | |||
| attribute (PTA) of the IMET route. If the tunnel type is VXLAN, the | the IMET route. If the tunnel type is VXLAN, the label is actually a | |||
| "label" is actually a Virtual Network Identifier (VNI); for other | Virtual Network Identifier (VNI); for other tunnel types, the label | |||
| tunnel types, the label is an MPLS label. In either case, we will | is an MPLS label. In either case, the frames are transmitted with a | |||
| speak of the transmitted frames as carrying a label that was assigned | label that was assigned to a particular BD by the PE-R to which the | |||
| to a particular BD by the PE-R to which the frame is being | frame is being transmitted. | |||
| transmitted. | ||||
| To support OISM/non-OISM interworking, an OISM PE-R MUST originate, | To support OISM/non-OISM interworking, an OISM PE-R MUST originate, | |||
| for each of its BDs, both an IMET route and an S-PMSI (C-*,C-*) A-D | for each of its BDs, both an IMET route and an (C-*,C-*) S-PMSI A-D | |||
| route. Note that even when IR is being used, interworking between | route. Note that even when IR is being used, interworking between | |||
| OISM and non-OISM PEs requires the OISM PEs to follow the rules of | OISM and non-OISM PEs requires the OISM PEs to follow the rules of | |||
| Section 3.2.5.2, as modified below. | Section 3.2.5.2, as modified below. | |||
| Non-OISM PEs will not understand S-PMSI A-D routes. So when a | Non-OISM PEs will not understand S-PMSI A-D routes. So when a non- | |||
| non-OISM PE-S transmits an IP multicast frame with a particular | OISM PE-S transmits an IP multicast frame with a particular source BD | |||
| source BD to an IPMG, it encapsulates the frame using the label | to an IPMG, it encapsulates the frame using the label specified in | |||
| specified in that IPMG's BD-S IMET route. (This is just the | that IPMG's BD-S IMET route. (This is just the procedure of | |||
| procedure of [RFC7432].) | [RFC7432].) | |||
| The (C-*,C-*) S-PMSI A-D route originated by a given OISM PE will | The (C-*,C-*) S-PMSI A-D route originated by a given OISM PE will | |||
| have a PTA that specifies IR. | have a PTA that specifies IR. | |||
| * If MPLS tunneling is being used, the MPLS label field SHOULD | * If MPLS tunneling is being used, the MPLS Label field SHOULD | |||
| contain a non-zero value, and the LIR flag SHOULD be zero. (The | contain a non-zero value, and the LIR flag SHOULD be zero. (The | |||
| case where the MPLS label field is zero or the LIR flag is set is | case where the MPLS Label field is zero or the LIR flag is set is | |||
| outside the scope of this document.) | outside the scope of this document.) | |||
| * If the tunnel encapsulation is VXLAN, the MPLS label field MUST | * If the tunnel encapsulation is VXLAN, the MPLS Label field MUST | |||
| contain a non-zero value, and the LIR flag MUST be zero. | contain a non-zero value, and the LIR flag MUST be zero. | |||
| When an OISM PE-S transmits an IP multicast frame to an IPMG, it will | When an OISM PE-S transmits an IP multicast frame to an IPMG, it will | |||
| use the label specified in that IPMG's (C-*,C-*) S-PMSI A-D route. | use the label specified in that IPMG's (C-*,C-*) S-PMSI A-D route. | |||
| When a PE originates both an IMET route and a (C-*,C-*) S-PMSI A-D | When a PE originates both an IMET route and a (C-*,C-*) S-PMSI A-D | |||
| route, the values of the MPLS label field in the respective PTAs must | route, the values of the MPLS Label field in the respective PTAs must | |||
| be distinct. Further, each MUST map uniquely (in the context of the | be distinct. Further, each MUST map uniquely (in the context of the | |||
| originating PE) to the route's BD. | originating PE) to the route's BD. | |||
| As a result, an IPMG receiving an MPLS-encapsulated IP multicast | As a result, an IPMG receiving an MPLS-encapsulated IP multicast | |||
| frame can always tell by the label whether the frame's ingress PE is | frame can always tell by the label whether the frame's ingress PE is | |||
| an OISM PE or a non-OISM PE. When an IPMG receives a VXLAN- | an OISM PE or a non-OISM PE. When an IPMG receives a VXLAN- | |||
| encapsulated IP multicast frame it may need to determine the identity | encapsulated IP multicast frame, it may need to determine the | |||
| of the ingress PE from the outer IP encapsulation; it can then | identity of the ingress PE from the outer IP encapsulation; it can | |||
| determine whether the ingress PE is an OISM PE or a non-OISM PE by | then determine whether the ingress PE is an OISM PE or a non-OISM PE | |||
| looking the IMET route from that PE. | by looking at the IMET route from that PE. | |||
| Suppose an IPMG receives an IP multicast frame from another EVPN-PE | Suppose an IPMG receives an IP multicast frame from another EVPN PE | |||
| in the Tenant Domain, and the IPMG is not the IPMG-DF for the frame's | in the Tenant Domain and the IPMG is not the IPMG-DF for the frame's | |||
| source BD. Then the IPMG performs only the ordinary OISM functions; | source BD. Then, the IPMG performs only the ordinary OISM functions; | |||
| it does not perform the IPMG-specific functions for that frame. In | it does not perform the IPMG-specific functions for that frame. In | |||
| the remainder of this section, when we discuss the procedures applied | the remainder of this section, when we discuss the procedures applied | |||
| by an IPMG when it receives an IP multicast frame, we are presuming | by an IPMG when it receives an IP multicast frame, we are presuming | |||
| that the source BD of the frame is a BD for which the IPMG is the | that the source BD of the frame is a BD for which the IPMG is the | |||
| IPMG-DF. | IPMG-DF. | |||
| We have two basic cases to consider: (1) a frame's ingress PE is a | We have two basic cases to consider: (1) a frame's ingress PE is a | |||
| non-OISM node, and (2) a frame's ingress PE is an OISM node. | non-OISM node and (2) a frame's ingress PE is an OISM node. | |||
| 5.2.1. Ingress PE is non-OISM | 5.2.1. Ingress PE is Non-OISM | |||
| In this case, a non-OISM PE, PE-S, has received an (S,G) multicast | In this case, a non-OISM PE, say PE-S, has received an (S,G) | |||
| frame over an AC that is attached to a particular BD, BD-S. By | multicast frame over an AC that is attached to a particular BD, say | |||
| virtue of normal EVPN procedures, PE-S has sent a copy of the frame | BD-S. By virtue of normal EVPN procedures, PE-S has sent a copy of | |||
| to every PE-R (both OISM and non-OISM) in the Tenant Domain that is | the frame to every PE-R (both OISM and non-OISM) in the Tenant Domain | |||
| attached to BD-S. If the non-OISM node supports [RFC9251], only PEs | that is attached to BD-S. If the non-OISM node supports [RFC9251], | |||
| that have expressed interest in (S,G) receive the frame. The IPMG | only PEs that have expressed interest in (S,G) receive the frame. | |||
| will have expressed interest via a (C-*,C-*) SMET route and thus | The IPMG will have expressed interest via a (C-*,C-*) SMET route and | |||
| receives the frame. | thus receives the frame. | |||
| Any OISM PE (including an IPMG) receiving the frame will apply normal | Any OISM PE (including an IPMG) receiving the frame will apply normal | |||
| OISM procedures. As a result it will deliver the frame to any of its | OISM procedures. As a result, it will deliver the frame to any of | |||
| local ACs (in BD-S or in any other BD) that have interest in (S,G). | its local ACs (in BD-S or in any other BD) that have interest in | |||
| (S,G). | ||||
| An OISM PE that is also the IPMG-DF for a particular BD, say BD-S, | An OISM PE that is also the IPMG-DF for a particular BD, say BD-S, | |||
| has additional procedures that it applies to frames received on BD-S | has additional procedures that it applies to frames received on BD-S | |||
| from non-OISM PEs: | from non-OISM PEs: | |||
| 1. When the IPMG-DF for BD-S receives an (S,G) frame from a non-OISM | 1. When the IPMG-DF for BD-S receives an (S,G) frame from a non-OISM | |||
| node, it MUST forward a copy of the frame to every OISM PE that | node, it MUST forward a copy of the frame to every OISM PE that | |||
| is NOT attached to BD-S but has interest in (S,G). The copy sent | is NOT attached to BD-S but has interest in (S,G). The copy sent | |||
| to a given OISM PE-R must carry the label that PE-R has assigned | to a given OISM PE-R must carry the label that PE-R has assigned | |||
| to the SBD in an S-PMSI A-D route. The IPMG MUST NOT do any IP | to the SBD in an S-PMSI A-D route. The IPMG MUST NOT do any IP | |||
| processing of the frame's IP payload. TTL decrement and other IP | processing of the frame's IP payload. TTL decrement and other IP | |||
| processing will be done by PE-R, per the normal OISM procedures. | processing will be done by PE-R, per the normal OISM procedures. | |||
| There is no need for the IPMG to include an ESI label in the | There is no need for the IPMG to include an ESI label in the | |||
| frame's tunnel encapsulation, because it is already known that | frame's tunnel encapsulation, because it is already known that | |||
| the frame's source BD has no presence on PE-R. There is also no | the frame's source BD has no presence on PE-R. There is also no | |||
| need for the IPMG to modify the frame's MAC SA. | need for the IPMG to modify the frame's MAC SA. | |||
| 2. In addition, when the IPMG-DF for BD-S receives an (S,G) frame | 2. In addition, when the IPMG-DF for BD-S receives an (S,G) frame | |||
| from a non-OISM node, it may need to forward copies of the frame | from a non-OISM node, it may need to forward copies of the frame | |||
| to other non-OISM nodes. Before it does so, it MUST decapsulate | to other non-OISM nodes. Before it does so, it MUST decapsulate | |||
| the (S,G) packet, and do the IP processing (e.g., TTL decrement). | the (S,G) packet and do the IP processing (e.g., TTL decrement). | |||
| Suppose PE-R is a non-OISM node that has an AC to BD-R, where | Suppose PE-R is a non-OISM node that has an AC to BD-R, where | |||
| BD-R is not the same as BD-S, and that AC has interest in (S,G). | BD-R is not the same as BD-S, and that AC has interest in (S,G). | |||
| The IPMG must then encapsulate the (S,G) packet (after the IP | The IPMG must then encapsulate the (S,G) packet (after the IP | |||
| processing has been done) in an Ethernet header. The MAC SA | processing has been done) in an Ethernet header. The MAC SA | |||
| field will have the MAC address of the IPMG's IRB interface for | field will have the MAC address of the IPMG's IRB interface for | |||
| BD-R. The IPMG then sends the frame to PE-R. The tunnel | BD-R. The IPMG then sends the frame to PE-R. The tunnel | |||
| encapsulation will carry the label that PE-R advertised in its | encapsulation will carry the label that PE-R advertised in its | |||
| IMET route for BD-R. There is no need to include an ESI label, | IMET route for BD-R. There is no need to include an ESI label, | |||
| as the source and destination BDs are known to be different. | as the source and destination BDs are known to be different. | |||
| Note that if a non-OISM PE-R has several BDs (other than BD-S) | Note that if a non-OISM PE-R has several BDs (other than BD-S) | |||
| with local ACs that have interest in (S,G), the IPMG will send it | with local ACs that have interest in (S,G), the IPMG will send it | |||
| one copy for each such BD. This is necessary because the | one copy for each such BD. This is necessary because the non- | |||
| non-OISM PE cannot move packets from one BD to another. | OISM PE cannot move packets from one BD to another. | |||
| There may be deployment scenarios in which every OISM PE is | There may be deployment scenarios in which every OISM PE is | |||
| configured with every BD that is present on any non-OISM PE. In such | configured with every BD that is present on any non-OISM PE. In such | |||
| scenarios, the procedures of item 1 above will not actually result in | scenarios, the procedures of item 1 above will not actually result in | |||
| the transmission of any packets. Hence if it is known a priori that | the transmission of any packets. Hence, if it is known a priori that | |||
| this deployment scenario exists for a given tenant domain, the | this deployment scenario exists for a given Tenant Domain, the | |||
| procedures of item 1 above can be disabled. | procedures of item 1 above can be disabled. | |||
| 5.2.2. Ingress PE is OISM | 5.2.2. Ingress PE is OISM | |||
| In this case, an OISM PE, PE-S, has received an (S,G) multicast frame | In this case, an OISM PE, say PE-S, has received an (S,G) multicast | |||
| over an AC that attaches to a particular BD, BD-S. | frame over an AC that attaches to a particular BD, say BD-S. | |||
| By virtue of receiving all the IMET routes for BD-S, PE-S will know | By virtue of receiving all the IMET routes for BD-S, PE-S will know | |||
| all the PEs attached to BD-S. By virtue of normal OISM procedures: | all the PEs attached to BD-S. By virtue of normal OISM procedures: | |||
| * PE-S will send a copy of the frame to every OISM PE-R (including | * PE-S will send a copy of the frame to every OISM PE-R (including | |||
| the IPMG) in the Tenant Domain that is attached to BD-S and has | the IPMG) in the Tenant Domain that is attached to BD-S and has | |||
| interest in (S,G). The copy sent to a given PE-R carries the | interest in (S,G). The copy sent to a given PE-R carries the | |||
| label that that the PE-R has assigned to BD-S in its (C-*,C-*) | label that the PE-R has assigned to BD-S in its (C-*,C-*) S-PMSI | |||
| S-PMSI A-D route. | A-D route. | |||
| * PE-S will also transmit a copy of the (S,G) frame to every OISM | * PE-S will also transmit a copy of the (S,G) frame to every OISM | |||
| PE-R that has interest in (S,G) but is not attached to BD-S. The | PE-R that has interest in (S,G) but is not attached to BD-S. The | |||
| copy will contain the label that the PE-R has assigned to the SBD. | copy will contain the label that the PE-R has assigned to the SBD. | |||
| (As specified in Section 5.2.1, an IPMG is assumed to have | (As specified in Section 5.2.1, an IPMG is assumed to have | |||
| indicated interest in all multicast flows.) | indicated interest in all multicast flows.) | |||
| * PE-S will also transmit a copy of the (S,G) frame to every | * PE-S will also transmit a copy of the (S,G) frame to every non- | |||
| non-OISM PE-R that is attached to BD-S. It does this using the | OISM PE-R that is attached to BD-S. It does this using the label | |||
| label advertised by that PE-R in its IMET route for BD-S. | advertised by that PE-R in its IMET route for BD-S. | |||
| The PE-Rs follow their normal procedures. An OISM PE that receives | The PE-Rs follow their normal procedures. An OISM PE that receives | |||
| the (S,G) frame on BD-S applies the OISM procedures to deliver the | the (S,G) frame on BD-S applies the OISM procedures to deliver the | |||
| frame to its local ACs, as necessary. A non-OISM PE that receives | frame to its local ACs as necessary. A non-OISM PE that receives the | |||
| the (S,G) frame on BD-S delivers the frame only to its local BD-S | (S,G) frame on BD-S delivers the frame only to its local BD-S ACs as | |||
| ACs, as necessary. | necessary. | |||
| Suppose that a non-OISM PE-R has interest in (S,G) on a BD, BD-R, | Suppose that a non-OISM PE-R has interest in (S,G) on a BD that is | |||
| that is different than BD-S. If the non-OISM PE-R is attached to | different than BD-S, say BD-R. If the non-OISM PE-R is attached to | |||
| BD-S, the OISM PE-S will send it the original (S,G) multicast frame, | BD-S, the OISM PE-S will send it the original (S,G) multicast frame, | |||
| but the non-OISM PE-R will not be able to send the frame to ACs that | but the non-OISM PE-R will not be able to send the frame to ACs that | |||
| are not in BD-S. If PE-R is not even attached to BD-S, the OISM PE-S | are not in BD-S. If PE-R is not even attached to BD-S, the OISM PE-S | |||
| will not send it a copy of the frame at all, because PE-R is not | will not send it a copy of the frame at all, because PE-R is not | |||
| attached to the SBD. In these cases, the IPMG needs to relay the | attached to the SBD. In these cases, the IPMG needs to relay the | |||
| (S,G) multicast traffic from OISM PE-S to non-OISM PE-R. | (S,G) multicast traffic from OISM PE-S to non-OISM PE-R. | |||
| When the IPMG-DF for BD-S receives an (S,G) frame from an OISM PE-S, | When the IPMG-DF for BD-S receives an (S,G) frame from an OISM PE-S, | |||
| it has to forward it to every non-OISM PE-R that that has interest in | it has to forward it to every non-OISM PE-R that has interest in | |||
| (S,G) on a BD-R that is different than BD-S. The IPMG MUST | (S,G) on a BD-R that is different than BD-S. The IPMG MUST | |||
| decapsulate the IP multicast packet, do the IP processing, re- | decapsulate the IP multicast packet, do the IP processing, re- | |||
| encapsulate it for BD-R (changing the MAC SA to the IPMG's own MAC | encapsulate it for BD-R (changing the MAC SA to the IPMG's own MAC | |||
| address for BD-R), and send a copy of the frame to PE-R. Note that a | address for BD-R), and send a copy of the frame to PE-R. Note that a | |||
| given non-OISM PE-R will receive multiple copies of the frame, if it | given non-OISM PE-R will receive multiple copies of the frame if it | |||
| has multiple BDs on which there is interest in the frame. | has multiple BDs on which there is interest in the frame. | |||
| 5.3. P2MP Tunnels | 5.3. P2MP Tunnels | |||
| When IR is used to distribute the multicast traffic among the | When IR is used to distribute the multicast traffic among the EVPN | |||
| EVPN-PEs, the procedures of Section 5.2 ensure that there will be no | PEs, the procedures described in Section 5.2 ensure that there will | |||
| duplicate delivery of multicast traffic. That is, no egress PE will | be no duplicate delivery of multicast traffic. That is, no egress PE | |||
| ever send a frame twice on any given AC. If P2MP tunnels are being | will ever send a frame twice on any given AC. If P2MP tunnels are | |||
| used to distribute the multicast traffic, it is necessary to have | being used to distribute the multicast traffic, it is necessary to | |||
| additional procedures to prevent duplicate delivery. | have additional procedures to prevent duplicate delivery. | |||
| At the present time, it is not clear that there will be a use case in | At the present time, it is not clear that there will be a use case in | |||
| which OISM nodes need to interwork with non-OISM nodes that use P2MP | which OISM nodes need to interwork with non-OISM nodes that use P2MP | |||
| tunnels. If it is determined that there is such a use case, | tunnels. If it is determined that there is such a use case, | |||
| procedures for P2MP may be specified in a separate document. | procedures for P2MP may be specified in a separate document. | |||
| 6. Traffic to/from Outside the EVPN Tenant Domain | 6. Traffic to/from Outside the EVPN Tenant Domain | |||
| In this section, we discuss scenarios where a multicast source | In this section, we discuss scenarios where a multicast source | |||
| outside a given EVPN Tenant Domain sends traffic to receivers inside | outside a given EVPN Tenant Domain sends traffic to receivers inside | |||
| the domain (as well as, possibly, to receivers outside the domain). | the domain (as well as, possibly, to receivers outside the domain). | |||
| This requires the OISM procedures to interwork with various layer 3 | This requires the OISM procedures to interwork with various Layer 3 | |||
| multicast routing procedures. | multicast routing procedures. | |||
| We assume in this section that the Tenant Domain is not being used as | In this section, we assume that the Tenant Domain is not being used | |||
| an intermediate transit network for multicast traffic; that is, we do | as an intermediate transit network for multicast traffic; that is, we | |||
| not consider the case where the Tenant Domain contains multicast | do not consider the case where the Tenant Domain contains multicast | |||
| routers that will receive traffic from sources outside the domain and | routers that will receive traffic from sources outside the domain and | |||
| forward the traffic to receivers outside the domain. The transit | forward the traffic to receivers outside the domain. The transit | |||
| scenario is considered in Section 7. | scenario is considered in Section 7. | |||
| We can divide the non-transit scenarios into two classes: | We can divide the non-transit scenarios into two classes: | |||
| 1. One or more of the EVPN PE routers provide the functionality | 1. One or more of the EVPN PE routers provide the functionality | |||
| needed to interwork with layer 3 multicast routing procedures. | needed to interwork with Layer 3 multicast routing procedures. | |||
| 2. A single BD in the Tenant Domain contains external multicast | 2. A single BD in the Tenant Domain contains external multicast | |||
| routers ("tenant multicast routers"), and those tenant multicast | routers (tenant multicast routers), and those tenant multicast | |||
| routers are used to interwork, on behalf of the entire Tenant | routers are used to interwork, on behalf of the entire Tenant | |||
| Domain, with layer 3 multicast routing procedures. | Domain, with Layer 3 multicast routing procedures. | |||
| 6.1. Layer 3 Interworking via EVPN OISM PEs | 6.1. Layer 3 Interworking via EVPN OISM PEs | |||
| 6.1.1. General Principles | 6.1.1. General Principles | |||
| Sometimes it is necessary to interwork an EVPN Tenant Domain with an | Sometimes it is necessary to interwork an EVPN Tenant Domain with an | |||
| external layer 3 multicast domain (the "external domain"), e.g., a | external Layer 3 multicast domain (the external domain), e.g., a PIM | |||
| PIM or MVPN domain. This is needed to allow EVPN tenant systems to | or MVPN domain. This is needed to allow EVPN tenant systems to | |||
| receive multicast traffic from sources ("external sources") outside | receive multicast traffic from sources (external sources) outside the | |||
| the EVPN Tenant Domain. It is also needed to allow receivers | EVPN Tenant Domain. It is also needed to allow receivers (external | |||
| ("external receivers") outside the EVPN Tenant Domain to receive | receivers) outside the EVPN Tenant Domain to receive traffic from | |||
| traffic from sources inside the Tenant Domain. | sources inside the Tenant Domain. | |||
| In order to allow interworking between an EVPN Tenant Domain and an | In order to allow interworking between an EVPN Tenant Domain and an | |||
| external domain, one or more OISM PEs must be "L3 Gateways". An L3 | external domain, one or more OISM PEs must be L3 Gateways. An L3 | |||
| Gateway participates both in the OISM procedures and in the L3 | Gateway participates both in the OISM procedures and in the L3 | |||
| multicast routing procedures of the external domain, as shown in the | multicast routing procedures of the external domain, as shown in the | |||
| following figure. | following figure. | |||
| src1 rcvr1 | src1 rcvr1 | |||
| | | | | | | |||
| R1 RP R2 | R1 RP R2 | |||
| PIM/MVPN | PIM/MVPN | |||
| domain | Domain | |||
| +---+ +---+ | +---+ +---+ | |||
| -----|GW1|----------------------|GW2|---- | -----|GW1|----------------------|GW2|---- | |||
| +---+ +---+ | +---+ +---+ | |||
| | \ \ / / | | | \ \ / / | | |||
| | \ \ / / | | | \ \ / / | | |||
| BD1 BD2 SBD SBD BD2 BD1 | BD1 BD2 SBD SBD BD2 BD1 | |||
| EVPN Domain | EVPN Domain | |||
| SBD SBD | SBD SBD | |||
| / \ | / \ | |||
| / \ | / \ | |||
| +---+ +---+ | +---+ +---+ | |||
| |PE1| |PE2| | |PE1| |PE2| | |||
| +---+ +---+ | +---+ +---+ | |||
| | \ / | | | \ / | | |||
| BD1 BD2 BD2 BD1 | BD1 BD2 BD2 BD1 | |||
| | | | | | | | | | | |||
| src2 rcvr2 src3 rcvr3 | src2 rcvr2 src3 rcvr3 | |||
| Figure 1: Interworking via OISM PEs | ||||
| An L3 Gateway that has interest in receiving (S,G) traffic must be | An L3 Gateway that has interest in receiving (S,G) traffic must be | |||
| able to determine the best route to S. If an L3 Gateway has interest | able to determine the best route to S. If an L3 Gateway has interest | |||
| in (*,G), it must be able to determine the best route to G's RP. In | in (*,G), it must be able to determine the best route to G's RP. In | |||
| these interworking scenarios, the L3 Gateway must be running a layer | these interworking scenarios, the L3 Gateway must be running a Layer | |||
| 3 unicast routing protocol. Via this protocol, it imports unicast | 3 unicast routing protocol. Via this protocol, it imports unicast | |||
| routes (either IP routes or VPN-IP routes) from routers other than | routes (either IP routes or VPN-IP routes) from routers other than | |||
| EVPN PEs. And since there may be multicast sources inside the EVPN | EVPN PEs. And since there may be multicast sources inside the EVPN | |||
| Tenant Domain, the EVPN PEs also need to export, either as IP routes | Tenant Domain, the EVPN PEs also need to export, either as IP routes | |||
| or as VPN-IP routes (depending upon the external domain), unicast | or as VPN-IP routes (depending upon the external domain), unicast | |||
| routes to those sources. | routes to those sources. | |||
| When selecting the best route to a multicast source or RP, an L3 | When selecting the best route to a multicast source or RP, an L3 | |||
| Gateway might have a choice between an EVPN route and an IP/VPN-IP | Gateway might have a choice between an EVPN route and an IP/VPN-IP | |||
| route. When such a choice exists, the L3 Gateway SHOULD always | route. When such a choice exists, the L3 Gateway SHOULD always | |||
| prefer the EVPN route. This will ensure that when traffic originates | prefer the EVPN route. This will ensure that when traffic originates | |||
| in the Tenant Domain and has a receiver in the Tenant Domain, the | in the Tenant Domain and has a receiver in the Tenant Domain, the | |||
| path to that receiver will remain within the EVPN Tenant Domain, even | path to that receiver will remain within the EVPN Tenant Domain, even | |||
| if the source is also reachable via a routed path. This also | if the source is also reachable via a routed path. This also | |||
| provides protection against sub-optimal routing that might occur if | provides protection against sub-optimal routing that might occur if | |||
| two EVPN PEs export IP/VPN-IP routes and each imports the other's IP/ | two EVPN PEs export IP/VPN-IP routes and each imports the other's IP/ | |||
| VPN-IP routes. | VPN-IP routes. | |||
| Section 4.2 discusses the way layer 3 multicast states are | Section 4.2 discusses the way Layer 3 multicast states are | |||
| constructed by OISM PEs. These layer 3 multicast states have IRB | constructed by OISM PEs. These Layer 3 multicast states have IRB | |||
| interfaces as their IIF and OIF list entries, and are the basis for | interfaces as their IIF and OIF list entries and are the basis for | |||
| interworking OISM with other layer 3 multicast procedures such as | interworking OISM with other Layer 3 multicast procedures such as | |||
| MVPN or PIM. From the perspective of the layer 3 multicast | MVPN or PIM. From the perspective of the Layer 3 multicast | |||
| procedures running in a given L3 Gateway, an EVPN Tenant Domain is a | procedures running in a given L3 Gateway, an EVPN Tenant Domain is a | |||
| set of IRB interfaces. | set of IRB interfaces. | |||
| When interworking an EVPN Tenant Domain with an external domain, the | When interworking an EVPN Tenant Domain with an external domain, the | |||
| L3 Gateway's layer 3 multicast states will not only have IRB | L3 Gateway's Layer 3 multicast states will not only have IRB | |||
| interfaces as IIF and OIF list entries, but also other "interfaces" | interfaces as IIF and OIF list entries but also other interfaces that | |||
| that lead outside the Tenant Domain. For example, when interworking | lead outside the Tenant Domain. For example, when interworking with | |||
| with MVPN, the multicast states may have MVPN tunnels as well as IRB | MVPN, the multicast states may have MVPN tunnels as well as IRB | |||
| interfaces as IIF or OIF list members. When interworking with PIM, | interfaces as IIF or OIF list members. When interworking with PIM, | |||
| the multicast states may have PIM-enabled non-IRB interfaces as IIF | the multicast states may have PIM-enabled non-IRB interfaces as IIF | |||
| or OIF list members. | or OIF list members. | |||
| As long as a Tenant Domain is not being used as an intermediate | As long as a Tenant Domain is not being used as an intermediate | |||
| transit network for IP multicast traffic, it is not necessary to | transit network for IP multicast traffic, it is not necessary to | |||
| enable PIM on its IRB interfaces. | enable PIM on its IRB interfaces. | |||
| In general, an L3 Gateway has the following responsibilities: | In general, an L3 Gateway has the following responsibilities: | |||
| skipping to change at page 52, line 41 ¶ | skipping to change at line 2395 ¶ | |||
| * It imports, from the external domain, unicast routes to multicast | * It imports, from the external domain, unicast routes to multicast | |||
| sources that are in the external domain. | sources that are in the external domain. | |||
| * It executes the procedures necessary to draw externally sourced | * It executes the procedures necessary to draw externally sourced | |||
| multicast traffic that is of interest to locally attached | multicast traffic that is of interest to locally attached | |||
| receivers in the EVPN Tenant Domain. When such traffic is | receivers in the EVPN Tenant Domain. When such traffic is | |||
| received, the traffic is sent down the IRB interfaces of the BDs | received, the traffic is sent down the IRB interfaces of the BDs | |||
| on which the locally attached receivers reside. | on which the locally attached receivers reside. | |||
| One of the L3 Gateways in a given Tenant Domain becomes the "DR" for | One of the L3 Gateways in a given Tenant Domain becomes the DR for | |||
| the SBD. (See Section 6.1.2.4.) This L3 gateway has the following | the SBD (see Section 6.1.2.4). This L3 Gateway has the following | |||
| additional responsibilities: | additional responsibilities: | |||
| * It exports, to the external domain, unicast routes to multicast | * It exports, to the external domain, unicast routes to multicast | |||
| sources in the EVPN Tenant Domain that are not locally attached to | sources in the EVPN Tenant Domain that are not locally attached to | |||
| any L3 gateway. | any L3 Gateway. | |||
| * It imports, from the external domain, unicast routes to multicast | * It imports, from the external domain, unicast routes to multicast | |||
| sources that are in the external domain. | sources that are in the external domain. | |||
| * It executes the procedures necessary to draw externally sourced | * It executes the procedures necessary to draw externally sourced | |||
| multicast traffic that is of interest to receivers in the EVPN | multicast traffic that is of interest to receivers in the EVPN | |||
| Tenant Domain that are not locally attached to an L3 gateway. | Tenant Domain that are not locally attached to an L3 Gateway. | |||
| When such traffic is received, the traffic is sent down the SBD | When such traffic is received, the traffic is sent down the SBD | |||
| IRB interface. OISM procedures already described in this document | IRB interface. OISM procedures already described in this document | |||
| will then ensure that the IP multicast traffic gets distributed | will then ensure that the IP multicast traffic gets distributed | |||
| throughout the Tenant Domain to any EVPN PEs that have interest in | throughout the Tenant Domain to any EVPN PEs that have interest in | |||
| it. Thus to an OISM PE that is not an L3 gateway the externally | it. Thus, to an OISM PE that is not an L3 Gateway, the externally | |||
| sourced traffic will appear to have been sourced on the SBD. | sourced traffic will appear to have been sourced on the SBD. | |||
| In order for this to work, some special care is needed when an L3 | In order for this to work, some special care is needed when an L3 | |||
| gateway creates or modifies a layer 3 (*,G) multicast state. Suppose | Gateway creates or modifies a Layer 3 (*,G) multicast state. Suppose | |||
| group G has both external sources (sources outside the EVPN Tenant | group G has both external sources (sources outside the EVPN Tenant | |||
| Domain) and internal sources (sources inside the EVPN tenant domain). | Domain) and internal sources (sources inside the EVPN Tenant Domain). | |||
| Section 4.2 states that when there are internal sources, the SBD IRB | Section 4.2 states that when there are internal sources, the SBD IRB | |||
| interface must not be added to the OIF list of the (*,G) state. | interface must not be added to the OIF list of the (*,G) state. | |||
| Traffic from internal sources will already have been delivered to all | Traffic from internal sources will already have been delivered to all | |||
| the EVPN PEs that have interest in it. However, if the OIF list of | the EVPN PEs that have interest in it. However, if the OIF list of | |||
| the (*,G) state does not contain its SBD IRB interface, then traffic | the (*,G) state does not contain its SBD IRB interface, then traffic | |||
| from external sources will not get delivered to other EVPN PEs. | from external sources will not get delivered to other EVPN PEs. | |||
| One way of handling this is the following. When an L3 gateway | One way of handling this is the following. When an L3 Gateway | |||
| receives (S,G) traffic from other than an IRB interface, and the | receives (S,G) traffic that is from an interface other than IRB, and | |||
| traffic corresponds to a layer 3 (*,G) state, the L3 gateway can | the traffic corresponds to a Layer 3 (*,G) state, the L3 Gateway can | |||
| create (S,G) state. The IIF will be set to the external interface | create (S,G) state. The IIF will be set to the external interface | |||
| over which the traffic is expected. The OIF list will contain the | over which the traffic is expected. The OIF list will contain the | |||
| SBD IRB interface, as well as the IRB interfaces of any other BDs | SBD IRB interface, as well as the IRB interfaces of any other BDs | |||
| attached to the PEG DR that have locally attached receivers with | attached to the PEG DR that have locally attached receivers with | |||
| interest in the (S,G) traffic. The (S,G) state will ensure that the | interest in the (S,G) traffic. The (S,G) state will ensure that the | |||
| external traffic is sent down the SBD IRB interface. The following | external traffic is sent down the SBD IRB interface. The following | |||
| text will assume this procedure; however other implementation | text will assume this procedure; however, other implementation | |||
| techniques may also be possible. | techniques may also be possible. | |||
| If a particular BD is attached to several L3 Gateways, one of the L3 | If a particular BD is attached to several L3 Gateways, one of the L3 | |||
| Gateways becomes the DR for that BD. (See Section 6.1.2.4.) If the | Gateways becomes the DR for that BD (see Section 6.1.2.4). If the | |||
| interworking scenario requires FHR functionality, it is generally the | interworking scenario requires FHR functionality, it is generally the | |||
| DR for a particular BD that is responsible for performing that | DR for a particular BD that is responsible for performing that | |||
| functionality on behalf of the source hosts on that BD. (E.g., if | functionality on behalf of the source hosts on that BD (e.g., if the | |||
| the interworking scenario requires that PIM Register messages be sent | interworking scenario requires that PIM Register messages be sent by | |||
| by an FHR, the DR for a given BD would send the PIM Register messages | an FHR, the DR for a given BD would send the PIM Register messages | |||
| for sources on that BD.) Note though that the DR for the SBD does | for sources on that BD). Although, note that the DR for the SBD does | |||
| not perform FHR functionality on behalf of external sources. | not perform FHR functionality on behalf of external sources. | |||
| An optional alternative is to have each L3 gateway perform FHR | An optional alternative is to have each L3 Gateway perform FHR | |||
| functionality for locally attached sources. Then the DR would only | functionality for locally attached sources. Then, the DR would only | |||
| have to perform FHR functionality on behalf of sources that are | have to perform FHR functionality on behalf of sources that are | |||
| locally attached to itself AND sources that are not attached to any | locally attached to itself AND sources that are not attached to any | |||
| L3 gateway. | L3 Gateway. | |||
| N.B.: If it is possible that more than one BD contains a tenant | Note that if it is possible that more than one BD contains a tenant | |||
| multicast router, then a PE receiving an SMET route for that BD MUST | multicast router, then a PE receiving a SMET route for that BD MUST | |||
| NOT reconstruct IGMP/MLD Join Reports from the SMET route, and MUST | NOT reconstruct IGMP/MLD Join Reports from the SMET route and MUST | |||
| NOT transmit any such IGMP/MLD Join Reports on its local ACs | NOT transmit any such IGMP/MLD Join Reports on its local ACs | |||
| attaching to that BD. Otherwise, multicast traffic may be | attaching to that BD. Otherwise, multicast traffic may be | |||
| duplicated. | duplicated. | |||
| 6.1.2. Interworking with MVPN | 6.1.2. Interworking with MVPN | |||
| In this section, we specify the procedures necessary to allow EVPN | In this section, we specify the procedures necessary to allow EVPN | |||
| PEs running OISM procedures to interwork with L3VPN PEs that run BGP- | PEs running OISM procedures to interwork with L3VPN PEs that run BGP- | |||
| based MVPN [RFC6514] procedures. More specifically, the procedures | based MVPN [RFC6514] procedures. More specifically, the procedures | |||
| herein allow a given EVPN Tenant Domain to become part of an L3VPN/ | herein allow a given EVPN Tenant Domain to become part of an L3VPN/ | |||
| MVPN, and support multicast flows where either: | MVPN and support multicast flows where either of the following | |||
| occurs: | ||||
| * The source of a given multicast flow is attached to an Ethernet | * The source of a given multicast flow is attached to an Ethernet | |||
| segment whose BD is part of an EVPN Tenant Domain, and one or more | segment whose BD is part of an EVPN Tenant Domain, and one or more | |||
| receivers of the flow are attached to the network via L3VPN/MVPN. | receivers of the flow are attached to the network via L3VPN/MVPN. | |||
| (Other receivers may be attached to the network via EVPN.) | (Other receivers may be attached to the network via EVPN.) | |||
| * The source of a given multicast flow is attached to the network | * The source of a given multicast flow is attached to the network | |||
| via L3VPN/MVPN, and one or more receivers of the flow are attached | via L3VPN/MVPN, and one or more receivers of the flow are attached | |||
| to an Ethernet segment that is part of an EVPN tenant domain. | to an Ethernet segment that is part of an EVPN Tenant Domain. | |||
| (Other receivers may be attached via L3VPN/MVPN.) | (Other receivers may be attached via L3VPN/MVPN.) | |||
| In this interworking model, existing L3VPN/MVPN PEs are unaware that | In this interworking model, existing L3VPN/MVPN PEs are unaware that | |||
| certain sources or receivers are part of an EVPN Tenant Domain. The | certain sources or receivers are part of an EVPN Tenant Domain. The | |||
| existing L3VPN/MVPN nodes run only their standard procedures and are | existing L3VPN/MVPN nodes run only their standard procedures and are | |||
| entirely unaware of EVPN. Interworking is achieved by having some or | entirely unaware of EVPN. Interworking is achieved by having some or | |||
| all of the EVPN PEs function as L3 Gateways running L3VPN/MVPN | all of the EVPN PEs function as L3 Gateways running L3VPN/MVPN | |||
| procedures, as detailed in the following sub-sections. | procedures, as detailed in the following subsections. | |||
| In this section, we assume that there are no tenant multicast routers | In this section, we assume that there are no tenant multicast routers | |||
| on any of the EVPN-attached Ethernet segments. (There may of course | on any of the EVPN-attached Ethernet segments. (Of course, there may | |||
| be multicast routers in the L3VPN.) Consideration of the case where | be multicast routers in the L3VPN.) Consideration of the case where | |||
| there are tenant multicast routers is deferred till Section 7.) | there are tenant multicast routers is addressed in Section 7. | |||
| To support MVPN/EVPN interworking, we introduce the notion of an | To support MVPN/EVPN interworking, we introduce the notion of an | |||
| MVPN/EVPN Gateway, or MEG. | MVPN/EVPN Gateway (MEG). | |||
| A MEG is an L3 Gateway (see Section 6.1.1), hence is both an OISM PE | A MEG is an L3 Gateway (see Section 6.1.1); hence, it is both an OISM | |||
| and an L3VPN/MVPN PE. For a given EVPN Tenant Domain, it will have | PE and an L3VPN/MVPN PE. For a given EVPN Tenant Domain, it will | |||
| an IP-VRF. If the Tenant Domain is part of an L3VPN/MVPN, the IP-VRF | have an IP-VRF. If the Tenant Domain is part of an L3VPN/MVPN, the | |||
| also serves as an L3VPN VRF [RFC4364]. The IRB interfaces of the | IP-VRF also serves as an L3VPN VRF [RFC4364]. The IRB interfaces of | |||
| IP-VRF are considered to be "VRF interfaces" of the L3VPN VRF. The | the IP-VRF are considered to be VRF interfaces of the L3VPN VRF. The | |||
| L3VPN VRF may also have other local VRF interfaces that are not EVPN | L3VPN VRF may also have other local VRF interfaces that are not EVPN | |||
| IRB interfaces. | IRB interfaces. | |||
| The VRF on the MEG will import VPN-IP routes [RFC4364] from other | The VRF on the MEG will import VPN-IP routes [RFC4364] from other | |||
| L3VPN Provider Edge (PE) routers. It will also export VPN-IP routes | L3VPN PE routers. It will also export VPN-IP routes to other L3VPN | |||
| to other L3VPN PE routers. In order to do so, it must be | PE routers. In order to do so, it must be appropriately configured | |||
| appropriately configured with the Route Targets used in the L3VPN to | with the RTs used in the L3VPN to control the distribution of the | |||
| control the distribution of the VPN-IP routes. These Route Targets | VPN-IP routes. In general, these RTs will be different than the RTs | |||
| will in general be different than the Route Targets used for | used for controlling the distribution of EVPN routes, as there is no | |||
| controlling the distribution of EVPN routes, as there is no need to | need to distribute EVPN routes to L3VPN-only PEs and no reason to | |||
| distribute EVPN routes to L3VPN-only PEs and no reason to distribute | distribute L3VPN/MVPN routes to EVPN-only PEs. | |||
| L3VPN/MVPN routes to EVPN-only PEs. | ||||
| Note that the RDs in the imported VPN-IP routes will not necessarily | Note that the RDs in the imported VPN-IP routes will not necessarily | |||
| conform to the EVPN rules (as specified in [RFC7432]) for creating | conform to the EVPN rules (as specified in [RFC7432]) for creating | |||
| RDs. Therefore a MEG MUST NOT expect the RDs of the VPN-IP routes to | RDs. Therefore, a MEG MUST NOT expect the RDs of the VPN-IP routes | |||
| be of any particular format other than what is required by the L3VPN/ | to be of any particular format other than what is required by the | |||
| MVPN specifications. | L3VPN/MVPN specifications. | |||
| The VPN-IP routes that a MEG exports to L3VPN are subnet routes and/ | The VPN-IP routes that a MEG exports to L3VPN are subnet routes and/ | |||
| or host routes for the multicast sources that are part of the EVPN | or host routes for the multicast sources that are part of the EVPN | |||
| tenant domain. The exact set of routes that need to be exported is | Tenant Domain. The exact set of routes that need to be exported is | |||
| discussed in Section 6.1.2.2. | discussed in Section 6.1.2.2. | |||
| Each IMET route originated by a MEG SHOULD carry a Multicast Flags | Each IMET route originated by a MEG SHOULD carry a Multicast Flags | |||
| Extended Community with the "MEG" flag set, indicating that the | Extended Community with the MEG flag set, indicating that the | |||
| originator of the IMET route is a MEG. However, PE1 will consider | originator of the IMET route is a MEG. However, PE1 will consider | |||
| PE2 to be a MEG if PE1 imports at least one IMET route from PE2 that | PE2 to be a MEG if PE1 imports at least one IMET route from PE2 that | |||
| carries the Multicast Flags EC with the MEG flag set. | carries the Multicast Flags EC with the MEG flag set. | |||
| All the MEGs of a given Tenant Domain attach to the SBD of that | All the MEGs of a given Tenant Domain attach to the SBD of that | |||
| domain, and one of them is selected to be the SBD's Designated Router | domain, and one of them is selected to be the SBD's Designated Router | |||
| (the "MEG SBD-DR") for the domain. The selection procedure is | (the MEG SBD-DR) for the domain. The selection procedure is | |||
| discussed in Section 6.1.2.4. | discussed in Section 6.1.2.4. | |||
| In this model of operation, MVPN procedures and EVPN procedures are | In this model of operation, MVPN procedures and EVPN procedures are | |||
| largely independent. In particular, there is no assumption that MVPN | largely independent. In particular, there is no assumption that MVPN | |||
| and EVPN use the same kind of tunnels. Thus no special procedures | and EVPN use the same kind of tunnels. Thus, no special procedures | |||
| are needed to handle the common scenarios where, e.g., EVPN uses | are needed to handle the common scenarios where, e.g., EVPN uses | |||
| VXLAN tunnels but MVPN uses MPLS P2MP tunnels, or where EVPN uses | VXLAN tunnels but MVPN uses MPLS P2MP tunnels, or where EVPN uses IR | |||
| Ingress Replication but MVPN uses MPLS P2MP tunnels. | but MVPN uses MPLS P2MP tunnels. | |||
| Similarly, no special procedures are needed to prevent duplicate data | Similarly, no special procedures are needed to prevent duplicate data | |||
| delivery on Ethernet segments that are multi-homed. | delivery on Ethernet segments that are multihomed. | |||
| The MEG does have some special procedures (described below) for | The MEG does have some special procedures (described below) for | |||
| interworking between EVPN and MVPN; these have to do with selection | interworking between EVPN and MVPN; these have to do with selection | |||
| of the Upstream PE for a given multicast source, with the exporting | of the Upstream PE for a given multicast source, with the exporting | |||
| of VPN-IP routes, and with the generation of MVPN C-multicast routes | of VPN-IP routes and with the generation of MVPN C-multicast routes | |||
| triggered by the installation of SMET routes. | triggered by the installation of SMET routes. | |||
| 6.1.2.1. MVPN Sources with EVPN Receivers | 6.1.2.1. MVPN Sources with EVPN Receivers | |||
| 6.1.2.1.1. Identifying MVPN Sources | 6.1.2.1.1. Identifying MVPN Sources | |||
| Consider a multicast source S. It is possible that a MEG will import | Consider a multicast source S. It is possible that a MEG will import | |||
| both an EVPN unicast route to S and a VPN-IP route (or an ordinary IP | both an EVPN unicast route to S and a VPN-IP route (or an ordinary IP | |||
| route), where the prefix length of each route is the same. In order | route), where the prefix length of each route is the same. In order | |||
| to draw (S,G) multicast traffic for any group G, the MEG SHOULD use | to draw (S,G) multicast traffic for any group G, the MEG SHOULD use | |||
| the EVPN route rather than the VPN-IP or IP route to determine the | the EVPN route rather than the VPN-IP or IP route to determine the | |||
| "Upstream PE" (see section 5 of [RFC6513]). | Upstream PE (see Section 5 of [RFC6513]). | |||
| Doing so ensures that when an EVPN tenant system desires to receive a | Doing so ensures that when an EVPN tenant system desires to receive a | |||
| multicast flow from another EVPN tenant system, the traffic from the | multicast flow from another EVPN tenant system, the traffic from the | |||
| source to that receiver stays within the EVPN domain. This prevents | source to that receiver stays within the EVPN domain. This prevents | |||
| problems that might arise if there is a unicast route via L3VPN to S, | problems that might arise if there is a unicast route via L3VPN to S | |||
| but no multicast routers along the routed path. This also prevents | but no multicast routers along the routed path. This also prevents | |||
| problem that might arise as a result of the fact that the MEGs will | problem that might arise as a result of the fact that the MEGs will | |||
| import each others' VPN-IP routes. | import each others' VPN-IP routes. | |||
| In the Section 6.1.2.1.2, we describe the procedures to be used when | In Section 6.1.2.1.2, we describe the procedures to be used when the | |||
| the selected route to S is a VPN-IP route. | selected route to S is a VPN-IP route. | |||
| 6.1.2.1.2. Joining a Flow from an MVPN Source | 6.1.2.1.2. Joining a Flow from an MVPN Source | |||
| Consider a tenant system, R, on a particular BD, BD-R. Suppose R | Consider a tenant system, say R, on a particular BD, say BD-R. | |||
| wants to receive (S,G) multicast traffic, where source S is not | Suppose R wants to receive (S,G) multicast traffic, where source S is | |||
| attached to any PE in the EVPN Tenant Domain, but is attached to an | not attached to any PE in the EVPN Tenant Domain but is attached to | |||
| MVPN PE. | an MVPN PE. | |||
| * Suppose R is on a singly homed Ethernet segment of BD-R, and that | * Suppose R is on a singly homed Ethernet segment of BD-R and that | |||
| segment is attached to PE1, where PE1 is a MEG. PE1 learns via | segment is attached to PE1, where PE1 is a MEG. PE1 learns via | |||
| IGMP/MLD listening that R is interested in (S,G). PE1 determines | IGMP/MLD listening that R is interested in (S,G). PE1 determines | |||
| from its VRF that there is no route to S within the Tenant Domain | from its VRF that there is no route to S within the Tenant Domain | |||
| (i.e., no EVPN RT-2 route matching on S's IP address), but that | (i.e., no EVPN RT-2 route matching on S's IP address) but that | |||
| there is a route to S via L3VPN (i.e., the VRF contains a subnet | there is a route to S via L3VPN (i.e., the VRF contains a subnet | |||
| or host route to S that was received as a VPN-IP route). PE1 thus | or host route to S that was received as a VPN-IP route). Thus, | |||
| originates (if it hasn't already) an MVPN C-multicast Source Tree | PE1 originates (if it hasn't already) an MVPN C-multicast Source | |||
| Join(S,G) route. The route is constructed according to normal | Tree Join (S,G) route. The route is constructed according to | |||
| MVPN procedures. | normal MVPN procedures. | |||
| The layer 2 multicast state is constructed as specified in | The Layer 2 multicast state is constructed as specified in | |||
| Section 4.1. | Section 4.1. | |||
| In the layer 3 multicast state, the IIF is the appropriate MVPN | In the Layer 3 multicast state, the IIF is the appropriate MVPN | |||
| tunnel, and the IRB interface to BD-R is added to the OIF list. | tunnel, and the IRB interface to BD-R is added to the OIF list. | |||
| When PE1 receives (S,G) traffic from the appropriate MVPN tunnel, | When PE1 receives (S,G) traffic from the appropriate MVPN tunnel, | |||
| it performs IP processing of the traffic, and then sends the | it performs IP processing of the traffic and then sends the | |||
| traffic down its IRB interface to BD-R. Following normal OISM | traffic down its IRB interface to BD-R. Following normal OISM | |||
| procedures, the (S,G) traffic will be encapsulated for Ethernet | procedures, the (S,G) traffic will be encapsulated for Ethernet | |||
| and sent on the AC to which R is attached. | and sent on the AC to which R is attached. | |||
| * Suppose R is on a singly homed Ethernet segment of BD-R, and that | * Suppose R is on a singly homed Ethernet segment of BD-R and that | |||
| segment is attached to PE1, where PE1 is an OISM PE but is NOT a | segment is attached to PE1, where PE1 is an OISM PE but is NOT a | |||
| MEG. PE1 learns via IGMP/MLD listening that R is interested in | MEG. PE1 learns via IGMP/MLD listening that R is interested in | |||
| (S,G). PE1 follows normal OISM procedures, originating an SBD- | (S,G). PE1 follows normal OISM procedures, originating an SBD- | |||
| SMET route for (S,G); this route will be received by all the MEGs | SMET route for (S,G); this route will be received by all the MEGs | |||
| of the Tenant Domain, including the MEG SBD-DR. The MEG SBD-DR | of the Tenant Domain, including the MEG SBD-DR. From PE1's IMET | |||
| can determine from PE1's IMET routes whether PE1 is itself a MEG. | routes, the MEG SBD-DR can determine whether or not PE1 is itself | |||
| If PE1 is not a MEG, the MEG SBD-DR will originate (if it hasn't | a MEG. If PE1 is not a MEG, the MEG SBD-DR will originate (if it | |||
| already) an MVPN C-multicast Source Tree Join(S,G) route. This | hasn't already) an MVPN C-multicast Source Tree Join (S,G) route. | |||
| will cause the MEG SBD-DR to receive (S,G) traffic on an MVPN | This will cause the MEG SBD-DR to receive (S,G) traffic on an MVPN | |||
| tunnel. | tunnel. | |||
| The layer 2 multicast state is constructed as specified in | The Layer 2 multicast state is constructed as specified in | |||
| Section 4.1. | Section 4.1. | |||
| In the layer 3 multicast state, the IIF is the appropriate MVPN | In the Layer 3 multicast state, the IIF is the appropriate MVPN | |||
| tunnel, and the IRB interface to the SBD is added to the OIF list. | tunnel, and the IRB interface to the SBD is added to the OIF list. | |||
| When the MEG SBD-DR receives (S,G) traffic on an MVPN tunnel, it | When the MEG SBD-DR receives (S,G) traffic on an MVPN tunnel, it | |||
| performs IP processing of the traffic, and the sends the traffic | performs IP processing of the traffic and then sends the traffic | |||
| down its IRB interface to the SBD. Following normal OISM | down its IRB interface to the SBD. Following normal OISM | |||
| procedures, the traffic will be encapsulated for Ethernet and | procedures, the traffic will be encapsulated for Ethernet and | |||
| delivered to all PEs in the Tenant Domain that have interest in | delivered to all PEs in the Tenant Domain that have interest in | |||
| (S,G), including PE1. | (S,G), including PE1. | |||
| * If R is on a multi-homed Ethernet segment of BD-R, one of the PEs | * If R is on a multihomed Ethernet segment of BD-R, one of the PEs | |||
| attached to the segment will be its DF (following normal EVPN | attached to the segment will be its DF (following normal EVPN | |||
| procedures), and the DF will know (via IGMP/MLD listening or the | procedures), and the DF will know (via IGMP/MLD listening or the | |||
| procedures of [RFC9251]) that a tenant system reachable via one of | procedures of [RFC9251]) that a tenant system reachable via one of | |||
| its local ACs to BD-R is interested in (S,G) traffic. The DF is | its local ACs to BD-R is interested in (S,G) traffic. The DF is | |||
| responsible for originating an SBD-SMET route for (S,G), following | responsible for originating an SBD-SMET route for (S,G), following | |||
| normal OISM procedures. If the DF is a MEG, it MUST originate the | normal OISM procedures. If the DF is a MEG, it MUST originate the | |||
| corresponding MVPN C-multicast Source Tree Join(S,G) route; if the | corresponding MVPN C-multicast Source Tree Join (S,G) route; if | |||
| DF is not a MEG, the MEG SBD-DR SBD MUST originate the C-multicast | the DF is not a MEG, the MEG SBD-DR SBD MUST originate the | |||
| route when it receives the SMET route. | C-multicast route when it receives the SMET route. | |||
| Optionally, if the non-DF is a MEG, it MAY originate the | Optionally, if the non-DF is a MEG, it MAY originate the | |||
| corresponding MVPN C-multicast Source Tree Join(S,G) route. This | corresponding MVPN C-multicast Source Tree Join (S,G) route. This | |||
| will cause the traffic to flow to both the DF and the non-DF, but | will cause the traffic to flow to both the DF and the non-DF, but | |||
| only the DF will forward the traffic out an AC. This allows for | only the DF will forward the traffic out an AC. This allows for | |||
| quicker recovery if the DF's local AC to R fails. | quicker recovery if the DF's local AC to R fails. | |||
| * If R is attached to a non-OISM PE, it will receive the traffic via | * If R is attached to a non-OISM PE, it will receive the traffic via | |||
| an IPMG, as specified in Section 5. | an IPMG, as specified in Section 5. | |||
| If an EVPN-attached receiver is interested in (*,G) traffic, and if | If an EVPN-attached receiver is interested in (*,G) traffic, and if | |||
| it is possible for there to be sources of (*,G) traffic that are | it is possible for there to be sources of (*,G) traffic that are | |||
| attached only to L3VPN nodes, the MEGs will have to know the group- | attached only to L3VPN nodes, the MEGs will have to know the group- | |||
| to-RP mappings. That will enable them to originate MVPN C-multicast | to-RP mappings. That will enable them to originate MVPN C-multicast | |||
| Shared Tree Join(*,G) routes and to send them towards the RP. (Since | Shared Tree Join (*,G) routes and to send them toward the RP. (Since | |||
| we are assuming in this section that there are no tenant multicast | we are assuming in this section that there are no tenant multicast | |||
| routers attached to the EVPN Tenant Domain, the RP must be attached | routers attached to the EVPN Tenant Domain, the RP must be attached | |||
| via L3VPN. Alternatively, the MEG itself could be configured to | via L3VPN. Alternatively, the MEG itself could be configured to | |||
| function as an RP for group G.) | function as an RP for group G.) | |||
| The layer 2 multicast states are constructed as specified in | The Layer 2 multicast states are constructed as specified in | |||
| Section 4.1. | Section 4.1. | |||
| In the layer 3 (*,G) multicast state, the IIF is the appropriate MVPN | In the Layer 3 (*,G) multicast state, the IIF is the appropriate MVPN | |||
| tunnel. A MEG will add to the (*,G) OIF list its IRB interfaces for | tunnel. A MEG will add its IRB interfaces to the (*,G) OIF list for | |||
| any BDs containing locally attached receivers. If there are | any BDs containing locally attached receivers. If there are | |||
| receivers attached to other EVPN PEs, then whenever (S,G) traffic | receivers attached to other EVPN PEs, then whenever (S,G) traffic | |||
| from an external source matches a (*,G) state, the MEG will create | from an external source matches a (*,G) state, the MEG will create | |||
| (S,G) state, with the MVPN tunnel as the IIF, the OIF list copied | (S,G) state, with the MVPN tunnel as the IIF, the OIF list copied | |||
| from the (*,G) state, and the SBD IRB interface added to the OIF | from the (*,G) state, and the SBD IRB interface added to the OIF | |||
| list. (Please see the discussion in Section 6.1.1 regarding the | list. (Please see the discussion in Section 6.1.1 regarding the | |||
| inclusion of the SBD IRB interface in a (*,G) state; the SBD IRB | inclusion of the SBD IRB interface in a (*,G) state; the SBD IRB | |||
| interface is used in the OIF list only for traffic from external | interface is only used in the OIF list for traffic from external | |||
| sources.) | sources.) | |||
| Normal MVPN procedures will then result in the MEG getting the (*,G) | Normal MVPN procedures will then result in the MEG getting the (*,G) | |||
| traffic from all the multicast sources for G that are attached via | traffic from all the multicast sources for G that are attached via | |||
| L3VPN. This traffic arrives on MVPN tunnels. When the MEG removes | L3VPN. This traffic arrives on MVPN tunnels. When the MEG removes | |||
| the traffic from these tunnels, it does the IP processing. If there | the traffic from these tunnels, it does the IP processing. If there | |||
| are any receivers on a given BD, BD-R, that are attached via local | are any receivers on a given BD, say BD-R, that are attached via | |||
| EVPN ACs, the MEG sends the traffic down its BD-R IRB interface. If | local EVPN ACs, the MEG sends the traffic down its BD-R IRB | |||
| there are any other EVPN PEs that are interested in the (*,G) | interface. If there are any other EVPN PEs that are interested in | |||
| traffic, the MEG sends the traffic down the SBD IRB interface. | the (*,G) traffic, the MEG sends the traffic down the SBD IRB | |||
| Normal OISM procedures then distribute the traffic as needed to other | interface. Normal OISM procedures then distribute the traffic as | |||
| EVPN-PEs. | needed to other EVPN PEs. | |||
| 6.1.2.2. EVPN Sources with MVPN Receivers | 6.1.2.2. EVPN Sources with MVPN Receivers | |||
| 6.1.2.2.1. General procedures | ||||
| 6.1.2.2.1. General Procedures | ||||
| Consider the case where an EVPN tenant system S is sending IP | Consider the case where an EVPN tenant system S is sending IP | |||
| multicast traffic to group G, and there is a receiver R for the (S,G) | multicast traffic to group G and there is a receiver R for the (S,G) | |||
| traffic that is attached to the L3VPN, but not attached to the EVPN | traffic that is attached to the L3VPN but not attached to the EVPN | |||
| Tenant Domain. (We assume in this document that the L3VPN/MVPN-only | Tenant Domain. (In this document, we assume that the L3VPN-/MVPN- | |||
| nodes will not have any special procedures to deal with the case | only nodes will not have any special procedures to deal with the case | |||
| where a source is inside an EVPN domain.) | where a source is inside an EVPN domain.) | |||
| In this case, an L3VPN PE through which R can be reached has to send | In this case, an L3VPN PE through which R can be reached has to send | |||
| an MVPN C-multicast Join(S,G) route to one of the MEGs that is | an MVPN C-multicast Join (S,G) route to one of the MEGs that is | |||
| attached to the EVPN Tenant Domain. For this to happen, the L3VPN PE | attached to the EVPN Tenant Domain. For this to happen, the L3VPN PE | |||
| must have imported a VPN-IP route for S (either a host route or a | must have imported a VPN-IP route for S (either a host route or a | |||
| subnet route) from a MEG. | subnet route) from a MEG. | |||
| If a MEG determines that there is multicast source transmitting on | If a MEG determines that there is multicast source transmitting on | |||
| one of its ACs, the MEG SHOULD originate a VPN-IP host route for that | one of its ACs, the MEG SHOULD originate a VPN-IP host route for that | |||
| source. This determination SHOULD be made by examining the IP | source. This determination SHOULD be made by examining the IP | |||
| multicast traffic that arrives on the ACs. (It MAY be made by | multicast traffic that arrives on the ACs. (It MAY be made by | |||
| provisioning.) A MEG SHOULD NOT export a VPN-IP host route for any | provisioning.) A MEG SHOULD NOT export a VPN-IP host route for any | |||
| IP address that is not known to be a multicast source (unless it has | IP address that is not known to be a multicast source (unless it has | |||
| some other reason for exporting such a route). The VPN-IP host route | some other reason for exporting such a route). The VPN-IP host route | |||
| for a given multicast source MUST be withdrawn if the source goes | for a given multicast source MUST be withdrawn if the source goes | |||
| silent for a configurable period of time, or if it can be determined | silent for a configurable period of time or if it can be determined | |||
| that the source is no longer reachable via a local AC. | that the source is no longer reachable via a local AC. | |||
| A MEG SHOULD also originate a VPN-IP subnet route for each of the BDs | A MEG SHOULD also originate a VPN-IP subnet route for each of the BDs | |||
| in the Tenant Domain. | in the Tenant Domain. | |||
| VPN-IP routes exported by a MEG must carry any attributes or extended | VPN-IP routes exported by a MEG must carry any attributes or Extended | |||
| communities that are required by L3VPN and MVPN. In particular, a | Communities that are required by L3VPN and MVPN. In particular, a | |||
| VPN-IP route exported by a MEG must carry a VRF Route Import Extended | VPN-IP route exported by a MEG must carry a VRF Route Import Extended | |||
| Community corresponding to the IP-VRF from which it is imported, and | Community corresponding to the IP-VRF from which it is imported and a | |||
| a Source AS Extended Community. | Source AS Extended Community. | |||
| As a result, if S is attached to a MEG, the L3VPN nodes will direct | As a result, if S is attached to a MEG, the L3VPN nodes will direct | |||
| their MVPN C-multicast Join routes to that MEG. Normal MVPN | their MVPN C-multicast Join routes to that MEG. Normal MVPN | |||
| procedures will cause the traffic to be delivered to the L3VPN nodes. | procedures will cause the traffic to be delivered to the L3VPN nodes. | |||
| The layer 3 multicast state for (S,G) will have the MVPN tunnel on | The Layer 3 multicast state for (S,G) will have the MVPN tunnel on | |||
| its OIF list. The IIF will be the IRB interface leading to the BD | its OIF list. The IIF will be the IRB interface leading to the BD | |||
| containing S. | containing S. | |||
| If S is not attached to a MEG, the L3VPN nodes will direct their | If S is not attached to a MEG, the L3VPN nodes will direct their | |||
| C-multicast Join routes to whichever MEG appears to be on the best | C-multicast Join routes to whichever MEG appears to be on the best | |||
| route to S's subnet. Upon receiving the C-multicast Join, that MEG | route to S's subnet. Upon receiving the C-multicast Join, that MEG | |||
| will originate an EVPN SMET route for (S,G). As a result, the MEG | will originate an EVPN SMET route for (S,G). As a result, the MEG | |||
| will receive the (S,G) traffic at layer 2 via the OISM procedures. | will receive the (S,G) traffic at Layer 2 via the OISM procedures. | |||
| The (S,G) traffic will be sent up the appropriate IRB interface, and | The (S,G) traffic will be sent up the appropriate IRB interface, and | |||
| the layer 3 MVPN procedures will ensure that the traffic is delivered | the Layer 3 MVPN procedures will ensure that the traffic is delivered | |||
| to the L3VPN nodes that have requested it. The layer 3 multicast | to the L3VPN nodes that have requested it. The Layer 3 multicast | |||
| state for (S,G) will have the MVPN tunnel in the OIF list, and the | state for (S,G) will have the MVPN tunnel in the OIF list, and the | |||
| IIF will be one of the following: | IIF will be one of the following: | |||
| * If S belongs to a BD that is attached to the MEG, the IIF will be | * If S belongs to a BD that is attached to the MEG, the IIF will be | |||
| the IRB interface to that BD; | the IRB interface to that BD. | |||
| * Otherwise the IIF will be the SBD IRB interface. | * Otherwise, the IIF will be the SBD IRB interface. | |||
| Note that this works even if S is attached to a non-OISM PE, per the | Note that this works even if S is attached to a non-OISM PE, per the | |||
| procedures of Section 5. | procedures of Section 5. | |||
| 6.1.2.2.2. Any-Source Multicast (ASM) Groups | 6.1.2.2.2. Any-Source Multicast (ASM) Groups | |||
| Suppose the MEG SBD-DR learns that one of the PEs in its Tenant | Suppose the MEG SBD-DR learns that one of the PEs in its Tenant | |||
| Domain is interested in (*,G), traffic, where G is an Any-Source | Domain is interested in (*,G) traffic, where G is an ASM group. If | |||
| Multicast (ASM) group. If there are no tenant multicast routers, the | there are no tenant multicast routers, the MEG SBD-DR SHOULD perform | |||
| MEG SBD-DR SHOULD perform the "First Hop Router" (FHR) functionality | the First Hop Router (FHR) functionality for group G on behalf of the | |||
| for group G on behalf of the Tenant Domain, as described in | Tenant Domain, as described in [RFC7761]. This means that the MEG | |||
| [RFC7761]. This means that the MEG SBD-DR must know the identity of | SBD-DR must know the identity of the RP for each group, must send | |||
| the Rendezvous Point (RP) for each group, must send Register messages | Register messages to the RP, etc. | |||
| to the Rendezvous Point, etc. | ||||
| If the MEG SBD-DR is to be the FHR for the Tenant Domain, it must see | If the MEG SBD-DR is to be the FHR for the Tenant Domain, it must see | |||
| all the multicast traffic that is sourced from within the domain and | all the multicast traffic that is sourced from within the domain and | |||
| destined to an ASM group address. The MEG can ensure this by | destined to an ASM group address. The MEG can ensure this by | |||
| originating an SBD-SMET route for (*,*). | originating an SBD-SMET route for (*,*). | |||
| (As a possible optimization, an SBD-SMET route for (*, "any ASM | (As a possible optimization, an SBD-SMET route for (*, any ASM group) | |||
| group") may be defined in a separate document.) | may be defined in a separate document.) | |||
| In some deployment scenarios, it may be preferred that the MEG that | In some deployment scenarios, it may be preferred that the MEG that | |||
| receives the (S,G) traffic over an AC be the one providing the FHR | receives the (S,G) traffic over an AC be the one providing the FHR | |||
| functionality. This behavior is OPTIONAL. If this option is used, | functionality. This behavior is OPTIONAL. If this option is used, | |||
| it MUST be ensured that the MEG DR does not provide the FHR | it MUST be ensured that the MEG DR does not provide the FHR | |||
| functionality for (S,G) traffic that is attached to another MEG; FHR | functionality for (S,G) traffic that is attached to another MEG; FHR | |||
| functionality for (S,G) traffic from a particular source S MUST be | functionality for (S,G) traffic from a particular source S MUST be | |||
| provided by only a single router. | provided by only a single router. | |||
| Other deployment scenarios are also possible. For example, one might | Other deployment scenarios are also possible. For example, one might | |||
| want to configure the MEGs themselves to be RPs. In this case, the | want to configure the MEGs themselves to be RPs. In this case, the | |||
| RPs would have to exchange with each other information about which | RPs would have to exchange with each other information about which | |||
| sources are active. The method exchanging such information is | sources are active. The method exchanging such information is | |||
| outside the scope of this document. | outside the scope of this document. | |||
| 6.1.2.2.3. Source on Multihomed Segment | 6.1.2.2.3. Source on Multihomed Segment | |||
| Suppose S is attached to a segment that is all-active multi-homed to | Suppose S is attached to a segment that is all-active multihomed to | |||
| PEl and PE2. If S is transmitting to two groups, say G1 and G2, it | PE1 and PE2. If S is transmitting to two groups, say G1 and G2, it | |||
| is possible that PE1 will receive the (S,G1) traffic from S while PE2 | is possible that PE1 will receive the (S,G1) traffic from S, whereas | |||
| receives the (S,G2) traffic from S. | PE2 will receive the (S,G2) traffic from S. | |||
| This creates an issue for MVPN/EVPN interworking, because there is no | This creates an issue for MVPN/EVPN interworking, because there is no | |||
| way to cause L3VPN/MVPN nodes to select PE1 as the ingress PE for | way to cause L3VPN/MVPN nodes to select PE1 as the ingress PE for | |||
| (S,G1) traffic while selecting PE2 as the ingress PE for (S,G2) | (S,G1) traffic while selecting PE2 as the ingress PE for (S,G2) | |||
| traffic. | traffic. | |||
| However, the following procedure ensures that the IP multicast | However, the following procedure ensures that the IP multicast | |||
| traffic will still flow, even if the L3VPN/MVPN nodes picks the | traffic will still flow, even if the L3VPN/MVPN nodes pick the wrong | |||
| "wrong" EVPN-PE as the Upstream PE for (say) the (S,G1) traffic. | EVPN PE as the Upstream PE for, e.g., the (S,G1) traffic. | |||
| Suppose S is on an Ethernet segment, belonging to BD1, that is | Suppose S is on an Ethernet segment, belonging to BD1, that is | |||
| multi-homed to both PE1 and PE2, where PE1 is a MEG. And suppose | multihomed to both PE1 and PE2, where PE1 is a MEG. And suppose that | |||
| that IP multicast traffic from S to G travels over the AC that | IP multicast traffic from S to G travels over the AC that attaches | |||
| attaches the segment to PE2. If PE1 receives a C-multicast Source | the segment to PE2. If PE1 receives a C-multicast Source Tree Join | |||
| Tree Join (S,G) route, it MUST originate an SMET route for (S,G). | (S,G) route, it MUST originate a SMET route for (S,G). Normal OISM | |||
| Normal OISM procedures will then cause PE2 to send the (S,G) traffic | procedures will then cause PE2 to send the (S,G) traffic to PE1 on an | |||
| to PE1 on an EVPN IP multicast tunnel. Normal OISM procedures will | EVPN IP multicast tunnel. Normal OISM procedures will also cause PE1 | |||
| also cause PE1 to send the (S,G) traffic up its BD1 IRB interface. | to send the (S,G) traffic up its BD1 IRB interface. Normal MVPN | |||
| Normal MVPN procedures will then cause PE1 to forward the traffic on | procedures will then cause PE1 to forward the traffic on an MVPN | |||
| an MVPN tunnel. In this case, the routing is not optimal, but the | tunnel. In this case, the routing is not optimal, but the traffic | |||
| traffic does flow correctly. | does flow correctly. | |||
| 6.1.2.3. Obtaining Optimal Routing of Traffic Between MVPN and EVPN | 6.1.2.3. Obtaining Optimal Routing of Traffic between MVPN and EVPN | |||
| The routing of IP multicast traffic between MVPN nodes and EVPN nodes | The routing of IP multicast traffic between MVPN nodes and EVPN nodes | |||
| will be optimal as long as there is a MEG along the optimal route. | will be optimal as long as there is a MEG along the optimal route. | |||
| There are various deployment strategies that can be used to obtain | There are various deployment strategies that can be used to obtain | |||
| optimal routing between MVPN and EVPN. | optimal routing between MVPN and EVPN. | |||
| In one such scenario, a Tenant Domain will have a small number of | In one such scenario, a Tenant Domain will have a small number of | |||
| strategically placed MEGs. For example, a Data Center may have a | strategically placed MEGs. For example, a data center may have a | |||
| small number of MEGs that connect it to a wide-area network. Then | small number of MEGs that connect it to a wide-area network. Then, | |||
| the optimal route into or out of the Data Center would be through the | the optimal route into or out of the data center would be through the | |||
| MEGs. | MEGs. | |||
| In this scenario, the MEGs do not need to originate VPN-IP host | In this scenario, the MEGs do not need to originate VPN-IP host | |||
| routes for the multicast sources, they only need to originate VPN-IP | routes for the multicast sources; they only need to originate VPN-IP | |||
| subnet routes. The internal structure of the EVPN is completely | subnet routes. The internal structure of the EVPN is completely | |||
| hidden from the MVPN node. EVPN actions such as MAC Mobility and | hidden from the MVPN node. EVPN actions, such as MAC Mobility and | |||
| Mass Withdrawal [RFC7432] have zero impact on the MVPN control plane. | Mass Withdrawal [RFC7432], have zero impact on the MVPN control | |||
| plane. | ||||
| While this deployment scenario provides the most optimal routing and | While this deployment scenario provides the most optimal routing and | |||
| has the least impact on the installed based of MVPN nodes, it does | has the least impact on the installed based of MVPN nodes, it does | |||
| complicate network planning considerations. | complicate network planning considerations. | |||
| Another way of providing routing that is close to optimal is to turn | Another way of providing routing that is close to optimal is to turn | |||
| each EVPN PE into a MEG. Then routing of MVPN-to-EVPN traffic is | each EVPN PE into a MEG. Then, routing of MVPN-to-EVPN traffic is | |||
| optimal. However, routing of EVPN-to-MVPN traffic is not guaranteed | optimal. However, routing of EVPN-to-MVPN traffic is not guaranteed | |||
| to be optimal when a source host is on a multi-homed Ethernet segment | to be optimal when a source host is on a multihomed Ethernet segment | |||
| (as discussed in Section 6.1.2.2.) | (as discussed in Section 6.1.2.2.) | |||
| The obvious disadvantage of this method is that it requires every | The obvious disadvantage of this method is that it requires every | |||
| EVPN PE to be a MEG. | EVPN PE to be a MEG. | |||
| The procedures specified in this document allow an operator to add | The procedures specified in this document allow an operator to add | |||
| MEG functionality to any subset of his EVPN OISM PEs. This allows an | MEG functionality to any subset of its EVPN OISM PEs. This allows an | |||
| operator to make whatever trade-offs deemed appropriate between | operator to make whatever trade-offs deemed appropriate between | |||
| optimal routing and MEG deployment. | optimal routing and MEG deployment. | |||
| 6.1.2.4. Selecting the MEG SBD-DR | 6.1.2.4. Selecting the MEG SBD-DR | |||
| Every PE that is eligible for selection as the MEG SBD-DR originates | Every PE that is eligible for selection as the MEG SBD-DR originates | |||
| an SBD-IMET route. As stated in Section 5, these SBD-IMET routes | an SBD-IMET route. As stated in Section 5, these SBD-IMET routes | |||
| carry a Multicast Flags EC with the MEG Flag set. | carry a Multicast Flags EC with the MEG flag set. | |||
| These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | |||
| Election EC and its use are specified in [RFC8584]. When the route | Election EC and its use are specified in [RFC8584]. When the route | |||
| is originated, the AC-DF bit in the DF Election EC SHOULD be set to | is originated, the AC-DF bit in the DF Election EC SHOULD be set to | |||
| zero. This bit is not used when selecting a MEG SBD-DR, i.e., it | zero. This bit is not used when selecting a MEG SBD-DR, i.e., it | |||
| MUST be ignored by the receiver of an SBD-IMET route. | MUST be ignored by the receiver of an SBD-IMET route. | |||
| In the context of a given Tenant Domain, to select the MEG SBD-DR, | In the context of a given Tenant Domain, to select the MEG SBD-DR, | |||
| the MEGs of the Tenant Domain perform the following procedure: | the MEGs of the Tenant Domain perform the following procedure: | |||
| * From the set of received SBD-IMET routes for the given tenant | * From the set of received SBD-IMET routes for the given Tenant | |||
| domain, determine the candidate set of PEs that support MEG | Domain, determine the candidate set of PEs that support MEG | |||
| functionality for that domain. | functionality for that domain. | |||
| * Select a DF Election algorithm as specified in [RFC8584]. Some of | * Select a DF election algorithm as specified in [RFC8584]. Some of | |||
| the possible algorithms can be found, e.g., in [RFC7432], | the possible algorithms can be found, e.g., in [RFC7432], | |||
| [RFC8584], and [I-D.ietf-bess-evpn-pref-df]. | [RFC8584], and [EVPN-DF]. | |||
| * Apply the DF Election Algorithm (see [RFC8584]) to the candidate | * Apply the DF election algorithm (see [RFC8584]) to the candidate | |||
| set of PEs. The "winner" becomes the MEG SBD-DR. | set of PEs. The winner becomes the MEG SBD-DR. | |||
| Note that if a given PE supports IPMG (Section 6.1.2) or PEG | Note that if a given PE supports IPMG (Section 6.1.2) or PEG | |||
| (Section 6.1.4) functionality as well as MEG functionality, its | (Section 6.1.4) functionality as well as MEG functionality, its SBD- | |||
| SBD-IMET routes carry only one DF Election EC. | IMET routes carry only one DF Election EC. | |||
| 6.1.3. Interworking with 'Global Table Multicast' | 6.1.3. Interworking with Global Table Multicast | |||
| If multicast service to the outside sources and/or receivers is | If multicast service to the outside sources and/or receivers is | |||
| provided via the BGP-based "Global Table Multicast" (GTM) procedures | provided via the BGP-based Global Table Multicast (GTM) procedures of | |||
| of [RFC7716], the procedures of Section 6.1.2 can easily be adapted | [RFC7716], the procedures of Section 6.1.2 can easily be adapted for | |||
| for EVPN/GTM interworking. The way to adapt the MVPN procedures to | EVPN/GTM interworking. The way to adapt the MVPN procedures to GTM | |||
| GTM is explained in [RFC7716]. | is explained in [RFC7716]. | |||
| 6.1.4. Interworking with PIM | 6.1.4. Interworking with PIM | |||
| As we have been discussing, there may be receivers in an EVPN tenant | As discussed, there may be receivers in an EVPN Tenant Domain that | |||
| domain that are interested in multicast flows whose sources are | are interested in multicast flows whose sources are outside the EVPN | |||
| outside the EVPN Tenant Domain. Or there may be receivers outside an | Tenant Domain. Or there may be receivers outside an EVPN Tenant | |||
| EVPN Tenant Domain that are interested in multicast flows whose | Domain that are interested in multicast flows whose sources are | |||
| sources are inside the Tenant Domain. | inside the Tenant Domain. | |||
| If the outside sources and/or receivers are part of an MVPN, | If the outside sources and/or receivers are part of an MVPN, see the | |||
| interworking procedures are covered in Section 6.1.2. | procedures for interworking that are covered in Section 6.1.2. | |||
| There are also cases where an external source or receiver are | There are also cases where an external source or receiver are | |||
| attached via IP, and the layer 3 multicast routing is done via PIM. | attached via IP and the Layer 3 multicast routing is done via PIM. | |||
| In this case, the interworking between the "PIM domain" and the EVPN | In this case, the interworking between the PIM domain and the EVPN | |||
| tenant domain is done at L3 Gateways that perform "PIM/EVPN Gateway" | Tenant Domain is done at L3 Gateways that perform PIM/EVPN Gateway | |||
| (PEG) functionality. A PEG is very similar to a MEG, except that its | (PEG) functionality. A PEG is very similar to a MEG, except that its | |||
| layer 3 multicast routing is done via PIM rather than via BGP. | Layer 3 multicast routing is done via PIM rather than via BGP. | |||
| If external sources or receivers for a given group are attached to a | If external sources or receivers for a given group are attached to a | |||
| PEG via a layer 3 interface, that interface should be treated as a | PEG via a Layer 3 interface, that interface should be treated as a | |||
| VRF interface attached to the Tenant Domain's L3VPN VRF. The layer 3 | VRF interface attached to the Tenant Domain's L3VPN VRF. The Layer 3 | |||
| multicast routing instance for that Tenant Domain will either run PIM | multicast routing instance for that Tenant Domain will either run PIM | |||
| on the VRF interface or will listen for IGMP/MLD messages on that | on the VRF interface or listen for IGMP/MLD messages on that | |||
| interface. If the external receiver is attached elsewhere on an IP | interface. If the external receiver is attached elsewhere on an IP | |||
| network, the PE has to enable PIM on its interfaces to the backbone | network, the PE has to enable PIM on its interfaces to the backbone | |||
| network. In both cases, the PE needs to perform PEG functionality, | network. In both cases, the PE needs to perform PEG functionality, | |||
| and its IMET routes must carry the Multicast Flags EC with the PEG | and its IMET routes must carry the Multicast Flags EC with the PEG | |||
| flag set. | flag set. | |||
| For each BD on which there is a multicast source or receiver, one of | For each BD on which there is a multicast source or receiver, one of | |||
| the PEGs will becomes the PEG DR. DR selection can be done using the | the PEGs will become the PEG DR. DR selection can be done using the | |||
| same procedures specified in Section 6.1.2.4, except with "PEG" | same procedures specified in Section 6.1.2.4, except with PEG | |||
| substituted for "MEG". | substituted for MEG. | |||
| As long as there are no tenant multicast routers within the EVPN | As long as there are no tenant multicast routers within the EVPN | |||
| Tenant Domain, the PEGs do not need to run PIM on their IRB | Tenant Domain, the PEGs do not need to run PIM on their IRB | |||
| interfaces. | interfaces. | |||
| 6.1.4.1. Source Inside EVPN Domain | 6.1.4.1. Source Inside EVPN Domain | |||
| If a PEG receives a PIM Join(S,G) from outside the EVPN tenant | If a PEG receives a PIM Join (S,G) from outside the EVPN Tenant | |||
| domain, it may find it necessary to create (S,G) state. The PE needs | Domain, it may find it necessary to create (S,G) state. The PE needs | |||
| to determine whether S is within the Tenant Domain. If S is not | to determine whether S is within the Tenant Domain. If S is not | |||
| within the EVPN Tenant Domain, the PE carries out normal layer 3 | within the EVPN Tenant Domain, the PE carries out normal Layer 3 | |||
| multicast routing procedures. If S is within the EVPN tenant domain, | multicast routing procedures. If S is within the EVPN Tenant Domain, | |||
| the IIF of the (S,G) state is set as follows: | the IIF of the (S,G) state is set as follows: | |||
| * if S is on a BD that is attached to the PE, the IIF is the PE's | * If S is on a BD that is attached to the PE, the IIF is the PE's | |||
| IRB interface to that BD; | IRB interface to that BD. | |||
| * if S is not on a BD that is attached to the PE, the IIF is the | * If S is not on a BD that is attached to the PE, the IIF is the | |||
| PE's IRB interface to the SBD. | PE's IRB interface to the SBD. | |||
| When the PE creates such an (S,G) state, it MUST originate (if it | When the PE creates such an (S,G) state, it MUST originate (if it | |||
| hasn't already) an SBD-SMET route for (S,G). This will cause it to | hasn't already) an SBD-SMET route for (S,G). This will cause it to | |||
| pull the (S,G) traffic via layer 2. When the traffic arrives over an | pull the (S,G) traffic via Layer 2. When the traffic arrives over an | |||
| EVPN tunnel, it gets sent up an IRB interface where the layer 3 | EVPN tunnel, it gets sent up an IRB interface where the Layer 3 | |||
| multicast routing determines the packet's disposition. The SBD-SMET | multicast routing determines the packet's disposition. The SBD-SMET | |||
| route is withdrawn when the (S,G) state no longer exists (unless | route is withdrawn when the (S,G) state no longer exists (unless | |||
| there is some other reason for not withdrawing it). | there is some other reason for not withdrawing it). | |||
| If there are no tenant multicast routers within the EVPN tenant | If there are no tenant multicast routers within the EVPN Tenant | |||
| domain, there cannot be an RP in the Tenant Domain, so a PEG does not | Domain, there cannot be an RP in the Tenant Domain, so a PEG does not | |||
| have to handle externally arriving PIM Join(*,G) messages. | have to handle externally arriving PIM Join (*,G) messages. | |||
| The PEG DR for a particular BD MUST act as the a First Hop Router for | The PEG DR for a particular BD MUST act as the a First Hop Router for | |||
| that BD. It will examine all (S,G) traffic on the BD, and whenever G | that BD. It will examine all (S,G) traffic on the BD, and whenever G | |||
| is an ASM group, the PEG DR will send Register messages to the RP for | is an ASM group, the PEG DR will send Register messages to the RP for | |||
| G. This means that the PEG DR will need to pull all the (S,G) | G. This means that the PEG DR will need to pull all the (S,G) | |||
| traffic originating on a given BD, by originating an SMET (*,*) route | traffic originating on a given BD by originating a SMET (*,*) route | |||
| for that BD. If a PEG DR is the DR for all the BDs, in SHOULD | for that BD. If a PEG DR is the DR for all the BDs, it SHOULD | |||
| originate just an SBD-SMET (*,*) route rather than an SMET (*,*) | originate just an SBD-SMET (*,*) route rather than a SMET (*,*) route | |||
| route for each BD. | for each BD. | |||
| The rules for exporting IP routes to multicast sources are the same | The rules for exporting IP routes to multicast sources are the same | |||
| as those specified for MEGs in Section 6.1.2.2, except that the | as those specified for MEGs in Section 6.1.2.2, except that the | |||
| exported routes will be IP routes rather than VPN-IP routes, and it | exported routes will be IP routes rather than VPN-IP routes, and it | |||
| is not necessary to attach the VRF Route Import EC or the Source AS | is not necessary to attach the VRF Route Import EC or the Source AS | |||
| EC. | EC. | |||
| When a source is on a multi-homed segment, the same issue discussed | When a source is on a multihomed segment, the same issue discussed in | |||
| in Section 6.1.2.2.3 exists. Suppose S is on an Ethernet segment, | Section 6.1.2.2.3 exists. Suppose S is on an Ethernet segment, | |||
| belonging to BD1, that is multi-homed to both PE1 and PE2, where PE1 | belonging to BD1, that is multihomed to both PE1 and PE2, where PE1 | |||
| is a PEG. And suppose that IP multicast traffic from S to G travels | is a PEG. And suppose that IP multicast traffic from S to G travels | |||
| over the AC that attaches the segment to PE2. If PE1 receives an | over the AC that attaches the segment to PE2. If PE1 receives an | |||
| external PIM Join (S,G) route, it MUST originate an SMET route for | external PIM Join (S,G) route, it MUST originate a SMET route for | |||
| (S,G). Normal OISM procedures will cause PE2 to send the (S,G) | (S,G). Normal OISM procedures will cause PE2 to send the (S,G) | |||
| traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM | traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM | |||
| procedures will also cause PE1 to send the (S,G) traffic up its BD1 | procedures will also cause PE1 to send the (S,G) traffic up its BD1 | |||
| IRB interface. Normal PIM procedures will then cause PE1 to forward | IRB interface. Normal PIM procedures will then cause PE1 to forward | |||
| the traffic along a PIM tree. In this case, the routing is not | the traffic along a PIM tree. In this case, the routing is not | |||
| optimal, but the traffic does flow correctly. | optimal, but the traffic does flow correctly. | |||
| 6.1.4.2. Source Outside EVPN Domain | 6.1.4.2. Source Outside EVPN Domain | |||
| By means of normal OISM procedures, a PEG learns whether there are | By means of normal OISM procedures, a PEG learns whether there are | |||
| receivers in the Tenant Domain that are interested in receiving (*,G) | receivers in the Tenant Domain that are interested in receiving (*,G) | |||
| or (S,G) traffic. The PEG must determine whether S (or the RP for G) | or (S,G) traffic. The PEG must determine whether or not S (or the RP | |||
| is outside the EVPN Tenant Domain. If so, and if there is a receiver | for G) is outside the EVPN Tenant Domain. If so, and if there is a | |||
| on BD1 interested in receiving such traffic, the PEG DR for BD1 is | receiver on BD1 interested in receiving such traffic, the PEG DR for | |||
| responsible for originating a PIM Join(S,G) or Join(*,G) control | BD1 is responsible for originating a PIM Join (S,G) or Join (*,G) | |||
| message. | control message. | |||
| An alternative would be to allow any PEG that is directly attached to | An alternative would be to allow any PEG that is directly attached to | |||
| a receiver to originate the PIM Joins. Then the PEG DR would only | a receiver to originate the PIM Joins. Then, the PEG DR would only | |||
| have to originate PIM Joins on behalf of receivers that are not | have to originate PIM Joins on behalf of receivers that are not | |||
| attached to a PEG. However, if this is done, it is necessary for the | attached to a PEG. However, if this is done, it is necessary for the | |||
| PEGs to run PIM on all their IRB interfaces, so that the PIM Assert | PEGs to run PIM on all their IRB interfaces so that the PIM Assert | |||
| procedures can be used to prevent duplicate delivery to a given BD. | procedures can be used to prevent duplicate delivery to a given BD. | |||
| The IIF for the layer 3 (S,G) or (*,G) state is determined by normal | The IIF for the Layer 3 (S,G) or (*,G) state is determined by normal | |||
| PIM procedures. If a receiver is on BD1, and the PEG DR is attached | PIM procedures. If a receiver is on BD1, and the PEG DR is attached | |||
| to BD1, its IRB interface to BD1 is added to the OIF list. This | to BD1, its IRB interface to BD1 is added to the OIF list. This | |||
| ensures that any receivers locally attached to the PEG DR will | ensures that any receivers locally attached to the PEG DR will | |||
| receive the traffic. If there are receivers attached to other EVPN | receive the traffic. If there are receivers attached to other EVPN | |||
| PEs, then whenever (S,G) traffic from an external source matches a | PEs, then whenever (S,G) traffic from an external source matches a | |||
| (*,G) state, the PEG will create (S,G) state. The IIF will be set to | (*,G) state, the PEG will create (S,G) state. The IIF will be set to | |||
| whatever external interface the traffic is expected to arrive on | whatever external interface the traffic is expected to arrive on | |||
| (copied from the (*,G) state), the OIF list is copied from the (*,G) | (copied from the (*,G) state), the OIF list is copied from the (*,G) | |||
| state, and the SBD IRB interface is added to the OIF list. | state, and the SBD IRB interface is added to the OIF list. | |||
| 6.2. Interworking with PIM via an External PIM Router | 6.2. Interworking with PIM via an External PIM Router | |||
| Section 6.1 describes how to use an OISM PE router as the gateway to | Section 6.1 describes how to use an OISM PE router as the gateway to | |||
| a non-EVPN multicast domain, when the EVPN tenant domain is not being | a non-EVPN multicast domain when the EVPN Tenant Domain is not being | |||
| used as an intermediate transit network for multicast. An | used as an intermediate transit network for multicast. An | |||
| alternative approach is to have one or more external PIM routers | alternative approach is to have one or more external PIM routers | |||
| (perhaps operated by a tenant) on one of the BDs of the tenant | (perhaps operated by a tenant) on one of the BDs of the Tenant | |||
| domain. We will refer to this BD as the "gateway BD". | Domain. We will refer to this BD as the "gateway BD". | |||
| In this model: | In this model: | |||
| * The EVPN Tenant Domain is treated as a stub network attached to | * The EVPN Tenant Domain is treated as a stub network attached to | |||
| the external PIM routers. | the external PIM routers. | |||
| * The external PIM routers follow normal PIM procedures, and provide | * The external PIM routers follow normal PIM procedures and provide | |||
| the FHR and LHR functionality for the entire Tenant Domain. | the FHR and LHR functionality for the entire Tenant Domain. | |||
| * The OISM PEs do not run PIM. | * The OISM PEs do not run PIM. | |||
| * There MUST NOT be more than one gateway BD. | * There MUST NOT be more than one gateway BD. | |||
| * If an OISM PE not attached to the gateway BD has interest in a | * If an OISM PE not attached to the gateway BD has interest in a | |||
| given multicast flow, it conveys that interest, following normal | given multicast flow, it conveys that interest, following normal | |||
| OISM procedures, by originating an SBD-SMET route for that flow. | OISM procedures, by originating an SBD-SMET route for that flow. | |||
| * If a PE attached to the gateway BD receives an SBD-SMET, it may | * If a PE attached to the gateway BD receives an SBD-SMET, it may | |||
| need to generate and transmit a corresponding IGMP/MLD Join on one | need to generate and transmit a corresponding IGMP/MLD Join on one | |||
| or more of its ACs. (Procedures for generating an IGMP/MLD Join | or more of its ACs. (Procedures for generating an IGMP/MLD Join | |||
| as a result of receiving an SMET route are given in [RFC9251].) | as a result of receiving a SMET route are given in [RFC9251].) | |||
| The PE MUST know which BD is the Gateway BD and MUST NOT transmit | The PE MUST know which BD is the gateway BD and MUST NOT transmit | |||
| an IGMP/MLD Join to any other BDs. Furthermore, even if a | an IGMP/MLD Join to any other BDs. Furthermore, even if a | |||
| particular AC is part of that BD, the PE SHOULD NOT transmit an | particular AC is part of that BD, the PE SHOULD NOT transmit an | |||
| IGMP/MLD Join on that AC unless there is an external PIM router | IGMP/MLD Join on that AC unless there is an external PIM router | |||
| attached via that AC. | attached via that AC. | |||
| As a result, IGMP/MLD messages will be received by the external | As a result, IGMP/MLD messages will be received by the external | |||
| PIM routers on the gateway BD, and those external PIM routers will | PIM routers on the gateway BD, and those external PIM routers will | |||
| send PIM Join messages externally as required. Traffic for the | send PIM Join messages externally as required. Traffic for the | |||
| given multicast flow will then be received by one of the external | given multicast flow will then be received by one of the external | |||
| PIM routers, and that traffic will be forwarded by that router to | PIM routers, and that traffic will be forwarded by that router to | |||
| the gateway BD. | the gateway BD. | |||
| The normal OISM procedures will then cause the given multicast | The normal OISM procedures will then cause the given multicast | |||
| flow to be tunneled to any PEs of the EVPN Tenant Domain that have | flow to be tunneled to any PEs of the EVPN Tenant Domain that have | |||
| interest in the flow. PEs attached to the gateway BD will see the | interest in the flow. PEs attached to the gateway BD will see the | |||
| flow as originating from the gateway BD and other PEs will see the | flow as originating from the gateway BD, and other PEs will see | |||
| flow as originating from the SBD. | the flow as originating from the SBD. | |||
| * An OISM PE attached to a gateway BD MUST set its layer 2 multicast | * An OISM PE attached to a gateway BD MUST set its Layer 2 multicast | |||
| state to indicate that each AC to the gateway BD has interest in | state to indicate that each AC to the gateway BD has interest in | |||
| all multicast flows. It MUST also originate an SMET route for | all multicast flows. It MUST also originate a SMET route for | |||
| (*,*). The procedures for originating SMET routes are discussed | (*,*). The procedures for originating SMET routes are discussed | |||
| in Section 2.5. | in Section 2.5. | |||
| This will cause the OISM PEs attached to the gateway BD to receive | This will cause the OISM PEs attached to the gateway BD to receive | |||
| all the IP multicast traffic that is sourced within the EVPN | all the IP multicast traffic that is sourced within the EVPN | |||
| tenant domain, and to transmit that traffic to the gateway BD, | Tenant Domain and to transmit that traffic to the gateway BD, | |||
| where the external PIM routers will receive it. This enables the | where the external PIM routers will receive it. This enables the | |||
| external PIM routers to perform FHR functions on behalf of the | external PIM routers to perform FHR functions on behalf of the | |||
| entire Tenant Domain. (Of course, if the gateway BD has a | entire Tenant Domain. (Of course, if the gateway BD has a | |||
| multi-homed segment, only the PE that is the DF for that segment | multihomed segment, only the PE that is the DF for that segment | |||
| will transmit the multicast traffic to the segment.) | will transmit the multicast traffic to the segment.) | |||
| 7. Using an EVPN Tenant Domain as an Intermediate (Transit) Network for | 7. Using an EVPN Tenant Domain as an Intermediate (Transit) Network for | |||
| Multicast traffic | Multicast Traffic | |||
| In this section, we consider the scenario where one or more BDs of an | In this section, we consider the scenario where one or more BDs of an | |||
| EVPN Tenant Domain are being used to carry IP multicast traffic for | EVPN Tenant Domain are being used to carry IP multicast traffic for | |||
| which the source and at least one receiver are not part the tenant | which the source and at least one receiver are not part the Tenant | |||
| domain. That is, one or more BDs of the Tenant Domain are | Domain. That is, one or more BDs of the Tenant Domain are | |||
| intermediate "links" of a larger multicast tree created by PIM. | intermediate links of a larger multicast tree created by PIM. | |||
| We define a "tenant multicast router" as a multicast router, running | We define a "tenant multicast router" as a multicast router, running | |||
| PIM, that is: | PIM, that: | |||
| 1. attached to one or more BDs of the Tenant Domain, but | 1. is attached to one or more BDs of the Tenant Domain but | |||
| 2. is not an EVPN PE router. | 2. is not an EVPN PE router. | |||
| In order an EVPN Tenant Domain to be used as a transit network for IP | In order for an EVPN Tenant Domain to be used as a transit network | |||
| multicast, one or more of its BDs must have tenant multicast routers, | for IP multicast, one or more of its BDs must have tenant multicast | |||
| and an OISM PE that attaching to such a BD MUST be provisioned to | routers, and an OISM PE attached to such a BD MUST be provisioned to | |||
| enable PIM on its IRB interface to that BD. (This is true even if | enable PIM on its IRB interface to that BD. (This is true even if | |||
| none of the tenant routers is on a segment attached to the PE.) | none of the tenant routers is on a segment attached to the PE.) | |||
| Further, all the OISM PEs (even ones not attached to a BD with tenant | Further, all the OISM PEs (even ones not attached to a BD with tenant | |||
| multicast routers) MUST be provisioned to enable PIM on their SBD IRB | multicast routers) MUST be provisioned to enable PIM on their SBD IRB | |||
| interfaces. | interfaces. | |||
| If PIM is enabled on a particular BD, the DR Selection procedure of | If PIM is enabled on a particular BD, the DR selection procedure of | |||
| Section 6.1.2.4 MUST be replaced by the normal PIM DR Election | Section 6.1.2.4 MUST be replaced by the normal PIM DR Election | |||
| procedure of [RFC7761]. Note that this may result in one of the | procedure of [RFC7761]. Note that this may result in one of the | |||
| tenant routers being selected as the DR, rather than one of the OISM | tenant routers being selected as the DR rather than one of the OISM | |||
| PE routers. In this case, First Hop Router and Last Hop Router | PE routers. In this case, First Hop Router and Last Hop Router | |||
| functionality will not be performed by any of the EVPN PEs. | functionality will not be performed by any of the EVPN PEs. | |||
| A PIM control message on a particular BD is considered to be a | A PIM control message on a particular BD is considered to be a link- | |||
| link-local multicast message, and as such is sent transparently from | local multicast message and, as such, is sent transparently from PE | |||
| PE to PE via the BUM tunnel for that BD. This is true whether the | to PE via the BUM tunnel for that BD. This is true whether the | |||
| control message was received from an AC, or whether it was received | control message was received from an AC or from the local Layer 3 | |||
| from the local layer 3 routing instance via an IRB interface. | routing instance via an IRB interface. | |||
| A PIM Join/Prune message contains three fields that are relevant to | A PIM Join/Prune message contains three fields that are relevant to | |||
| the present discussion: | the present discussion: | |||
| * Upstream Neighbor | * Upstream Neighbor | |||
| * Group Address (G) | * Group Address (G) | |||
| * Source Address (S), omitted in the case of (*,G) Join/Prune | * Source Address (S), omitted in the case of (*,G) Join/Prune | |||
| messages. | messages | |||
| We will generally speak of a PIM Join as a "Join(S,G)" or a | We will generally speak of a PIM Join as a Join (S,G) or a Join (*,G) | |||
| "Join(*,G)" message, and will use the term "Join(X,G)" to mean | message and will use the term "Join (X,G)" to mean either "Join | |||
| "either Join(S,G) or Join(*,G)". In the context of a Join(X,G), we | (S,G)" or "Join (*,G)". In the context of a Join (X,G), we will use | |||
| will use the term "X" to mean "S in the case of (S,G), or G's RP in | the term "X" to mean "S" in the case of (S,G) or "G's RP" in the case | |||
| the case of (*,G)". | of (*,G). | |||
| Suppose BD1 contains two tenant multicast routers, C1 and C2. | Suppose BD1 contains two tenant multicast routers, say C1 and C2. | |||
| Suppose C1 is on a segment attached to PE1, and C2 is on a segment | Suppose C1 is on a segment attached to PE1 and C2 is on a segment | |||
| attached to PE2. When C1 sends a PIM Join(X,G) to BD1, the Upstream | attached to PE2. When C1 sends a PIM Join (X,G) to BD1, the Upstream | |||
| Neighbor field might be set to either PE1, PE2, or C2. C1 chooses | Neighbor field might be set to PE1, PE2, or C2. C1 chooses the | |||
| the Upstream Neighbor based on its unicast routing. Typically, it | Upstream Neighbor based on its unicast routing. Typically, it will | |||
| will choose as the Upstream Neighbor the PIM router on BD1 that is | choose the PIM router on BD1 that is closest (according to the | |||
| "closest" (according to the unicast routing) to X. Note that this | unicast routing) to X as the Upstream Neighbor. Note that this will | |||
| will not necessarily be PE1. PE1 may not even be visible to the | not necessarily be PE1. PE1 may not even be visible to the unicast | |||
| unicast routing algorithm used by the tenant routers. Even if it is, | routing algorithm used by the tenant routers. Even if it is, it is | |||
| it is unlikely to be the PIM router that is closest to X. So we need | unlikely to be the PIM router that is closest to X. So we need to | |||
| to consider the following two cases: | consider the following two cases: | |||
| 1. C1 sends a PIM Join(X,G) to BD1, with PE1 as the Upstream | 1. C1 sends a PIM Join (X,G) to BD1, with PE1 as the Upstream | |||
| Neighbor. | Neighbor. | |||
| PE1's PIM routing instance will receive the Join arrive on the | PE1's PIM routing instance will receive the Join arrive on the | |||
| BD1 IRB interface. If X is not within the Tenant Domain, PE1 | BD1 IRB interface. If X is not within the Tenant Domain, PE1 | |||
| handles the Join according to normal PIM procedures. This will | handles the Join according to normal PIM procedures. This will | |||
| generally result in PE1 selecting an Upstream Neighbor and | generally result in PE1 selecting an Upstream Neighbor and | |||
| sending it a Join(X,G). | sending it a Join (X,G). | |||
| If X is within the Tenant Domain, but is attached to some other | If X is within the Tenant Domain but is attached to some other | |||
| PE, PE1 sends (if it hasn't already) an SBD-SMET route for (X,G). | PE, PE1 sends (if it hasn't already) an SBD-SMET route for (X,G). | |||
| The IIF of the layer 3 (X,G) state will be the SBD IRB interface, | The IIF of the Layer 3 (X,G) state will be the SBD IRB interface, | |||
| and the OIF list will include the IRB interface to BD1. | and the OIF list will include the IRB interface to BD1. | |||
| The SBD-SMET route will pull the (X,G) traffic to PE1, and the | The SBD-SMET route will pull the (X,G) traffic to PE1, and the | |||
| (X,G) state will result in the (X,G) traffic being forwarded to | (X,G) state will result in the (X,G) traffic being forwarded to | |||
| C1. | C1. | |||
| If X is within the Tenant Domain, but is attached to PE1 itself, | If X is within the Tenant Domain but is attached to PE1 itself, | |||
| no SBD-SMET route is sent. The IIF of the layer 3 (X,G) state | no SBD-SMET route is sent. The IIF of the Layer 3 (X,G) state | |||
| will be the IRB interface to X's BD, and the OIF list will | will be the IRB interface to X's BD, and the OIF list will | |||
| include the IRB interface to BD1. | include the IRB interface to BD1. | |||
| 2. C1 sends a PIM Join(X,G) to BD1, with either PE2 or C2 as the | 2. C1 sends a PIM Join (X,G) to BD1, with either PE2 or C2 as the | |||
| Upstream Neighbor. | Upstream Neighbor. | |||
| PE1's PIM routing instance will receive the Join arrive on the | PE1's PIM routing instance will receive the Join arrive on the | |||
| BD1 IRB interface. If neither X nor Upstream Neighbor is within | BD1 IRB interface. If neither X nor Upstream Neighbor is within | |||
| the tenant domain, PE1 handles the Join according to normal PIM | the Tenant Domain, PE1 handles the Join according to normal PIM | |||
| procedures. This will NOT result in PE1 sending a Join(X,G). | procedures. This will NOT result in PE1 sending a Join (X,G). | |||
| If either X or Upstream Neighbor is within the Tenant Domain, PE1 | If either X or Upstream Neighbor is within the Tenant Domain, PE1 | |||
| sends (if it hasn't already) an SBD-SMET route for (X,G). The | sends (if it hasn't already) an SBD-SMET route for (X,G). The | |||
| IIF of the layer 3 (X,G) state will be the SBD IRB interface, and | IIF of the Layer 3 (X,G) state will be the SBD IRB interface, and | |||
| the OIF list will include the IRB interface to BD1. | the OIF list will include the IRB interface to BD1. | |||
| The SBD-SMET route will pull the (X,G) traffic to PE1, and the | The SBD-SMET route will pull the (X,G) traffic to PE1, and the | |||
| (X,G) state will result in the (X,G) traffic being forwarded to | (X,G) state will result in the (X,G) traffic being forwarded to | |||
| C1. | C1. | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| IANA is requested to assign new flags in the "Multicast Flags | IANA has assigned new flags in the "Multicast Flags Extended | |||
| Extended Community Flags" registry. These flags are: | Community" registry under the "Border Gateway Protocol (BGP) Extended | |||
| Communities" registry as shown below. | ||||
| * IPMG | ||||
| * MEG | ||||
| * PEG | ||||
| * OISM SBD | +=====+================+===========+===================+ | |||
| | Bit | Name | Reference | Change Controller | | ||||
| +=====+================+===========+===================+ | ||||
| | 7 | OISM SBD | RFC 9625 | IETF | | ||||
| +-----+----------------+-----------+-------------------+ | ||||
| | 9 | IPMG | RFC 9625 | IETF | | ||||
| +-----+----------------+-----------+-------------------+ | ||||
| | 10 | MEG | RFC 9625 | IETF | | ||||
| +-----+----------------+-----------+-------------------+ | ||||
| | 11 | PEG | RFC 9625 | IETF | | ||||
| +-----+----------------+-----------+-------------------+ | ||||
| | 12 | OISM-supported | RFC 9625 | IETF | | ||||
| +-----+----------------+-----------+-------------------+ | ||||
| * OISM-supported | Table 1: Multicast Flags Extended Community Registry | |||
| 9. Security Considerations | 9. Security Considerations | |||
| This document uses protocols and procedures defined in the normative | This document uses protocols and procedures defined in the normative | |||
| references, and inherits the security considerations of those | references and inherits the security considerations of those | |||
| references. | references. | |||
| This document adds flags or Extended Communities (ECs) to a number of | This document adds flags or Extended Communities (ECs) to a number of | |||
| BGP routes, in order to signal that particular nodes support the | BGP routes in order to signal that particular nodes support the OISM, | |||
| OISM, IPMG, MEG, and/or PEG functionalities that are defined in this | IPMG, MEG, and/or PEG functionalities that are defined in this | |||
| document. Incorrect addition, removal, or modification of those | document. Incorrect addition, removal, or modification of those | |||
| flags and/or ECs will cause the procedures defined herein to | flags and/or ECs will cause the procedures defined herein to | |||
| malfunction, in which case loss or diversion of data traffic is | malfunction, in which case loss or diversion of data traffic is | |||
| possible. Implementations should provide tools to easily debug | possible. Implementations should provide tools to easily debug | |||
| configuration mistakes that cause the signaling of incorrect | configuration mistakes that cause the signaling of incorrect | |||
| information. | information. | |||
| The interworking with non-OISM networks described in sections 5 and | The interworking with non-OISM networks described in Sections 5 and 6 | |||
| 6, require gateway functions in multiple redundant PEs, among which | requires gateway functions in multiple redundant PEs, among which one | |||
| one of them is elected as Designated Forwarder for a given BD (or | of them is elected as Designated Forwarder for a given BD (or SBD). | |||
| SBD). The election of the MEG or PEG Designated Router, as well as | The election of the MEG or PEG DR, as well as the IPMG Designated | |||
| the IPMG Designated Forwarder makes use of the RFC8584 Designated | Forwarder, makes use of the Designated Forwarder election procedures | |||
| Forwarder election procedures. An attacker with access to one of | [RFC8584]. An attacker with access to one of these Gateways may | |||
| these Gateways may influence such election and therefore modify the | influence such election and therefore modify the forwarding of | |||
| forwarding of multicast traffic between the OISM network and the | multicast traffic between the OISM network and the external domain. | |||
| external domain. The operator should be especially careful with the | The operator should be especially careful with the protection of | |||
| protection of these gateways by making sure the management interfaces | these gateways by making sure the management interfaces to access the | |||
| to access the gateways are only allowed to authorized operators. | gateways are only allowed to authorized operators. | |||
| The document also introduces the concept of per-Tenant-Domain | The document also introduces the concept of per-Tenant-Domain | |||
| dissemination for the SMET routes, as opposed to per-BD distribution | dissemination for the SMET routes, as opposed to per-BD distribution | |||
| in [RFC9251]. That is, e.g., an SMET route triggered by the | in [RFC9251]. That is, a SMET route triggered by the reception of an | |||
| reception of an IGMP/MLD join in BD-1 on PE1, needs to be distributed | IGMP/MLD Join in BD-1 on PE1 needs to be distributed and imported by | |||
| and imported by all PEs of the Tenant Domain, even to those PEs that | all PEs of the Tenant Domain, even to those PEs that are not attached | |||
| are not attached to BD-1. This means that an attacker with access to | to BD-1. This means that an attacker with access to only one BD in a | |||
| only one BD in a PE of the Tenant Domain, might force the | PE of the Tenant Domain might force the advertisement of SMET routes | |||
| advertisement of SMET routes and impact the resources of all the PEs | and impact the resources of all the PEs of the Tenant Domain, as | |||
| of the Tenant Domain, as opposed to only the PEs of that particular | opposed to only the PEs of that particular BD (as in [RFC9251]). The | |||
| BD (as in RFC9251). The implementation should provide ways to | implementation should provide ways to filter/control the client IGMP/ | |||
| filter/control the client IGMP/MLD reports that are received by the | MLD reports that are received by the attached hosts. | |||
| attached hosts. | ||||
| 10. Acknowledgements | ||||
| The authors thank Vikram Nagarajan and Princy Elizabeth for their | ||||
| work on Section 6.2 and Section 3.2.3.1. The authors also benefited | ||||
| tremendously from discussions with Aldrin Isaac on EVPN multicast | ||||
| optimizations. | ||||
| 11. References | ||||
| 11.1. Normative References | ||||
| [I-D.ietf-bess-evpn-bum-procedure-updates] | 10. References | |||
| Zhang, Z. J., Lin, W., Rabadan, J., Patel, K., and A. | ||||
| Sajassi, "Updates on EVPN BUM Procedures", Work in | ||||
| Progress, Internet-Draft, draft-ietf-bess-evpn-bum- | ||||
| procedure-updates-14, 18 November 2021, | ||||
| <https://datatracker.ietf.org/doc/html/draft-ietf-bess- | ||||
| evpn-bum-procedure-updates-14>. | ||||
| [I-D.ietf-bess-evpn-optimized-ir] | 10.1. Normative References | |||
| Rabadan, J., Sathappan, S., Lin, W., Katiyar, M., and A. | ||||
| Sajassi, "Optimized Ingress Replication Solution for | ||||
| Ethernet VPN (EVPN)", Work in Progress, Internet-Draft, | ||||
| draft-ietf-bess-evpn-optimized-ir-12, 25 January 2022, | ||||
| <https://datatracker.ietf.org/doc/html/draft-ietf-bess- | ||||
| evpn-optimized-ir-12>. | ||||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | |||
| Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | |||
| Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, | Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, | |||
| <https://www.rfc-editor.org/info/rfc3032>. | <https://www.rfc-editor.org/info/rfc3032>. | |||
| skipping to change at page 72, line 16 ¶ | skipping to change at line 3303 ¶ | |||
| A. Sajassi, "IP Prefix Advertisement in Ethernet VPN | A. Sajassi, "IP Prefix Advertisement in Ethernet VPN | |||
| (EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021, | (EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021, | |||
| <https://www.rfc-editor.org/info/rfc9136>. | <https://www.rfc-editor.org/info/rfc9136>. | |||
| [RFC9251] Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., | [RFC9251] Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., | |||
| and W. Lin, "Internet Group Management Protocol (IGMP) and | and W. Lin, "Internet Group Management Protocol (IGMP) and | |||
| Multicast Listener Discovery (MLD) Proxies for Ethernet | Multicast Listener Discovery (MLD) Proxies for Ethernet | |||
| VPN (EVPN)", RFC 9251, DOI 10.17487/RFC9251, June 2022, | VPN (EVPN)", RFC 9251, DOI 10.17487/RFC9251, June 2022, | |||
| <https://www.rfc-editor.org/info/rfc9251>. | <https://www.rfc-editor.org/info/rfc9251>. | |||
| 11.2. Informative References | [RFC9572] Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A. | |||
| Sajassi, "Updates to EVPN Broadcast, Unknown Unicast, or | ||||
| Multicast (BUM) Procedures", RFC 9572, | ||||
| DOI 10.17487/RFC9572, May 2024, | ||||
| <https://www.rfc-editor.org/info/rfc9572>. | ||||
| [I-D.ietf-bess-evpn-pref-df] | [RFC9574] Rabadan, J., Ed., Sathappan, S., Lin, W., Katiyar, M., and | |||
| Rabadan, J., Sathappan, S., Lin, W., Drake, J., and A. | A. Sajassi, "Optimized Ingress Replication Solution for | |||
| Ethernet VPNs (EVPNs)", RFC 9574, DOI 10.17487/RFC9574, | ||||
| May 2024, <https://www.rfc-editor.org/info/rfc9574>. | ||||
| 10.2. Informative References | ||||
| [EVPN-DF] Rabadan, J., Sathappan, S., Lin, W., Drake, J., and A. | ||||
| Sajassi, "Preference-based EVPN DF Election", Work in | Sajassi, "Preference-based EVPN DF Election", Work in | |||
| Progress, Internet-Draft, draft-ietf-bess-evpn-pref-df-13, | Progress, Internet-Draft, draft-ietf-bess-evpn-pref-df-13, | |||
| 9 October 2023, <https://datatracker.ietf.org/doc/html/ | 9 October 2023, <https://datatracker.ietf.org/doc/html/ | |||
| draft-ietf-bess-evpn-pref-df-13>. | draft-ietf-bess-evpn-pref-df-13>. | |||
| [I-D.ietf-bier-evpn] | ||||
| Zhang, Z. J., Przygienda, T., Sajassi, A., and J. Rabadan, | ||||
| "EVPN BUM Using BIER", Work in Progress, Internet-Draft, | ||||
| draft-ietf-bier-evpn-14, 2 January 2024, | ||||
| <https://datatracker.ietf.org/doc/html/draft-ietf-bier- | ||||
| evpn-14>. | ||||
| [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | |||
| Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | |||
| 2006, <https://www.rfc-editor.org/info/rfc4364>. | 2006, <https://www.rfc-editor.org/info/rfc4364>. | |||
| [RFC4541] Christensen, M., Kimball, K., and F. Solensky, | [RFC4541] Christensen, M., Kimball, K., and F. Solensky, | |||
| "Considerations for Internet Group Management Protocol | "Considerations for Internet Group Management Protocol | |||
| (IGMP) and Multicast Listener Discovery (MLD) Snooping | (IGMP) and Multicast Listener Discovery (MLD) Snooping | |||
| Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, | Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, | |||
| <https://www.rfc-editor.org/info/rfc4541>. | <https://www.rfc-editor.org/info/rfc4541>. | |||
| skipping to change at page 73, line 28 ¶ | skipping to change at line 3364 ¶ | |||
| Multicast - Sparse Mode (PIM-SM): Protocol Specification | Multicast - Sparse Mode (PIM-SM): Protocol Specification | |||
| (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March | (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March | |||
| 2016, <https://www.rfc-editor.org/info/rfc7761>. | 2016, <https://www.rfc-editor.org/info/rfc7761>. | |||
| [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | |||
| Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation | Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation | |||
| for Bit Index Explicit Replication (BIER) in MPLS and Non- | for Bit Index Explicit Replication (BIER) in MPLS and Non- | |||
| MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January | MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January | |||
| 2018, <https://www.rfc-editor.org/info/rfc8296>. | 2018, <https://www.rfc-editor.org/info/rfc8296>. | |||
| [RFC9624] Zhang, Z., Przygienda, T., Sajassi, A., and J. Rabadan, | ||||
| "EVPN Broadcast, Unknown Unicast, or Multicast (BUM) Using | ||||
| Bit Index Explicit Replication (BIER)", RFC 9624, | ||||
| DOI 10.17487/RFC9624, August 2024, | ||||
| <https://www.rfc-editor.org/info/rfc9624>. | ||||
| Appendix A. Integrated Routing and Bridging | Appendix A. Integrated Routing and Bridging | |||
| This Appendix provides a short tutorial on the interaction of routing | This appendix provides a short tutorial on the interaction of routing | |||
| and bridging. First it shows the traditional model, where bridging | and bridging. First, it shows a model, where bridging and routing | |||
| and routing are performed in separate devices. Then it shows the | are performed in separate devices. Then, it shows the model | |||
| model specified in [RFC9135], where a single device contains both | specified in [RFC9135], where a single device contains both routing | |||
| routing and bridging functions. The latter model is presupposed in | and bridging functions. The latter model is presupposed in the body | |||
| the body of this document. | of this document. | |||
| Figure 1 shows a "traditional" router that only does routing and has | Figure 2 shows the model where a router only does routing and has no | |||
| no L2 bridging capabilities. There are two LANs, LAN1 and LAN2. | L2 bridging capabilities. There are two LANs: LAN1 and LAN2. LAN1 | |||
| LAN1 is realized by switch1, LAN2 by switch2. The router has an | is realized by switch1, and LAN2 is realized by switch2. The router | |||
| interface, "lan1" that attaches to LAN1 (via switch1) and an | has an interface, lan1, that attaches to LAN1 (via switch1) and an | |||
| interface "lan2" that attachs to LAN2 (via switch2). Each intreface | interface, lan2, that attaches to LAN2 (via switch2). Each interface | |||
| is configured, as an IP interface, with an IP address and a subnet | is configured, as an IP interface, with an IP address and a subnet | |||
| mask. | mask. | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | | lan1| |lan2 | | | | | lan1| |lan2 | | | |||
| H1 -----+Switch1+--------+ Router1+--------+Switch2+------H3 | H1 -----+Switch1+--------+ Router1+--------+Switch2+------H3 | |||
| | | | | | | | | | | | | | | |||
| H2 -----| | | | | | | H2 -----| | | | | | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| |_________________| |__________________| | |_________________| |__________________| | |||
| LAN1 LAN2 | LAN1 LAN2 | |||
| Figure 1: Conventional Router with LAN Interfaces | Figure 2: Conventional Router with LAN Interfaces | |||
| IP traffic (unicast or multicast) that remains within a single subnet | IP traffic (unicast or multicast) that remains within a single subnet | |||
| never reaches the router. For instance, if H1 emits an Ethernet | never reaches the router. For instance, if H1 emits an Ethernet | |||
| frame with H2's MAC address in the Ethernet destination address | frame with H2's MAC address in the Ethernet Destination Address | |||
| field, the frame will go from H1 to Switch1 to H2, without ever | field, the frame will go from H1 to Switch1 to H2 without ever | |||
| reaching the router. Since the frame is never seen by a router, the | reaching the router. Since the frame is never seen by a router, the | |||
| IP datagram within the frame remains entirely unchanged, e.g., its | IP datagram within the frame remains entirely unchanged, e.g., its | |||
| TTL is not decremented. The Ethernet Source and Destination MAC | TTL is not decremented. The Ethernet Source and Destination MAC | |||
| addresses are not changed either. | addresses are not changed either. | |||
| If H1 wants to send a unicast IP datagram to H3, which is on a | If H1 wants to send a unicast IP datagram to H3, which is on a | |||
| different subnet, H1 has to be configured with the IP address of a | different subnet, H1 has to be configured with the IP address of a | |||
| "default router". Let's assume that H1 is configured with an IP | default router. Let's assume that H1 is configured with an IP | |||
| address of Router1 as its default router address. H1 compares H3's | address of Router1 as its default router address. H1 compares H3's | |||
| IP address with its own IP address and IP subnet mask, and determines | IP address with its own IP address and IP subnet mask and determines | |||
| that H3 is on a different subnet. So the packet has to be routed. | that H3 is on a different subnet. So the packet has to be routed. | |||
| H1 uses ARP to map Router1's IP address to a MAC address on LAN1. H1 | H1 uses ARP to map Router1's IP address to a MAC address on LAN1. H1 | |||
| then encapsulates the datagram in an Ethernet frame, using router1's | then encapsulates the datagram in an Ethernet frame, using Router1's | |||
| MAC address as the destination MAC address, and sends the frame to | MAC address as the destination MAC address, and sends the frame to | |||
| Router1. | Router1. | |||
| Router1 then receives the frame over its lan1 interface. Router1 | Router1 then receives the frame over its lan1 interface. Router1 | |||
| sees that the frame is addressed to it, so it removes the Ethernet | sees that the frame is addressed to it, so it removes the Ethernet | |||
| encapsulation and processes the IP datagram. The datagram is not | encapsulation and processes the IP datagram. The datagram is not | |||
| addressed to Router1, so it must be forwarded further. Router1 does | addressed to Router1, so it must be forwarded further. Router1 does | |||
| a lookup of the datagram's IP destination field, and determines that | a lookup of the datagram's IP Destination Address field and | |||
| the destination (H3) can be reached via Router1's lan2 interface. | determines that the destination (H3) can be reached via Router1's | |||
| Router1 now performs the IP processing of the datagram: it decrements | lan2 interface. Router1 now performs the IP processing of the | |||
| the IP TTL, adjusts the IP header checksum (if present), may fragment | datagram: it decrements the IP TTL, adjusts the IP header checksum | |||
| the packet is necessary, etc. Then the datagram (or its fragments) | (if present), may fragment the packet as necessary, etc. Then, the | |||
| are encapsulated in an Ethernet header, with Router1's MAC address on | datagram (or its fragments) is encapsulated in an Ethernet header, | |||
| LAN2 as the MAC Source Address, and H3's MAC address on LAN2 (which | with Router1's MAC address on LAN2 as the MAC Source Address and H3's | |||
| Router1 determines via ARP) as the MAC Destination Address. Finally | MAC address on LAN2 (which Router1 determines via ARP) as the | |||
| the packet is sent on the lan2 interface. | Destination MAC Address. Finally, the packet is sent on the lan2 | |||
| interface. | ||||
| If H1 has an IP multicast datagram to send (i.e., an IP datagram | If H1 has an IP multicast datagram to send (i.e., an IP datagram | |||
| whose Destination Address field is an IP Multicast Address), it | whose Destination Address field is an IP Multicast Address), it | |||
| encapsulates it in an Ethernet frame whose MAC Destination Address is | encapsulates it in an Ethernet frame whose Destination MAC Address is | |||
| computed from the IP Destination Address. | computed from the IP Destination Address. | |||
| If H2 is a receiver for that multicast address, H2 will receive a | If H2 is a receiver for that multicast address, H2 will receive a | |||
| copy of the frame, unchanged, from H1. The MAC Source Address in the | copy of the frame, unchanged, from H1. The MAC Source Address in the | |||
| Ethernet encapsulation does not change, the IP TTL field does not get | Ethernet encapsulation does not change, the IP TTL field does not get | |||
| decremented, etc. | decremented, etc. | |||
| If H3 is a receiver for that multicast address, the datagram must be | If H3 is a receiver for that multicast address, the datagram must be | |||
| routed to H3. In order for this to happen, Router1 must be | routed to H3. In order for this to happen, Router1 must be | |||
| configured as a multicast router, and it must accept traffic sent to | configured as a multicast router, and it must accept traffic sent to | |||
| Ethernet multicast addresses. Router1 will receive H1's multicast | Ethernet multicast addresses. Router1 will receive H1's multicast | |||
| frame on its lan1 interface, will remove the Ethernet encapsulation, | frame on its lan1 interface, remove the Ethernet encapsulation, and | |||
| and will determine how to dispatch the IP datagram based on Router1's | determine how to dispatch the IP datagram based on Router1's | |||
| multicast forwarding states. If Router1 knows that there is a | multicast forwarding states. If Router1 knows that there is a | |||
| receiver for the multicast datagram on LAN2, it makes a copy of the | receiver for the multicast datagram on LAN2, it makes a copy of the | |||
| datagram, decrements the TTL (and performs any other necessary IP | datagram, decrements the TTL (and performs any other necessary IP | |||
| processing), then encapsulates the datagram in Ethernet frame for | processing), and then encapsulates the datagram in the Ethernet frame | |||
| LAN2. The MAC Source Address for this frame will be Router1's MAC | for LAN2. The MAC Source Address for this frame will be Router1's | |||
| Source Address on LAN2. The MAC Destination Address is computed from | MAC Source Address on LAN2. The Destination MAC Address is computed | |||
| the IP Destination Address. Finally, the frame is sent on Router1's | from the IP Destination Address. Finally, the frame is sent on | |||
| LAN2 interface. | Router1's LAN2 interface. | |||
| Figure 2 shows an Integrated Router/Bridge that supports the routing/ | Figure 3 shows an integrated router/bridge that supports the routing/ | |||
| bridging integration model of [RFC9135]. | bridging integration model of [RFC9135]. | |||
| +------------------------------------------+ | +------------------------------------------+ | |||
| | Integrated Router/Bridge | | | Integrated Router/Bridge | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | | IRB1| L3 |IRB2 | | | | | IRB1| L3 |IRB2 | | | |||
| H1 -----+ BD1 +--------+Routing +--------+ BD2 +------H3 | H1 -----+ BD1 +--------+Routing +--------+ BD2 +------H3 | |||
| | | |Instance| | | | | | |Instance| | | | |||
| H2 -----| | | | | | | H2 -----| | | | | | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| |___________________| |____________________| | |___________________| |____________________| | |||
| LAN1 LAN2 | LAN1 LAN2 | |||
| Figure 2: Integrated Router/Bridge | Figure 3: Integrated Router/Bridge | |||
| In Figure 2, a single device consists of one or more "L3 Routing | In Figure 3, a single device consists of one or more L3 Routing | |||
| Instances". The routing/forwarding tables of a given routing | Instances. The routing/forwarding tables of a given routing instance | |||
| instance is known as an IP-VRF [RFC9135]. In the context of EVPN, it | is known as an IP-VRF [RFC9135]. In the context of EVPN, it is | |||
| is convenient to think of each routing instance as representing the | convenient to think of each routing instance as representing the | |||
| routing of a particular tenant. Each IP-VRF is attached to one or | routing of a particular tenant. Each IP-VRF is attached to one or | |||
| more interfaces. | more interfaces. | |||
| When several EVPN PEs have a routing instance of the same tenant | When several EVPN PEs have a routing instance of the same Tenant | |||
| domain, those PEs advertise IP routes to the attached hosts. This is | Domain, those PEs advertise IP routes to the attached hosts. This is | |||
| done as specified in [RFC9135]. | done as specified in [RFC9135]. | |||
| The integrated router/bridge shown in Figure 2 also attaches to a | The integrated router/bridge shown in Figure 3 also attaches to a | |||
| number of "Broadcast Domains" (BDs). Each BD performs the functions | number of Broadcast Domains (BDs). Each BD performs the functions | |||
| that are performed by the bridges in Figure 1. To the L3 routing | that are performed by the bridges in Figure 2. To the L3 routing | |||
| instance, each BD appears to be a LAN. The interface attaching a | instance, each BD appears to be a LAN. The interface attaching a | |||
| particular BD to a particular IP-VRF is known as an "IRB Interface". | particular BD to a particular IP-VRF is known as an "IRB interface". | |||
| From the perspective of L3 routing, each BD is a subnet. Thus, each | ||||
| From the perspective of L3 routing, each BD is a subnet. Thus each | ||||
| IRB interface is configured with a MAC address (which is the router's | IRB interface is configured with a MAC address (which is the router's | |||
| MAC address on the corresponding LAN), as well as an IP address and | MAC address on the corresponding LAN), as well as an IP address and | |||
| subnet mask. | subnet mask. | |||
| The integrated router/bridge shown in Figure 2 may have multiple ACs | The integrated router/bridge shown in Figure 3 may have multiple ACs | |||
| to each BD. These ACs are visible only to the bridging function, not | to each BD. These ACs are visible only to the bridging function, not | |||
| to the routing instance. To the L3 routing instance, there is just | to the routing instance. To the L3 routing instance, there is just | |||
| one "interface" to each BD. | one interface to each BD. | |||
| If the L3 routing instance represents the IP routing of a particular | If the L3 routing instance represents the IP routing of a particular | |||
| tenant, the BDs attached to that routing instance are BDs belonging | tenant, the BDs attached to that routing instance are BDs belonging | |||
| to that same tenant. | to that same tenant. | |||
| Bridging and routing now proceed exactly as in the case of Figure 1, | Bridging and routing now proceed exactly as in the case of Figure 2, | |||
| except that BD1 replaces Switch1, BD2 replaces Switch2, interface | except that BD1 replaces Switch1, BD2 replaces Switch2, interface | |||
| IRB1 replaces interface lan1, and interface IRB2 replaces interface | IRB1 replaces interface lan1, and interface IRB2 replaces interface | |||
| lan2. | lan2. | |||
| It is important to understand that an IRB interface connects an L3 | It is important to understand that an IRB interface connects an L3 | |||
| routing instance to a BD, NOT to a "MAC-VRF". (See [RFC7432] for the | routing instance to a BD, NOT to a MAC-VRF (see [RFC7432] for the | |||
| definition of "MAC-VRF".) A MAC-VRF may contain several BDs, as long | definition of MAC-VRF). A MAC-VRF may contain several BDs, as long | |||
| as no MAC address appears in more than one BD. From the perspective | as no MAC address appears in more than one BD. From the perspective | |||
| of the L3 routing instance, each individual BD is an individual IP | of the L3 routing instance, each individual BD is an individual IP | |||
| subnet; whether each BD has its own MAC-VRF or not is irrelevant to | subnet; whether or not each BD has its own MAC-VRF is irrelevant to | |||
| the L3 routing instance. | the L3 routing instance. | |||
| Figure 3 illustrates IRB when a pair of BDs (subnets) are attached to | Figure 4 illustrates IRB when a pair of BDs (subnets) are attached to | |||
| two different PE routers. In this example, each BD has two segments, | two different PE routers. In this example, each BD has two segments, | |||
| and one segment of each BD is attached to one PE router. | and one segment of each BD is attached to one PE router. | |||
| +------------------------------------------+ | +------------------------------------------+ | |||
| | Integrated Router/Bridges | | | Integrated Router/Bridges | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | | IRB1| |IRB2 | | | | | IRB1| |IRB2 | | | |||
| H1 -----+ BD1 +--------+ PE1 +--------+ BD2 +------H3 | H1 -----+ BD1 +--------+ PE1 +--------+ BD2 +------H3 | |||
| |(Seg-1)| |(L3 Rtg)| |(Seg-1)| | |(Seg-1)| |(L3 Rtg)| |(Seg-1)| | |||
| skipping to change at page 77, line 25 ¶ | skipping to change at line 3542 ¶ | |||
| LAN1 | LAN2 | LAN1 | LAN2 | |||
| | | | | |||
| | | | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | | IRB1| |IRB2 | | | | | IRB1| |IRB2 | | | |||
| H4 -----+ BD1 +--------+ PE2 +--------+ BD2 +------H5 | H4 -----+ BD1 +--------+ PE2 +--------+ BD2 +------H5 | |||
| |(Seg-2)| |(L3 Rtg)| |(Seg-2)| | |(Seg-2)| |(L3 Rtg)| |(Seg-2)| | |||
| | | | | | | | | | | | | | | |||
| +-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| Figure 3: Integrated Router/Bridges with Distributed Subnet | Figure 4: Integrated Router/Bridges with Distributed Subnet | |||
| If H1 needs to send an IP packet to H4, it determines from its IP | If H1 needs to send an IP packet to H4, it determines from its IP | |||
| address and subnet mask that H4 is on the same subnet as H1. | address and subnet mask that H4 is on the same subnet as H1. | |||
| Although H1 and H4 are not attached to the same PE router, EVPN | Although H1 and H4 are not attached to the same PE router, EVPN | |||
| provides Ethernet communication among all hosts that are on the same | provides Ethernet communication among all hosts that are on the same | |||
| BD. H1 thus uses ARP to find H4's MAC address, and sends an Ethernet | BD. Thus, H1 uses ARP to find H4's MAC address and sends an Ethernet | |||
| frame with H4's MAC address in the Destination MAC address field. | frame with H4's MAC address in the Destination MAC Address field. | |||
| The frame is received at PE1, but since the Destination MAC address | The frame is received at PE1, but since the Destination MAC address | |||
| is not PE1's MAC address, PE1 assumes that the frame is to remain on | is not PE1's MAC address, PE1 assumes that the frame is to remain on | |||
| BD1. Therefore the packet inside the frame is NOT decapsulated, and | BD1. Therefore, the packet inside the frame is NOT decapsulated and | |||
| is NOT send up the IRB interface to PE1's routing instance. Rather, | is NOT sent up the IRB interface to PE1's routing instance. Rather, | |||
| standard EVPN intra-subnet procedures (as detailed in [RFC7432]) are | standard EVPN intra-subnet procedures (as detailed in [RFC7432]) are | |||
| used to deliver the frame to PE2, which then sends it to H4. | used to deliver the frame to PE2, which then sends it to H4. | |||
| If H1 needs to send an IP packet to H5, it determines from its IP | If H1 needs to send an IP packet to H5, it determines from its IP | |||
| address and subnet mask that H5 is NOT on the same subnet as H1. | address and subnet mask that H5 is NOT on the same subnet as H1. | |||
| Assuming that H1 has been configured with the IP address of PE1 as | Assuming that H1 has been configured with the IP address of PE1 as | |||
| its default router, H1 sends the packet in an Ethernet frame with | its default router, H1 sends the packet in an Ethernet frame with | |||
| PE1's MAC address in its Destination MAC Address field. PE1 receives | PE1's MAC address in its Destination MAC Address field. PE1 receives | |||
| the frame, and sees that the frame is addressed to it. PE1 thus | the frame and sees that the frame is addressed to it. Thus, PE1 | |||
| sends the frame up its IRB1 interface to the L3 routing instance. | sends the frame up its IRB1 interface to the L3 routing instance. | |||
| Appropriate IP processing is done, e.g., TTL decrement. The L3 | Appropriate IP processing is done, e.g., TTL decrement. The L3 | |||
| routing instance determines that the "next hop" for H5 is PE2, so the | routing instance determines that the next hop for H5 is PE2, so the | |||
| packet is encapsulated (e.g., in MPLS) and sent across the backbone | packet is encapsulated (e.g., in MPLS) and sent across the backbone | |||
| to PE2's routing instance. PE2 will see that the packet's | to PE2's routing instance. PE2 will see that the packet's | |||
| destination, H5, is on BD2 segment-2, and will send the packet down | destination, H5, is on BD2 segment-2 and will send the packet down | |||
| its IRB2 interface. This causes the IP packet to be encapsulated in | its IRB2 interface. This causes the IP packet to be encapsulated in | |||
| an Ethernet frame with PE2's MAC address (on BD2) in the Source | an Ethernet frame with PE2's MAC address (on BD2) in the Source | |||
| Address field and H5's MAC address in the Destination Address field. | Address field and H5's MAC address in the Destination Address field. | |||
| Note that if H1 has an IP packet to send to H3, the forwarding of the | Note that if H1 has an IP packet to send to H3, the forwarding of the | |||
| packet is handled entirely within PE1. PE1's routing instance sees | packet is handled entirely within PE1. PE1's routing instance sees | |||
| the packet arrive on its IRB1 interface, and then transmits the | the packet arrive on its IRB1 interface and then transmits the packet | |||
| packet by sending it down its IRB2 interface. | by sending it down its IRB2 interface. | |||
| Often, all the hosts in a particular Tenant Domain will be | Often, all the hosts in a particular Tenant Domain will be | |||
| provisioned with the same value of the default router IP address. | provisioned with the same value of the default router IP address. | |||
| This IP address can be provisioned as an "anycast address" in all the | This IP address can be provisioned as an anycast address in all the | |||
| EVPN PEs attached to that Tenant Domain. Thus although all hosts are | EVPN PEs attached to that Tenant Domain. Thus, although all hosts | |||
| provisioned with the same "default router address", the actual | are provisioned with the same default router address, the actual | |||
| default router for a given host will be one of the PEs attached to | default router for a given host will be one of the PEs attached to | |||
| the same Ethernet segment as the host. This provisioning method | the same Ethernet segment as the host. This provisioning method | |||
| ensures that IP packets from a given host are handled by the closest | ensures that IP packets from a given host are handled by the closest | |||
| EVPN PE that supports IRB. | EVPN PE that supports IRB. | |||
| In the topology of Figure 3, one could imagine that H1 is configured | In the topology of Figure 4, one could imagine that H1 is configured | |||
| with a default router address that belongs to PE2 but not to PE1. | with a default router address that belongs to PE2 but not to PE1. | |||
| Inter-subnet routing would still work, but IP packets from H1 to H3 | Inter-subnet routing would still work, but IP packets from H1 to H3 | |||
| would then follow the non-optimal path H1-->PE1-->PE2-->PE1-->H3. | would then follow the non-optimal path H1-->PE1-->PE2-->PE1-->H3. | |||
| Sending traffic on this sort of path, where it leaves a router and | Sending traffic on this sort of path, where it leaves a router and | |||
| then comes back to the same router, is sometimes known as | then comes back to the same router, is sometimes known as | |||
| "hairpinning". Similarly, if PE2 supports IRB but PE1 dos not, the | "hairpinning". Similarly, if PE2 supports IRB but PE1 dos not, the | |||
| same non-optimal path from H1 to H3 would have to be followed. To | same non-optimal path from H1 to H3 would have to be followed. To | |||
| avoid hairpinning, each EVPN PE needs to support IRB. | avoid hairpinning, each EVPN PE needs to support IRB. | |||
| It is worth pointing out the way IRB interfaces interact with | It is worth pointing out the way IRB interfaces interact with | |||
| multicast traffic. Referring again to Figure 3, suppose PE1 and PE2 | multicast traffic. Referring again to Figure 4, suppose PE1 and PE2 | |||
| are functioning as IP multicast routers. Also Suppose that H3 | are functioning as IP multicast routers. Also, suppose that H3 | |||
| transmits a multicast packet, and both H1 and H4 are interested in | transmits a multicast packet and both H1 and H4 are interested in | |||
| receiving that packet. PE1 will receive the packet from H3 via its | receiving that packet. PE1 will receive the packet from H3 via its | |||
| IRB2 interface. The Ethernet encapsulation from BD2 is removed, the | IRB2 interface. The Ethernet encapsulation from BD2 is removed, the | |||
| IP header processing is done, and the packet is then reencapsulated | IP header processing is done, and the packet is then re-encapsulated | |||
| for BD1, with PE1's MAC address in the MAC Source Address field. | for BD1, with PE1's MAC address in the MAC Source Address field. | |||
| Then the packet is sent down the IRB1 interface. Layer 2 procedures | Then, the packet is sent down the IRB1 interface. Layer 2 procedures | |||
| (as defined in [RFC7432] would then be used to deliver a copy of the | (as defined in [RFC7432]) would then be used to deliver a copy of the | |||
| packet locally to H1, and remotely to H4. | packet locally to H1 and remotely to H4. | |||
| Please be aware that his document modifies the semantics, described | Please be aware that this document modifies the semantics, described | |||
| in the previous paragraph, of sending/receiving multicast traffic on | in the previous paragraph, of sending/receiving multicast traffic on | |||
| an IRB interface. This is explained in Section 1.5.1 and subsequent | an IRB interface. This is explained in Section 1.5.1 and subsequent | |||
| sections. | sections. | |||
| Acknowledgements | ||||
| The authors thank Vikram Nagarajan and Princy Elizabeth for their | ||||
| work on Sections 6.2 and 3.2.3.1. The authors also benefited | ||||
| tremendously from discussions with Aldrin Isaac on EVPN multicast | ||||
| optimizations. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Wen Lin | Wen Lin | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| 10 Technology Park Drive | 10 Technology Park Drive | |||
| Westford, Massachusetts 01886 | Westford, MA 01886 | |||
| United States | United States of America | |||
| Email: wlin@juniper.net | Email: wlin@juniper.net | |||
| Zhaohui Zhang | Zhaohui Zhang | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| 10 Technology Park Drive | 10 Technology Park Drive | |||
| Westford, Massachusetts 01886 | Westford, MA 01886 | |||
| United States | United States of America | |||
| Email: zzhang@juniper.net | Email: zzhang@juniper.net | |||
| John Drake | John Drake | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| 1194 N. Mathilda Ave | 1194 N. Mathilda Ave | |||
| Sunnyvale, CA 94089 | Sunnyvale, CA 94089 | |||
| United States | United States of America | |||
| Email: jdrake@juniper.net | Email: jdrake@juniper.net | |||
| Eric C. Rosen (editor) | Eric C. Rosen (editor) | |||
| Juniper Networks, Inc. | Juniper Networks, Inc. | |||
| 10 Technology Park Drive | 10 Technology Park Drive | |||
| Westford, Massachusetts 01886 | Westford, MA 01886 | |||
| United States | United States of America | |||
| Email: erosen52@gmail.com | Email: erosen52@gmail.com | |||
| Jorge Rabadan | Jorge Rabadan | |||
| Nokia | Nokia | |||
| 777 E. Middlefield Road | 777 E. Middlefield Road | |||
| Mountain View, CA 94043 | Mountain View, CA 94043 | |||
| United States | United States of America | |||
| Email: jorge.rabadan@nokia.com | Email: jorge.rabadan@nokia.com | |||
| Ali Sajassi | Ali Sajassi | |||
| Cisco Systems | Cisco Systems | |||
| 170 West Tasman Drive | 170 West Tasman Drive | |||
| San Jose, CA 95134 | San Jose, CA 95134 | |||
| United States | United States of America | |||
| Email: sajassi@cisco.com | Email: sajassi@cisco.com | |||
| End of changes. 552 change blocks. | ||||
| 1362 lines changed or deleted | 1341 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||