PCE Working Group Shankar Raman Internet draft Balaji Venkat Intended Status: Experimental RFC Gaurav Raina Expires: February 2013 I.I.T, Madras August 1, 2012 Constructing protection paths for inter-AS, inter-sub-AS P2MP TE-LSPs draft-mjsraman-pce-inter-as-p2mp-protect-02 Abstract Constructing primary and backup explicit path Point-to-Multipoint Label Switched Paths is important from the point of view of providing protection switching in case the primary fails. It is absolutely essential that the backup P2MP LSPs constructed do not share risk with any of the links and nodes of the primary path. In the case of inter-AS P2MP TE-LSPs or in the case of inter-sub-AS (in the case of BGP-Confederations being deployed) P2MP TE-LSPs where BGP confederations are deployed within an AS, such protection switching can be provided by calculating primary and backup multicast distribution trees (read P2MP TE-LSPs) that dont intersect with each other. In this paper we propose a method by which inter-sub-AS P2MP TE-LSPs (hence even inter-AS P2MP TE-LSPs) can be constructed by first finding the AS level topology of the network (be it inter-AS or inter-sub-AS within a single AS) in question and secondly to compute the paths in such a way that they dont intersect or if necessary in the worst case partially intersect each other. The proposed scheme is explained with an example and subsequent discussion is done to elucidate its benefits to multicast in particular. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any Shankar Raman.et.al Expires February 2013 [Page 1] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Methodology of the proposal . . . . . . . . . . . . . . . . . . 3 3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1 Discussion of Algorithms to be used . . . . . . . . . . . . 9 4 Security Considerations . . . . . . . . . . . . . . . . . . . . 11 5 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11 6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 6.1 Normative References . . . . . . . . . . . . . . . . . . . 11 6.2 Informative References . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 13 Shankar Raman.et.al Expires February 2013 [Page 2] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 1 Introduction In the context of P2MP TE-LSPs being constructed for carriage of multicast traffic across multiple areas within a given AS employing BGP confederations where such an autonomous system may comprise of more than one BGP confederation instance, and if such multicast traffic is mission critical traffic then it would be most useful to have backup protection trees (where such trees represent entire P2MP TE-LSPs rooted at the same BGP confederation instance as the primary) or sub-trees which cover certain areas of the primary P2MP TE-LSP. The intent is to provide a solution whereby a backup protection P2MP TE-LSP can be laid out in such a way that the set of BGP confederation instances traversed by the primary (except say for the egress BGP confederation instances) inter-sub-AS P2MP TE-LSP is not traversed at all by the backup P2MP TE-LSP. It is also possible that if such a totally distinct backup TE-LSP is not possible, then an effort be made where possible to construct a partially disjoint backup P2MP TE-LSP which tries to avoid to the extent possible those BGP confederation instances which fall in the primary. While this type of a solution is available in the unicast world applying to inter-AS cases such a solution doesn't seem to have been proposed for the multicast world for inter-AS P2MP TE-LSPs or for cases within an autonomous system in which BGP conferderations are deployed. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2 Methodology of the proposal Assume that a centralized path calculation engine is available as a facility in the head end BGP confederation instance of the P2MP TE- LSP. And this centralized PCE has a backup PCE that does pretty much the same work as the primary PCE in the head end BGP confederation instance except that it is not contacted for path calculation requests while the active is still up. At the active PCE the following happens. The intention at the active PCE is to gather all routes advertised by BGP peers internal and external/internal (a.k.a EIBGP used for confederations) and collect from each of these routes the AS path information attribute (which is a mandatory well-known) and from each of these AS path info attributes which we will call strands from now on, an entire BGP-confederation-instance-Level Shankar Raman.et.al Expires February 2013 [Page 3] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 topology of the AS involved (in the case of inter-AS the internet) is captured to the extent possible. These strands give us information about sections of a BGP-confederation-instance level graph, where each BGP-confederation-instance is considered to be a node and the edge is connectivity between two or more such confederation- instances. Where a set of BGP-confederation-instances are consecutive in a strand they are assumed to be connected to each other. It is to be noted that the information about such strands may come from iBGP, eiBGP, MBGP (in the case of inter-AS level) (sometimes known as Multi-protocol BGP used for multicast) if it carries AS path information. After the BGP-confederation-instance level topology of the immediate neighborhood of the ingress BGP-confederation instance and the egress BGP-confederation-instances to which the multicast subscribers connect to, has been constructed, the P2MP TE-LSP calculation algorithm is run, that takes into account a path in the topology that is most optimal to help and connect the source to the destinations / receivers. It is possible that source and receivers who may be distributed across multiple BGP-confederation-instances in the neighborhood are multihomed in that they have more than one BGP- confederation instance to which they are connected to. This is particularly of interest to the algorithm presented in this invention (in summary) because the backup P2MP TE-LSP needs to be spread out across links that are disjoint from the primary P2MP TE-LSP to the extent possible and multi-homing in this case helps a great deal. Once the primary has been built and the secondary is sought to be built, the invention proposed would take into account multi-homing characteristics of the egress BGP-confederation instances to their respective receivers (which may be enterprise networks / sites) and build a totally disjoint P2MP TE-LSP or if that is not possible a partially disjoint P2MP TE-LSPs. It is also possible that manual policies may be stated in a suitable format to avoid certain BGP- confederation-instances in the calculation of either a totally disjoint or partially disjoint backup P2MP TE-LSP. It is assumed that RSVP-TE would be used to build such primary and secondary P2MP TE-LSPs be they disjoint or partially disjoint. Building a BGP-confederation-instance topology of the immediate neighborhood of the ingress and egress BGP-confederation-instance for a multicast tree to carry one or more streams of mcast traffic through such a tree. Shankar Raman.et.al Expires February 2013 [Page 4] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 Here the term AS refers to a BGP-confederation-instance. Prefix-A: AS-Path Info: AS1 AS2 AS4 Prefix-B: AS-Path Info: AS2 AS3 AS4 Prefix-C: AS-Path Info: AS2 AS5 AS6 and so on. Figure 1.0 Strand obtained from a subgraph of the AS having confederations Given the above three strands of information we could build out the following graph. AS5---------> AS6 | AS1 -----> AS2 --------> AS4 | / --------> AS3 Figure 2.0 Strands put together to form a graph of the AS with confederations If we extrapolate this to a number of such prefixes and their respective AS paths we could end up discovering the BGP- confederation-instance level topology for a greater radius than is otherwise possible with the ingress BGP-confederation-instance to which the source is connected as the center. Multi-homing links appear as duplicate edges between receivers / source to their respective egress / ingress BGP-confederation- instances. Thus building out disjoint and partially disjoint P2MP TE- LSPs using RSVP-TE is a helpful technique in protecting mission critical multicast traffic on the primary by obviating the need for fate-sharing of the path that such traffic takes to the extent possible. Assume that the following BGP-confederation-instance topology in the vicinity of the sender / senders is computed. Shankar Raman.et.al Expires February 2013 [Page 5] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 +----------------+ / V / +----> (AS2) ------> (AS3)<--(RcvrB) / / \ | / / \ +------+ / / \V (source/s)--->(AS1)------> (AS5) ------> (AS4)<-(RcvrA) \ /\ / \ ----------+ \ / \ / \ / (AS6)----> (AS7) ------> (AS8)<+ Figure 3.0 Strands put together to form a graph of the AS with confederations In the above diagram you can see that the source/sources are connected using a multi homed connections to the same ISP through BGP-confederation instance AS1 and AS2. Similarly there are two Receiver sites RcvrA and RcvrB that are multihomed to TWO BGP- confederation-instances for RcvrB to AS3 and AS4 and for RcvrA to AS4 and AS8 respectively. Given that the path calculation engine in AS1 BGP-confederation- instance is given this picture as it absorbs the AS paths and builds out the BGP-confederation-instances level topology, there are at least 2 possible completely (to the maximum extent possible) disjoint P2MP TE-LSPs possible. Note here of course that the egress BGP- confederation-instance sometimes prevents a totally disjoint P2MP TE- LSP since the two up-until-the-egress-BGP-confederation-instance totally P2MP TE-LSPs may be need to merge to enter the receiver domains / sites. They are illustrated below. Primary P2MP TE-LSP advised could be as illustrated by the dotted lines... Shankar Raman.et.al Expires February 2013 [Page 6] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 +----------------+ / V / +----> (AS2) ------> (AS3)<--(RcvrB) / / \ | / / \ +------+ / / \V (source/s)...>(AS1)------> (AS5) ------> (AS4)<-(RcvrA) . .\ / . ........... \ / . . \ / (AS6).....>(AS7) .......>(AS8)<+ Figure 4.0 Strands put together to form a graph of the AS with confederations Secondary P2MP TE-LSP advicsed could be as illustrated by the dotted lines... .................. . V . +----> (AS2).......> (AS3)<--(RcvrB) . / . | . / . +------+ . / .V (source/s)--->(AS1)------> (AS5) ------> (AS4)<-(RcvrA) \ /\ / \ ----------+ \ / \ / \ / (AS6)----> (AS7) ------> (AS8)<+ It is important to note that multicast P2MP TE-LSPs constructed are considered to be stitched sections of unicast paths and may include multicast topologies reserved for multicast use alone that may have been advertised by Multi-protocol BGP just for multicast purposes (in the case of inter-AS). The normal mechanism of discovering receivers could be employed. It is to be noted that while constructing the BGP-confederation- instance-Level topology the routes which have been advertised in BGP that carry multicast NLRIs can be preferred and such inter-BGP- confederation-instance links can be given a higher priority for constructing the P2MP TE-LSPs. Please read the term AS in the following paras to represent a BGP-confederation instance. Shankar Raman.et.al Expires February 2013 [Page 7] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 It is to be noted that this applies to centralized schemes of path calculations. What is being calculated is a tree of ASes that form a P2MP tree where each node can conceptualized as an AS and each edge the border link connecting one AS to another to carry multicast traffic from several sources located at the head end ingress AS to several receiver nodes connected to egress ASes. We will call this calculated tree as a P2MP AS tree. Once the tree is calculated the PCE in the head end / ingress AS through which sources connect calculates the intra-AS P2MP path (the literal P2MP TE-LSP within the AS) within that ingress AS and hands off the elements in the P2MP AS tree in suitable form to the branch ASes which branch off from the ingress AS. This is followed as the information about the elements of the P2MP AS tree are handed off at branches from the PCE of one AS to another until the final inter-AS P2MP literal TE-LSP is complete. The backup P2MP AS tree is constructed as well and the same procedure is followed. This completes the construction of the primary and the backup P2MP AS Trees. The failover mechanism could be derived from any one of the available mechanisms in existence or a novel method for failover could also be deduced. But that is outside the scope of this document. This method could be applied for multicast traffic to be transported through L3VPNs and L2VPNs as well through such inter-AS tree construction of a P2MP AS tree. Also this method of constructing inter-AS disjoint P2MP AS trees is independent of how the egress ASes and the ingress AS are discovered. The method of such discovery is left to existing mechanisms. The primary input to the invention proposed is a list of ingress AS and their respective egress ASes. The other input to the construction of P2MP AS tree part of the invention of course is the BGP-confederation-AS-instance level topology (in case of inter-AS the AS level topology of the neighborhood). It is also possible to apply policies in such a way that the construction of the P2MP AS tree is done in a partially disjoint fashion. That is if there exist certain policies that dictate that a given AS in the path of the P2MP AS tree is to be present in both the primary and backup P2MP AS trees, then it would be applied at the PCE and the resultant tree would contain that AS as policy dictates. Thus the mechanism could churn out disjoint (totally) P2MP AS trees or partially disjoint P2MP AS trees depending on policy dictats. 3 Discussion A centralized picture of a AS level topology of the immediate vicinity of the egress and ingress ASes of a multicast tree to be Shankar Raman.et.al Expires February 2013 [Page 8] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 built using a P2MP TE LSP, helps build a backup P2MP TE-LSP without traversing wherever possible the ASes in the primary for the backup Backup / protection switching is a lot more reliable since the fate- sharing is precluded to the maximum extent possible. Also the multi-homing characteristics of the sites / receivers as well as the multi-homing characteristics of the source to its ingress PEs which may be located in different ISP ASes helps in building out a disjoint path that may well help in avoid in fate-sharing if an edge between two ASes or an AS itself goes down. Whether it be OPTIONS A,B or C as listed in the inter-AS RFCs the method we propose can be used to calculate the P2MP AS tree and the actual construction left to those methods applying to those options respectively. Multicast NLRI information could be used to give priority to routes learnt in that fashion and AS path information used to construct the P2MP AS tree may be favourably passed through ASes listed within it. Thus if there exist in-congruent multicast and unicast topologies then this method could use the appropriate multicast topology hints apart from using the unicast topology information for constructing P2MP AS trees that are more suited to multicast topology hints provided. AS sequences which are not ordered ASes representing AS paths toward a destination may be dropped from consideration while constructing the topology of ASes in the immediate neighborhood or in toto. This proposal will also be applicable to ASes which are independent administrative domains in a literal sense (as against BGP- confederation instances) as outlined. 3.1 Discussion of Algorithms to be used The basic first cut algorithm that can be used for implementing this scheme is to account or collate the list of ASes the primary P2MP TE- LSP has passed through. These set of ASes can be enclosed in a set EXCLUDE_ASes and the algorithm for the secondary P2MP TE-LSP run with the constraint that any of the elements / ASes in this set are to be avoided. The set EXCLUDE_ASes is ordered with the branch points in order of traversal from the egress AS to the source ASes where the receivers are located. So the first check is applied for the elements / ASes in the beginning of the EXCLUDE_ASes set working backwards from the receivers to the source ASes. Only in cases where disjoint paths are not possible will the overlap occur. Shankar Raman.et.al Expires February 2013 [Page 9] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 Another way is to weight the edges in the primary P2MP TE-LSP with higher weights and simply run the algorithm on the topology again. If the CSPF algorithm finds edges which are high it can be made to avoid such nodes and edges thus excluding the ASes in the primary path. The weights that can be assigned for the primary P2MP tree edges is 1/n where n was the lower weight for these edges when the primary P2MP LSP was constructed. All the edges in the graph are assigned 1/n as well. This way the lowest weights of the previous run become the higher weights of the secondary run. Once this is done one could use graph clipping and rejoining as explained below. This algorithm too can run backwards from the egress ASes to the source AS. Another way is to use graph clipping and rejoining without coupling with the previous algorithm. The primary path ASes are simply clipped along with the edges from the graph at the PCE and the algorithm run again to build a P2MP TE-LSP covering all receivers. Notably if the receivers are multi-homed the primary P2MP TE-LSP ASes can be removed first for such cases. If they are single-homed then it would be worthwhile retaining such egress ASes for the second run. The re-run can then try to build the tree, and if incrementally such an attempt fails those ASes in the Primary P2MP TE-LSP which are necessarily needed in the secondary P2MP tree can be added to the secondary tree till the completion of the secondary P2MP TE-LSP occurs. So the CSPF would operate on the graph clipping first and then rejoining those ASes without which a secondary cannot be built. This again could be run backwards from the egress AS to the source AS. An intermediary meeting point akin to the Rendezvous point can be chosen such that the primary RP AS is disjoint from the secondary RP AS. Working backwards this will in most cases produce the optimal result. Further proof of this algorithm will be furnished in future versions of this document. Shankar Raman.et.al Expires February 2013 [Page 10] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 4 Security Considerations When receiving BGP updates from BGP-confederation-instances the PCE or the head end which is expected to do the calculation of the totally disjoint or partially disjoint P2MP TE-LSP paths MUST ensure the authenticity of the updates received from other BGP- confederation-instances through existing mechanisms that assure that the updates are not spoofed thus misleading the receiver of such updates. No additional security hole is seen under these circumstances apart from the above. 5 IANA Considerations No IANA requirements exist as of the time of writing of this draft. 6 References 6.1 Normative References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC1776] Crocker, S., "The Address is the Message", RFC 1776, April 1 1995. [TRUTHS] Callon, R., "The Twelve Networking Truths", RFC 1925, April 1 1996. [RFC4875] Aggarwal, R., Ed., Papadimitriou, D., Ed., and S. Yasukawa, Ed., "Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to- Multipoint TE Label Switched Paths (LSPs)", RFC 4875, May 2007. [RFC5151] Farrel, A., Ed., Ayyangar, A., and JP. Vasseur, "Inter- Domain MPLS and GMPLS Traffic Engineering -- Resource Reservation Protocol-Traffic Engineering (RSVP-TE) Extensions", RFC 5151, February 2008. [RFC4920] Farrel, A., Ed., Satyanarayana, A., Iwata, A., Fujita, N., and G. Ash, "Crankback Signaling Extensions for MPLS and GMPLS RSVP-TE", RFC 4920, July 2007. [RFC5920] Fang, L., Ed., "Security Framework for MPLS and GMPLS Networks", RFC 5920, July 2010. Shankar Raman.et.al Expires February 2013 [Page 11] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. 6.2 Informative References [EVILBIT] Bellovin, S., "The Security Flag in the IPv4 Header", RFC 3514, April 1 2003. [RFC5513] Farrel, A., "IANA Considerations for Three Letter Acronyms", RFC 5513, April 1 2009. [RFC5514] Vyncke, E., "IPv6 over Social Networks", RFC 5514, April 1 2009. [RFC4726] Farrel, A., Vasseur, J.-P., and A. Ayyangar, "A Framework for Inter-Domain Multiprotocol Label Switching Traffic Engineering", RFC 4726, November 2006. [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) Hierarchy with Generalized Multi-Protocol Label Switching (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, "Label Switched Path Stitching with Generalized Multiprotocol Label Switching Traffic Engineering (GMPLS TE)", RFC 5150, February 2008. [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream Label Assignment and Context-Specific Label Space", RFC 5331, August 2008. Shankar Raman.et.al Expires February 2013 [Page 12] INTERNET DRAFT Inter-AS/sub-AS disjoint P2MP TE-LSPs August 2012 Authors' Addresses Shankar Raman Department of Computer Science and Engineering I.I.T Madras, Chennai - 600036 TamilNadu, India. EMail: mjsraman@cse.iitm.ac.in Balaji Venkat Venkataswami Department of Electrical Engineering, I.I.T Madras, Chennai - 600036, TamilNadu, India. EMail: balajivenkat299@gmail.com Prof.Gaurav Raina Department of Electrical Engineering, I.I.T Madras, Chennai - 600036, TamilNadu, India. EMail: gaurav@ee.iitm.ac.in Shankar Raman.et.al Expires February 2013 [Page 13]