NVO3 Workgroup                                               Ali Sajassi
INTERNET-DRAFT                                               Samer Salam
Intended Status: Standards Track                             Keyur Patel
                                                                   Cisco
                                                                        
                                                             Nabil Bitar
                                                                 Verizon
                                                                        
                                                          Wim Henderickx
                                                          Alcatel-Lucent
                                                                        
Expires: April 22, 2013                                 October 22, 2012


         A Network Virtualization Overlay Solution using E-VPN 
                   draft-sajassi-nvo3-evpn-overlay-01


Abstract

   This document describes how E-VPN can be used as an NVO solution and
   explores the various tunnel encapsulation options and their impact on
   the E-VPN control-plane and procedures. In particular, the following
   three encapsulation options are analyzed: MPLS over GRE, VXLAN and
   NVGRE.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html


 


Sajassi et al.           Expires April 22, 2013                 [Page 1]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


Copyright and License Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document. Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.



Table of Contents

   1  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1  Terminology . . . . . . . . . . . . . . . . . . . . . . . .  4
   2  E-VPN Main Features . . . . . . . . . . . . . . . . . . . . . .  5
     2.1 Multi-homed Ethernet Segment Auto-Discovery  . . . . . . . .  5
     2.2 Fast Convergence and Mass Withdraw . . . . . . . . . . . . .  5
     2.3 Split-Horizon  . . . . . . . . . . . . . . . . . . . . . . .  5
     2.4 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     2.5 DF Election  . . . . . . . . . . . . . . . . . . . . . . . .  6
   3 Encapsulation Options for E-VPN Overlays . . . . . . . . . . . .  7
     3.1 MPLS over GRE  . . . . . . . . . . . . . . . . . . . . . . .  7
       3.1.1 Benefits of MPLS over GRE  . . . . . . . . . . . . . . .  7
     3.2 VXLAN/NVGRE Encapsulation  . . . . . . . . . . . . . . . . .  8
       3.2.1 Impact on E-VPN Routes for VXLAN/NVGRE Encapsulation . .  8
       3.2.2 Impact on E-VPN Procedures for VXLAN/NVGRE
             Encapsulation  . . . . . . . . . . . . . . . . . . . . .  9
         3.2.2.1 NVE with No Redundancy . . . . . . . . . . . . . . .  9
         3.2.2.2 NVE with Active/Standby Redundancy . . . . . . . . . 10
         3.2.2.3 NVE with All-Active Redundancy . . . . . . . . . . . 10
       3.2.3 Support for Multicast  . . . . . . . . . . . . . . . . . 13
       3.2.4 Inter-AS Challenges  . . . . . . . . . . . . . . . . . . 13
   4 Comparison between MPLSoGRE and VXLAN/NVGRE Encapsulation  . . . 14
   5  Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 15
   6  Security Considerations . . . . . . . . . . . . . . . . . . . . 15
   7  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 15
   8  References  . . . . . . . . . . . . . . . . . . . . . . . . . . 15
     8.1  Normative References  . . . . . . . . . . . . . . . . . . . 15
     8.2  Informative References  . . . . . . . . . . . . . . . . . . 15
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16

 


Sajassi et al.           Expires April 22, 2013                 [Page 2]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


















































 


Sajassi et al.           Expires April 22, 2013                 [Page 3]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


1  Introduction

   In the context of this document, a Network Virtualization Overlay
   (NVO) is a solution to address the requirements of a multi-tenant
   data center, especially one with virtualized hosts (i.e. Virtual
   Machines or VMs). The key requirements of such a solution as
   described in [Problem-Statement] are:

   - Isolation of network traffic per tenant

   - Support of large number of tenants (tens or hundreds of thousands)

   - Extending L2 connectivity among different VMs belonging to a given
   tenant segment (subnet) across different PODs within a data center or
   between different data centers

   The underlay network for NVO solutions is assumed to provide IP
   connectivity.

   This document describes how E-VPN can be used as an NVO solution and
   explores the various tunnel encapsulation options for E-VPN over IP,
   and their impact on the E-VPN control-plane and procedures. Note that
   the use of E-VPN as an NVO solution does not necessarily mandate that
   the BGP control-plane be running on the NVE. This may not be
   desirable, for e.g., when the NVE resides on the hypervisor. For such
   scenarios, it is still possible to leverage the E-VPN solution by
   using XMPP, or alternative mechanisms, to extend the control-plane to
   the NVE as discussed in [L3VPN-ENDSYSTEMS].

   The possible encapsulation options for E-VPN overlays that are
   analyzed in this document are:

   - MPLS over GRE

   - VXLAN and NVGRE

   Before getting into the description of the different encapsulation
   options for E-VPN over IP, it is important to highlight the E-VPN
   solution main features, how those features are currently supported,
   and any impact that the encapsulation may have on those features.

1.1  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [KEYWORDS].


 


Sajassi et al.           Expires April 22, 2013                 [Page 4]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


2  E-VPN Main Features

   In this section, we will recap the main features of E-VPN, to
   highlight the encapsulation dependencies. The section only describes
   the features and functions at high-level. For more details, the
   reader is to refer to [E-VPN]. 

2.1 Multi-homed Ethernet Segment Auto-Discovery

   E-VPN NV Edge devices (NVEs) connected to the same Ethernet segment
   (e.g. server) can automatically discover each other with minimal to
   no configuration through the exchange of BGP routes.

2.2 Fast Convergence and Mass Withdraw

   E-VPN defines a mechanism to efficiently and quickly signal, to
   remote NVEs, the need to update their forwarding tables upon the
   occurrence of a failure in connectivity to an Ethernet segment. This
   is done by having each NVE advertise an Ethernet A-D Route per
   Ethernet segment for each locally attached segment. Upon a failure in
   connectivity to the attached segment, the NVE withdraws the
   corresponding Ethernet A-D route. This triggers all NVEs that receive
   the withdrawal to update their next-hop adjacencies for all MAC
   addresses associated with the Ethernet segment in question. If no
   other NVE had advertised an Ethernet A-D route for the same segment,
   then the NVE that received the withdrawal simply invalidates the MAC
   entries for that segment. Otherwise, the NVE updates the next-hop
   adjacencies to point to the backup NVE(s).

2.3 Split-Horizon

   Consider a station that is multi-homed to two or more NVEs on an
   Ethernet segment ES1, with all-active redundancy. If the station
   sends a multicast, broadcast or unknown unicast packet to a
   particular NVE, say NE1, then NE1 will forward that packet to all or
   subset of the other NVEs in the E-VPN instance. In this case the
   NVEs, other than NE1, that the station is multi-homed to MUST drop
   the packet and not forward back to the station. This is referred to
   as "split horizon" filtering. In order to achieve this split horizon
   function, every multicast, broadcast or unknown unicast packet is
   encapsulated with an MPLS label that identifies the Ethernet segment
   of origin (i.e. the segment from which the frame entered the E-VPN
   network). This label is referred to as the ESI MPLS label, and is
   distributed using the "Ethernet A-D route per Ethernet Segment". This
   route is imported by the PEs connected to the Ethernet Segment and
   also by the PEs that have at least one E-VPN instance in common with
   the Ethernet Segment in the route. The disposition PEs rely on the
   value of the ESI MPLS label to determine whether or not a flooded
 


Sajassi et al.           Expires April 22, 2013                 [Page 5]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   frame is allowed to egress a specific Ethernet segment.

2.4 Aliasing

   In the case where a station is multi-homed to multiple NVEs, it is
   possible that only a single NVE learns a set of the MAC addresses
   associated with traffic transmitted by the station. This leads to a
   situation where remote NVEs receive MAC advertisement routes, for
   these addresses, from a single NVE even though multiple PEs are
   connected to the multi-homed segment. As a result, the remote PEs are
   not able to effectively load-balance traffic among the NVEs connected
   to the multi-homed Ethernet segment. This could be the case, for e.g.
   when the PEs perform data-path learning on the access, and the load-
   balancing function on the station hashes traffic from a given source
   MAC address to a single PE. Another scenario where this occurs is
   when the PEs rely on control plane learning on the access (e.g. using
   ARP), since ARP traffic will be hashed to a single link in the LAG.

   To alleviate this issue, E-VPN introduces the concept of 'Aliasing'.
   Aliasing refers to the ability of an NVE to signal that it has
   reachability to a given locally attached Ethernet segment, even when
   it has learnt no MAC addresses from that segment. The Ethernet A-D
   route per EVI is used to that end. Remote PEs which receive MAC
   advertisement routes with non-zero ESI SHOULD consider the advertised
   MAC address as reachable via all PEs which have advertised
   reachability to the relevant Segment using Ethernet A-D routes with
   the same ESI (and Ethernet Tag if applicable) and with the Active-
   Standby flag reset.

2.5 DF Election

   Consider a station that is a host or a VM that is multi-homed
   directly to more than one NVE in an E-VPN on a given Ethernet
   segment. One or more Ethernet Tags may be configured on the Ethernet
   segment. In this scenario only one of the PEs, referred to as the
   Designated Forwarder (DF), is responsible for certain actions:

    -   Sending multicast and broadcast traffic, on a given Ethernet
        Tag on a particular Ethernet segment, to the station.

    -   Flooding unknown unicast traffic (i.e. traffic for
        which an NVE does not know the destination MAC address),
        on a given Ethernet Tag on a particular Ethernet segment
        to the station, if the environment requires flooding of
        unknown unicast traffic.

   This is required in order to prevent duplicate delivery of multi-
   destination frames to a multi-homed host or VM, in case of all-active
 


Sajassi et al.           Expires April 22, 2013                 [Page 6]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   redundancy.

3 Encapsulation Options for E-VPN Overlays 


3.1 MPLS over GRE 

   The E-VPN data-plane is modeled as an E-VPN MPLS client layer sitting
   over an MPLS PSN tunnel. The Split-Horizon and Aliasing functions of
   E-VPN are tied to the MPLS client layer. In order to keep the E-VPN
   procedures intact and data-plane operation as is, an ideal
   encapsulation would allow the E-VPN MPLS client layer to be carried
   over an IP PSN tunnel transparently - i.e., without any changes. The
   existing standards-based GRE encapsulation as defined by [RFC2890]
   and [RFC2784] provides such a solution: 

       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |C| |K|S| Reserved0       | Ver |         Protocol Type         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         Key                                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   The Key field can be used to provide 32-bit entropy field.

   The C (Checksum Present) and S (Sequence Number Present) bits in the
   GRE header are set to zero. The K bit is set to 1.

   [MPLSoUDP] discusses using a UDP header instead of the GRE header to
   transport MPLS client layer over an IP PSN tunnel. The main advantage
   for doing so is for better load-balancing capabilities over existing
   IP networks, where some core routers can perform ECMP based on the
   UDP header but not based on the GRE Key field. However, the routers
   that are capable of supporting [NVGRE] encapsulation, can also
   perform load-balancing based on the GRE key which accommodates a 32-
   bit entropy value; whereas, UDP encapsulation accommodates a 16-bit
   entropy value. 


3.1.1 Benefits of MPLS over GRE 

   The benefits of using the MPLS over GRE encapsulation are as follows:
   - Uses existing standard for transporting MPLS over IP.
   - Uses E-VPN control plane (BGP routes and attributes), as well as 
     E-VPN procedures and functions exactly as is.
   - Consistent with L3VPN over IP (RFC 4797)
   - The MPLS label can be a global value (instead of downstream 
 


Sajassi et al.           Expires April 22, 2013                 [Page 7]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   assigned) just like VXLAN or NVGRE service-instance ID.
   - Provides seamless interoperability with E-VPN PEs. There is no 
   need for a gateway device.


3.2 VXLAN/NVGRE Encapsulation 

   If either the VXLAN or NVGRE encapsulation were to be used with the
   E-VPN control plane, there will be an impact on the E-VPN client
   layer and the associated procedures and BGP routes. In order to
   assess this impact, the first step is to identify which subset of the
   service interfaces defined in [E-VPN] is needed for the NVO solutions
   defined in [VXLAN] and [NVGRE]. Then we need to examine how the E-VPN
   BGP routes and procedures should be modified to support these service
   interfaces with the new encapsulation.  

   [E-VPN] defines the following four service interface types:

   - VLAN Based Service Interface
   - VLAN Bundle Service Interface
   - Port-based Service Interface
   - VLAN Aware Bundle Service Interface

   For a detailed description of these service interface types, refer to
   [EVPN-REQ] and [E-VPN]. As described in [E-VPN], the first three
   service interface types don't require encoding the VLAN Tag in the
   BGP routes, because there is a one-to-one mapping between an EVI and
   a broadcast domain represented by a virtual network or a virtual
   segment.

   [NVGRE] requires only VLAN-based service interface and it clearly
   describes that the tenant VLAN Tag (inner VLAN Tag) is not part of
   the encapsulated frames because there is a one-to-one mapping between
   Virtual Subnet Identifier (VSID) and the inner VLAN ID. 

   The [VXLAN] default mode of operation only requires VLAN-based
   service interface, as it specifies that the VTEP does not include an
   inner VLAN tag upon encapsulation; moreover, the decapsulated frames
   with an inner VLAN tag should get discarded. However, [VXLAN]
   provides an option of including an inner VLAN tag in the encapsulated
   packet if it is configured explicitly at the VTEP. If an inner VLAN
   tag is included, then VXLAN requires a VLAN-bundle service interface.
   However, as discussed above, this service interface type does not
   require that the tenant VLAN tag be sent in the BGP routes.  

3.2.1 Impact on E-VPN Routes for VXLAN/NVGRE Encapsulation

   As discussed above, both [NVGRE] and [VXLAN] do not require the
 


Sajassi et al.           Expires April 22, 2013                 [Page 8]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   tenant VLAN tag to be sent in BGP routes. Therefore, the 32-bit
   Ethernet tag field in the E-VPN BGP routes can be used to represent
   NVGRE VSID or VXLAN VNI. This is not accidental, but rather by
   design: The Ethernet Tag field in E-VPN was designed not just for C-
   tagged or S-tagged interfaces [802.1Q] but also for I-tagged
   interfaces [802.1ah] where an I-SID is a 24-bit entity representing a
   virtual segment just like VSID or VNI. Therefore, there is no need to
   re-purpose the MPLS label field in the E-VPN BGP routes and this
   field can be omitted in the E-VPN BGP routes. The length field of the
   NLRI in E-VPN routes will be three octets shorter for VXLAN and NVGRE
   encapsulations.

   Since VXLAN VNI or NVGRE VSID is assumed to be a global value, one
   might question the need for the Route Distinguisher (RD) in the E-VPN
   routes. In the scenario where all data centers are under a single
   administrative domain, and there is a single global VNI/VSID space,
   the RD can be set to zero in the E-VPN routes. However, in the
   scenarios where different group of data centers are under different
   administrative domains, and these data centers are connected via one
   or more backbone core providers as described in [NOV3-Framework], the
   RD must be a unique value per EVI or per NVE as described in [E-VPN].
   In other words, whenever, there is more than one administrative
   domain for VNI or VSID, then a non-zero RD MUST be used.

3.2.2 Impact on E-VPN Procedures for VXLAN/NVGRE Encapsulation


   In order to analyze the impact of the VXLAN/NVGRE encapsulation on E-
   VPN procedures, we must distinguish three NVE redundancy models:

   - No redundancy

   - Active/Standby redundancy

   - All-active redundancy

   The impact of the encapsulation varies depending on the employed
   model.


3.2.2.1 NVE with No Redundancy 

   This is the scenario where, for e.g., the NVE is implemented on the
   hypervisor. In this case, neither the Split-Horizon nor the Aliasing
   functions are required or applicable. Therefore, the choice of
   VXLAN/NVGRE encapsulation has no impact on E-VPN procedures.

   For all practical purposes, in this scenario, the only difference
 


Sajassi et al.           Expires April 22, 2013                 [Page 9]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   between the choice of GRE or VXLAN/NVGRE encapsulation is in the size
   of the entropy field (32-bits vs. 16 bits).

3.2.2.2 NVE with Active/Standby Redundancy

   This is the scenario where the hosts are multi-homed to a set of
   NVEs, however, only a single NVE is active at a given point of time
   for a given VNI or VSID. In this case as well, the Split-Horizon
   function is not required. However, in order to support fast
   convergence in case where the primary NVE fails, the Aliasing
   function of E-VPN is needed. Note that Aliasing in this scenario is
   used to quickly identify the backup NVE rather than being used for
   traffic load-balancing. In this case, the impact of the use of the
   VXLAN/NVGRE encapsulation on the E-VPN procedures is as discussed in
   Section 3.2.2.3.2, with the difference being that a remote NVE uses
   the received Ethernet A-D routes to build primary and backup paths to
   the advertising NVEs, instead of a load-balancing path-list.

   If fast convergence is not required or not used, then the VXLAN/NVGRE
   encapsulation would have no impact on the E-VPN procedures.

3.2.2.3 NVE with All-Active Redundancy

   Out of the E-VPN features listed in section 2, the use of the VXLAN
   or NVGRE encapsulation impacts the Split-Horizon and Aliasing
   features, since those two rely on the MPLS client layer. Given that
   this MPLS client layer is absent with these types of encapsulations,
   alternative procedures and mechanisms are needed to provide the
   required functions. Those are discussed in detail next.

3.2.2.3.1 Split Horizon

   In E-VPN, an MPLS label is used for split-horizon filtering to
   support active/active multi-homing where an ingress NV Edge device
   (NVE) adds a label corresponding to the site of origin (aka ESI MPLS
   Label) when encapsulating the packet. The egress NVE checks the ESI
   MPLS label when attempting to forward a multi-destination frame out
   an interface, and if the label corresponds to the same site
   identifier (ESI) associated with that interface, the packet gets
   dropped. This prevents the occurrence of forwarding loops. 

   Since the VXLAN or NVGRE encapsulation does not include this ESI MPLS
   label, other means of performing the split-horizon filtering function
   MUST be devised. One way of supporting this function is to assign an
   IP address for each site of origin (e.g., for each ESI in the E-VPN
   terminology) and advertise this IP address in the BGP Remote-Next-Hop
   attribute associated with the E-VPN Ethernet A-D route (refer to
   section 3.2.3 for details). The "Active-Standby" bit in the flags of
 


Sajassi et al.           Expires April 22, 2013                [Page 10]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   the ESI MPLS Label Extended Community MUST be set to 0 to indicate
   active/active multi-homing and the MPLS label field MUST be set to
   zero to indicate that IP address in the BGP Remote-Next-Hop attribute
   will be used for split-horizon filtering. The ingress NVE uses the IP
   address associated with a given site as the source IP address for all
   traffic originating from said site. The egress NVE will program its
   egress ACL with this IP address for the interfaces corresponding to
   that same site.  

   Although the impact in control plane is minimal and the existing E-
   VPN BGP routes can be used with minimum modifications to its
   corresponding procedures, the same cannot be said in terms of network
   operations, management, and data plane. The use of IP addresses to
   represent the site of origin requires many IP addresses to be
   allocated and configured on a single NVE. For example a TOR with N
   interfaces may require one IP address per interface in worst case
   which may impact management and operational aspects of the Data
   Center Network. Also, the data-plane operation for Split-Horizon
   filtering will be different from that of MPLS client layer and it
   cannot be assumed that platforms/ASICs that support Split-Horizon
   filtering based on MPLS label can also support such function based on
   IP addresses. However, there are alternative options for performing 
   such Split-Horizon filtering function when doing VXLAN/NVGRE
   encapsulation, while retaining a single IP address per NVE, and those
   will be described in a future revision of this document. 

   It should be noted that such filtering function is not required when
   doing active/standby multi-homing where load-balancing from a tenant
   can still be performed on a per VLAN basis - e.g., different VLANs
   are active on different NVEs connected to a multi-homed site.
   Furthermore, active/active multi-homing is primarily applicable when
   NVEs are on physical devices as opposed to on the hypervisor. For
   example, [VXLAN] describes the use of physical devices as VXLAN
   gateways to connect a legacy network with a VXLAN overlay network. In
   such scenarios, one would expect: a) that the number of such gateways
   is not very large and/or b) that not all of them require
   active/active multi-homing.     

3.2.2.3.2 Aliasing 

   In E-VPN, the NVEs connected to a multi-homed site optionally
   advertise a VPN label used to load-balance traffic between NVEs, even
   when a given MAC address is learnt by only a single NVE connected to
   the site. In the case where VXLAN or NVGRE encapsulation is used,
   some alternative means that does not rely on MPLS labels is required
   to support aliasing. One solution would be to rely on the IP address
   per site assignment depicted in the previous section for aliasing as
   well: Effectively every NVE advertises an Ethernet A-D route for a
 


Sajassi et al.           Expires April 22, 2013                [Page 11]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   given site with the BGP Remote-Next-Hop attribute set to an IP
   address that has a 1:1 mapping to the site. The remote NVEs resolve
   an ESI (site ID) to a list of IP addresses corresponding to that
   site. Furthermore, a given MAC address that is associated with an
   ESI, in turn, gets resolved to this list of IP addresses. When a
   remote NVE wants to forward a packet for a given MAC address, it
   selects one of IP addresses from the list (using a hash value for
   load balancing) and encapsulates the packet using that IP address as
   the destination IP address in the VXLAN or NVGRE encapsulation. The
   source IP address will be that of the source multi-homed site. In
   case where the source site is single homed, the source IP address
   will be the loopback address of the NVE.  


3.2.2.3.3 Tunnel Endpoint Identification

   To accommodate the Split Horizon as well as Aliasing functions of E-
   VPN, multiple IP tunnel endpoints (one per site) must be associated
   with the same NVE. As such, the mechanisms of [RFC5512] cannot be
   used to specify the tunnel endpoint and encapsulation, since those
   mechanisms only allow a single tunnel endpoint IP address to be
   associated with the BGP speaker. To alleviate this, the BGP Remote-
   Next-Hop attribute defined in [REMOTE-NH] can be used. Two new Tunnel
   Types would be required for VXLAN and NVGRE.

   This attribute will be carried with the E-VPN Ethernet A-D route. The
   IP address field of this attribute serves two functions:

   - It indicates the tunnel endpoint destination IP address that must
   be used when load-balancing traffic associated with a given site
   (i.e. ESI).

   - It is used to build the egress ACL for filtering multi-destination
   traffic on multi-homed Ethernet Segments. In this context, the IP
   address is the tunnel endpoint source address. 

   It is worth noting that for multi-homed Ethernet segments, the NVE
   will always advertise an Ethernet A-D route with the Remote-Next-Hop
   attribute, in addition to the MAC Advertisement routes. In this case,
   the NVEs which receive the routes derive the tunnel endpoint IP
   address for a given MAC address as follows:

   1- The NVE identifies the Ethernet Segment Identifier (ESI)
   associated with the MAC address, as encoded in the MAC Advertisement
   route.

   2- The NVE then sets the tunnel endpoint IP address for that MAC to
   the value encoded in the Remote-Next-Hop attribute of the Ethernet AD
 


Sajassi et al.           Expires April 22, 2013                [Page 12]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   route advertised for the ESI identified in step 1. 

   On the other hand, for single-homed Ethernet segments, the NVE will
   only advertise the MAC Advertisement routes. In this latter case, the
   tunnel endpoint IP address is derived from the BGP Next-Hop attribute
   associated with the MAC Advertisement route.

3.2.3 Support for Multicast

   The E-VPN Inclusive Multicast BGP route can be used to discover the
   multicast endpoints associated with a given VXLAN VNI or NVGRE VSID.
   The Ethernet Tag field of this route is used to encode the VNI or
   VSID. This route is tagged with the PMSI Tunnel attribute, which is
   used to encode the type of multicast tunnel to be used as well as the
   multicast tunnel identifier. The following tunnel types can be used
   for VXLAN/NVGRE:

   - PIM-SSM Tree
   - PIM-SM Tree
   - BIDIR-PIM Tree
   - Ingress Replication

   In the scenario where the multicast tunnel is a tree, both the
   Inclusive as well as the Aggregate Inclusive variants may be used. In
   the former case, a multicast tree is dedicated to a VNI or VSID.
   Whereas, in the latter, a multicast tree is shared among multiple
   VNIs or VSIDs. This is done by having the NVEs advertise multiple
   Inclusive Multicast routes with different VNI or VSID encoded in the
   Ethernet Tag field, but with the same tunnel identifier encoded in
   the PMSI Tunnel attribute.

3.2.4 Inter-AS Challenges

   For inter-AS operation, two scenarios must be considered:

   - Scenario 1: The tunnel endpoint IP addresses are public
   - Scenario 2: The tunnel endpoint IP addresses are private

   In the first scenario, inter-AS operation is straight-forward and
   follows existing BGP inter-AS procedures. 

   The second scenario is more challenging, because the absence of the
   MPLS client layer from the VXLAN encapsulation creates a situation
   where the ASBR has no fully qualified indication within the tunnel
   header as to where the tunnel endpoint resides. To elaborate on this,
   recall that with MPLS, the client layer labels (i.e. the VPN labels)
   are downstream assigned. As such, this label implicitly has a
   connotation of the tunnel endpoint, and it is sufficient for the ASBR
 


Sajassi et al.           Expires April 22, 2013                [Page 13]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   to look up the client layer label in order to identify the label
   translation required as well as the tunnel endpoint to which a given
   packet is being destined. With the VXLAN encapsulation, the VNI is
   globally assigned and hence is shared among all endpoints. The
   destination IP address is the only field which identifies the tunnel
   endpoint in the tunnel header, and this address is privately managed
   by every data center network. Since the tunnel address is allocated
   out of a private address pool, then we either need to do a lookup
   based on VTEP IP address in context of a VRF (e.g., use IP-VPN) or
   terminate the VXLAN tunnel and do a lookup based on the tenant's MAC
   address to identify the egress tunnel on the ASBR. This effectively
   mandates that the ASBR to either run another overlay solution such as
   IP-VPN over MPLS/IP core network or to be aware of the MAC addresses
   of all VMs in its local AS, at the very least.

   Even in the first scenario where the tunnel endpoint IP addresses are
   public, there may be security concern regarding the distribution of
   these addresses among different ASes. This security concern is one of
   the main reasons for having the so called inter-AS "option-B" in MPLS
   VPN solutions such as E-VPN. 

   Using MPLS over GRE encapsulation addresses both of these concerns.



4 Comparison between MPLSoGRE and VXLAN/NVGRE Encapsulation

   The comparison between MPLSoGRE and VXLAN/NVGRE encapsulation depends
   on the required functionality on NVEs. If the hosts are single-homed
   to NVEs without any need to support redundancy group on NVEs, or if
   the hosts are multi-homed to two or more NVEs with active/standby
   redundancy but without the need for fast convergence upon a failure,
   then both MPLSoGRE and VXLAN/NVGRE do equally well with E-VPN control
   plane.

   If we need to support active/standby multi-homing with fast
   convergence upon a failure or if we need to support active/active
   multi-homing, then MPLSoGRE encap can provide these additional
   functionality without any impact to E-VPN routes and procedures.
   Furthermore, it can provide complete support for inter-AS operation
   and complete set of E-VPN functions without impacting IP address
   assignment and management of the underlying network. However,
   VXLAN/NVGRE impacts E-VPN routes and procedures as well as the
   underlying data plane behavior as noted above. Furthermore, there are
   implications to IP address assignments, security, and inter-AS
   operations. It should be noted that the additional requirements on
   the data plane behavior as well as the above implications are the
   consequence of the functionality that need to be supported and
 


Sajassi et al.           Expires April 22, 2013                [Page 14]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   independent of the control-plane choice.

   As noted previously, there are existing core switches that do not
   support ECMP by hashing the GRE key; however, vast majority of
   existing core switches support ECMP by hashing UDP header; therefore,
   VXLAN encapsulation can provide better ECMP functions for these
   existing switches.  Thus, the choice for overlay encapsulation
   depends on needed functionality, inter-AS scenarios, security
   requirements, and the ECMP capabilities of the core switches. 


5  Acknowledgement

   The authors would like to thank John Mullooly and Dave Smith for
   providing value comments and feedbacks. 

6  Security Considerations


7  IANA Considerations


8  References

8.1  Normative References

   [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.



   [REMOTE-NH] Van de Velde et al., "BGP Remote-Next-Hop", draft-
              vandevelde-idr-remote-next-hop-01.txt, work in progress,
              July 2012.


8.2  Informative References

   [NVGRE]   Sridhavan, M., et al., "NVGRE: Network Virtualization using
   Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre-
   01.txt, July 8, 2012.

   [VXLAN] Dutt, D., et al, "VXLAN: A Framework for Overlaying
   Virtualized Layer 2 Networks over Layer 3 Networks", draft-
   mahalingam-dutt-dcops-vxlan-02.txt,  August 22, 2012.

   [E-VPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf-
   l2vpn-evpn-01.txt, work in progress, February, 2012.
 


Sajassi et al.           Expires April 22, 2013                [Page 15]

INTERNET DRAFT               E-VPN Overlay              October 22, 2012


   [Problem-Statement] Narten et al., "Problem Statement: Overlays for
   Network Virtualization", draft-ietf-nvo3-overlay-problem-statement-
   00, September 2012.

   [L3VPN-ENDSYSTEMS] Marques et al., "BGP-signaled end-system IP/VPNs",
   draft-ietf-l3vpn-end-system, work in progress, October 2012.

Authors' Addresses


   Ali Sajassi
   Cisco
   Email: sajassi@cisco.com


   Samer Salam
   Cisco
   595 Burrard Street
   Vancouver, BC V7X 1J1, Canada
   Email: ssalam@cisco.com


   Keyur Patel
   Cisco
   170 West Tasman Drive
   San Jose, CA  95134, US
   Email: Keyupate@cisco.com


   Nabil Bitar
   Verizon Communications
   Email : nabil.n.bitar@verizon.com


   Wim Henderickx
   Alcatel-Lucent
   Email: wim.henderickx@alcatel-lucent.com














Sajassi et al.           Expires April 22, 2013                [Page 16]