Network Working Group                                  Bhumip Khasnabish
Internet-Draft                                              ZTE USA,Inc.
Intended status: Informational                                   Bin Liu
Expires: April 4, 2013                                   ZTE Corporation
                                                              Baohua Lei
                                                               Feng Wang
                                                           China Telecom
                                                                Oct 2012


  Requirements for Mobility and Interconnection of Virtual Machine and
                        Virtual Network Elements
                 draft-khasnabish-vmmi-problems-02.txt

Abstract

   In this draft, we discuss the challenges and requirements related to
   migration, mobility, and interconnection of Virtual Machines (VMs)and
   Virtual Network Elements (VNEs).  VM migration scheme across IP
   subnets is needed to implement virtual computing resources sharing
   across multiple network administrative domains.  For the seamless
   online migration in various scenarios, many problems are needed to be
   resolved on the control plane.  The VM migration process should be
   adapted to these aspects.  We also describe the limitations of
   various types of virtual local area networking (VLAN) and virtual
   private networking (VPN) techniques that are traditionally expected
   to support such migration, mobility, and interconnections.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 4, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 1]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  5
     1.1.  Conventions used in this document  . . . . . . . . . . . .  5
   2.  Terminology and Concepts . . . . . . . . . . . . . . . . . . .  5
   3.  Network Related Prloblem specification . . . . . . . . . . . .  7
     3.1.  The Mobility Problems of VM Migration Across IP
           subnets/WAN  . . . . . . . . . . . . . . . . . . . . . . .  9
       3.1.1.  IP Tunnel Problems . . . . . . . . . . . . . . . . . . 11
       3.1.2.  IP Allocation Strategy Problems  . . . . . . . . . . . 12
       3.1.3.  Routing Synchronization Strategy Problems  . . . . . . 14
       3.1.4.  The migration protocol state machine of the VM
               online migration across subnets  . . . . . . . . . . . 14
       3.1.5.  Resource Gateway Problems  . . . . . . . . . . . . . . 15
       3.1.6.  Optimized Location of Default Gateway  . . . . . . . . 15
       3.1.7.  Other Problems . . . . . . . . . . . . . . . . . . . . 15
     3.2.  The Virtual Network Model  . . . . . . . . . . . . . . . . 15
     3.3.  The Processing Flow  . . . . . . . . . . . . . . . . . . . 15
     3.4.  Problems of NVE/OBP location . . . . . . . . . . . . . . . 16
       3.4.1.  NVE/OBP on the Server  . . . . . . . . . . . . . . . . 16
       3.4.2.  NVE/OBP on the ToR . . . . . . . . . . . . . . . . . . 17
     3.5.  The Evolution Problems of The Logical Network Topology
           in VMMI Environments . . . . . . . . . . . . . . . . . . . 18
     3.6.  Cloud Service Virtualization Requirements  . . . . . . . . 19
       3.6.1.  Requirement of logical element . . . . . . . . . . . . 19
       3.6.2.  Requirements for Resource Allocation Gateway (RA
               GW) Function . . . . . . . . . . . . . . . . . . . . . 20
       3.6.3.  Performance Requirements . . . . . . . . . . . . . . . 21
       3.6.4.  Fault Tolerance Capability Requirements  . . . . . . . 21
       3.6.5.  Network Model  . . . . . . . . . . . . . . . . . . . . 22
       3.6.6.  Types and Applications of VPNs Interconnection
               between DCs which provide Cloud Services . . . . . . . 22
         3.6.6.1.  Types of VPNs Layer3 VPN . . . . . . . . . . . . . 22
         3.6.6.2.  Applications of L2VPN in DCs . . . . . . . . . . . 22
         3.6.6.3.  Applications of L3VPN in DCs . . . . . . . . . . . 23


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 2]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


       3.6.7.  VN Requirements  . . . . . . . . . . . . . . . . . . . 23
       3.6.8.  Packet Encapsulation Problems  . . . . . . . . . . . . 24
       3.6.9.  VM Migration Problem in mixed IPv4 and IPv6
               Environment  . . . . . . . . . . . . . . . . . . . . . 24
         3.6.9.1.  Real-time Perception of Availability of Global
                   Network and Storage Resources  . . . . . . . . . . 25
         3.6.9.2.  The real-time perception of global available
                   network resource and requested network
                   resource for matching with storage resources . . . 25
         3.6.9.3.  The real-time perception of global requested
                   network resource for matching with storage
                   resources  . . . . . . . . . . . . . . . . . . . . 25
       3.6.10. Selection of Migration . . . . . . . . . . . . . . . . 26
         3.6.10.1. Requirements with Different Network
                   Environments and Protocol  . . . . . . . . . . . . 26
         3.6.10.2. Requirements for Live Migration of Virtual
                   Machines . . . . . . . . . . . . . . . . . . . . . 26
       3.6.11. Access and Migration of VMs without users'
               Perception . . . . . . . . . . . . . . . . . . . . . . 27
         3.6.11.1. VM Migration Problems and Strategies in the
                   WAN with having Traffic Roundabout as a
                   Prerequisite . . . . . . . . . . . . . . . . . . . 28
         3.6.11.2. VM Migration Problems and Strategies in the
                   WAN without having Traffic Roundabout as a
                   Target . . . . . . . . . . . . . . . . . . . . . . 29
       3.6.12. Review of VXLAN, NVGRE, and NVO3 . . . . . . . . . . . 30
       3.6.13. The East-West Traffic Problem  . . . . . . . . . . . . 31
       3.6.14. Data Center Interconnection Fabric Related Problems  . 33
       3.6.15. MAC, IP, and ARP Explosion Problems  . . . . . . . . . 34
       3.6.16. Suppressing Flooding within VLAN . . . . . . . . . . . 34
       3.6.17. Convergence and Multipath Support  . . . . . . . . . . 34
       3.6.18. Routing Control - Multicast Processing . . . . . . . . 34
       3.6.19. Problems and Requirement related to DMTF . . . . . . . 35
   4.  Control & Mobility Related Problem Specification . . . . . . . 36
     4.1.  General Requirements and Problems of State Migration . . . 36
       4.1.1.  Foundation of Migration Scheduling . . . . . . . . . . 36
       4.1.2.  Authentication for Migration . . . . . . . . . . . . . 36
       4.1.3.  Consultation for Assessing Migratability . . . . . . . 36
       4.1.4.  Standardization of Migration State . . . . . . . . . . 36
     4.2.  Mobility in Virtualized Environments . . . . . . . . . . . 38
     4.3.  VM Mobility Requirements . . . . . . . . . . . . . . . . . 38
       4.3.1.  Summarization of Mobility  . . . . . . . . . . . . . . 39
       4.3.2.  Problem Statement  . . . . . . . . . . . . . . . . . . 39
   5.  Network Management Related Problem Specification . . . . . . . 39
     5.1.  Data Center Maintenance  . . . . . . . . . . . . . . . . . 39
     5.2.  Load Balancing after VM Migration and Integration  . . . . 41
     5.3.  Security and Authentication of VMMI  . . . . . . . . . . . 42
     5.4.  Efficiency of Data Migration and Fault Processing  . . . . 42


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 3]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


     5.5.  Robustness Problems  . . . . . . . . . . . . . . . . . . . 42
       5.5.1.  Robustness of VM Migration . . . . . . . . . . . . . . 43
       5.5.2.  Robustness of VNE  . . . . . . . . . . . . . . . . . . 43
   6.  Acknowledgement  . . . . . . . . . . . . . . . . . . . . . . . 44
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 44
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 45
   9.  IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 45
   10. Normative References . . . . . . . . . . . . . . . . . . . . . 45
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 45


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 4]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


1.  Introduction

   There are many challenges related to the VM migration and their
   interconnections among two or more data centers (DCs).  The
   techniques that can be used for VM migration and data center
   interconnection should support the required level of performance,
   security, scalability, along with simplicity and cost-effective
   management, operations and maintenance.

   In this draft, the issues and requirements for moving the virtual
   machines are summarized with reference to the necessary conditions
   for migration, business needs, state classification, security, and
   efficiency.  We then list the requirements for VM migration in the
   current IPV4 and IPV6 mixed environment.

   On the choice of the migration solution, the requirements for
   techniques that are useful on large-scale Layer-2 network and on
   segmented IP network/WAN are discussed.  VM migration scheme across
   IP subnets/WAN is therefore needed to implement virtual computing
   resources sharing across multiple network administrative domains.
   This will make a wider range of VM migration possible, and can allow
   for migration of VMs to different types of DC.  It can be adapted to
   different types of physical networks, different topological networks,
   and various protocols.  For the seamless online migration in these
   scenarios, a very intelligent seamless VM online migration is needed
   to be implemented on the control plane.  We summarize the
   requirements of virtual networks for VM migration, visual networking,
   and operations in DCI modes.

   In the following sections of this draft, we first describe the
   general challenges at high level, and then analyze the requirements
   for VM migration.  We then discuss the commonly-used solutions and
   their limitations along with the desired features of a potential
   reference solution.  A more detailed solution survey will be
   presented in a companion draft.

1.1.  Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2.  Terminology and Concepts

   o  ACL: Access Control List


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 5]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   o  ARP: Address Resolution Protocol

   o  DC: Data Center

   o  DCB/DCBR: Data Center Border Routers

   o  DC GW: Data Center Gateway

   o  DCI: Data Center Interconnection

   o  DCS: Data Center Switch

   o  FDB: Forwarding DataBase

   o  HPC: High-Performance Computing

   o  IDC: Internet Data Center

   o  IGMP: Internet Group Management Protocol

   o  IOMMU: Input/Output Memory Management Unit

   o  IP: Internet Protocol

   o  IP VPN: Layer 3 VPN, defined in L3VPN working group

   o  ISATAP: Intra-Site Automatic Tunnel Addressing Protocol

   o  LISP: Locator ID Separation Protocol

   o  MatrixDCN: Matrix-based fabric for Data Center Network

   o  NHRP: Next Hop Resolution Protocol

   o  NVO3: Network Virtualization Overlays (Over Layer-3)

   o  OBP: Overlay network boundary point

   o  OTV: Overlay Transport Virtualization

   o  PaaS: Platform as a Service

   o  PIM: Protocol Independent Multicast

   o  PBB: Provider Backbone Bridge

   o  PM: Physical Machine


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 6]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   o  QoS: Quality of Service

   o  RA GW: Resource Allocation GateWay

   o  STP: Spanning Tree Protocol

   o  TNI: Tenant Network Identifier

   o  ToR: Top of the Rack

   o  TRILL: Transparent Interconnection of Lots of Links

   o  VLAN: Virtual Local Area Networking

   o  VM: Virtual Machine

   o  VMMI: Virtual Machine Mobility and Interconnection

   o  VN: Virtual Network

   o  VNI: Virtual Network Identifier

   o  VNE: Virtual Network Entity.(a virtualized laye-3/network entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VPN: Virtual Private Network

   o  VPLS: Virtual Private LAN Service

   o  VRRP: Virtual Router Redundancy Protocol

   o  VSE: Virtual Switching Entity (a virtualized laye-2/switch entity
      with associated virtualized port and virtualized processing
      capabilities)

   o  VSw: Virtual Switch

   o  WAN: Wide Area Network


3.  Network Related Prloblem specification

   In this section, we describe the background of the virtual machine
   and VNE migration between the data centers.

   Why VM and VNE need to be migrated?  First of all, in case of
   overload and during any natural disasters, business-critical data


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 7]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   center applications need to be migrated to other data centers as
   quickly as possible.  As a pre-condition of data center migration
   and/or integration, some of the applications can be migrated without
   interruption from one data center to another.  As for the
   considerations of address resources, cooling and physical space in
   the primary data center, some of the virtual machines can be migrated
   to the backup data center(s) even under normal operating conditions.

   Secondly, through seamless management of VM migration, it may be
   possible to save operations, maintenance, and upgrade costs.  For
   example, the volume of previous server may be relatively large, and
   the volume of the present server may be relatively small.  The
   migration of VMs would allow the users to simultaneously use a single
   server or to replace a set of smaller previous servers.  Thus VM
   migration will save the user a substantial amount of physical rack
   space.  In addition, the server of virtual machine has a unified
   "virtual hardware", unlike the previous server which may have a
   number of different hardware resources.  After migration, the server
   can be managed through a unified interface.  We note that using some
   of the virtual machine software such as high availability tools
   provided by VMware -- when the server shuts down due to various
   failure -- it is possible to automatically switch to another virtual
   server in the network without causing any disruption in operation.
   In short, migration of VMs under many desirable scenarios has the
   advantage of lowering operations costs, simplifying maintenance,
   improving system load balancing, enhancing system error tolerance,
   and optimizing system-wide power and space management.

   In general, a data center architecture consists of the following
   components:

   o  Gateways (Data Center Gateway, Resource Allocation Gateway)

   o  Core Router / Switch

   o  Aggregation layer switch

   o  Access layer ToR switch

   o  Visual switch

   o  Interconnection network between DCs

   o  Servers

   o  Firewall system, etc.


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 8]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   Overall, the requirement of VM migration brings in the following
   challenges in the forefront of data center operations and management:
   (A)How to accommodate a large number of tenants in each isolated
   network in data center;

   (B)From one DC to another within one administrative domain, (i) how
   to ensure that the necessary conditions of migration are satisfied,
   (ii) how to ensure that a successful migration occurs without service
   disruption, and (c) how to ensure successful rollback when any
   unforeseen problem occur in the migration process.

   (C)From one administrative domain to another, how to solve the
   problem of seamless communication between the domains.  There are
   several different solutions to the current Layer-2 (L2) based DC
   interconnect technology, and each can solve different problems in
   different scenarios.  In L2 network, VXLAN
   [draft-mahalingam-dutt-dcops-vxlan-01] is used to resolve the VLAN
   number limitation problem.  And, NVGRE
   [draft-sridharan-virtualization-nvgre-00] attempts to solve similar
   problems, but artificially causes interoperability problems between
   domains.  If the unification of packet encapsulation in different
   solutions can be achieved, it is bound to promote seamless migration
   of VMs among DCs along with the desired integration in cloud
   computing and networking.

   (D) How to utilize IP based technology to resolve migration of VMs
   over layer-3 (L3) network?  For example, VPN technology can be used
   to carry L2 and L3 traffic across the IP/MPLS core network.

   (E)How to resolve the problems related to mobility and portability of
   VMs among DCs is also an important aspect to consider.

   We discuss the above in more details in the following sections.  A
   related draft [DCN Ops Req] discusses data center network and
   operations requirements.

3.1.  The Mobility Problems of VM Migration Across IP subnets/WAN

   Why there is a need to implement VM migration across IP subnets/WAN?

   There are many existing implementable solutions for migrating VM
   within a LAN.  These solutions include Xen, KVM, and VMWare, which
   all implement VM image file sharing based on NFS, and only CPU and
   memory status are migrated.  These are layer-2 VM migration
   techniques.  The advantage of the implementation is that its IP
   addresses don't need to be changed after the VM migration.  With the
   development and popularization of the DC and virtualization
   technology, the number of servers and network environment in a single


Bhumip Khasnabish, et al.  Expires April 4, 2013                [Page 9]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   LAN will limit the scalability of the virtual computing environment.

   In addition, when re-configuring the VLAN in the traditional DC
   network, STP (MSTP) will lead to the VLAN isolation.  It is a very
   serious problem in the DC network, especially in the storage network,
   because the services that storage networks support demand
   uninterrupted operation.  In the Cloud computing and DC network, the
   problem with VM migration is fatal.  But the techniques or standards
   of the existing large L2 domain cannot completely solve the problem.
   Because even the large L2 network is very huge, it is easy to reach
   the upper limit, which is restricted by the scope of the Ethernet
   broadcast domain.

   So even the scope of the L2 network is very large, the maximum number
   of the VMs it can accommodate may not be sufficient.  It will limit
   the scope of sharing of virtual computing resources .  VM migration
   scheme across IP subnets is therefore needed to implement virtual
   computing resources sharing across multiple network administrative
   domains.  This will make a wider range of VM migration possible, and
   can allow for migration of VMs to different types of DC.  It can be
   adapted to different types of physical networks, different
   topological networks, and various protocols.

   For example, in the process of VM migration in IDC, there are
   scenarios that VM in the traditional three-tier topological network
   is migrated through WAN to Fat-Tree topological network, or to a
   variety of other topological networks.  For the seamless online
   migration in these scenarios, a very intelligent seamless VM online
   migration is needed to be implemented on the control plane.

   If VM migration is only implemented in the L2 domain, its concern on
   the network is that the expansion of the number of VLANs or isolated
   domains, such as the 16,000,000 isolated domains in PBB.

   Now the limitless and seamless VM arbitrary online migration across
   IP subnets means that the following issues need to be addressed, in
   order to achieve our goal: to create a true virtual network
   environment that is separated from the physical network.

   Migration across IP subnets.

   VM migration in the overlay network needs to be adapted to the
   heterogeneous network topology.

   How the source network environment adapts its configuration to the
   target environment?

   The network redirection technology, IP-in-IP technology and dynamic


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 10]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   IP tunnel configuration will be used to allow online VM migration
   across subnets.

   An IP allocation management module of VM is needed, which manages
   each IP allocation in the virtual network.  The IP allocation should
   not conflict with each other, and make the path cost of routing
   forwarding as smaller as possible.  It is necessary to know the DC
   network topology, its routing protocols, and real-time results of the
   path cost to realize minimum path cost.  We know that the network
   topology of different DCs is not necessarily the same.  For example,
   the network topology and routing protocols of traditional DC and Fat-
   Tree network DC are different.  The addition of related protocol
   processing on the control plane is needed for seamless VM migration
   between them.  Otherwise, online VM migration cannot be implemented
   across DCs or across IP subnets.  The scheme of IPinIP tunneling
   resolves the contradiction between unchanged IP addresses during the
   VM migration and changed IP addresses when VMs migrate across IP
   subnets.  Therefore, the VM's mobility problem can be resolved only
   after the above problems are solved.

   Service providers can implement it by upgrading its software to
   support new protocols, the hardware devices need not to be upgraded.

   These problems are as described below:


3.1.1.  IP Tunnel Problems

   During the VM migration, it is required to establish the IP-in-IP
   tunnel is required.  The purpose is to make the user/application have
   no perception of the migration process, and their IP addresses on the
   related level should be the same.  The scheme of IPinIP tunneling
   resolves the contradiction between unchanged IP addresses during the
   VM migration and changed IP addresses when VMs migrate across IP
   subnets.  OBP is involved in setting up IP tunnels.  According to
   nvo3 control plane protocol, there are two positions for OBP (NVE /
   VTEP) in the IDC: on server and on TOR.  Placing OBP on server can
   minimize its correlation with network elements in the specific
   network topology.  It will face more problems if OBP is placed on
   ToR.  NVE is preferred to be placed on server (unless there are other
   stronger reasons).  It will create a virtual network for VM
   communications.  The traffic between VMs will not directly expose on
   the wire and switches.

   However, OBP on Server can reduce the weak coupling with DC topology
   to a certain extent, but they cannot be completely unrelated.

   The disadvantage of network connection solutions for online VM


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 11]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   migration across different subnets is the network configuration of VM
   needs to be changed after the migration, and the migration process is
   opaque.  So the transparent migration of VM needs to be implemented,
   and network connection redirection technology needs to be considered.

   Since users cannot utilize it due to changes in the network access
   point during the online migration of VM across subnets, the scheme of
   network connection redirection system based on Proxy Mobile IP (PM
   IP) can be used.  VM migrated to the external subnet is regarded as a
   mobile node and does not change the IP address.  All the data to/from
   the VM is transmitted through the bi-directional tunnel between the
   external network and the home network, in order to implement online
   transparent migration across subnets and preferably at the switching
   speed.

   The source VM and the destination VM need to activate simultaneously
   and must be dynamically configured with IP tunnel.  In order to make
   the VM migration process completely transparent (including
   transparent to the VMs' applications and the outside users), the
   migration environment of the VMs should be regarded as a mobile
   network environment, and the migrated VM is regarded as a mobile
   node.  After the VM is migrated to the external network, its network
   configuration doesn't need any changes.  The mobile agent function of
   the host should be taken full advantage to communicate with the
   external network.

3.1.2.  IP Allocation Strategy Problems

   In the encapsulation of packets is as described above, the IP address
   of VM is a critical entity.  Its allocation is based on DHCP to
   achieve in the small network.  With the expansion of the network
   scale, IP address conflict is more likely to occur.  When the VM is
   migrated to another network, its IP address may possibly conflict
   with IP address of the VM or physical host in the current network.
   For example, the duplicate IP addresses will make the isolated VM
   networks can communicate with each other, which causes confusion and
   migration failure.

   Therefore, the IP allocation management module of VM is needed, which
   manages each IP allocation in the virtual network.  The IP allocation
   should not conflict with each other, and should make the path cost of
   routing forwarding as smaller as possible.

   When allocating IP, it should not conflict with the currently
   assigned IP network segment of the VM cluster.  In addition, it
   cannot conflict with the IP network segment where the physical hosts
   are located.  It also cannot conflict with the destination IP network
   segment after the migration.  So the synchronization of IP address


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 12]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   allocation information needs to be done.  Of course, the
   synchronization in the whole network is not necessary as long as
   there are ways to ensure no conflict exists.  Moreover, the
   allocation method needs to consider the introduction of network
   overhead as small as possible, as well as insufficient IP issues in
   the destination network segment.

   When allocating IP addresses to the hosts based on DHCP protocol, the
   IP addresses in the IP address pool are allocated from small to
   large.  The insufficient number of addresses in the pool may lead to
   assigned VM IP conflict, which hinders VM migration.  The IP address
   allocation from small to large makes assigned VM IP affect routing
   protocol to choose the better path.

   Especially for the specific architectures like Fat-Tree, specific
   network topology and the protocol architecture of specific routing
   strategy (such as OSPF) should be utilized.  The VM migration process
   must be adapted to these aspects, and cannot be copied for purely
   Layer-2 migration approach.  So VM migration is inherently related to
   network topology and network routing protocols.

   In the Fat-tree topology, IP addressing and IP allocation methods of
   networks servers and switches are related to the routing protocols.
   Two routing methods can be chosen: OSPF protocol (OSPF domain cannot
   be too large), and the fixed routing configuration.

   As the destination VM is needed to assign an IP, in order to prevent
   IP conflict, routing protocols used in the destination DC need to be
   known.  For example, OSPF routing protocol (in this case, the new
   added network node is assigned IP address by using DHCP), or the
   fixed configuration IP routing protocol is used in the Fat-tree
   topology.  If the former, the number and distribution of reserved IP
   addresses in the IP address pool are different from the latter.
   Therefore a scheme is required to know the adopted topology and
   address allocation strategy, IP usage for each segment, the remaining
   number of IP, etc.  This information cannot be acquired purely by the
   existing DHCP protocol.

   Different routing strategies have different routing management
   mechanism for VM migration across the DCs for the following reasons:
   (a) , It involves the uniqueness problem of the IP address assignment
   and IP tunnel establishment, and (b) It involves the global unified
   management issues.  These problems will be discussed later.

   In the addressing method of the fixed routing protocol, the IP
   address assigned to the device located within DC actually contains
   the location information.  The type of the corresponding device can
   easily be determined through IP, and the location of the device in


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 13]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   the topology can also be visually judged.

   So these addresses must be avoided in the automated allocation of IP
   to the VM.

   The function of DHCP protocol needs to be greatly enhanced, or the
   protocol and tool of IP address allocation need to be re-designed.

   Moreover, consultation is needed before migration.  It may be
   required to migrate the VMs after the migration process has been
   confirmed and cleared.  The future source and destination IP
   allocation should be considered for reservation.  The reserved IP
   involves network topology and different IP address allocation
   strategy, and its network topology and IP allocation strategy before
   migration.

   For the control plane protocol in the nvo3 network, reasonable
   allocation of IP address is used according to the adopted network
   topology and routing protocols in the source and destination DC, in
   order to achieve the seamless VM migration and the optimal path as
   much as possible.

   In addition, the above problems and requirements also involve
   PortLand network topology, similar to Fat Tree network topology.

   Future server-centric network topology, such as Dcell/Bcube network
   topology, also needs to achieve compatibility on the control plane.

3.1.3.  Routing Synchronization Strategy Problems

   In order to ensure the normal data forwarding after the VM migration,
   the routing synchronization between the source network and
   destination network is needed.

3.1.4.  The migration protocol state machine of the VM online migration
        across subnets

   As for the routing strategy discussed earlier, compared to the
   migration in the same IP subnets, changes include the IP allocation
   strategy, and routing synchronization strategy.  So the state and
   handling of routing updates must be included in the state machine of
   VM migration across subnets at the preparation phase before the VM
   migration.

   Therefore, if it is allowed to cross subnets, the network redirection
   technology should be used.  For IP-in-IP technology, the advantage of
   it is the good compatibility with network equipments, as long as
   upgrading their software.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 14]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


3.1.5.  Resource Gateway Problems

   A resource gateway is needed to record IP address resources that have
   been used, and IP network segments which the used IP addresses belong
   to.

3.1.6.  Optimized Location of Default Gateway

   The VM's default gateway should be in a close topological proximity
   to the ToR that connects the server presently hosting that VM.

3.1.7.  Other Problems

   Migration across domains has proposed new requirements for network
   protocols, for example, the ARP response packet mechanism is no
   longer applicable in the WAN.  In addition, some packets will be lost
   during the migration, which does not apply to parallel computing.
   There are also problems such as computing resources sharing across
   multiple administrative domains, etc.

3.2.  The Virtual Network Model

   Based on the above problems, two requirements will be added on the
   virtual network model: First, the routing information is adjusted
   automatically according to the physical location of VM after the VM
   is migrated to a new subnet; Second, a logical entity, namely
   "virtual network communications agent", is added, which is
   responsible for data routing, storage and forwarding in the across-
   subnets communications.  The agent can be dynamically created and
   revoked, and a data communications agent can be running on each
   sever.

   Overlay layer consists of VMs and the communications agents.  Each of
   the virtual networks on the top, such as VN1 and VN2, is respectively
   composed by the VMs and the communication agents as needed.  Since it
   is as required, VMs and agents may come from different networks, and
   the connections are established through dedicated tunnels between the
   communications agents.

3.3.  The Processing Flow

   During the process, VM migration messages will trigger the topology
   updates of the VMs' clusters in the source virtual network and
   destination virtual network.  It is therefore required to acquire the
   network topology, the routing protocols, and the IP address
   assignment rules for each other on both ends, so it can be assigned a
   unique VM IP.  The routing information of the communications agents
   is updated.  The communications agent captures the corresponding VM


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 15]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   packets, encapsulates them into the data section of the packets, and
   adds the necessary control information (such as self-defined
   forwarding rules).  After the encapsulation, these packets are
   transferred to the destination network through the tunnels between
   the communications agents.  The communications agent in the
   destination network de-capsulates the packets and processes the
   information, and then delivers the packets to the destination
   network.  The data transfer process across subnets is now complete.

   The modules which need to be modified are as follows:

   According to the above processing flow, it can be divided by function
   modules as follows: Routing management, MAC capture, Tunnel packet
   encapsulation, Tunnel forwarding, Tunnel packet de-capsulation, and
   Forwarding in the destination network.

3.4.  Problems of NVE/OBP location

   VMs communicate with each other through the interconnected network
   either within the same domain, or between different domains.
   According to various NVE / OBP position, the processing on the
   control plane is different.

3.4.1.  NVE/OBP on the Server

   Assume that a set of VMs and the network that interconnects them are
   allowed to communicate with each other, MAC source and destination
   addresses in the Ethernet header of the packets exchanged among these
   VMs are preserved.  This is L2-based VM communication within a LAN.
   Any VM should have its own IP.  If a VM belongs to more than one
   domain, this VM will have multiple IP addresses and multiple logical
   interfaces, which is similar to the model of L3 switches.

   Different VM clusters are distinguished by VLAN mechanism in the same
   L2 physical domain.  Apart from these cases, VLAN-ID doesn't work.
   For example, in the case of VM communications across IP subnets, the
   packets are encapsulated into NVE, and directly delivered to the peer
   NVE, and then transferred to the destination VM.

   Once migration across L3 network occurs, some scenarios will cause
   the MAC source address to be modified.A VM may belong to different VM
   cluster network (similarly distinguished by VNI).

   As it is very transparent to network topology and L2/L3 protocol, NVE
   / OBP on the server should be the default configuration mode.

   Scenarios classification


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 16]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   (1)NVEs on two servers are members of the same VN (NVEs in the same
   L2 physical domain).
   This is the simplest case.

   (2)NVEs on two servers are members of different VNs, and there is
   interconnection between the two VNs.
   NVEs on the servers are required to be added to the L3 domain.

   (3)NVEs on two servers are members of different VNs, and there is no
   interconnection between the two VNs.
   Need to consider routing mechanism.

   (4)NVEs on two servers are members of the same VN, and there is no
   interconnection between the two VNs.
   Need to consider the discovery mechanism between the two domains.

3.4.2.  NVE/OBP on the ToR

   For NVE on the ToR, NVE needs to deal with the VIDs of various
   packets.  Once the VM is migrated, the rules of source network need
   to be migrated, causing physical network configuration
   changes.Therefore, it is required to develop a series of rules to
   deal with it.

   In this case, the VID used by the VM has a global significance.
   Various rules and usage range of the VID are required to be provided.
   The VLAN-ID used by a given VM is referred to the VLAN-ID carried by
   the traffic that is originated by that VM and within the same L2
   physical domain as the VM.

   Scenarios classification

   (1)NVEs on two servers are members of the same VN (NVEs in the same
   L2 physical domain).
   This is the simplest case.

   (2)NVEs on two servers are members of different VNs, and there is
   interconnection between the two VNs.
   NVEs on the servers are required to be added to the L3 domain.

   (3)NVEs on two servers are members of different VNs, and there is no
   interconnection between the two VNs.
   Need to consider routing mechanism.  It will result in changing MAC
   source and destination addresses in the Ethernet header of the
   packets being exchanged.

   (4)NVEs on two servers are members of the same VN, and there is no
   interconnection between the two VNs.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 17]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   Need to consider the discovery mechanism between the two domains.

   As NVEs may belong to different domains, if a NVE communicates with
   the other NVE in the same domain, the VLAN-ID of packets exchanged
   should be the same.  In order to simplify the process, the VLAN-IDs
   are allowed to be removed.  But once a NVE communicates with the
   other NVE in the different domain, the VLAN-ID of packets exchanged
   may be different.

3.5.  The Evolution Problems of The Logical Network Topology in VMMI
      Environments

   The question is whether there is any relation between VM migration
   and the topology of the network within a data center.  In simple
   implementations, seamless VM migration should be realized over
   Layer-2 network.  Since a large number of VMs and their applications
   are running in the same Layer-2 domain, it (VM migration) may be very
   stressful from bandwidth utilization viewpont of the data center
   switching network.

   In order to improve the bandwidth utilization, it is required to
   upgrade the load balancing capability of the network which has
   numerous ECMP between different points.

   Although multi-root tree (such as Fat Tree, MatrixDCN, and other
   network topology) and protocols support ECMP, we can achieve it by
   configuring the appropriate routing, or through TRILL or SPB.
   However, implementing TRILL or SPB requires elimination/upgrading of
   the existing equipment.  If we can encode their positions in the
   topology by IP or MAC address along with using Fat Tree, MatrixDCN
   network topology, we can realize seamless and transparent VM
   migration within the data center, on the premise that the large
   layer-2 network is composed of the existing low-end switching
   equipments.

   Note that although Ethernet and IP protocols are meant to support
   arbitrary topology, these Layer-2 and Layer-3 network protocols are
   not flexible enough for use in Data Center environments.  The lack of
   flexibility may result in lack of scalability, management
   difficulties, inflexible communications, and poor fault tolerance.
   These ultimately result in lack of support for flexible VMs migration
   in the increasingly larger and complex Layer-2 networks.  However, if
   we can solve these problems, we will be able to achieve the purpose
   of flexible migration for VMs in the scalable, fault tolerant layer-2
   data center networks.

   Some solutions are moving forward in the direction to solve the
   problems above, there have been several new topological models and


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 18]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   routing architectures.  These include Fat Tree fabric and MatrixDCN
   fabric.

   MatrixDCN is a new style network fabric for data center networks.
   These fabrics can support super-large scale network including more
   than 100,0000 servers without performance degradation.  Furthermore,
   through ECMP techology, MatrixDCN can eliminate the bandwidth
   bottleneck problems in the canonical tree-structure data center
   networks.  MatrixDCN fabric is described in [Matrix DCN, I-D.sun-
   matrix-dcn].

3.6.  Cloud Service Virtualization Requirements

   The following sub-sections present the requirements of logical and
   physical elements for Cloud/DC service virtualization and their
   operations.

3.6.1.  Requirement of logical element

   o  Resource Allocation Gateway (RA GW)
      Network service providers provide virtualized basic network
      resources for tenants between data centers.  Within the data
      center, the facilities include virtualized computing and
      virtualized storage resources.  The RA gateway's role is to
      provide access to the virtualized resources.  These resources are
      divided into the following three categories: networking resources,
      computing resources, and storage resources.  The RA gateway
      compares the demanded networking, computing and storage resources
      with the available resources, finds out the corresponding
      relations, and achieves globally reasonable matching of resources
      scheduling.  DC GW's function, described below, is a subset of RA
      GW functions.

   o  Data Center Gateway (DC GW)
      The DC gateway provides access to the data center for different
      outside users including the Internet access and VPN connection
      users.  In the existing DC network model, the DC GW may be a
      router with virtual routing capabilities, or may be a PE device of
      IPVPN/L2VPN connection.  Core Nodes which perform the roles of DC
      GWs, may also provide Internet connectivity, inter-DC connectivity
      and VPN support.

   o  Core Router / Switch
      These are high-end core nodes / switch with routing capabilities
      located in the core layer, connecting aggregation layer switches.

   o  Aggregation Layer Switch
      This switch aggregates traffic from the ToR switches and forwards


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 19]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


      the downstream traffic.  The switch can be a normal aggregation
      switch, or multiple switches virtualized into a single stack
      switch.

   o  Access Layer ToR Switch
      Access layer ToR switches are usually dual-homed to the parent
      node switch.

   o  Virtual Switch
      This is a virtual software switch which runs on a server.

      The requirements related to the above demand that L2/L3 tunnel is
      terminated to one of the entities mentioned above.

3.6.2.  Requirements for Resource Allocation Gateway (RA GW) Function

   The emerging DC and network providers offer virtualized computing,
   storage and networking resources and related services.  Tenants are
   identified by the overlapping addresses, and share a pool of storage
   and networking resources.  Therefore, a virtual platform is needed,
   with the capabilities of control and management for virtual machines,
   virtual services, virtual storage and virtual networks.  What tenants
   see is a subset of the above four entities.  The virtualized platform
   is built on the framework of the physical network, physical servers,
   physical switches and routers, and physical storage devices.  Through
   the virtual platform, the tenants are offered globally scheduled
   resources for sharing throughout the entire system.

   The RA GW collects information related to system-wide availability of
   computing, storage, and networking resources.  The RA GW then
   allocates appropriate quantities of computing, storage and networking
   resources to the tenants according to certain policies, and the
   demands for resources.

   Note that in order to prevent any single point of failure the RA GW
   needs to have backup support.  The global resource availability
   information and scheduling information (between resource allocation
   gateway and backup resource allocation gateway) also needs real-time
   backup.

   It is possible to provide automatic matching and scheduling of the
   virtualized resources, which are dynamically adjusted according to
   the operating conditions.  It can optimize utilization of the
   computing resources, networking resources such as IDC interconnection
   resources, IDC internal routing and switching resources, and storage
   resources.  It should consider the optimization of the network path
   routing for matching with network resources.  Routing selection can
   be based on the degree of matching between the required bandwidth and


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 20]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   the bandwidth that can be provided, the shortest path, service level,
   user and usage level.  These factors need to be considered in the
   decision-making process.

3.6.3.  Performance Requirements

   Any preferred solution should be able to easily support a large
   number of tenants sharing the data center resources.  It is also
   required to support a large (more than 4K) number of VLANs.  For
   example, there are a number of VPN applications -- VPLS or IP VPN --
   which serve more than 10K tenants, each requiring multiple VLANs.  In
   this scenario the availability of 4K VLANs is not sufficient for the
   tenants.
   The solution should guarantee high quality of service, and must
   ensure a large number of network connections are not interrupted even
   during overloads or minor failure conditions.  The connectivity
   should meet carrier-class reliability and availability requirements.

3.6.4.  Fault Tolerance Capability Requirements

   In the event of any fault or error, it is required to quickly recover
   from an error condition.  Error recovery includes network fault
   recovery, computing power recovery, VM migration recovery, and
   storage recovery.  Among them, the network fault recovery capability
   and computing power recovery are the fundamental requirements for VM
   migration recovery and storage recovery.

   Network fault recovery: Once an error or fault condition is
   identified in virtual network connectivity, alarms should be
   triggered, and recovery by using backup virtual network should be
   automatically activated.

   Computing capability recovery: Once the computing capability fails,
   an efficient detection mechanism is needed to find the problem and
   services can be scheduled to backup virtual machines that are being
   used for the services.

   VM migration recovery: In the event of VM migration failure, it is
   required to automatically restore to the original state of the
   virtual machines so that users' services are not adversely impacted.

   Storage recovery: In the event of storage failures, it is required to
   automatically find a backup virtual storage resource so that it can
   be enabled or activated immediately.  The response and recovery times
   should be very short in order to minimize service delay and
   disruptions.

   After the VM migration, it is required to consider the impact on the


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 21]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   switching network, such as whether the new network environment will
   have the problem of insufficient bandwidth.  Although at the
   consultation phase before the migration, there will be an initial
   judgment, but it cannot guarantee that no problem will occur at all
   after the migration.  In addition, if the destination DC needs to
   activate the standby servers and additional network resources, it may
   be worthwhile to consider allocating and activating additional server
   and network resources.  And, in some cases, some routing policies --
   on network segments and server clusters -- may need to be adjusted as
   well after migration.

3.6.5.  Network Model

   Traditionally, the DCs have their own private networks for the
   interconnection among themselves.
   Alternatively, the data centers can use independent WAN service
   provider's interconnection facilities for primary and/or secondary
   connections.

3.6.6.  Types and Applications of VPNs Interconnection between DCs which
        provide Cloud Services

3.6.6.1.  Types of VPNs Layer3 VPN

   Layer3 VPN
   BGP / MPLS IP Virtual Private Networks (VPNs) (BGP / MPLS IP Virtual
   Private Networks (VPNs))
   RFC 4364
   Layer2 VPN
   PBB + L2VPN
   TRILL + L2VPN
   VLAN + L2VPN
   NVGRE [draft-sridharan-virtualization-nvgre-00]
   PBB VPLS
   E-VPN
   PBB-EVPN
   VPLS
   VPWS

3.6.6.2.  Applications of L2VPN in DCs

   It is a very common practice to use L2 interconnection technologies
   for DC interconnection across geographical regions.  Note that VPN
   technology is also used to carry L2 and L3 traffic across the IP/MPLS
   core network.  This technology can be used in the same DC to support
   scalability or interconnection across L3 domains.  VPLS is commonly
   used for IP/MPLS connection over WAN and it supports transparent LAN
   services.  IP VPN, including BGP / MPLS IP VPN and IPSec VPN, has


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 22]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   been used in a common IP/MPLS core network to provide virtual IP
   routing instances.
   The implementation of PBB plus L2-VPN can take advantage of some of
   the existing technologies.  It is flexible to use VPN network in the
   cloud computing environment and can support a sufficient number of
   VPN connections/sessions (networking resources), which is much larger
   than the 4K VLAN mode of L2VPN.  Therefore, it can achieve the effect
   which is similar to that of VXLAN.
   Note that PBB can not only support access to more than 16M virtual
   LAN instances, it can also separate the customers and provide
   different domains by isolated MAC address spaces.
   The use of PBB encapsulation has one major advantage.  Note that
   since VM's MAC address will not be processed by ToRs and Core SWs,
   MAC table size of ToRs and Core SWs may be reduced by two orders of
   magnitude; the specific number is related with the number of virtual
   machines in each server and VM virtual interfaces.
   One solution to solve problems in DC is to deploy other technologies
   in the existing DC network.  A service provider can separate its
   domains of VLAN into different VLAN islands, in this way each island
   can support up to 4K VLANs.  Domains of VLAN can be interconnected
   via VPLS, at the same time, DC GWs can be used as VPLS PEs.
   If retaining the existing VLAN-based solutions only in VSw, while the
   number of tenants in some VLAN islands is more than 4K, the service
   provider needs to deploy VPLS deeper in the DC network.  This is
   equivalent to supporting L2VPN from the ToRs, and using the existing
   VPLS solutions to enable MPLS for the ToR and core DC elements.

3.6.6.3.  Applications of L3VPN in DCs

   IP VPN technology can also be used for data center network
   virtualization.  For example, multi-tenant L3 virtualization can be
   achieved by assigning a different IP VPN instance to each tenant who
   needs L3 virtualization in a DC network.
   There are many advantages of using IP VPN as a Layer-3 virtualization
   solution within DC compared to using existing virtual routing DC
   technology.  Some of the advantages are as mentioned below:
   (1) It supports many VRF-to-VRF tunneling options containing
   different operational models: BGP/MPLS IP VPN, IP or L3 VPN GRE, etc.
   (2) The connections of IP VPN instances used in Cloud services below
   the WAN can be IP VPN that is directly involved in the WAN.

3.6.7.  VN Requirements

   The Virtual Networks (VNs) consists of virtual IDC network, and
   virtual DC internal switching network.  These VNs are built on the
   basis of the physical networks.  VM migration is not affected by the
   physical network.  As long as it is within the scope of the VN, it is
   free to migrate if it satisfies the necessary conditions.  In


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 23]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   addition, network architecture and forwarding/switching capacity
   should match between the source network and destination network,
   without causing any concern for the physical network.
   The physical characteristics of the network, such as VLAN, IP subnet,
   L2 protocol entities, QoS supporting entities, etc. are abstracted as
   the logical elements of VN.  Because the VMs operate in VN
   environment, each VM has the associated logical elements, such as the
   CPU process, I/O, memory, Disk, etc., and VN also has a corresponding
   set of logical elements.
   In general, the VNs are isolated from each other.  The VMs within
   each VN communicate using their own internal address, and send and
   receive Ethernet packets.  VNs do not have ties to their specific
   implementation; the implementation can use Internet, L2VPN, L3VPN,
   GRE, etc.  From the VN layer, IP can be used to make that
   distinction.  Traffic traverse through firewall into the VN, and ACL
   and other security policies are also needed in the access layer.

3.6.8.  Packet Encapsulation Problems

   In order to implement virtual network (VN), a method similar to the
   overlay address is required.  Overlay address can be reflected by
   VXLAN or the I-SID of PBB+L2VPN.  The overlay address works as an
   identifier corresponding to every instance of VN.  The implementation
   model requires that the edge switch or router acts as the DC GW for
   the encapsulation and de-encapsulation of the tunnel packets.
   Various VNs within the DC rely on overlay address in order to
   distinguish and separate one from the other.  Each VN also contain 4K
   VLANs for its internal use.  The data packets travel to DC
   interconnection network through DC GW, and are encapsulated for
   subsequent transmission.

   The main issue related to the above is the support of encapsulation.

   In L2 network, VXLAN supports the VLAN expansion requirements.  In
   NVGRE, a similar problem is also resolved in a different way.
   Therefore, in order to achieve seamless migration of VMs across DCs
   that support different VLAN expansion mechanisms, unification of
   packet encapsulation methods is required.

3.6.9.  VM Migration Problem in mixed IPv4 and IPv6 Environment

   With the proliferation of IPv6 technology, the existing IPv4 networks
   will have attachment to IPv6 hosts.  This is driving the development
   of a series of tunnel technologies, e.g., 6to4 tunnel technology,
   ISATAP tunnel technology, and so on.  ISATAP tunnel is a point to
   point automatic tunnel technology, and 6to4 tunnel is multipoint
   automatic tunnel technology which is mainly used for attaching
   multiple IPv6 islands over an IPv4 network to connect to the IPv6


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 24]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   network.  ISATAP and 6to4 tunnel technology works through the IPv4
   address embedded in destination address of IPv6 packets, which are
   automatically obtained at the end of the tunnel.

   The following issues are pertinent to the migration of VMs across
   data centers in mixed (IPv4 and IPv6) network environment.

3.6.9.1.  Real-time Perception of Availability of Global Network and
          Storage Resources

   In the current system, that status of availability of network
   resources and storage resources may not be reported in hard real-
   time.  This may cause a mismatch between the reported and actually
   available virtual machines/storage system resources in the data
   centers.  However, from the global the scale, the compute and storage
   resources in the distributed data center system may need to be used
   more efficiently.  Without real-time up-to-date information about
   system resources availability, the network resources cannot be used
   more efficiently.  Therefore, a management model needs to be
   established.  This model needs to keep track of system-wide network
   resources and storage resources, and dispatch them on as needed
   basis.  The management model can be integrated into the framework of
   virtual machine migration as being currently discussed in DMTF [DMTF
   VSMP].

   The real challenges here are how to learn about the availability of
   system-wide networking, compute, and storage resources.  A set of
   uniform methods, mechanisms and protocols would be very useful to
   resolve these issues.


3.6.9.2.  The real-time perception of global available network resource
          and requested network resource for matching with storage
          resources

   In mixed IPv4 and IPv6 networks, a multi-tunneling VPN gateway
   solution may be useful to resolve the problem of establishing
   communication between heterogeneous networks.  This will be helpful
   for supporting seamless communication across heterogeneous data
   centers about the availability of system-wide resources.

3.6.9.3.  The real-time perception of global requested network resource
          for matching with storage resources

   The access to data center virtual machine / storage resources can be
   accurately performed when we have a set of standardized APIs,
   resources (memory, storage, processing, communications, etc.) format,
   and communication protocols.  The availability of virtual machine /


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 25]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   storage system resources in the global scope needs to be registered,
   and their status need to be reported to the resource management
   system in the cloud system.  Eventually, the resource management
   system in the cloud system is kept well-informed of system-wide
   network resources.


3.6.10.  Selection of Migration

3.6.10.1.  Requirements with Different Network Environments and Protocol

   Currently in large-scale DCs, Layer-2 interconnection techniques are
   mainly used for migration of virtual machines, but there also exists
   Layer-3 interconnection techniques for VM migration.  These two
   technologies are suitable for different implementation environments
   and scenarios.

   The former is often used for frequent data migration with strict
   requirements on data security, such as data migration and backup in
   the bank, etc., whereas the latter is commonly used for data
   migration for personal or mobile users, or bulk data transfer between
   different service providers.

   Because of users' demands for establishment of a unified management
   platform, it will become more and more important to build the
   distributed PaaS across different cloud/DC service providers.  No
   user is willing to maintain too many independent platforms.  At the
   same time, sharing of resources across multiple data centers is
   becoming a major trend.  As a result, it will become very cumbersome
   for data center managers to build a large number of VPN connections
   for all data centers.  What may be needed is a portal operator, who
   can manage of all the internal VPN connections between the clouds/DCs
   and can unify the scheduling of data/VM migration in order to achieve
   optimum utilization of resources.

3.6.10.2.  Requirements for Live Migration of Virtual Machines

   The scenarios for live migration of VMs across DCs include the
   following: (a) Migration across IPv4 networks and across IPv6
   networks, (b) Migration from IPv4 to IPv6 networks and vice versa,
   (c) Migration based on mobile IP.

   Live migration of VMs may be more suitable for mobile applications
   for small scale and home users.  The complexity of the network can be
   fully shielded from the users, as long as both source and destination
   have either IPv4 or IPV6 addresses.  This migration paradigm can be
   more secure and applicable in Layer-3 networking environment.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 26]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


3.6.11.  Access and Migration of VMs without users' Perception

   For VM migration without users' perception, it is required to achieve
   migration of VMs from one DC to another without causing any
   significant disruption of services.  In essence, the users should not
   be able to perceive that the VM migration has occurred.  To achieve
   this, none (or insignificant amount) of critical data packets can be
   lost during the process of VM migration.  The following two
   conditions are helpful to achieve this:

   i.  First, consider how to avoid traffic roundabout while having
   traffic roundabout problem as a prerequisite.

   ii.  Second, consider how to portray the state of no migration in
   user's perception and no traffic roundabout with having no traffic
   roundabout problem as a target.

   The following are the relevant problems and possible solutions in
   these two areas.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 27]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


3.6.11.1.  VM Migration Problems and Strategies in the WAN with having
           Traffic Roundabout as a Prerequisite


                            _____________
                           /             \
                   user c +     MAN C     +
                           \_____________/
                                 |
                                 |
                                 |
                                \|/
                           =--_==--==--=--=
                          /                \
                         = backbone network =
                         =                  =
                          \___=--__--==-__=
                              /         \
                       |     /     __\   \    |
                      \|/   /        /    \  \|/
             __________    /  /|\          \    _______
            /          ___|___ |         ___|___       \
   user a  + MAN A    | VM-A  |         |gateway|MAN B + user b
            \_________|gateway|         |       |______/
                      |_______|         |_______|
                          |                  |
                         _|__              __|__
                       ||VM-A||           ||VM-A||
                       ||____|| migration ||____||
                       |      |__________\|      |
                       |Server|          /|Server|
                       |______|           |______|


                   Figure 1: Roundabout Traffic Scenario

3.6.11.1.1.  VM Migration Requirements

   For migration in Layer-2 (L2) network, it is required to keep VM MAC
   / IP address the same as they are in the source domain.  This will
   help live VM migration and seamless inter-DC communications among the
   service providers.

3.6.11.1.2.  A Scenario

   Let us consider the scenario where a VM needs to be migrated from the
   IDC in metro A to the IDC in metro B. There is almost no traffic
   roundabout for users within the metro (such as for user a).


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 28]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   For access to IDC services by WAN, such as from user in metro C, the
   client traffic must first access VM-A gateway after VM migration, and
   then be sent to the migrated VM through the Layer-2 tunnel.

3.6.11.1.3.  A Possible Solution

   Through mechanisms such as DNS service, businesses can access
   services from a location/DC which is as close as possible and the
   roundabout routes can be minimized after migration.  However, the
   shortcoming of this approach is that, for access across the metro
   network, there are still traffic roundabout issues.  This approach is
   a solution to evade the problem, and does not completely solve the
   problem.  Moreover, additional processing is involved in the control
   of DNS service, which increases the complexity of the solution.

3.6.11.2.  VM Migration Problems and Strategies in the WAN without
           having Traffic Roundabout as a Target

   In this process of VM migration, in order to achieve real-time
   migration without users' perception, the entire state of the
   management programs (including firewall) needs to migrate as VMs
   migrate.  The state migration of the firewalls is the key to ensure
   that the packets in the original firewalls' data flow are neither
   lost nor mis-routed during the VM migration.
   Before a VM migrates to a new DC environment, firewalls have recorded
   the existing VMs connections' session tables.  In the event of VM
   migration, the firewalls in the new DC location will be used for the
   access to the VM.  If the firewalls in the new location don't have
   the session tables of the original firewalls' data flows it will
   cause loss or mis-routing of packets.  The original sessions will be
   disconnected and the users' data flows will fail to access the VM.
   To solve this problem, the original firewall's session tables in use
   need to be migrated and synchronized with the session tables of the
   firewall in the new VM location.  The session table should contain at
   least the following information:
   Source IP address, Destination IP address, Source Port address,
   Destination Port address, Protocol type, VLAN ID, time of expiration,
   public guard information for Firewall defense.
   Since the firewall's session table needs to migrate when a VM
   migrates, the deployment of the source and destination firewall
   should be known in advance.  There are at least two kinds of firewall
   deployment.
   The first kind is the centralized deployment.  In this case, the
   firewalls are placed on the connection point of the DC and WAN.  Each
   DC has firewalls either on or adjacent to the core switches.  The
   second one is the distributed deployment.  In this case, the
   firewalls are distributed on the aggregation switches or access
   switches.  The former's advantages are convenient management and


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 29]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   deployment.  The disadvantage is the firewalls can easily become a
   bottleneck because of centralized/aggregated processing.  The
   latter's advantage is distributed processing of huge VM data flows in
   large L2 network.
   After knowing the deployment of the firewall, it is necessary to
   determine how to migrate the firewall session table from the source
   location to the destination location.  Since the location and number
   of the centralized and distributed firewall deployment differ, the
   mechanisms that are utilized to migrate the session tables in these
   two deployments are not exactly the same.  These are new challenges
   to be addressed for VM migration.

3.6.12.  Review of VXLAN, NVGRE, and NVO3

   In order to solve the problem of insufficient number of VLANs in DC,
   the techniques like VXLAN and NVGRE have adopted two major
   strategies; one is the encapsulation and the other is tunneling.

   Both VXLAN and NVGRE use encapsulation and tunneling to create a
   number of VLAN subnets, which can be extended to the Layer-2 and
   Layer-3 networks.  This solves the problem of limitation of the
   number of VLAN as defined by IEEE802.1Q, and helps achieve shared
   load-balancing in multi-tenant environment in both public and private
   networks.

   The VXLAN technology is introduced in 2011, and it is designed to
   address the number restrictions of 802.1Q VLAN.  The technologies
   like MAC in MAC, MAC in GRE also extend the number of VLANs.
   However, VXLAN attempts to address the issues related to inadequate
   utilization of link resources, monitoring of packets after re-
   encapsulation of header more effectively.
   The frame format of VXLAN is the same as that of OTV and LISP,
   although these three solutions solve different problems of IDC
   Interconnection and the VM migration.  Also, in VXLAN, the packet is
   encapsulated in MAC in UDP, and addressing is extended to 24-bit,
   which is the effective solution to the restrictions of VLAN numbers.
   UDP encapsulation enables the logical virtual network extension to
   different subnets.  It also supports the migration of VMs across
   subnets.  The change of the frame's structure increases the field for
   extending the VLAN.

   Note that VXLAN solves different problem compared to OTV.  OTV solves
   the problem of IDC interconnection, which builds an IP tunnel between
   different data centers through MAC in IP.  VXVLAN mainly solves the
   problem of limitation of VLAN resources in DCs due to the increase in
   the number of tenants.  The key is the expansion of the VNI field to
   increase the number of VLANs.  Both techniques can be applied to VM
   migration, since the two packet formats are almost the same and


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 30]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   completely compatible.

   NVGRE specifies the 24-bit Tenant Network Identifier (TNI) and
   resolves some issues related to supporting multiple tenants in DC
   network.  It uses GRE to create an independent virtual Layer-2
   network, and limit physical Layer-2 network to expand across subnet
   borders.  Terminals supporting NVGRE insert the TNI indicators in the
   GRE headers to separate the TNIs.

   NVGRE and VXLAN solve the same problem.  The two technologies were
   proposed almost at the same time.  Hoverer, there are some
   differences between them:

   VXLAN not only increases VXLAN header(VNI), but also increases the
   outer UDP encapsulation on the package, which facilitates live
   migration of VMs across subnets.  In addition, differentiated
   services can be supported to the tenants in the same subnet because
   of the use of UDP.  Both proposals are built on the assumption that
   load-balancing is the necessary condition to achieve efficient
   operation.  VXLAN randomly assigns port number to achieve load-
   balancing, while NVGRE uses the retained 8-bit in the key GRE field.
   However, there may be opportunity to improve the capability of the
   control plane for both mechanisms in future.

3.6.13.  The East-West Traffic Problem

   Let us discuss the background of East-West traffic problem first.
   There are a variety of applications in the DC, such as distributed
   computing, distributed storage, and distributed search.  These
   applications and services need frequent exchanges of transactions
   between the business servers across the DCs.  According to the
   traditional three-tier network model, the data streams first flows
   north-south and then finally flows east-west.  In order to improve
   the forwarding efficiency of the data stream, it is necessary to
   update the existing network model and network forwarding technology.
   Among others, the Layer-2 multi-path technology being studied is one
   of the directions to solve this problem.

   Distributed computing is the basis of transformation of the existing
   IT services.  This allows scalable and efficient use of sometime
   underutilized computing and storage resources scattered across the
   data centers.

   In typical data centers, the average server utilization is often low
   in the existing network.  The concept of virtualization and
   distributed computing can perfectly solve the problem of capacity
   limitation of a single server in demanding environments in certain
   DCs via on-demand utilization of resources and without impacting the


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 31]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   performance.  This revolutionary technology of distributed computing
   and services using resources in the DCs also produces several
   horizontal flows of traffic.  The application of distributed
   computing technology on the servers produces a large number of
   interactive traffic streams between servers.  In addition, the types
   of IDC would influence the traffic model both within and across data
   centers.

   The first type of IDC is telecom operators who usually not only
   operate DCs, but also supply bandwidth for the Level-2 ISP providers.
   The second type is the traditional ISP companies with strong power.
   The third type is some IT enterprises which invest in the
   construction of DCs.  The fourth type is high-performance computing
   (HPC) centers that are built by universities and research institutes
   and organizations.  Note that in these types of DCs, the south-north
   traffic flow is significantly smaller compared to the horizontal
   flow, and this causes greatest challenges to the network design and
   installation.  In addition to the normal flow of traffic due to the
   distributed computing, storage, communications, and management, hot
   backup and VM migration requirements produce a sudden lateral flow of
   traffic, and associated challenges.

   There are three potential solutions to the distributed horizontal
   flow of traffic, as described below.
   A. The first one is to solve the problem of east-west traffic within
   the server clusters by exploiting representative technologies such as
   vswitch, Dcell, B-cube, and DCTCP .
   B. The second solution is through the server network and Ethernet
   network, by exploiting technologies such as IEEE 802.1qbg, VEPA and
   UCS.
   C. The third solution is the network-based solution.  The tree
   structure of the traditional DC network is not inherently efficient
   for horizontal flow of traffic.  The problems can be solved in two
   ways: (i)The direction of radical changes: radical deformations in
   changing the tree structure to multi-path, and (ii) The direction of
   mild improvement: change L2 big trees to L2 small trees and meet the
   requirements by expanding the interconnection capacity of the upper
   node, clustering/stacking system, and links trunking.

   The requirements related to the above are as follows: Stacking
   technology across the data center requires specialized interfaces,
   and the length of feasible transmission distance is limited.

   The problems related to the above statement include the following:
   (a) although TRILL resolves the multi-path problem of Layer-2
   protocol, it negatively impacts the multi-path properties of Layer-3
   protocol.  This is because only one active default router supports
   Virtual Router Redundancy Protocol (VRRP), and this means that the


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 32]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   multi-path characteristics cannot be fully utilized on Layer-3
   protocol.

   In addition, TRILL does not define how to deal with the problem of
   overlapping namespace, nor it provides any solution to the
   requirement of supporting more than 4K VLANs.


3.6.14.  Data Center Interconnection Fabric Related Problems

   One of the most important factors that directly impact the VMMI is
   connectivity among the relevant data centers.  There are many
   features that determine this required connectivity.  These features
   of connectivity include bandwidth, security, quality of service, load
   balancing capability, etc.  These are frequently utilized to make
   decision on whether a VM can join a host in real-time or it needs to
   join VRF in certain unit of VM.

   This connectivity fabric should be open and transparent, which can be
   achieved by developing simple extensions to some of the existing
   technologies.  The program should have strong openness and
   compatibility; it must be easy to deploy any required extensions as
   well.

   The requirements related to the above are as follows:
   o The negative impact of ARP, MAC and IP entry explosion on the
   individual network which contains a large number of tenants should be
   minimized by DC and DC-interconnect technologies.

   o The link capacity of both intra-DC and inter-DC network should be
   effectively utilized.  Efficient utilization of the link capacity
   requires traffic forwarding on the shortest path between two VMs both
   within the DC and across DCs.  Therefore, Traffic should be forwarded
   on the shortest path between two VMs within the DC or across DCs.
   o Support of east-west traffic between customers' applications
   located in different DCs.

   o Management of VMs across DC

   o Mobility of VMs and their migration across DCs

   Many mature VPN technologies can be utilized to provide connectivity
   between DCs.  The extension of VLAN and virtual domain between DCs
   may also be utilized for this purpose.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 33]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


3.6.15.  MAC, IP, and ARP Explosion Problems

   Network devices within data centers encounter many problems for
   supporting conventional communication framework because they need to
   accommodate a huge number of IP, MAC addresses and ARP.

   Each blade server in a network device usually supports at least 16-40
   VMs, and each VM has its own MAC adress and IP address.  The entities
   like Disk, memory, FDB table, MAC table, etc. cause an increase in
   convergence time.  In order to accommodate this large number of the
   servers, different options for the network topology, for example, fat
   tree topology or a conventional network topology may be considered.

   The number of ARP packets grows with not only the number of virtual
   L2 domains or ELANs which is instantiated on server but also with the
   number of VMs in that domain.  Therefore, scenarios like overload of
   ARP entries on server/hypervisor, exhaustion of ARP entries on the
   routers/PEs, and processing overload of L3 service appliances, must
   be efficiently resolved.  These problems will easily propagate
   throughout the layer 2 switching network.

   Consequently, what are needed to resolve these problems include (a)
   automated management of MAC/IP/ARP in IDC, and (b) network deployment
   that will reduce the explosion in MAC number requirements in DCs.

3.6.16.  Suppressing Flooding within VLAN

   Efficient operations of Data Centers require that flooding of
   broadcast, multicast and unknown unicast frames within VLAN (that may
   be caused by the improper configuration) be reduced.

3.6.17.  Convergence and Multipath Support

   Although STP is used to solve the broadcast storm problem in the
   loop, it may cause network oscillation resulting in inefficient
   utilization of resources.

   Possible solutions to this problem include switch virtualization, use
   of TRILL and SPB, etc.

   Consequently, standardization of switch virtualization and the
   support of complex network topology in TRILL/SPB would be very
   helpful.

3.6.18.  Routing Control - Multicast Processing

   In order to achieve efficient operation of Data centers, the
   overheads and delays due to processing of (a) different types of


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 34]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   packets such as unicast, multicast and broadcast, (b) ARP packets,
   and (c) load-balancing/-sharing mechanisms must be minimized.

   Note that STP bridging is often used to perform IGMP and/or PIM
   snooping to optimize multicast data delivery.  However, since this
   snooping mechanism is performed by local STP topology, all traffic
   goes through the root bridge for each bridge.  This type of
   traversing may lead to sub-optimal multicast traffic transmission.
   There also exist additional overheads because each customer multicast
   group is associated with the forwarding tree network throughout the
   Ethernet switching network.

   Consequently, development and standardization of efficient Layer-2
   multicast mechanism to support intra- and inter-DC VM mobility would
   be very useful.

3.6.19.  Problems and Requirement related to DMTF

   o  Computing Resources
      It is required to standardize the format for virtualizing
      computing resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      computing resources would be also very useful.

   o  Storage Resources
      It is required to standardize the format for virtualizing storage
      resources.  Best practices for utilizing a standardized format for
      mobility and interconnection management of virtualized storage
      resources would be also very useful.

   o  Memory Resources
      It is required to standardize the format for virtualizing memory
      resources.  Best practices for utilizing a standardized format for
      mobility and interconnection management of virtualized memory
      resources would be also very useful.

   o  Switching Resources
      It is required to standardize the format for virtualizing
      switching resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      switching resources would be also very useful.

   o  Networking Resources
      It is required to standardize the format for virtualizing
      networking resources.  Best practices for utilizing a standardized
      format for mobility and interconnection management of virtualized
      networking resources would be also very useful.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 35]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


4.  Control & Mobility Related Problem Specification

4.1.  General Requirements and Problems of State Migration

4.1.1.  Foundation of Migration Scheduling

   A series of inspections need to be done before initiating the VM
   migration process.  The hypervisor should be able to confirm which
   data centers need to be interconnected for migrating VM data in the
   network.  The hypervisor should also be able to confirm which subnets
   and servers in the current network are most suitable to accommodate
   the migrated VMs.

4.1.2.  Authentication for Migration

   For VM migration, authentication is required for all of the following
   entities: network resources, processor, memory and storage resources,
   load balancer, firewall, etc.

4.1.3.  Consultation for Assessing Migratability

   After successful authentication, it is required to check that the
   inter-DC networking resources can support the migration of VMs.  The
   required resources include network bandwidth resources, storage
   resources, resource pool scheduling or management resources, and so
   on.

4.1.4.  Standardization of Migration State

   As an example of standardization of the VM state migration process,
   the following related entities should be aware of the state of each
   other.  The flow of activities may be as follows: Global detection ->
   authentication processing -> capability negotiation->session
   establishment -> initialization instance -> establish the beginning
   stage -> begin migration -> migration & migration exception handling
   -> finish migration -> End stage -> deletion of instances - > Global
   detection


   +------------------------+
   |                       \|/
   |              +------------------+
   |              | Global detection |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 36]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   |              |  authentication  |
   |              |  processing      |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              |    capability    |
   |              |    negotiation   |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              |     session      |
   |              |   establishment  |
   |              +------------------+
   |                        |
   |                       \|/
   |              +------------------+
   |              | initialization   |   establish the beginning stage
   |              | instance         |                 |
   |              +------------------+                 |
   |                       \|                          |
   |        +---------------|                          |
   |        |              /|                          |
   |        |               |                          |
   |        |              \|/                         |
   |        |     +------------------+                 |
   |        |     |  begin migration |                 |
   |        |     +------------------+                 |
   |        |               |                          |
   |        |               |                          |
   |        |              \|/                         |
   | +------------+                                    |
   | | exception  |/ Y  migration                      |
   | | processing |---  exception?                     |
   | +------------+\                                   |
   |                        |                          |
   |                        |N                         |
   |                        |                          |
   |                       \|/                         |
   |              +------------------+                 |
   |              | finish migration |                 |
   |              +------------------+                 |
   |                        |                          |
   |                       \|/                         |
   |              +------------------+                \|/
   |              |   destruction    |             end stage
   |              |   of instances   |


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 37]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   |              +------------------+
   |                        |
   +------------------------+


      Figure 2: A Flow Chart for State Migration between Data Centers

4.2.  Mobility in Virtualized Environments

   In order to support VM mobility, it is required to allow VMs to
   migrate easily and repeatedly -- that is as often as needed by the
   applications and services -- among a large (more than two) number of
   DCs.  Seamless migration of VMs in mixed IPv4 and IPv6 VPN
   environments should be supported by using appropriate DC GWs.

   VMs in the resource pool should support mobility.  These mobile VMs
   can move either within a DC or from one DC to another remote DC.  The
   mobility can be triggered by factor like natural disaster, imbalance
   of load, cost (of space, electricity, etc.) reduction campaign, and
   so on.  When a VM is migrated to a new location, it should maintain
   the existing client sessions.  VM's MAC and IP address should be
   preserved and the state of the VM sessions should be copied to the
   new location.

   Some widely used virtual machine migration tools require that
   management programs on the source server and destination server are
   directly connected via an L2 network.  The objective is to facilitate
   the implementation of smooth VM migration.

   One example of such tool is VMware's VMotion virtual machine
   migration tool.
   (1) Firstly, a VMotion ELAN may need to provide protection and load-
   balancing across multiple DC network.
   (2) Secondly, in the current VMotion procedure, the new location of
   the VM must be part of the tenant ELAN domain.  When a new VM is
   activated, a Gratuitous ARP is sent, and the MAC FIB entries in the
   "tenant ELAN" are updated to direct the traffic for that VM to the
   new VM location.
   (3) Thirdly, if the path needs IP forwarding, the accessibility
   information of VM must be updated to the shortest path information to
   the VM.

4.3.  VM Mobility Requirements


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 38]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


4.3.1.  Summarization of Mobility

   Mobility refers to the movement of a VM from one server to another
   server within one DC or to a different DC, while maintaining the VM's
   original IP and MAC address throughout the process.  VM mobility does
   not change the VLAN/subnet connection to the VM, and it requires that
   the serving VLAN be extended to the new location of VM.

   In summary, the seamless mobility solution in DC is based on IP
   routing, BGP / MPLS MAC-VPN, BGP / MPLS IP VPNs and NHRP.

4.3.2.  Problem Statement

   The following are the major issues related to supporting seamless
   mobility of VM.

   The first problem is that the participating source server and
   destination server in the VM migration process may be located in
   different data centers.  It may be required to extend the Layer-2
   network beyond what is covered by the L2 network of the source DC.
   This may create islands of the same VLAN in different (geographically
   dispersed) data centers.

   The second problem is that the optimal forwarding in a VLAN that
   support VM mobility may involve traffic management over multiple data
   centers.

   The third problem is that the support of seamless mobility of VM
   across DCs may not necessarily always achieve optimal intra-VLAN
   forwarding.

   The forth problem is that the support of seamless mobility of VM
   across DCs may not necessarily always result in optimal routing.


5.  Network Management Related Problem Specification

5.1.  Data Center Maintenance

   We note that the servers and the applications/services in the data
   center should maintain uninterrupted service during the migration
   process.

   In order to provide uninterrupted service during the migration
   process, the following are some prerequisites:

   o It is required to ensure the networking and communication services
   remain uninterrupted between the source node and destination node


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 39]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   during the migration.

   o A stateful migration may be preferred.  It may be desirable to not
   to respond to users' requests until a successful migration occurs.
   The service management program in the source server records the
   current state of VM and saves users' requests for any service/
   operation to the VM in the source node.

   o It is required to copy the state data of source VM to the target VM
   in another DC, and then the new VM in the target node (DC) can be
   activated for accepting the service requests.

   o The service management program in the source server needs to store
   (in cache) both operation request and the current state of the source
   VM, and send those over the network to the service management program
   in the target server.  As soon as the target server and VM become
   ready, the service management program in the target server publishes
   the received operation request to the target VM.  The target VM takes
   the received final state information of the source VM as the initial
   operational parameters.

   However, in real-life operations, system malfunction may occur in any
   one of the above four steps/scenarios.  For example, it may be
   difficult to ensure uninterrupted communication/networking between
   source node and destination node during the entire migration process.
   Maintaining sustainable network QoS may be complex, and VM migration
   may take excessively long time due to lack of timely availability of
   the required nodal/DC resources.

   Now, if the VM migration time is excessively long, the users may need
   to be allowed to continuously use the source VM, and the changes of
   data during the migration must also be recorded.  At the same time it
   is required to take measures to ensure that the amount of change in
   the database and application is as small as possible.  This will help
   achieve faster recovery, and at the same time the interruption due to
   VM migration will be almost imperceptible to the users.

   It may be useful if IETF proposes a standard definition of the
   uninterrupted service for the VM migration scenario.  This definition
   along with the parameters can be the basis for checking the maturity
   of various VM migration solutions.  The definition should take into
   account the time that the users/services can tolerate without giving
   any perception of interruption in the operation.  Total time is the
   addition to the time required for execution of the four steps/
   processes that are mentioned at the beginning of the section.  It may
   be expected that the most mature solution in each of the steps/
   process will offer fastest and best solution to the VM migration
   process.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 40]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   The next problem related to this topic is the Physical Device
   Compatibility Problem.
   When migrating a VM from one Physical Machine (PM) to another, if the
   VM is depending on some special driver, hardware, which are NOT
   available in the target PM, the migration process will fail.  For
   example, if a VM is using IOMMU technology which is used to access
   real hardware directly (not emulated by hypervisor, for high
   performance) from VM, and this device is not available in the target
   PM, VM migration process will fail.  Therefore a basic requirement
   related to VM migration is checking for strict compatibility between
   source and target PM before initiating the migration process.

   Another problem related to this topic is migration of VMs between
   Heterogeneous Hypervisors.  We note that some virtual network
   functions are implemented in hypervisor, such as vSwitch in VMware.

   Additional requirements related to the above are as follows: stateful
   and stateless VMMI processing need to be be treated separately.
   Stateless VMMI processing refers to the fact that the protocol state
   for the transaction does not need to be preserved in memory.  This
   lack of state means that if the follow-up processing is needed before
   processing the information, it must be retransmitted.  This means
   that it could lead to significant increase in the amount of data that
   need to be transferred as the number of connections increases.  For
   stateless VM migration, there is no need transfer previous state
   information and hence lightweight processing and fast response can be
   achieved.


5.2.  Load Balancing after VM Migration and Integration

   In the migration of virtual machines between data centers, users are
   provided with the nearest calculation principle of "follow the sun",
   or multi-site load balancing requirements.  In addition, for reducing
   energy consumption, cooling costs and other similar considerations,
   the virtual machines can be integrated into less dynamic data
   centers, which is the future trend of the so-called "Green" data
   centers.

   The challenge related to this topic is how to solve the problem of
   load-balancing.  For example, before the migration of VM, loading of
   the source VM server and network traffic distribution may be load-
   balanced locally, and the loading of the destination VM server and
   network traffic distribution may be load-balanced locally.  However,
   after the migration of VM from the source server to destination
   server, both loading condition and traffic distribution may not be
   balanced even for some extended time period.


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 41]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   Therefore, it may be useful to define and enforce a set of policy in
   order to allocate VM and other networking and computing resources
   uniformly across data centers.  Of course the software, hardware and
   networking environments of the source and destination servers should
   be also as similar as possible.

5.3.  Security and Authentication of VMMI

   During the VMMI / VM migration process, it is required to give proper
   considerations to the security related matters; this includes solving
   traffic roundabout issues, ensuring that the firewall functionalities
   are appropriately enacted, and so on.

   Therefore, in addition to authorization and authentication,
   appropriate policies and measures to check/enforce the security level
   must be in place while migrating VMs from one DC to another,
   especially from a private DC to a public DC in the Cloud [NIST 800-
   145, Cloud/DataCenter SDO Survey].
   For example, when a VM is migrated to the destination DC network, the
   corresponding switch port of the VM and its host server should
   utilize the port strategy of the source switch.  The end time of the
   VM migration and the issue time of the strategy must be synchronized.
   If the former is earlier than the latter, the services may not get a
   timely response, and if the former is later than the latter, it may
   not have exact level of network security for a time period.
   What may be helpful in such environment is the creation and
   maintenance of a reasonable interactive state machine.

5.4.  Efficiency of Data Migration and Fault Processing

   It may be useful to streamline data before commencing VM migration.
   Incremental migration may help improve VM migration efficiency.  For
   example, plan to transfer only differentiated data during VM
   migration process between two DCs.  However, this strategy may have
   the risk of propagating faults between DCs.
   In addition, if VM migration occurs between heterogeneous database
   systems, such as transfer of data from ORACLE database in Linux
   system to SQL Server database in Windows system, it is necessary to
   define the security and policy when fault occurs.  The processing of
   VM migration may be slower when database migration operation fails,
   and there may be a need to roll back to previous stable states for
   all of the databases involved in VM migration.  Similar issues are
   being discussed in DMTF [DMTF VSMP] as well.

5.5.  Robustness Problems


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 42]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


5.5.1.  Robustness of VM Migration

   During normal operations, VMs may encounter a series of challenges,
   e.g., CPU overloaded, memory and storage stress, disk space
   limitation, excessive program response time, database write-up
   failure, file system failure, etc.

   If any of the above issues cannot be resolved in a timely fashion, it
   will lead to the collapse of the VM migration process.  As a part of
   the recovery process, the VM management process should take a
   snapshot of all data in the VM and copy them into a blank VM (VM
   template) in the current or a distant server with an objective to
   prevent any service disruption.  The snapshot can be stateful or
   stateless, depending on (a) the status, nature, and function of the
   owner to which various data belongs to in the VM, and (b) the
   strategy of replication.  For example, for the data in the database,
   a stateful snapshot needs to be taken, because the database itself
   has the ability to record the running state of the database.
   We note that any incremental migration of VM state is not sufficient
   to guarantee service continuity.  Another alternative solution may be
   warranted.

   During VM migration process if the speed of writing is faster than
   the data transfer (from source VM location to destination VM
   location) rate, the VM state transfer has to be paused to adjust the
   time for bulk data transfer.  During this adjustment period, the
   service downtime will occur.  It is required to develop methods and
   mechanisms to overcome such service discontinuity.

5.5.2.  Robustness of VNE

   During normal operations, VNEs may encounter a series of challenges,
   e.g., CPU overloaded, memory stress, space limitation of MAC table
   and forwarding table, lack of routing convergence, excessive program
   response time, file system failure, etc.

   If any of the above issues cannot be resolved in a timely fashion, it
   will lead to the collapse of VNE migration.  As a part of the
   recovery process, the VNE management process should take a snapshot
   of all data in the VNE and copy them into an idle/unassigned VNE in
   the current or a distant node with an objective to prevent any
   service disruption.

   The snapshot can be stateful or stateless, depending on (a) the
   status, nature, and function of the owner to which various data
   belongs to in the VNE, and (b) the strategy of replication.
   For example, for stateful snapshot of a VNE both protocol state and
   the status of forwarding table need to be captured and transferred to


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 43]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   the new (migrated) location of the VNE.


6.  Acknowledgement

   The following experts have provided valuable comments on the earlier
   version of this draft: Thomas Narten, Christopher LILJENSTOLPE,
   Steven Blake, Ashish Dalela, Melinda Shore, David Black, Joel M.
   Halpern, Vishwas Manral, Lizhong Jin, Juergen Schoenwaelder, Donald
   Eastlake, and Truman Boyes.  We express our sincere thanks to them,
   and expect that they will continue to provide suggestions in future.


7.  References

   [PBB-VPLS] Balus, F. et al.  "Extensions to VPLS PE model for
   Provider
   Backbone Bridging", draft-ietf-l2vpn-pbb-vpls-pe-model-
   04.txt (work in progress), October 2011.

   [VM-Mobility] Raggarwa, R. et al.  "Data Center Mobility based on
   BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-
   mobility-01.txt (work in progress), September 2011.

   [DCN Ops Req] A. Dalela.  "Datacenter Network and Operations
   Requirements",
   draft-dalela-dc-requirements-00.txt, December 30, 2011

   [DMTF VSMP] DMTF.  "Virtual System Migration Profile",
   DSP1081, Version: 1.0.0c, May 2010

   [VPN Applicability] Nabil Bitar.  "Cloud Networking: Framework and
   VPN Applicability",
   draft-bitar-datacenter-vpn-applicability-01.txt, October 2011

   [VXLAN] M.Mahalingam.  "VXLAN: A Framework for Overlaying Virtualized
   Layer 2 Networks over Layer 3 Networks",
   draft-mahalingam-dutt-dcops-vxlan-01.txt, February 24, 2012

   [NIST 800-145] NIST Special Publication 800-145, Peter Mell and
   Timothy Grance, The NIST definition of cloud computing,
   http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf,
   September 2011

   [Cloud/DataCenter SDO Survey] B. Khasnabish and C. JunSheng.  "Cloud/
   DataCenter SDO Activities Survey and Analysis",
   draft-khasnabish-cloud-sdo-survey-02.txt, December 28, 2011


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 44]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   [NVGRE] M. Sridharan.  "NVGRE: Network Virtualization using Generic
   Routing Encapsulation",
   draft-sridharan-virtualization-nvgre-00.txt, September 2011

   [NVO3] Thomas Narten. " NVO3: Network Virtualization", l2vpn-9.pdf,
   November 2011

   [Network State Migration] Yingjie Gu,
   "draft-gu-opsawg-policies-migration-01",
   draft-gu-opsawg-policies-migration-01.txt,October 2011

   [Matrix DCN] Sun et al , "Matrix Fabric based Data Center Network",
   draft-sun-matrix-dcn-00.txt,Work in progress, 2012.


8.  Security Considerations

   To be added later, on as-needed basis.


9.  IANA Consideration

   The extensions that are discussed in this draft are related to DC
   operations environment.


10.  Normative References

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.


Authors' Addresses

   Bhumip Khasnabish
   ZTE USA,Inc.
   55 Madison Avenue, Suite 160  Morristown, NJ 07960
   USA

   Phone: +001-781-752-8003
   Email: vumip1@gmail.com, bhumip.khasnabish@zteusa.com


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 45]

Internet-Draft  Mobility and Interconnection of VM & VNE        Oct 2012


   Bin Liu
   ZTE Corporation
   15F, ZTE Plaza, No.19 East Huayuan Road,Haidian District
   Beijing  100191
   P.R.China

   Phone: +86-10-59932098
   Email: richard.bohan.liu@gmail.com,liu.bin21@zte.com.cn


   Baohua Lei
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552124
   Email: leibh@ctbri.com.cn


   Feng Wang
   China Telecom
   118, St. Xizhimennei, Office 709, Xicheng District
   Beijing
   P.R.China

   Phone: +86-10-58552866
   Email: wangfeng@ctbri.com.cn


Bhumip Khasnabish, et al.  Expires April 4, 2013               [Page 46]