Network Working Group Bhumip Khasnabish Internet-Draft ZTE USA,Inc. Intended status: Informational Bin Liu Expires: April 4, 2013 ZTE Corporation Baohua Lei Feng Wang China Telecom Oct 2012 Requirements for Mobility and Interconnection of Virtual Machine and Virtual Network Elements draft-khasnabish-vmmi-problems-02.txt Abstract In this draft, we discuss the challenges and requirements related to migration, mobility, and interconnection of Virtual Machines (VMs)and Virtual Network Elements (VNEs). VM migration scheme across IP subnets is needed to implement virtual computing resources sharing across multiple network administrative domains. For the seamless online migration in various scenarios, many problems are needed to be resolved on the control plane. The VM migration process should be adapted to these aspects. We also describe the limitations of various types of virtual local area networking (VLAN) and virtual private networking (VPN) techniques that are traditionally expected to support such migration, mobility, and interconnections. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 4, 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 1] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Conventions used in this document . . . . . . . . . . . . 5 2. Terminology and Concepts . . . . . . . . . . . . . . . . . . . 5 3. Network Related Prloblem specification . . . . . . . . . . . . 7 3.1. The Mobility Problems of VM Migration Across IP subnets/WAN . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.1. IP Tunnel Problems . . . . . . . . . . . . . . . . . . 11 3.1.2. IP Allocation Strategy Problems . . . . . . . . . . . 12 3.1.3. Routing Synchronization Strategy Problems . . . . . . 14 3.1.4. The migration protocol state machine of the VM online migration across subnets . . . . . . . . . . . 14 3.1.5. Resource Gateway Problems . . . . . . . . . . . . . . 15 3.1.6. Optimized Location of Default Gateway . . . . . . . . 15 3.1.7. Other Problems . . . . . . . . . . . . . . . . . . . . 15 3.2. The Virtual Network Model . . . . . . . . . . . . . . . . 15 3.3. The Processing Flow . . . . . . . . . . . . . . . . . . . 15 3.4. Problems of NVE/OBP location . . . . . . . . . . . . . . . 16 3.4.1. NVE/OBP on the Server . . . . . . . . . . . . . . . . 16 3.4.2. NVE/OBP on the ToR . . . . . . . . . . . . . . . . . . 17 3.5. The Evolution Problems of The Logical Network Topology in VMMI Environments . . . . . . . . . . . . . . . . . . . 18 3.6. Cloud Service Virtualization Requirements . . . . . . . . 19 3.6.1. Requirement of logical element . . . . . . . . . . . . 19 3.6.2. Requirements for Resource Allocation Gateway (RA GW) Function . . . . . . . . . . . . . . . . . . . . . 20 3.6.3. Performance Requirements . . . . . . . . . . . . . . . 21 3.6.4. Fault Tolerance Capability Requirements . . . . . . . 21 3.6.5. Network Model . . . . . . . . . . . . . . . . . . . . 22 3.6.6. Types and Applications of VPNs Interconnection between DCs which provide Cloud Services . . . . . . . 22 3.6.6.1. Types of VPNs Layer3 VPN . . . . . . . . . . . . . 22 3.6.6.2. Applications of L2VPN in DCs . . . . . . . . . . . 22 3.6.6.3. Applications of L3VPN in DCs . . . . . . . . . . . 23 Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 2] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 3.6.7. VN Requirements . . . . . . . . . . . . . . . . . . . 23 3.6.8. Packet Encapsulation Problems . . . . . . . . . . . . 24 3.6.9. VM Migration Problem in mixed IPv4 and IPv6 Environment . . . . . . . . . . . . . . . . . . . . . 24 3.6.9.1. Real-time Perception of Availability of Global Network and Storage Resources . . . . . . . . . . 25 3.6.9.2. The real-time perception of global available network resource and requested network resource for matching with storage resources . . . 25 3.6.9.3. The real-time perception of global requested network resource for matching with storage resources . . . . . . . . . . . . . . . . . . . . 25 3.6.10. Selection of Migration . . . . . . . . . . . . . . . . 26 3.6.10.1. Requirements with Different Network Environments and Protocol . . . . . . . . . . . . 26 3.6.10.2. Requirements for Live Migration of Virtual Machines . . . . . . . . . . . . . . . . . . . . . 26 3.6.11. Access and Migration of VMs without users' Perception . . . . . . . . . . . . . . . . . . . . . . 27 3.6.11.1. VM Migration Problems and Strategies in the WAN with having Traffic Roundabout as a Prerequisite . . . . . . . . . . . . . . . . . . . 28 3.6.11.2. VM Migration Problems and Strategies in the WAN without having Traffic Roundabout as a Target . . . . . . . . . . . . . . . . . . . . . . 29 3.6.12. Review of VXLAN, NVGRE, and NVO3 . . . . . . . . . . . 30 3.6.13. The East-West Traffic Problem . . . . . . . . . . . . 31 3.6.14. Data Center Interconnection Fabric Related Problems . 33 3.6.15. MAC, IP, and ARP Explosion Problems . . . . . . . . . 34 3.6.16. Suppressing Flooding within VLAN . . . . . . . . . . . 34 3.6.17. Convergence and Multipath Support . . . . . . . . . . 34 3.6.18. Routing Control - Multicast Processing . . . . . . . . 34 3.6.19. Problems and Requirement related to DMTF . . . . . . . 35 4. Control & Mobility Related Problem Specification . . . . . . . 36 4.1. General Requirements and Problems of State Migration . . . 36 4.1.1. Foundation of Migration Scheduling . . . . . . . . . . 36 4.1.2. Authentication for Migration . . . . . . . . . . . . . 36 4.1.3. Consultation for Assessing Migratability . . . . . . . 36 4.1.4. Standardization of Migration State . . . . . . . . . . 36 4.2. Mobility in Virtualized Environments . . . . . . . . . . . 38 4.3. VM Mobility Requirements . . . . . . . . . . . . . . . . . 38 4.3.1. Summarization of Mobility . . . . . . . . . . . . . . 39 4.3.2. Problem Statement . . . . . . . . . . . . . . . . . . 39 5. Network Management Related Problem Specification . . . . . . . 39 5.1. Data Center Maintenance . . . . . . . . . . . . . . . . . 39 5.2. Load Balancing after VM Migration and Integration . . . . 41 5.3. Security and Authentication of VMMI . . . . . . . . . . . 42 5.4. Efficiency of Data Migration and Fault Processing . . . . 42 Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 3] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 5.5. Robustness Problems . . . . . . . . . . . . . . . . . . . 42 5.5.1. Robustness of VM Migration . . . . . . . . . . . . . . 43 5.5.2. Robustness of VNE . . . . . . . . . . . . . . . . . . 43 6. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 44 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8. Security Considerations . . . . . . . . . . . . . . . . . . . 45 9. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 45 10. Normative References . . . . . . . . . . . . . . . . . . . . . 45 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 45 Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 4] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 1. Introduction There are many challenges related to the VM migration and their interconnections among two or more data centers (DCs). The techniques that can be used for VM migration and data center interconnection should support the required level of performance, security, scalability, along with simplicity and cost-effective management, operations and maintenance. In this draft, the issues and requirements for moving the virtual machines are summarized with reference to the necessary conditions for migration, business needs, state classification, security, and efficiency. We then list the requirements for VM migration in the current IPV4 and IPV6 mixed environment. On the choice of the migration solution, the requirements for techniques that are useful on large-scale Layer-2 network and on segmented IP network/WAN are discussed. VM migration scheme across IP subnets/WAN is therefore needed to implement virtual computing resources sharing across multiple network administrative domains. This will make a wider range of VM migration possible, and can allow for migration of VMs to different types of DC. It can be adapted to different types of physical networks, different topological networks, and various protocols. For the seamless online migration in these scenarios, a very intelligent seamless VM online migration is needed to be implemented on the control plane. We summarize the requirements of virtual networks for VM migration, visual networking, and operations in DCI modes. In the following sections of this draft, we first describe the general challenges at high level, and then analyze the requirements for VM migration. We then discuss the commonly-used solutions and their limitations along with the desired features of a potential reference solution. A more detailed solution survey will be presented in a companion draft. 1.1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Terminology and Concepts o ACL: Access Control List Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 5] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 o ARP: Address Resolution Protocol o DC: Data Center o DCB/DCBR: Data Center Border Routers o DC GW: Data Center Gateway o DCI: Data Center Interconnection o DCS: Data Center Switch o FDB: Forwarding DataBase o HPC: High-Performance Computing o IDC: Internet Data Center o IGMP: Internet Group Management Protocol o IOMMU: Input/Output Memory Management Unit o IP: Internet Protocol o IP VPN: Layer 3 VPN, defined in L3VPN working group o ISATAP: Intra-Site Automatic Tunnel Addressing Protocol o LISP: Locator ID Separation Protocol o MatrixDCN: Matrix-based fabric for Data Center Network o NHRP: Next Hop Resolution Protocol o NVO3: Network Virtualization Overlays (Over Layer-3) o OBP: Overlay network boundary point o OTV: Overlay Transport Virtualization o PaaS: Platform as a Service o PIM: Protocol Independent Multicast o PBB: Provider Backbone Bridge o PM: Physical Machine Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 6] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 o QoS: Quality of Service o RA GW: Resource Allocation GateWay o STP: Spanning Tree Protocol o TNI: Tenant Network Identifier o ToR: Top of the Rack o TRILL: Transparent Interconnection of Lots of Links o VLAN: Virtual Local Area Networking o VM: Virtual Machine o VMMI: Virtual Machine Mobility and Interconnection o VN: Virtual Network o VNI: Virtual Network Identifier o VNE: Virtual Network Entity.(a virtualized laye-3/network entity with associated virtualized port and virtualized processing capabilities) o VPN: Virtual Private Network o VPLS: Virtual Private LAN Service o VRRP: Virtual Router Redundancy Protocol o VSE: Virtual Switching Entity (a virtualized laye-2/switch entity with associated virtualized port and virtualized processing capabilities) o VSw: Virtual Switch o WAN: Wide Area Network 3. Network Related Prloblem specification In this section, we describe the background of the virtual machine and VNE migration between the data centers. Why VM and VNE need to be migrated? First of all, in case of overload and during any natural disasters, business-critical data Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 7] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 center applications need to be migrated to other data centers as quickly as possible. As a pre-condition of data center migration and/or integration, some of the applications can be migrated without interruption from one data center to another. As for the considerations of address resources, cooling and physical space in the primary data center, some of the virtual machines can be migrated to the backup data center(s) even under normal operating conditions. Secondly, through seamless management of VM migration, it may be possible to save operations, maintenance, and upgrade costs. For example, the volume of previous server may be relatively large, and the volume of the present server may be relatively small. The migration of VMs would allow the users to simultaneously use a single server or to replace a set of smaller previous servers. Thus VM migration will save the user a substantial amount of physical rack space. In addition, the server of virtual machine has a unified "virtual hardware", unlike the previous server which may have a number of different hardware resources. After migration, the server can be managed through a unified interface. We note that using some of the virtual machine software such as high availability tools provided by VMware -- when the server shuts down due to various failure -- it is possible to automatically switch to another virtual server in the network without causing any disruption in operation. In short, migration of VMs under many desirable scenarios has the advantage of lowering operations costs, simplifying maintenance, improving system load balancing, enhancing system error tolerance, and optimizing system-wide power and space management. In general, a data center architecture consists of the following components: o Gateways (Data Center Gateway, Resource Allocation Gateway) o Core Router / Switch o Aggregation layer switch o Access layer ToR switch o Visual switch o Interconnection network between DCs o Servers o Firewall system, etc. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 8] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 Overall, the requirement of VM migration brings in the following challenges in the forefront of data center operations and management: (A)How to accommodate a large number of tenants in each isolated network in data center; (B)From one DC to another within one administrative domain, (i) how to ensure that the necessary conditions of migration are satisfied, (ii) how to ensure that a successful migration occurs without service disruption, and (c) how to ensure successful rollback when any unforeseen problem occur in the migration process. (C)From one administrative domain to another, how to solve the problem of seamless communication between the domains. There are several different solutions to the current Layer-2 (L2) based DC interconnect technology, and each can solve different problems in different scenarios. In L2 network, VXLAN [draft-mahalingam-dutt-dcops-vxlan-01] is used to resolve the VLAN number limitation problem. And, NVGRE [draft-sridharan-virtualization-nvgre-00] attempts to solve similar problems, but artificially causes interoperability problems between domains. If the unification of packet encapsulation in different solutions can be achieved, it is bound to promote seamless migration of VMs among DCs along with the desired integration in cloud computing and networking. (D) How to utilize IP based technology to resolve migration of VMs over layer-3 (L3) network? For example, VPN technology can be used to carry L2 and L3 traffic across the IP/MPLS core network. (E)How to resolve the problems related to mobility and portability of VMs among DCs is also an important aspect to consider. We discuss the above in more details in the following sections. A related draft [DCN Ops Req] discusses data center network and operations requirements. 3.1. The Mobility Problems of VM Migration Across IP subnets/WAN Why there is a need to implement VM migration across IP subnets/WAN? There are many existing implementable solutions for migrating VM within a LAN. These solutions include Xen, KVM, and VMWare, which all implement VM image file sharing based on NFS, and only CPU and memory status are migrated. These are layer-2 VM migration techniques. The advantage of the implementation is that its IP addresses don't need to be changed after the VM migration. With the development and popularization of the DC and virtualization technology, the number of servers and network environment in a single Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 9] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 LAN will limit the scalability of the virtual computing environment. In addition, when re-configuring the VLAN in the traditional DC network, STP (MSTP) will lead to the VLAN isolation. It is a very serious problem in the DC network, especially in the storage network, because the services that storage networks support demand uninterrupted operation. In the Cloud computing and DC network, the problem with VM migration is fatal. But the techniques or standards of the existing large L2 domain cannot completely solve the problem. Because even the large L2 network is very huge, it is easy to reach the upper limit, which is restricted by the scope of the Ethernet broadcast domain. So even the scope of the L2 network is very large, the maximum number of the VMs it can accommodate may not be sufficient. It will limit the scope of sharing of virtual computing resources . VM migration scheme across IP subnets is therefore needed to implement virtual computing resources sharing across multiple network administrative domains. This will make a wider range of VM migration possible, and can allow for migration of VMs to different types of DC. It can be adapted to different types of physical networks, different topological networks, and various protocols. For example, in the process of VM migration in IDC, there are scenarios that VM in the traditional three-tier topological network is migrated through WAN to Fat-Tree topological network, or to a variety of other topological networks. For the seamless online migration in these scenarios, a very intelligent seamless VM online migration is needed to be implemented on the control plane. If VM migration is only implemented in the L2 domain, its concern on the network is that the expansion of the number of VLANs or isolated domains, such as the 16,000,000 isolated domains in PBB. Now the limitless and seamless VM arbitrary online migration across IP subnets means that the following issues need to be addressed, in order to achieve our goal: to create a true virtual network environment that is separated from the physical network. Migration across IP subnets. VM migration in the overlay network needs to be adapted to the heterogeneous network topology. How the source network environment adapts its configuration to the target environment? The network redirection technology, IP-in-IP technology and dynamic Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 10] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 IP tunnel configuration will be used to allow online VM migration across subnets. An IP allocation management module of VM is needed, which manages each IP allocation in the virtual network. The IP allocation should not conflict with each other, and make the path cost of routing forwarding as smaller as possible. It is necessary to know the DC network topology, its routing protocols, and real-time results of the path cost to realize minimum path cost. We know that the network topology of different DCs is not necessarily the same. For example, the network topology and routing protocols of traditional DC and Fat- Tree network DC are different. The addition of related protocol processing on the control plane is needed for seamless VM migration between them. Otherwise, online VM migration cannot be implemented across DCs or across IP subnets. The scheme of IPinIP tunneling resolves the contradiction between unchanged IP addresses during the VM migration and changed IP addresses when VMs migrate across IP subnets. Therefore, the VM's mobility problem can be resolved only after the above problems are solved. Service providers can implement it by upgrading its software to support new protocols, the hardware devices need not to be upgraded. These problems are as described below: 3.1.1. IP Tunnel Problems During the VM migration, it is required to establish the IP-in-IP tunnel is required. The purpose is to make the user/application have no perception of the migration process, and their IP addresses on the related level should be the same. The scheme of IPinIP tunneling resolves the contradiction between unchanged IP addresses during the VM migration and changed IP addresses when VMs migrate across IP subnets. OBP is involved in setting up IP tunnels. According to nvo3 control plane protocol, there are two positions for OBP (NVE / VTEP) in the IDC: on server and on TOR. Placing OBP on server can minimize its correlation with network elements in the specific network topology. It will face more problems if OBP is placed on ToR. NVE is preferred to be placed on server (unless there are other stronger reasons). It will create a virtual network for VM communications. The traffic between VMs will not directly expose on the wire and switches. However, OBP on Server can reduce the weak coupling with DC topology to a certain extent, but they cannot be completely unrelated. The disadvantage of network connection solutions for online VM Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 11] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 migration across different subnets is the network configuration of VM needs to be changed after the migration, and the migration process is opaque. So the transparent migration of VM needs to be implemented, and network connection redirection technology needs to be considered. Since users cannot utilize it due to changes in the network access point during the online migration of VM across subnets, the scheme of network connection redirection system based on Proxy Mobile IP (PM IP) can be used. VM migrated to the external subnet is regarded as a mobile node and does not change the IP address. All the data to/from the VM is transmitted through the bi-directional tunnel between the external network and the home network, in order to implement online transparent migration across subnets and preferably at the switching speed. The source VM and the destination VM need to activate simultaneously and must be dynamically configured with IP tunnel. In order to make the VM migration process completely transparent (including transparent to the VMs' applications and the outside users), the migration environment of the VMs should be regarded as a mobile network environment, and the migrated VM is regarded as a mobile node. After the VM is migrated to the external network, its network configuration doesn't need any changes. The mobile agent function of the host should be taken full advantage to communicate with the external network. 3.1.2. IP Allocation Strategy Problems In the encapsulation of packets is as described above, the IP address of VM is a critical entity. Its allocation is based on DHCP to achieve in the small network. With the expansion of the network scale, IP address conflict is more likely to occur. When the VM is migrated to another network, its IP address may possibly conflict with IP address of the VM or physical host in the current network. For example, the duplicate IP addresses will make the isolated VM networks can communicate with each other, which causes confusion and migration failure. Therefore, the IP allocation management module of VM is needed, which manages each IP allocation in the virtual network. The IP allocation should not conflict with each other, and should make the path cost of routing forwarding as smaller as possible. When allocating IP, it should not conflict with the currently assigned IP network segment of the VM cluster. In addition, it cannot conflict with the IP network segment where the physical hosts are located. It also cannot conflict with the destination IP network segment after the migration. So the synchronization of IP address Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 12] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 allocation information needs to be done. Of course, the synchronization in the whole network is not necessary as long as there are ways to ensure no conflict exists. Moreover, the allocation method needs to consider the introduction of network overhead as small as possible, as well as insufficient IP issues in the destination network segment. When allocating IP addresses to the hosts based on DHCP protocol, the IP addresses in the IP address pool are allocated from small to large. The insufficient number of addresses in the pool may lead to assigned VM IP conflict, which hinders VM migration. The IP address allocation from small to large makes assigned VM IP affect routing protocol to choose the better path. Especially for the specific architectures like Fat-Tree, specific network topology and the protocol architecture of specific routing strategy (such as OSPF) should be utilized. The VM migration process must be adapted to these aspects, and cannot be copied for purely Layer-2 migration approach. So VM migration is inherently related to network topology and network routing protocols. In the Fat-tree topology, IP addressing and IP allocation methods of networks servers and switches are related to the routing protocols. Two routing methods can be chosen: OSPF protocol (OSPF domain cannot be too large), and the fixed routing configuration. As the destination VM is needed to assign an IP, in order to prevent IP conflict, routing protocols used in the destination DC need to be known. For example, OSPF routing protocol (in this case, the new added network node is assigned IP address by using DHCP), or the fixed configuration IP routing protocol is used in the Fat-tree topology. If the former, the number and distribution of reserved IP addresses in the IP address pool are different from the latter. Therefore a scheme is required to know the adopted topology and address allocation strategy, IP usage for each segment, the remaining number of IP, etc. This information cannot be acquired purely by the existing DHCP protocol. Different routing strategies have different routing management mechanism for VM migration across the DCs for the following reasons: (a) , It involves the uniqueness problem of the IP address assignment and IP tunnel establishment, and (b) It involves the global unified management issues. These problems will be discussed later. In the addressing method of the fixed routing protocol, the IP address assigned to the device located within DC actually contains the location information. The type of the corresponding device can easily be determined through IP, and the location of the device in Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 13] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 the topology can also be visually judged. So these addresses must be avoided in the automated allocation of IP to the VM. The function of DHCP protocol needs to be greatly enhanced, or the protocol and tool of IP address allocation need to be re-designed. Moreover, consultation is needed before migration. It may be required to migrate the VMs after the migration process has been confirmed and cleared. The future source and destination IP allocation should be considered for reservation. The reserved IP involves network topology and different IP address allocation strategy, and its network topology and IP allocation strategy before migration. For the control plane protocol in the nvo3 network, reasonable allocation of IP address is used according to the adopted network topology and routing protocols in the source and destination DC, in order to achieve the seamless VM migration and the optimal path as much as possible. In addition, the above problems and requirements also involve PortLand network topology, similar to Fat Tree network topology. Future server-centric network topology, such as Dcell/Bcube network topology, also needs to achieve compatibility on the control plane. 3.1.3. Routing Synchronization Strategy Problems In order to ensure the normal data forwarding after the VM migration, the routing synchronization between the source network and destination network is needed. 3.1.4. The migration protocol state machine of the VM online migration across subnets As for the routing strategy discussed earlier, compared to the migration in the same IP subnets, changes include the IP allocation strategy, and routing synchronization strategy. So the state and handling of routing updates must be included in the state machine of VM migration across subnets at the preparation phase before the VM migration. Therefore, if it is allowed to cross subnets, the network redirection technology should be used. For IP-in-IP technology, the advantage of it is the good compatibility with network equipments, as long as upgrading their software. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 14] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 3.1.5. Resource Gateway Problems A resource gateway is needed to record IP address resources that have been used, and IP network segments which the used IP addresses belong to. 3.1.6. Optimized Location of Default Gateway The VM's default gateway should be in a close topological proximity to the ToR that connects the server presently hosting that VM. 3.1.7. Other Problems Migration across domains has proposed new requirements for network protocols, for example, the ARP response packet mechanism is no longer applicable in the WAN. In addition, some packets will be lost during the migration, which does not apply to parallel computing. There are also problems such as computing resources sharing across multiple administrative domains, etc. 3.2. The Virtual Network Model Based on the above problems, two requirements will be added on the virtual network model: First, the routing information is adjusted automatically according to the physical location of VM after the VM is migrated to a new subnet; Second, a logical entity, namely "virtual network communications agent", is added, which is responsible for data routing, storage and forwarding in the across- subnets communications. The agent can be dynamically created and revoked, and a data communications agent can be running on each sever. Overlay layer consists of VMs and the communications agents. Each of the virtual networks on the top, such as VN1 and VN2, is respectively composed by the VMs and the communication agents as needed. Since it is as required, VMs and agents may come from different networks, and the connections are established through dedicated tunnels between the communications agents. 3.3. The Processing Flow During the process, VM migration messages will trigger the topology updates of the VMs' clusters in the source virtual network and destination virtual network. It is therefore required to acquire the network topology, the routing protocols, and the IP address assignment rules for each other on both ends, so it can be assigned a unique VM IP. The routing information of the communications agents is updated. The communications agent captures the corresponding VM Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 15] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 packets, encapsulates them into the data section of the packets, and adds the necessary control information (such as self-defined forwarding rules). After the encapsulation, these packets are transferred to the destination network through the tunnels between the communications agents. The communications agent in the destination network de-capsulates the packets and processes the information, and then delivers the packets to the destination network. The data transfer process across subnets is now complete. The modules which need to be modified are as follows: According to the above processing flow, it can be divided by function modules as follows: Routing management, MAC capture, Tunnel packet encapsulation, Tunnel forwarding, Tunnel packet de-capsulation, and Forwarding in the destination network. 3.4. Problems of NVE/OBP location VMs communicate with each other through the interconnected network either within the same domain, or between different domains. According to various NVE / OBP position, the processing on the control plane is different. 3.4.1. NVE/OBP on the Server Assume that a set of VMs and the network that interconnects them are allowed to communicate with each other, MAC source and destination addresses in the Ethernet header of the packets exchanged among these VMs are preserved. This is L2-based VM communication within a LAN. Any VM should have its own IP. If a VM belongs to more than one domain, this VM will have multiple IP addresses and multiple logical interfaces, which is similar to the model of L3 switches. Different VM clusters are distinguished by VLAN mechanism in the same L2 physical domain. Apart from these cases, VLAN-ID doesn't work. For example, in the case of VM communications across IP subnets, the packets are encapsulated into NVE, and directly delivered to the peer NVE, and then transferred to the destination VM. Once migration across L3 network occurs, some scenarios will cause the MAC source address to be modified.A VM may belong to different VM cluster network (similarly distinguished by VNI). As it is very transparent to network topology and L2/L3 protocol, NVE / OBP on the server should be the default configuration mode. Scenarios classification Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 16] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 (1)NVEs on two servers are members of the same VN (NVEs in the same L2 physical domain). This is the simplest case. (2)NVEs on two servers are members of different VNs, and there is interconnection between the two VNs. NVEs on the servers are required to be added to the L3 domain. (3)NVEs on two servers are members of different VNs, and there is no interconnection between the two VNs. Need to consider routing mechanism. (4)NVEs on two servers are members of the same VN, and there is no interconnection between the two VNs. Need to consider the discovery mechanism between the two domains. 3.4.2. NVE/OBP on the ToR For NVE on the ToR, NVE needs to deal with the VIDs of various packets. Once the VM is migrated, the rules of source network need to be migrated, causing physical network configuration changes.Therefore, it is required to develop a series of rules to deal with it. In this case, the VID used by the VM has a global significance. Various rules and usage range of the VID are required to be provided. The VLAN-ID used by a given VM is referred to the VLAN-ID carried by the traffic that is originated by that VM and within the same L2 physical domain as the VM. Scenarios classification (1)NVEs on two servers are members of the same VN (NVEs in the same L2 physical domain). This is the simplest case. (2)NVEs on two servers are members of different VNs, and there is interconnection between the two VNs. NVEs on the servers are required to be added to the L3 domain. (3)NVEs on two servers are members of different VNs, and there is no interconnection between the two VNs. Need to consider routing mechanism. It will result in changing MAC source and destination addresses in the Ethernet header of the packets being exchanged. (4)NVEs on two servers are members of the same VN, and there is no interconnection between the two VNs. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 17] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 Need to consider the discovery mechanism between the two domains. As NVEs may belong to different domains, if a NVE communicates with the other NVE in the same domain, the VLAN-ID of packets exchanged should be the same. In order to simplify the process, the VLAN-IDs are allowed to be removed. But once a NVE communicates with the other NVE in the different domain, the VLAN-ID of packets exchanged may be different. 3.5. The Evolution Problems of The Logical Network Topology in VMMI Environments The question is whether there is any relation between VM migration and the topology of the network within a data center. In simple implementations, seamless VM migration should be realized over Layer-2 network. Since a large number of VMs and their applications are running in the same Layer-2 domain, it (VM migration) may be very stressful from bandwidth utilization viewpont of the data center switching network. In order to improve the bandwidth utilization, it is required to upgrade the load balancing capability of the network which has numerous ECMP between different points. Although multi-root tree (such as Fat Tree, MatrixDCN, and other network topology) and protocols support ECMP, we can achieve it by configuring the appropriate routing, or through TRILL or SPB. However, implementing TRILL or SPB requires elimination/upgrading of the existing equipment. If we can encode their positions in the topology by IP or MAC address along with using Fat Tree, MatrixDCN network topology, we can realize seamless and transparent VM migration within the data center, on the premise that the large layer-2 network is composed of the existing low-end switching equipments. Note that although Ethernet and IP protocols are meant to support arbitrary topology, these Layer-2 and Layer-3 network protocols are not flexible enough for use in Data Center environments. The lack of flexibility may result in lack of scalability, management difficulties, inflexible communications, and poor fault tolerance. These ultimately result in lack of support for flexible VMs migration in the increasingly larger and complex Layer-2 networks. However, if we can solve these problems, we will be able to achieve the purpose of flexible migration for VMs in the scalable, fault tolerant layer-2 data center networks. Some solutions are moving forward in the direction to solve the problems above, there have been several new topological models and Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 18] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 routing architectures. These include Fat Tree fabric and MatrixDCN fabric. MatrixDCN is a new style network fabric for data center networks. These fabrics can support super-large scale network including more than 100,0000 servers without performance degradation. Furthermore, through ECMP techology, MatrixDCN can eliminate the bandwidth bottleneck problems in the canonical tree-structure data center networks. MatrixDCN fabric is described in [Matrix DCN, I-D.sun- matrix-dcn]. 3.6. Cloud Service Virtualization Requirements The following sub-sections present the requirements of logical and physical elements for Cloud/DC service virtualization and their operations. 3.6.1. Requirement of logical element o Resource Allocation Gateway (RA GW) Network service providers provide virtualized basic network resources for tenants between data centers. Within the data center, the facilities include virtualized computing and virtualized storage resources. The RA gateway's role is to provide access to the virtualized resources. These resources are divided into the following three categories: networking resources, computing resources, and storage resources. The RA gateway compares the demanded networking, computing and storage resources with the available resources, finds out the corresponding relations, and achieves globally reasonable matching of resources scheduling. DC GW's function, described below, is a subset of RA GW functions. o Data Center Gateway (DC GW) The DC gateway provides access to the data center for different outside users including the Internet access and VPN connection users. In the existing DC network model, the DC GW may be a router with virtual routing capabilities, or may be a PE device of IPVPN/L2VPN connection. Core Nodes which perform the roles of DC GWs, may also provide Internet connectivity, inter-DC connectivity and VPN support. o Core Router / Switch These are high-end core nodes / switch with routing capabilities located in the core layer, connecting aggregation layer switches. o Aggregation Layer Switch This switch aggregates traffic from the ToR switches and forwards Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 19] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 the downstream traffic. The switch can be a normal aggregation switch, or multiple switches virtualized into a single stack switch. o Access Layer ToR Switch Access layer ToR switches are usually dual-homed to the parent node switch. o Virtual Switch This is a virtual software switch which runs on a server. The requirements related to the above demand that L2/L3 tunnel is terminated to one of the entities mentioned above. 3.6.2. Requirements for Resource Allocation Gateway (RA GW) Function The emerging DC and network providers offer virtualized computing, storage and networking resources and related services. Tenants are identified by the overlapping addresses, and share a pool of storage and networking resources. Therefore, a virtual platform is needed, with the capabilities of control and management for virtual machines, virtual services, virtual storage and virtual networks. What tenants see is a subset of the above four entities. The virtualized platform is built on the framework of the physical network, physical servers, physical switches and routers, and physical storage devices. Through the virtual platform, the tenants are offered globally scheduled resources for sharing throughout the entire system. The RA GW collects information related to system-wide availability of computing, storage, and networking resources. The RA GW then allocates appropriate quantities of computing, storage and networking resources to the tenants according to certain policies, and the demands for resources. Note that in order to prevent any single point of failure the RA GW needs to have backup support. The global resource availability information and scheduling information (between resource allocation gateway and backup resource allocation gateway) also needs real-time backup. It is possible to provide automatic matching and scheduling of the virtualized resources, which are dynamically adjusted according to the operating conditions. It can optimize utilization of the computing resources, networking resources such as IDC interconnection resources, IDC internal routing and switching resources, and storage resources. It should consider the optimization of the network path routing for matching with network resources. Routing selection can be based on the degree of matching between the required bandwidth and Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 20] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 the bandwidth that can be provided, the shortest path, service level, user and usage level. These factors need to be considered in the decision-making process. 3.6.3. Performance Requirements Any preferred solution should be able to easily support a large number of tenants sharing the data center resources. It is also required to support a large (more than 4K) number of VLANs. For example, there are a number of VPN applications -- VPLS or IP VPN -- which serve more than 10K tenants, each requiring multiple VLANs. In this scenario the availability of 4K VLANs is not sufficient for the tenants. The solution should guarantee high quality of service, and must ensure a large number of network connections are not interrupted even during overloads or minor failure conditions. The connectivity should meet carrier-class reliability and availability requirements. 3.6.4. Fault Tolerance Capability Requirements In the event of any fault or error, it is required to quickly recover from an error condition. Error recovery includes network fault recovery, computing power recovery, VM migration recovery, and storage recovery. Among them, the network fault recovery capability and computing power recovery are the fundamental requirements for VM migration recovery and storage recovery. Network fault recovery: Once an error or fault condition is identified in virtual network connectivity, alarms should be triggered, and recovery by using backup virtual network should be automatically activated. Computing capability recovery: Once the computing capability fails, an efficient detection mechanism is needed to find the problem and services can be scheduled to backup virtual machines that are being used for the services. VM migration recovery: In the event of VM migration failure, it is required to automatically restore to the original state of the virtual machines so that users' services are not adversely impacted. Storage recovery: In the event of storage failures, it is required to automatically find a backup virtual storage resource so that it can be enabled or activated immediately. The response and recovery times should be very short in order to minimize service delay and disruptions. After the VM migration, it is required to consider the impact on the Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 21] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 switching network, such as whether the new network environment will have the problem of insufficient bandwidth. Although at the consultation phase before the migration, there will be an initial judgment, but it cannot guarantee that no problem will occur at all after the migration. In addition, if the destination DC needs to activate the standby servers and additional network resources, it may be worthwhile to consider allocating and activating additional server and network resources. And, in some cases, some routing policies -- on network segments and server clusters -- may need to be adjusted as well after migration. 3.6.5. Network Model Traditionally, the DCs have their own private networks for the interconnection among themselves. Alternatively, the data centers can use independent WAN service provider's interconnection facilities for primary and/or secondary connections. 3.6.6. Types and Applications of VPNs Interconnection between DCs which provide Cloud Services 3.6.6.1. Types of VPNs Layer3 VPN Layer3 VPN BGP / MPLS IP Virtual Private Networks (VPNs) (BGP / MPLS IP Virtual Private Networks (VPNs)) RFC 4364 Layer2 VPN PBB + L2VPN TRILL + L2VPN VLAN + L2VPN NVGRE [draft-sridharan-virtualization-nvgre-00] PBB VPLS E-VPN PBB-EVPN VPLS VPWS 3.6.6.2. Applications of L2VPN in DCs It is a very common practice to use L2 interconnection technologies for DC interconnection across geographical regions. Note that VPN technology is also used to carry L2 and L3 traffic across the IP/MPLS core network. This technology can be used in the same DC to support scalability or interconnection across L3 domains. VPLS is commonly used for IP/MPLS connection over WAN and it supports transparent LAN services. IP VPN, including BGP / MPLS IP VPN and IPSec VPN, has Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 22] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 been used in a common IP/MPLS core network to provide virtual IP routing instances. The implementation of PBB plus L2-VPN can take advantage of some of the existing technologies. It is flexible to use VPN network in the cloud computing environment and can support a sufficient number of VPN connections/sessions (networking resources), which is much larger than the 4K VLAN mode of L2VPN. Therefore, it can achieve the effect which is similar to that of VXLAN. Note that PBB can not only support access to more than 16M virtual LAN instances, it can also separate the customers and provide different domains by isolated MAC address spaces. The use of PBB encapsulation has one major advantage. Note that since VM's MAC address will not be processed by ToRs and Core SWs, MAC table size of ToRs and Core SWs may be reduced by two orders of magnitude; the specific number is related with the number of virtual machines in each server and VM virtual interfaces. One solution to solve problems in DC is to deploy other technologies in the existing DC network. A service provider can separate its domains of VLAN into different VLAN islands, in this way each island can support up to 4K VLANs. Domains of VLAN can be interconnected via VPLS, at the same time, DC GWs can be used as VPLS PEs. If retaining the existing VLAN-based solutions only in VSw, while the number of tenants in some VLAN islands is more than 4K, the service provider needs to deploy VPLS deeper in the DC network. This is equivalent to supporting L2VPN from the ToRs, and using the existing VPLS solutions to enable MPLS for the ToR and core DC elements. 3.6.6.3. Applications of L3VPN in DCs IP VPN technology can also be used for data center network virtualization. For example, multi-tenant L3 virtualization can be achieved by assigning a different IP VPN instance to each tenant who needs L3 virtualization in a DC network. There are many advantages of using IP VPN as a Layer-3 virtualization solution within DC compared to using existing virtual routing DC technology. Some of the advantages are as mentioned below: (1) It supports many VRF-to-VRF tunneling options containing different operational models: BGP/MPLS IP VPN, IP or L3 VPN GRE, etc. (2) The connections of IP VPN instances used in Cloud services below the WAN can be IP VPN that is directly involved in the WAN. 3.6.7. VN Requirements The Virtual Networks (VNs) consists of virtual IDC network, and virtual DC internal switching network. These VNs are built on the basis of the physical networks. VM migration is not affected by the physical network. As long as it is within the scope of the VN, it is free to migrate if it satisfies the necessary conditions. In Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 23] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 addition, network architecture and forwarding/switching capacity should match between the source network and destination network, without causing any concern for the physical network. The physical characteristics of the network, such as VLAN, IP subnet, L2 protocol entities, QoS supporting entities, etc. are abstracted as the logical elements of VN. Because the VMs operate in VN environment, each VM has the associated logical elements, such as the CPU process, I/O, memory, Disk, etc., and VN also has a corresponding set of logical elements. In general, the VNs are isolated from each other. The VMs within each VN communicate using their own internal address, and send and receive Ethernet packets. VNs do not have ties to their specific implementation; the implementation can use Internet, L2VPN, L3VPN, GRE, etc. From the VN layer, IP can be used to make that distinction. Traffic traverse through firewall into the VN, and ACL and other security policies are also needed in the access layer. 3.6.8. Packet Encapsulation Problems In order to implement virtual network (VN), a method similar to the overlay address is required. Overlay address can be reflected by VXLAN or the I-SID of PBB+L2VPN. The overlay address works as an identifier corresponding to every instance of VN. The implementation model requires that the edge switch or router acts as the DC GW for the encapsulation and de-encapsulation of the tunnel packets. Various VNs within the DC rely on overlay address in order to distinguish and separate one from the other. Each VN also contain 4K VLANs for its internal use. The data packets travel to DC interconnection network through DC GW, and are encapsulated for subsequent transmission. The main issue related to the above is the support of encapsulation. In L2 network, VXLAN supports the VLAN expansion requirements. In NVGRE, a similar problem is also resolved in a different way. Therefore, in order to achieve seamless migration of VMs across DCs that support different VLAN expansion mechanisms, unification of packet encapsulation methods is required. 3.6.9. VM Migration Problem in mixed IPv4 and IPv6 Environment With the proliferation of IPv6 technology, the existing IPv4 networks will have attachment to IPv6 hosts. This is driving the development of a series of tunnel technologies, e.g., 6to4 tunnel technology, ISATAP tunnel technology, and so on. ISATAP tunnel is a point to point automatic tunnel technology, and 6to4 tunnel is multipoint automatic tunnel technology which is mainly used for attaching multiple IPv6 islands over an IPv4 network to connect to the IPv6 Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 24] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 network. ISATAP and 6to4 tunnel technology works through the IPv4 address embedded in destination address of IPv6 packets, which are automatically obtained at the end of the tunnel. The following issues are pertinent to the migration of VMs across data centers in mixed (IPv4 and IPv6) network environment. 3.6.9.1. Real-time Perception of Availability of Global Network and Storage Resources In the current system, that status of availability of network resources and storage resources may not be reported in hard real- time. This may cause a mismatch between the reported and actually available virtual machines/storage system resources in the data centers. However, from the global the scale, the compute and storage resources in the distributed data center system may need to be used more efficiently. Without real-time up-to-date information about system resources availability, the network resources cannot be used more efficiently. Therefore, a management model needs to be established. This model needs to keep track of system-wide network resources and storage resources, and dispatch them on as needed basis. The management model can be integrated into the framework of virtual machine migration as being currently discussed in DMTF [DMTF VSMP]. The real challenges here are how to learn about the availability of system-wide networking, compute, and storage resources. A set of uniform methods, mechanisms and protocols would be very useful to resolve these issues. 3.6.9.2. The real-time perception of global available network resource and requested network resource for matching with storage resources In mixed IPv4 and IPv6 networks, a multi-tunneling VPN gateway solution may be useful to resolve the problem of establishing communication between heterogeneous networks. This will be helpful for supporting seamless communication across heterogeneous data centers about the availability of system-wide resources. 3.6.9.3. The real-time perception of global requested network resource for matching with storage resources The access to data center virtual machine / storage resources can be accurately performed when we have a set of standardized APIs, resources (memory, storage, processing, communications, etc.) format, and communication protocols. The availability of virtual machine / Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 25] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 storage system resources in the global scope needs to be registered, and their status need to be reported to the resource management system in the cloud system. Eventually, the resource management system in the cloud system is kept well-informed of system-wide network resources. 3.6.10. Selection of Migration 3.6.10.1. Requirements with Different Network Environments and Protocol Currently in large-scale DCs, Layer-2 interconnection techniques are mainly used for migration of virtual machines, but there also exists Layer-3 interconnection techniques for VM migration. These two technologies are suitable for different implementation environments and scenarios. The former is often used for frequent data migration with strict requirements on data security, such as data migration and backup in the bank, etc., whereas the latter is commonly used for data migration for personal or mobile users, or bulk data transfer between different service providers. Because of users' demands for establishment of a unified management platform, it will become more and more important to build the distributed PaaS across different cloud/DC service providers. No user is willing to maintain too many independent platforms. At the same time, sharing of resources across multiple data centers is becoming a major trend. As a result, it will become very cumbersome for data center managers to build a large number of VPN connections for all data centers. What may be needed is a portal operator, who can manage of all the internal VPN connections between the clouds/DCs and can unify the scheduling of data/VM migration in order to achieve optimum utilization of resources. 3.6.10.2. Requirements for Live Migration of Virtual Machines The scenarios for live migration of VMs across DCs include the following: (a) Migration across IPv4 networks and across IPv6 networks, (b) Migration from IPv4 to IPv6 networks and vice versa, (c) Migration based on mobile IP. Live migration of VMs may be more suitable for mobile applications for small scale and home users. The complexity of the network can be fully shielded from the users, as long as both source and destination have either IPv4 or IPV6 addresses. This migration paradigm can be more secure and applicable in Layer-3 networking environment. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 26] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 3.6.11. Access and Migration of VMs without users' Perception For VM migration without users' perception, it is required to achieve migration of VMs from one DC to another without causing any significant disruption of services. In essence, the users should not be able to perceive that the VM migration has occurred. To achieve this, none (or insignificant amount) of critical data packets can be lost during the process of VM migration. The following two conditions are helpful to achieve this: i. First, consider how to avoid traffic roundabout while having traffic roundabout problem as a prerequisite. ii. Second, consider how to portray the state of no migration in user's perception and no traffic roundabout with having no traffic roundabout problem as a target. The following are the relevant problems and possible solutions in these two areas. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 27] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 3.6.11.1. VM Migration Problems and Strategies in the WAN with having Traffic Roundabout as a Prerequisite _____________ / \ user c + MAN C + \_____________/ | | | \|/ =--_==--==--=--= / \ = backbone network = = = \___=--__--==-__= / \ | / __\ \ | \|/ / / \ \|/ __________ / /|\ \ _______ / ___|___ | ___|___ \ user a + MAN A | VM-A | |gateway|MAN B + user b \_________|gateway| | |______/ |_______| |_______| | | _|__ __|__ ||VM-A|| ||VM-A|| ||____|| migration ||____|| | |__________\| | |Server| /|Server| |______| |______| Figure 1: Roundabout Traffic Scenario 3.6.11.1.1. VM Migration Requirements For migration in Layer-2 (L2) network, it is required to keep VM MAC / IP address the same as they are in the source domain. This will help live VM migration and seamless inter-DC communications among the service providers. 3.6.11.1.2. A Scenario Let us consider the scenario where a VM needs to be migrated from the IDC in metro A to the IDC in metro B. There is almost no traffic roundabout for users within the metro (such as for user a). Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 28] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 For access to IDC services by WAN, such as from user in metro C, the client traffic must first access VM-A gateway after VM migration, and then be sent to the migrated VM through the Layer-2 tunnel. 3.6.11.1.3. A Possible Solution Through mechanisms such as DNS service, businesses can access services from a location/DC which is as close as possible and the roundabout routes can be minimized after migration. However, the shortcoming of this approach is that, for access across the metro network, there are still traffic roundabout issues. This approach is a solution to evade the problem, and does not completely solve the problem. Moreover, additional processing is involved in the control of DNS service, which increases the complexity of the solution. 3.6.11.2. VM Migration Problems and Strategies in the WAN without having Traffic Roundabout as a Target In this process of VM migration, in order to achieve real-time migration without users' perception, the entire state of the management programs (including firewall) needs to migrate as VMs migrate. The state migration of the firewalls is the key to ensure that the packets in the original firewalls' data flow are neither lost nor mis-routed during the VM migration. Before a VM migrates to a new DC environment, firewalls have recorded the existing VMs connections' session tables. In the event of VM migration, the firewalls in the new DC location will be used for the access to the VM. If the firewalls in the new location don't have the session tables of the original firewalls' data flows it will cause loss or mis-routing of packets. The original sessions will be disconnected and the users' data flows will fail to access the VM. To solve this problem, the original firewall's session tables in use need to be migrated and synchronized with the session tables of the firewall in the new VM location. The session table should contain at least the following information: Source IP address, Destination IP address, Source Port address, Destination Port address, Protocol type, VLAN ID, time of expiration, public guard information for Firewall defense. Since the firewall's session table needs to migrate when a VM migrates, the deployment of the source and destination firewall should be known in advance. There are at least two kinds of firewall deployment. The first kind is the centralized deployment. In this case, the firewalls are placed on the connection point of the DC and WAN. Each DC has firewalls either on or adjacent to the core switches. The second one is the distributed deployment. In this case, the firewalls are distributed on the aggregation switches or access switches. The former's advantages are convenient management and Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 29] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 deployment. The disadvantage is the firewalls can easily become a bottleneck because of centralized/aggregated processing. The latter's advantage is distributed processing of huge VM data flows in large L2 network. After knowing the deployment of the firewall, it is necessary to determine how to migrate the firewall session table from the source location to the destination location. Since the location and number of the centralized and distributed firewall deployment differ, the mechanisms that are utilized to migrate the session tables in these two deployments are not exactly the same. These are new challenges to be addressed for VM migration. 3.6.12. Review of VXLAN, NVGRE, and NVO3 In order to solve the problem of insufficient number of VLANs in DC, the techniques like VXLAN and NVGRE have adopted two major strategies; one is the encapsulation and the other is tunneling. Both VXLAN and NVGRE use encapsulation and tunneling to create a number of VLAN subnets, which can be extended to the Layer-2 and Layer-3 networks. This solves the problem of limitation of the number of VLAN as defined by IEEE802.1Q, and helps achieve shared load-balancing in multi-tenant environment in both public and private networks. The VXLAN technology is introduced in 2011, and it is designed to address the number restrictions of 802.1Q VLAN. The technologies like MAC in MAC, MAC in GRE also extend the number of VLANs. However, VXLAN attempts to address the issues related to inadequate utilization of link resources, monitoring of packets after re- encapsulation of header more effectively. The frame format of VXLAN is the same as that of OTV and LISP, although these three solutions solve different problems of IDC Interconnection and the VM migration. Also, in VXLAN, the packet is encapsulated in MAC in UDP, and addressing is extended to 24-bit, which is the effective solution to the restrictions of VLAN numbers. UDP encapsulation enables the logical virtual network extension to different subnets. It also supports the migration of VMs across subnets. The change of the frame's structure increases the field for extending the VLAN. Note that VXLAN solves different problem compared to OTV. OTV solves the problem of IDC interconnection, which builds an IP tunnel between different data centers through MAC in IP. VXVLAN mainly solves the problem of limitation of VLAN resources in DCs due to the increase in the number of tenants. The key is the expansion of the VNI field to increase the number of VLANs. Both techniques can be applied to VM migration, since the two packet formats are almost the same and Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 30] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 completely compatible. NVGRE specifies the 24-bit Tenant Network Identifier (TNI) and resolves some issues related to supporting multiple tenants in DC network. It uses GRE to create an independent virtual Layer-2 network, and limit physical Layer-2 network to expand across subnet borders. Terminals supporting NVGRE insert the TNI indicators in the GRE headers to separate the TNIs. NVGRE and VXLAN solve the same problem. The two technologies were proposed almost at the same time. Hoverer, there are some differences between them: VXLAN not only increases VXLAN header(VNI), but also increases the outer UDP encapsulation on the package, which facilitates live migration of VMs across subnets. In addition, differentiated services can be supported to the tenants in the same subnet because of the use of UDP. Both proposals are built on the assumption that load-balancing is the necessary condition to achieve efficient operation. VXLAN randomly assigns port number to achieve load- balancing, while NVGRE uses the retained 8-bit in the key GRE field. However, there may be opportunity to improve the capability of the control plane for both mechanisms in future. 3.6.13. The East-West Traffic Problem Let us discuss the background of East-West traffic problem first. There are a variety of applications in the DC, such as distributed computing, distributed storage, and distributed search. These applications and services need frequent exchanges of transactions between the business servers across the DCs. According to the traditional three-tier network model, the data streams first flows north-south and then finally flows east-west. In order to improve the forwarding efficiency of the data stream, it is necessary to update the existing network model and network forwarding technology. Among others, the Layer-2 multi-path technology being studied is one of the directions to solve this problem. Distributed computing is the basis of transformation of the existing IT services. This allows scalable and efficient use of sometime underutilized computing and storage resources scattered across the data centers. In typical data centers, the average server utilization is often low in the existing network. The concept of virtualization and distributed computing can perfectly solve the problem of capacity limitation of a single server in demanding environments in certain DCs via on-demand utilization of resources and without impacting the Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 31] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 performance. This revolutionary technology of distributed computing and services using resources in the DCs also produces several horizontal flows of traffic. The application of distributed computing technology on the servers produces a large number of interactive traffic streams between servers. In addition, the types of IDC would influence the traffic model both within and across data centers. The first type of IDC is telecom operators who usually not only operate DCs, but also supply bandwidth for the Level-2 ISP providers. The second type is the traditional ISP companies with strong power. The third type is some IT enterprises which invest in the construction of DCs. The fourth type is high-performance computing (HPC) centers that are built by universities and research institutes and organizations. Note that in these types of DCs, the south-north traffic flow is significantly smaller compared to the horizontal flow, and this causes greatest challenges to the network design and installation. In addition to the normal flow of traffic due to the distributed computing, storage, communications, and management, hot backup and VM migration requirements produce a sudden lateral flow of traffic, and associated challenges. There are three potential solutions to the distributed horizontal flow of traffic, as described below. A. The first one is to solve the problem of east-west traffic within the server clusters by exploiting representative technologies such as vswitch, Dcell, B-cube, and DCTCP . B. The second solution is through the server network and Ethernet network, by exploiting technologies such as IEEE 802.1qbg, VEPA and UCS. C. The third solution is the network-based solution. The tree structure of the traditional DC network is not inherently efficient for horizontal flow of traffic. The problems can be solved in two ways: (i)The direction of radical changes: radical deformations in changing the tree structure to multi-path, and (ii) The direction of mild improvement: change L2 big trees to L2 small trees and meet the requirements by expanding the interconnection capacity of the upper node, clustering/stacking system, and links trunking. The requirements related to the above are as follows: Stacking technology across the data center requires specialized interfaces, and the length of feasible transmission distance is limited. The problems related to the above statement include the following: (a) although TRILL resolves the multi-path problem of Layer-2 protocol, it negatively impacts the multi-path properties of Layer-3 protocol. This is because only one active default router supports Virtual Router Redundancy Protocol (VRRP), and this means that the Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 32] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 multi-path characteristics cannot be fully utilized on Layer-3 protocol. In addition, TRILL does not define how to deal with the problem of overlapping namespace, nor it provides any solution to the requirement of supporting more than 4K VLANs. 3.6.14. Data Center Interconnection Fabric Related Problems One of the most important factors that directly impact the VMMI is connectivity among the relevant data centers. There are many features that determine this required connectivity. These features of connectivity include bandwidth, security, quality of service, load balancing capability, etc. These are frequently utilized to make decision on whether a VM can join a host in real-time or it needs to join VRF in certain unit of VM. This connectivity fabric should be open and transparent, which can be achieved by developing simple extensions to some of the existing technologies. The program should have strong openness and compatibility; it must be easy to deploy any required extensions as well. The requirements related to the above are as follows: o The negative impact of ARP, MAC and IP entry explosion on the individual network which contains a large number of tenants should be minimized by DC and DC-interconnect technologies. o The link capacity of both intra-DC and inter-DC network should be effectively utilized. Efficient utilization of the link capacity requires traffic forwarding on the shortest path between two VMs both within the DC and across DCs. Therefore, Traffic should be forwarded on the shortest path between two VMs within the DC or across DCs. o Support of east-west traffic between customers' applications located in different DCs. o Management of VMs across DC o Mobility of VMs and their migration across DCs Many mature VPN technologies can be utilized to provide connectivity between DCs. The extension of VLAN and virtual domain between DCs may also be utilized for this purpose. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 33] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 3.6.15. MAC, IP, and ARP Explosion Problems Network devices within data centers encounter many problems for supporting conventional communication framework because they need to accommodate a huge number of IP, MAC addresses and ARP. Each blade server in a network device usually supports at least 16-40 VMs, and each VM has its own MAC adress and IP address. The entities like Disk, memory, FDB table, MAC table, etc. cause an increase in convergence time. In order to accommodate this large number of the servers, different options for the network topology, for example, fat tree topology or a conventional network topology may be considered. The number of ARP packets grows with not only the number of virtual L2 domains or ELANs which is instantiated on server but also with the number of VMs in that domain. Therefore, scenarios like overload of ARP entries on server/hypervisor, exhaustion of ARP entries on the routers/PEs, and processing overload of L3 service appliances, must be efficiently resolved. These problems will easily propagate throughout the layer 2 switching network. Consequently, what are needed to resolve these problems include (a) automated management of MAC/IP/ARP in IDC, and (b) network deployment that will reduce the explosion in MAC number requirements in DCs. 3.6.16. Suppressing Flooding within VLAN Efficient operations of Data Centers require that flooding of broadcast, multicast and unknown unicast frames within VLAN (that may be caused by the improper configuration) be reduced. 3.6.17. Convergence and Multipath Support Although STP is used to solve the broadcast storm problem in the loop, it may cause network oscillation resulting in inefficient utilization of resources. Possible solutions to this problem include switch virtualization, use of TRILL and SPB, etc. Consequently, standardization of switch virtualization and the support of complex network topology in TRILL/SPB would be very helpful. 3.6.18. Routing Control - Multicast Processing In order to achieve efficient operation of Data centers, the overheads and delays due to processing of (a) different types of Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 34] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 packets such as unicast, multicast and broadcast, (b) ARP packets, and (c) load-balancing/-sharing mechanisms must be minimized. Note that STP bridging is often used to perform IGMP and/or PIM snooping to optimize multicast data delivery. However, since this snooping mechanism is performed by local STP topology, all traffic goes through the root bridge for each bridge. This type of traversing may lead to sub-optimal multicast traffic transmission. There also exist additional overheads because each customer multicast group is associated with the forwarding tree network throughout the Ethernet switching network. Consequently, development and standardization of efficient Layer-2 multicast mechanism to support intra- and inter-DC VM mobility would be very useful. 3.6.19. Problems and Requirement related to DMTF o Computing Resources It is required to standardize the format for virtualizing computing resources. Best practices for utilizing a standardized format for mobility and interconnection management of virtualized computing resources would be also very useful. o Storage Resources It is required to standardize the format for virtualizing storage resources. Best practices for utilizing a standardized format for mobility and interconnection management of virtualized storage resources would be also very useful. o Memory Resources It is required to standardize the format for virtualizing memory resources. Best practices for utilizing a standardized format for mobility and interconnection management of virtualized memory resources would be also very useful. o Switching Resources It is required to standardize the format for virtualizing switching resources. Best practices for utilizing a standardized format for mobility and interconnection management of virtualized switching resources would be also very useful. o Networking Resources It is required to standardize the format for virtualizing networking resources. Best practices for utilizing a standardized format for mobility and interconnection management of virtualized networking resources would be also very useful. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 35] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 4. Control & Mobility Related Problem Specification 4.1. General Requirements and Problems of State Migration 4.1.1. Foundation of Migration Scheduling A series of inspections need to be done before initiating the VM migration process. The hypervisor should be able to confirm which data centers need to be interconnected for migrating VM data in the network. The hypervisor should also be able to confirm which subnets and servers in the current network are most suitable to accommodate the migrated VMs. 4.1.2. Authentication for Migration For VM migration, authentication is required for all of the following entities: network resources, processor, memory and storage resources, load balancer, firewall, etc. 4.1.3. Consultation for Assessing Migratability After successful authentication, it is required to check that the inter-DC networking resources can support the migration of VMs. The required resources include network bandwidth resources, storage resources, resource pool scheduling or management resources, and so on. 4.1.4. Standardization of Migration State As an example of standardization of the VM state migration process, the following related entities should be aware of the state of each other. The flow of activities may be as follows: Global detection -> authentication processing -> capability negotiation->session establishment -> initialization instance -> establish the beginning stage -> begin migration -> migration & migration exception handling -> finish migration -> End stage -> deletion of instances - > Global detection +------------------------+ | \|/ | +------------------+ | | Global detection | | +------------------+ | | | \|/ | +------------------+ Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 36] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 | | authentication | | | processing | | +------------------+ | | | \|/ | +------------------+ | | capability | | | negotiation | | +------------------+ | | | \|/ | +------------------+ | | session | | | establishment | | +------------------+ | | | \|/ | +------------------+ | | initialization | establish the beginning stage | | instance | | | +------------------+ | | \| | | +---------------| | | | /| | | | | | | | \|/ | | | +------------------+ | | | | begin migration | | | | +------------------+ | | | | | | | | | | | \|/ | | +------------+ | | | exception |/ Y migration | | | processing |--- exception? | | +------------+\ | | | | | |N | | | | | \|/ | | +------------------+ | | | finish migration | | | +------------------+ | | | | | \|/ | | +------------------+ \|/ | | destruction | end stage | | of instances | Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 37] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 | +------------------+ | | +------------------------+ Figure 2: A Flow Chart for State Migration between Data Centers 4.2. Mobility in Virtualized Environments In order to support VM mobility, it is required to allow VMs to migrate easily and repeatedly -- that is as often as needed by the applications and services -- among a large (more than two) number of DCs. Seamless migration of VMs in mixed IPv4 and IPv6 VPN environments should be supported by using appropriate DC GWs. VMs in the resource pool should support mobility. These mobile VMs can move either within a DC or from one DC to another remote DC. The mobility can be triggered by factor like natural disaster, imbalance of load, cost (of space, electricity, etc.) reduction campaign, and so on. When a VM is migrated to a new location, it should maintain the existing client sessions. VM's MAC and IP address should be preserved and the state of the VM sessions should be copied to the new location. Some widely used virtual machine migration tools require that management programs on the source server and destination server are directly connected via an L2 network. The objective is to facilitate the implementation of smooth VM migration. One example of such tool is VMware's VMotion virtual machine migration tool. (1) Firstly, a VMotion ELAN may need to provide protection and load- balancing across multiple DC network. (2) Secondly, in the current VMotion procedure, the new location of the VM must be part of the tenant ELAN domain. When a new VM is activated, a Gratuitous ARP is sent, and the MAC FIB entries in the "tenant ELAN" are updated to direct the traffic for that VM to the new VM location. (3) Thirdly, if the path needs IP forwarding, the accessibility information of VM must be updated to the shortest path information to the VM. 4.3. VM Mobility Requirements Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 38] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 4.3.1. Summarization of Mobility Mobility refers to the movement of a VM from one server to another server within one DC or to a different DC, while maintaining the VM's original IP and MAC address throughout the process. VM mobility does not change the VLAN/subnet connection to the VM, and it requires that the serving VLAN be extended to the new location of VM. In summary, the seamless mobility solution in DC is based on IP routing, BGP / MPLS MAC-VPN, BGP / MPLS IP VPNs and NHRP. 4.3.2. Problem Statement The following are the major issues related to supporting seamless mobility of VM. The first problem is that the participating source server and destination server in the VM migration process may be located in different data centers. It may be required to extend the Layer-2 network beyond what is covered by the L2 network of the source DC. This may create islands of the same VLAN in different (geographically dispersed) data centers. The second problem is that the optimal forwarding in a VLAN that support VM mobility may involve traffic management over multiple data centers. The third problem is that the support of seamless mobility of VM across DCs may not necessarily always achieve optimal intra-VLAN forwarding. The forth problem is that the support of seamless mobility of VM across DCs may not necessarily always result in optimal routing. 5. Network Management Related Problem Specification 5.1. Data Center Maintenance We note that the servers and the applications/services in the data center should maintain uninterrupted service during the migration process. In order to provide uninterrupted service during the migration process, the following are some prerequisites: o It is required to ensure the networking and communication services remain uninterrupted between the source node and destination node Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 39] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 during the migration. o A stateful migration may be preferred. It may be desirable to not to respond to users' requests until a successful migration occurs. The service management program in the source server records the current state of VM and saves users' requests for any service/ operation to the VM in the source node. o It is required to copy the state data of source VM to the target VM in another DC, and then the new VM in the target node (DC) can be activated for accepting the service requests. o The service management program in the source server needs to store (in cache) both operation request and the current state of the source VM, and send those over the network to the service management program in the target server. As soon as the target server and VM become ready, the service management program in the target server publishes the received operation request to the target VM. The target VM takes the received final state information of the source VM as the initial operational parameters. However, in real-life operations, system malfunction may occur in any one of the above four steps/scenarios. For example, it may be difficult to ensure uninterrupted communication/networking between source node and destination node during the entire migration process. Maintaining sustainable network QoS may be complex, and VM migration may take excessively long time due to lack of timely availability of the required nodal/DC resources. Now, if the VM migration time is excessively long, the users may need to be allowed to continuously use the source VM, and the changes of data during the migration must also be recorded. At the same time it is required to take measures to ensure that the amount of change in the database and application is as small as possible. This will help achieve faster recovery, and at the same time the interruption due to VM migration will be almost imperceptible to the users. It may be useful if IETF proposes a standard definition of the uninterrupted service for the VM migration scenario. This definition along with the parameters can be the basis for checking the maturity of various VM migration solutions. The definition should take into account the time that the users/services can tolerate without giving any perception of interruption in the operation. Total time is the addition to the time required for execution of the four steps/ processes that are mentioned at the beginning of the section. It may be expected that the most mature solution in each of the steps/ process will offer fastest and best solution to the VM migration process. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 40] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 The next problem related to this topic is the Physical Device Compatibility Problem. When migrating a VM from one Physical Machine (PM) to another, if the VM is depending on some special driver, hardware, which are NOT available in the target PM, the migration process will fail. For example, if a VM is using IOMMU technology which is used to access real hardware directly (not emulated by hypervisor, for high performance) from VM, and this device is not available in the target PM, VM migration process will fail. Therefore a basic requirement related to VM migration is checking for strict compatibility between source and target PM before initiating the migration process. Another problem related to this topic is migration of VMs between Heterogeneous Hypervisors. We note that some virtual network functions are implemented in hypervisor, such as vSwitch in VMware. Additional requirements related to the above are as follows: stateful and stateless VMMI processing need to be be treated separately. Stateless VMMI processing refers to the fact that the protocol state for the transaction does not need to be preserved in memory. This lack of state means that if the follow-up processing is needed before processing the information, it must be retransmitted. This means that it could lead to significant increase in the amount of data that need to be transferred as the number of connections increases. For stateless VM migration, there is no need transfer previous state information and hence lightweight processing and fast response can be achieved. 5.2. Load Balancing after VM Migration and Integration In the migration of virtual machines between data centers, users are provided with the nearest calculation principle of "follow the sun", or multi-site load balancing requirements. In addition, for reducing energy consumption, cooling costs and other similar considerations, the virtual machines can be integrated into less dynamic data centers, which is the future trend of the so-called "Green" data centers. The challenge related to this topic is how to solve the problem of load-balancing. For example, before the migration of VM, loading of the source VM server and network traffic distribution may be load- balanced locally, and the loading of the destination VM server and network traffic distribution may be load-balanced locally. However, after the migration of VM from the source server to destination server, both loading condition and traffic distribution may not be balanced even for some extended time period. Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 41] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 Therefore, it may be useful to define and enforce a set of policy in order to allocate VM and other networking and computing resources uniformly across data centers. Of course the software, hardware and networking environments of the source and destination servers should be also as similar as possible. 5.3. Security and Authentication of VMMI During the VMMI / VM migration process, it is required to give proper considerations to the security related matters; this includes solving traffic roundabout issues, ensuring that the firewall functionalities are appropriately enacted, and so on. Therefore, in addition to authorization and authentication, appropriate policies and measures to check/enforce the security level must be in place while migrating VMs from one DC to another, especially from a private DC to a public DC in the Cloud [NIST 800- 145, Cloud/DataCenter SDO Survey]. For example, when a VM is migrated to the destination DC network, the corresponding switch port of the VM and its host server should utilize the port strategy of the source switch. The end time of the VM migration and the issue time of the strategy must be synchronized. If the former is earlier than the latter, the services may not get a timely response, and if the former is later than the latter, it may not have exact level of network security for a time period. What may be helpful in such environment is the creation and maintenance of a reasonable interactive state machine. 5.4. Efficiency of Data Migration and Fault Processing It may be useful to streamline data before commencing VM migration. Incremental migration may help improve VM migration efficiency. For example, plan to transfer only differentiated data during VM migration process between two DCs. However, this strategy may have the risk of propagating faults between DCs. In addition, if VM migration occurs between heterogeneous database systems, such as transfer of data from ORACLE database in Linux system to SQL Server database in Windows system, it is necessary to define the security and policy when fault occurs. The processing of VM migration may be slower when database migration operation fails, and there may be a need to roll back to previous stable states for all of the databases involved in VM migration. Similar issues are being discussed in DMTF [DMTF VSMP] as well. 5.5. Robustness Problems Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 42] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 5.5.1. Robustness of VM Migration During normal operations, VMs may encounter a series of challenges, e.g., CPU overloaded, memory and storage stress, disk space limitation, excessive program response time, database write-up failure, file system failure, etc. If any of the above issues cannot be resolved in a timely fashion, it will lead to the collapse of the VM migration process. As a part of the recovery process, the VM management process should take a snapshot of all data in the VM and copy them into a blank VM (VM template) in the current or a distant server with an objective to prevent any service disruption. The snapshot can be stateful or stateless, depending on (a) the status, nature, and function of the owner to which various data belongs to in the VM, and (b) the strategy of replication. For example, for the data in the database, a stateful snapshot needs to be taken, because the database itself has the ability to record the running state of the database. We note that any incremental migration of VM state is not sufficient to guarantee service continuity. Another alternative solution may be warranted. During VM migration process if the speed of writing is faster than the data transfer (from source VM location to destination VM location) rate, the VM state transfer has to be paused to adjust the time for bulk data transfer. During this adjustment period, the service downtime will occur. It is required to develop methods and mechanisms to overcome such service discontinuity. 5.5.2. Robustness of VNE During normal operations, VNEs may encounter a series of challenges, e.g., CPU overloaded, memory stress, space limitation of MAC table and forwarding table, lack of routing convergence, excessive program response time, file system failure, etc. If any of the above issues cannot be resolved in a timely fashion, it will lead to the collapse of VNE migration. As a part of the recovery process, the VNE management process should take a snapshot of all data in the VNE and copy them into an idle/unassigned VNE in the current or a distant node with an objective to prevent any service disruption. The snapshot can be stateful or stateless, depending on (a) the status, nature, and function of the owner to which various data belongs to in the VNE, and (b) the strategy of replication. For example, for stateful snapshot of a VNE both protocol state and the status of forwarding table need to be captured and transferred to Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 43] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 the new (migrated) location of the VNE. 6. Acknowledgement The following experts have provided valuable comments on the earlier version of this draft: Thomas Narten, Christopher LILJENSTOLPE, Steven Blake, Ashish Dalela, Melinda Shore, David Black, Joel M. Halpern, Vishwas Manral, Lizhong Jin, Juergen Schoenwaelder, Donald Eastlake, and Truman Boyes. We express our sincere thanks to them, and expect that they will continue to provide suggestions in future. 7. References [PBB-VPLS] Balus, F. et al. "Extensions to VPLS PE model for Provider Backbone Bridging", draft-ietf-l2vpn-pbb-vpls-pe-model- 04.txt (work in progress), October 2011. [VM-Mobility] Raggarwa, R. et al. "Data Center Mobility based on BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center- mobility-01.txt (work in progress), September 2011. [DCN Ops Req] A. Dalela. "Datacenter Network and Operations Requirements", draft-dalela-dc-requirements-00.txt, December 30, 2011 [DMTF VSMP] DMTF. "Virtual System Migration Profile", DSP1081, Version: 1.0.0c, May 2010 [VPN Applicability] Nabil Bitar. "Cloud Networking: Framework and VPN Applicability", draft-bitar-datacenter-vpn-applicability-01.txt, October 2011 [VXLAN] M.Mahalingam. "VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", draft-mahalingam-dutt-dcops-vxlan-01.txt, February 24, 2012 [NIST 800-145] NIST Special Publication 800-145, Peter Mell and Timothy Grance, The NIST definition of cloud computing, http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf, September 2011 [Cloud/DataCenter SDO Survey] B. Khasnabish and C. JunSheng. "Cloud/ DataCenter SDO Activities Survey and Analysis", draft-khasnabish-cloud-sdo-survey-02.txt, December 28, 2011 Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 44] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 [NVGRE] M. Sridharan. "NVGRE: Network Virtualization using Generic Routing Encapsulation", draft-sridharan-virtualization-nvgre-00.txt, September 2011 [NVO3] Thomas Narten. " NVO3: Network Virtualization", l2vpn-9.pdf, November 2011 [Network State Migration] Yingjie Gu, "draft-gu-opsawg-policies-migration-01", draft-gu-opsawg-policies-migration-01.txt,October 2011 [Matrix DCN] Sun et al , "Matrix Fabric based Data Center Network", draft-sun-matrix-dcn-00.txt,Work in progress, 2012. 8. Security Considerations To be added later, on as-needed basis. 9. IANA Consideration The extensions that are discussed in this draft are related to DC operations environment. 10. Normative References [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. Authors' Addresses Bhumip Khasnabish ZTE USA,Inc. 55 Madison Avenue, Suite 160 Morristown, NJ 07960 USA Phone: +001-781-752-8003 Email: vumip1@gmail.com, bhumip.khasnabish@zteusa.com Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 45] Internet-Draft Mobility and Interconnection of VM & VNE Oct 2012 Bin Liu ZTE Corporation 15F, ZTE Plaza, No.19 East Huayuan Road,Haidian District Beijing 100191 P.R.China Phone: +86-10-59932098 Email: richard.bohan.liu@gmail.com,liu.bin21@zte.com.cn Baohua Lei China Telecom 118, St. Xizhimennei, Office 709, Xicheng District Beijing P.R.China Phone: +86-10-58552124 Email: leibh@ctbri.com.cn Feng Wang China Telecom 118, St. Xizhimennei, Office 709, Xicheng District Beijing P.R.China Phone: +86-10-58552866 Email: wangfeng@ctbri.com.cn Bhumip Khasnabish, et al. Expires April 4, 2013 [Page 46]