Network Working Group N. Zong Internet-Draft L. Dunbar Intended status: Informational Huawei Technologies Expires: October 12, 2014 M. Shore No Mountain Software D. Lopez Telefonica April 10, 2014 Virtualized Network Function (VNF) Pool Problem Statement draft-zong-vnfpool-problem-statement-04 Abstract Network functions are traditionally implemented on specialized hardware rather than on general purpose servers, but there is a clear trend to implement a number of network functions, such as firewall or load balancer, as software on virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs). We call a group of such VNFs a VNF set, which can be used to build network services. The use of VNF set to build network services introduces additional challenges on reliability, such as additional points of failure and the need to coordinate various VNFs. This document introduces a general idea of VNF Pools to support reliable function provision by the VNF set. We then highlight the reliability challenges and problems when using the VNF set to build services. Related IETF works are also briefly described. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 12, 2014. Zong, et al. Expires October 12, 2014 [Page 1] Internet-Draft VNF Pool Problem Statement April 2014 Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. From Specialized Hardware to Virtualized Network Function 4 3.2. The Concept of VNF Set . . . . . . . . . . . . . . . . . 5 4. VNF Pools . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5. Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5.1. Risk factors for unreliable instance . . . . . . . . . . 8 5.2. Redundancy model inside VNF . . . . . . . . . . . . . . . 8 5.3. State synchronization inside VNF . . . . . . . . . . . . 8 5.4. Reliability impact on adjacent VNFs . . . . . . . . . . . 9 5.5. Reliable transport . . . . . . . . . . . . . . . . . . . 9 5.6. Scope Considerations . . . . . . . . . . . . . . . . . . 9 6. Related Works . . . . . . . . . . . . . . . . . . . . . . . . 9 6.1. Reliable Server Pool (RSerPool) . . . . . . . . . . . . . 9 6.2. Virtual Router Redundancy Protocol (VRRP) . . . . . . . . 10 6.3. Service Function Chaining (SFC) . . . . . . . . . . . . . 10 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 10.1. Normative References . . . . . . . . . . . . . . . . . . 11 10.2. Informative References . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 1. Introduction Network functions such as firewall, load balancer, WAN optimizer are conventionally deployed as specialized hardware servers in both network operators' networks and data center networks, as the building blocks of the network services. Zong, et al. Expires October 12, 2014 [Page 2] Internet-Draft VNF Pool Problem Statement April 2014 A Virtualized Network Function (VNF) provides such network function through its implementation as software instances running on general purpose servers via a virtualization layer (i.e., hypervisor). VNFs potentially offer benefits such as elastic service offering, reduced operational and equipment costs [NFV-WP]. There is a trend to move network functions from specialized hardware servers to general purpose servers based on virtualized computing platforms, in order to build network services by using VNFs. For example, in Service Function Chaining (SFC), a network service can be built using a group of sequentially connected VNF instances deployed at different points in the network [SFC]. We call a group of VNF instances a VNF set. A VNF set can include single or multiple VNFs (e.g., virtual firewall, virtual load balancer, etc.), and each VNF may have a number of instances. A VNF set can be used not only as part of a SFC, but also merely as multiple VNFs without specific topological constraint. VNF Pools can be used by a VNF set to provide reliable functions. In VNF Pools, each VNF has a pool containing a number of VNF instances. Each VNF also has a Pool Manager that manages the pool and interacts with the Service Control Entity to provide function. The major benefit of VNF Pools is that the reliability mechanisms such as redundancy management become internal to the VNF itself and not visible to the Service Control Entity. The Service Control Entity simply talks to the Pool Manager representing each VNF to request and orchestrate the network functions with desired reliability level. Nevertheless, a VNF set and the virtualized functions can pose additional challenges on the reliability of the provided services, such as additional points of failure and the need to coordinate various VNF instances. For a VNF instance, it typically would not have built-in reliability mechanisms on its host (i.e., a general purpose server). Instead, there are more factors of risk such as software failure, hardware failure, and instance migration that may make VNF instance unreliable. We are specifically concerned with the reliability of a VNF instance as managed internally by the VNF. For example, how to manage the redundancy model, e.g., active/standby for a VNF instance? How is the service states of a VNF instance held and accessed for efficient synchronization with backup instances in a VNF pool? We are only concerned with the whole VNF set to the extent that it involves reliability impact on adjacent VNF instances of different VNFs. For example, when a live VNF instance goes out of service, how do an adjacent VNF instance in another VNF pool learn which instance will replace it? How does a VNF instance learn the states of an adjacent VNF instance in another VNF pool before the failure of the adjacent VNF instance happens? Zong, et al. Expires October 12, 2014 [Page 3] Internet-Draft VNF Pool Problem Statement April 2014 Note that we only focus on reliability mechanisms based on VNF Pools. Other mechanisms such as VNF scaling, and load balancing are out of scope, although they are probably complementary to reliability in order to provide network services. This document introduces a general idea of VNF Pools to support reliable function provision by the VNF set. We then highlight the reliability challenges and problems when using the VNF set to build services. Related IETF works are also briefly described. 2. Terminology Reliability: capability of a functional entity to consistently provide its function under various dynamic and even unexpected conditions such as fault, overload, etc. Service Control Entity: an entity of the service provider that decides how to combine the network functions to build network services. Examples are orchestrator of DC services, SFC control plane, etc. Virtualized Network Function (VNF): a VNF provides the same functional behavior and interfaces as the equivalent network function, but is deployed as software instance(s) building on top of a virtualization layer [NFV-TERM]. VNF Pool: a group of VNF instances providing the same network function. VNF Pool Element: a VNF instance inside a VNF pool. VNF Pool Manager: an entity that manages a VNF pool, and interacts with the Service Control Entity to provide the network function. VNF Set: a group of VNF instances that can be used to build network services. 3. Background 3.1. From Specialized Hardware to Virtualized Network Function Network functions are traditionally implemented on specialized hardware. There is a trend to implement a number of network functions as software instances on general purpose servers, via virtualized computing platforms. These virtualized functions are called Virtualized Network Functions (VNFs). For example, in Figure 1, virtual firewall (vFW) can be deployed as software instances on general purpose servers, which could be located in Data Zong, et al. Expires October 12, 2014 [Page 4] Internet-Draft VNF Pool Problem Statement April 2014 Center (DC) networks, network operators' networks, or in the end user premises. Compared with traditional FW deployed as "standalone box" combining specialized hardware and software, vFW has potential advantages such as agility, scalability [NFV-WP]. FW vFW vFW vFW +-------------+ +-----------+ +-----------+ +-----------+ | Specialized | |FW Software| |FW Software| |FW Software| ... | Hardware |----\ +-----------+ +-----------+ +-----------+ | + |----/ +------------------------------------------+ | Software | | Virtualization Platform | +-------------+ +------------------------------------------+ +-----------------+ +-----------------+ | General Purpose | | General Purpose | | Server | | Server | ... +-----------------+ +-----------------+ Figure 1: Example of vFW. 3.2. The Concept of VNF Set We call a group of VNF instances a VNF set. A VNF set can include single or multiple VNFs, and each VNF can have a number of instances. The following examples are all valid VNF sets. 1. n vFW instances: {vFW#1,vFW#2,...,vFW#n}. 2. m vFW instances and k virtual load balancer (vLB) instances: {vFW#1,...,vFW#m,vLB#1,...,vLB#k}. To be more generic, we denote VNF-A#x the xth instance of VNF type A, VNF-B#y the yth instance of VNF type B, and so on. A VNF set can be used as part of a Service Function Chaining (SFC) [SFC], where the instances of various functions are sequentially connected to build a network service. A simple example is shown in Figure 2. Network Service +----------+ +----------+ +----------+ | VNF-A#x | data conn | VNF-B#y | data conn | VNF-C#z | | |-----------| |-----------| | +----------+ +----------+ +----------+ Figure 2: A VNF set used as part of a SFC. Zong, et al. Expires October 12, 2014 [Page 5] Internet-Draft VNF Pool Problem Statement April 2014 Alternatively, a VNF set can be also used merely as multiple VNFs, where these VNF instances can provide network service in a parallel way. An example is shown in Figure 3. +----------+ +----------+ +----------+ | VNF-A#x | | VNF-B#y | | VNF-C#z | +----------+ +----------+ +----------+ \ | / data conn \ |data /data conn \ |conn / \ | / +---------------+ | Client | +---------------+ Figure 3: A VNF set used as multiple VNFs. Some more detailed use cases of VNFs are documented in several separated drafts [VNFPOOL-UC1] [VNFPOOL-UC2] [VNFPOOL-UC3]. 4. VNF Pools There are a number of existing technologies for providing reliable functions, such as Reliable Server Pool (RSerPool) [RFC5351], Virtual Router Redundancy Protocol (VRRP) [RFC5798], amongst many others. Both technologies provide the service with an abstract object (e.g., pool handle in RSerPool, virtual router ID in VRRP) representing a group of functional instances. The dynamic mapping of such abstract object to the actual serving instance is managed internally in the group to cover the failover procedure. The advantage is to provide reliable functions in a transparent manner for both end-hosts and service control entities. We adopt the similar idea of VNF Pools in the context of VNF set to provide reliable network functions, as shown in figure 4. Zong, et al. Expires October 12, 2014 [Page 6] Internet-Draft VNF Pool Problem Statement April 2014 +------------------------+ | Service Control Entity | +------------------------+ ^ ^ | | +-----------+ +------------+ | | v v + - - - - - - - - - - - - - - - + + - - - - - - - - - - - - - - - + | VNF-A +--------------+ | | VNF-B +--------------+ | | | Pool Manager | | | | Pool Manager | | | +--------------+ | | +--------------+ | | + - - - - - - - - - - - - - + | | + - - - - - - - - - - - - - + | | |+---------+ +---------+| | | |+---------+ +---------+| | | || VNF-A#1 | ... | VNF-A#n || | | || VNF-B#1 | ... | VNF-B#m || | | |+---------+ +---------+| | | |+---------+ +---------+| | | | VNF-A Pool | | | | VNF-B Pool | | | + - - - - - - - - - - - - - + | | + - - - - - - - - - - - - - + | + - - - - - - - - - - - - - - - + + - - - - - - - - - - - - - - - + Figure 4: VNF Pools Architecture. In VNF Pools, each VNF has a VNF Pool containing a number of VNF instances (or VNF Pool Elements). Each VNF also has a Pool Manager that manages the instances in the VNF Pool. Pool Manager interacts with the Service Control Entity to provide the network function. Similar to RSerPool and VRRP, Pool Manager can provide either identifier like "vFW", or virtual address representing the VNF, to the Service Control Entity. The benefit of VNF Pools is that the pooling mechanism such as redundancy management is internal to the VNF itself and not visible to the Service Control Entity. Moreover, the reliability capability of each VNF can be customized and provided to the Service Control Entity. Service Control Entity simply talks to the Pool Manager representing each VNF to request and orchestrate the network functions with desired reliability level. We only focus on reliability mechanisms based on VNF Pools. In the following problem statement, we are specifically concerned with the reliability of a VNF instance as managed internally by the VNF. We are only concerned with the whole VNF set to the extent that it involves reliability impact and relevant coordination on adjacent VNF instances of different VNFs. Other mechanisms such as VNF scaling, and load balancing are out of scope, although they are probably complementary to reliability in order to provide network services. Zong, et al. Expires October 12, 2014 [Page 7] Internet-Draft VNF Pool Problem Statement April 2014 5. Problems 5.1. Risk factors for unreliable instance For a VNF instance, it typically would not have built-in reliability mechanisms on its host (i.e., a general purpose server). Instead, there are more factors of risk that may make VNF instance unreliable. 1. Instance failure due to hardware failure or status change such as server overload. 2. Instance failure due to software failure at various levels including hypervisor, Virtual Machine (VM), VNF. 3. Instance migration caused by instance performance downgrade caused by load (e.g., CPU, memory, disk I/O), server consolidation or other service requirement changes. This is distinct from a hard failure, although it may give the appearance of one. 5.2. Redundancy model inside VNF Before a live VNF instance fails, one or more backup instances in the same VNF pool need to be selected and advertised to the adjacent entities such as the VNF instances in other VNF pools. Who is responsible and how to select and advertise such backup instances? Moreover, there are policies influencing the appropriate selection of backup instance. For example, it should be avoided that a live VNF instance and its backup instances are placed in a single physical server, or locations with shared risks in the network. On the other hand, it would be desirable to place the live and backup instances in geologically closed locations. VNF Pool Manager may need to collect information from the underlying network via - e.g., the interface with Application Layer Traffic Optimization (ALTO) [ALTO], or Interface to Routing System (I2RS) [I2RS]. 5.3. State synchronization inside VNF Service states related to the specific function performed by a VNF instance, e.g., NAT translation table, TCP connection states, should be synchronized between a live VNF instance and its backup instances for stateful failover. Who is responsible and how to collect, hold, and access such service states to achieve efficient synchronization? A VNF instance should provide negotiated level of state sharing with the necessary performance to fulfill the service requirements - e.g., state synchronization method, format of state data, location and mechanism to access state data. Zong, et al. Expires October 12, 2014 [Page 8] Internet-Draft VNF Pool Problem Statement April 2014 5.4. Reliability impact on adjacent VNFs A VNF instance may need to learn the states of an adjacent VNF instance in another VNF pool, before the failure of the adjacent VNF instance happens. Some critical states include the performance downgrade due to resource contention between instances, instance migration, and so on. Who is responsible and how to notify such critical states? 5.5. Reliable transport The transport mechanism used to carry the control messages dealing with reliability should provide reliable message delivery. Transport redundancy mechanisms such as Multipath TCP (MPTCP) [MPTCP] and the Stream Control Transmission Protocol (SCTP) [RFC3286] will need to be evaluated for applicability. Latency requirements for pool control message delivery must also be evaluated. 5.6. Scope Considerations Ideally, the reliability goal is that the network service provided by a VNF set will continue throughout an interruption within the VNF set, and VNF instances failure or migration will not be visible to the external entities. Our work of VNFPool initially focuses on several reliability mechanisms that are mainly the redundancy model and the stateful failover of a VNF instance, as well as the relevant coordination between adjacent VNF instances. Additional reliability mechanisms might be included after future gap analysis between identified requirements and existing IETF technologies. A detailed analysis of VNF reliability can be found in [NFV-REL]. VNFPool does not work on every aspect of VNF instance management such as scaling, or load balancing, even though these aspects may be complementary to reliability in order to provide network services. VNFPool does not intend to resolve the service availability that usually involves more factors including the interruptions in various OSI layers, and even user perception on service performance. 6. Related Works 6.1. Reliable Server Pool (RSerPool) RSerPool supports high availability and scalability of the applications through the use of pools of servers [RFC5351]. The main functions of RSerPool involve server pool management, as well as receiving requests from a client to bind to a desired server. The applicability and gaps of RSerPool to VNFPool is described in a separated draft [VNFPOOL-RSP]. Zong, et al. Expires October 12, 2014 [Page 9] Internet-Draft VNF Pool Problem Statement April 2014 6.2. Virtual Router Redundancy Protocol (VRRP) VRRP specifies an election protocol that dynamically assigns responsibility of a virtual router to one of the VRRP routers called master on a LAN [RFC5798]. The election process provides dynamic failover in the forwarding responsibility should the Master become unavailable. The advantage of VRRP is a higher availability default path without requiring configuration of dynamic routing or router discovery protocols on every end-host. 6.3. Service Function Chaining (SFC) A service chain defines an ordered set of service functions that must be applied to packets [SFC]. Although a VNF set can be used as part of a SFC, SFC and VNFPool have different focus. In particular, VNFPool focus on the following key problem domains that do not covered by SFC: 1) redundancy model inside a VNF such as M:N protection where any instance can be used, and backup selection considering shared risks; 2) efficient state sharing among multiple VNF instances. We are specifically concerned with the reliability of a VNF instance managed internally in a VNF. We only deal with adjacencies in the VNF set in the context of coordination traffic specifically to deal with reliability, e.g., backup and state notification. The reliability mechanisms in VNFPool are mostly internal to the VNF itself and not visible to the SFC control entity. SFC control entity simply talks to the Pool Manager presenting each VNF to request and orchestrate the network functions with desired reliability level. VNFPool is not only used in the case of "chained VNFs", but also applicable to more cases where the VNF instances have no specific topological constraint. 7. Security Considerations Any technology which allows the insertion, deletion, reordering, or manipulation of network functions has the potential to be subverted by an attacker, with serious consequences. Distributed VNFs introduce an additional attack vector, in which bad actors join several VNFs of a service. Replay attacks have the potential to create denials of service, reordering, adding, or removing VNFs. VNF reliability technologies must provide cryptographic protections against spoofing and insertion attacks as well as replay attacks, in the form of client authentication, origin authentication on VNF reliability management (control plane) traffic, and replay protections. There may be circumstances under which an attacker Zong, et al. Expires October 12, 2014 [Page 10] Internet-Draft VNF Pool Problem Statement April 2014 masquerading as a VNF manager can introduce data leakage or similar attacks, and consequently server authentication would be required, as well. Failing over a VNF or otherwise transferring service state raises issues related to the transfer of security state, including VNF element identity and credentials, session-associated cryptographic state, and so on. Where possible, transfer of security state should be avoided as a matter of good practice, and this will require particular attention as solutions are drafted. 8. IANA Considerations This document has no actions for IANA. 9. Acknowledgements The authors would like to thank Chidung Lac from Orange, Daniel King from Lancaster University, Lingli Deng, Zhen Cao from China Mobile, Richard Yang from Yale University, Hidetoshi Yokota from KDDI, Mukhtiar Shaikh from Brocade, Susan Hares, for their valuable comments. 10. References 10.1. Normative References TBD. 10.2. Informative References [NFV-WP] NFV Whitepaper: "Network Function Virtualization", issue 1, 2012, http://portal.etsi.org/NFV/NFV_White_Paper.pdf. [SFC] "Service Function Chaining (SFC)", . [NFV-TERM] ETSI GS NFV 003: "Terminology for Main Conceptional Entities in NFV", Version 0.0.4, 2013. [VNFPOOL-UC1] L. Xia, Q. Wu, D. King, H. Yokota, and N. Khan, "Requirements and Use Cases for Virtual Network Functions", draft- xia-vnfpool-use-cases-00, February 2014. [VNFPOOL-UC2] D. King, M. Liebsch, P. Willis and J. Ryoo, "Virtualization of Mobile Core Network Use Case", draft-king-vnfpool- mobile-use-case-00, February 2014. Zong, et al. Expires October 12, 2014 [Page 11] Internet-Draft VNF Pool Problem Statement April 2014 [VNFPOOL-UC3] S. Hares and K. Subramaniam, "Use Cases for Resource Pools with Virtual Network Functions (VNFs)", draft-hares-vnf-pool- use-case-00, January 2014. [ALTO] "Application-Layer Traffic Optimization (alto)", . [I2RS] "Interface to the Routing System (i2rs)", . [MPTCP] "Multipath TCP (mptcp)", . [RFC3286] L. Ong and J. Yoakum, "An Introduction to the Stream Control Transmission Protocol (SCTP)", RFC3286, May 2002. [NFV-REL] ETSI GS NFV REL 001: "Network Function Virtualization; Resiliency Requirements", Version 0.0.1, 2013. [RFC5351] P. Lei, L. Ong, M. Tuexen and T. Dreibholz, "An Overview of Reliable Server Pooling Protocols", RFC5351, September 2008. [RFC5798] S. Nadas, "Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6", RFC5798, March 2010. [VNFPOOL-RSP] T. Dreibholz, M. Tuexen, M. Shore and N. Zong, "The Applicability of Reliable Server Pooling (RSerPool) for Virtual Network Function Resource Pooling (VNFPOOL)", draft-dreibholz- vnfpool-rserpool-applic-00, October 2013. Authors' Addresses Ning Zong Huawei Technologies Email: zongning@huawei.com Linda Dunbar Huawei Technologies Email: linda.dunbar@huawei.com Melinda Shore No Mountain Software Email: melinda.shore@nomountain.net Zong, et al. Expires October 12, 2014 [Page 12] Internet-Draft VNF Pool Problem Statement April 2014 Diego Lopez Telefonica Email: diego@tid.es Zong, et al. Expires October 12, 2014 [Page 13]