INTERNET-DRAFT X. Wei Intended Status: Informational Zhu Lei Expires: December 27, 2013 Huawei June 25, 2013 Tunnel Congestion Feedback draft-wei-tsvwg-tunnel-congestion-feedback-00 Abstract This document describes a mechanism to calculate congestion of a tunnel segment based on RFC 6040 recommendations, and a feedback protocol by which to send the measured congestion of the tunnel from egress to ingress router. A basic model for measuring tunnel congestion and feedback is described, and a protocol for carrying the feedback data is outlined. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Wei Expires December 27, 2013 [Page 1] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions and Terminology . . . . . . . . . . . . . . . . . . 3 2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 4 4. Congestion Model . . . . . . . . . . . . . . . . . . . . . . . 5 4.1 Congestion Calculation . . . . . . . . . . . . . . . . . . . 5 4.2 Congestion Feedback . . . . . . . . . . . . . . . . . . . . 7 5. Congestion Feedback Protocol . . . . . . . . . . . . . . . . . 9 5.1 Properties of Candidate Protocol . . . . . . . . . . . . . . 9 5.2 IPFIX Extensions for Congestion Feedback . . . . . . . . . . 9 5.3 Other Protocols . . . . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 14 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 8.1 Normative References . . . . . . . . . . . . . . . . . . . 14 8.2 Informative References . . . . . . . . . . . . . . . . . . 14 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 Wei Expires December 27, 2013 [Page 2] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 1. Introduction This document describes a mechanism for feedback of congestion observed in IP-in-IP tunnels (also referred to as IP tunnels).Common tunnel deployments such as mobile backhaul networks, VPNs and other IP-in-IP tunnels can be congested as a result of sustained high load. Network providers use a number of methods to deal with high load conditions including proper network dimensioning, policies for preferential flow treatment and selective offloading among others. The mechanism proposed in this draft is expected to complement them and provide congestion information that to allow making better, policies and decisions. The model and general solution proposed in chapters 4 consist of identifying congestion marks set in the tunnel segment, and feeding back the congestion signal from the egress to the ingress of the tunnel. Measuring congestion of a tunnel segment is based on counting outer packet CE marks for packets that have ECT marks in the inner packet. This proposal depends on statistical marking of congestion and uses the method described in RFC 6040, Appendix C. Chapter 5 describes the protocol mechanisms to feed back the calculated congestion from egress to ingress. The desired properties of a protocol are outlined and IPFIX as a candidate protocol for these extensions is explored further. 2. Conventions and Terminology 2.1 Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119] 2.2 Terminology Tunnel: A channel over which encapsulated packets traverse across a network. Encapsulation: The process of adding control information as it passes through the layered model. Encapsulator: The tunnel endpoint function that adds an outer IP header to tunnel a packet, the encapsulator is considered as the "ingress" of the tunnel. Wei Expires December 27, 2013 [Page 3] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 Decapsulator: The tunnel endpoint function that removes an outer IP header from a tunnelled packet, the decapsulator is considered as the "egress" of the tunnel. Outer header: The header added to encapsulate a tunneled packet. Inner header: The header encapsulated by the outer header. E2E: End to End. VPN: Virtual Private Network is a technology for using the Internet or another intermediate network to connect computers to isolated remote computer networks that would otherwise be inaccessible. GRE: Generic Routing Encapsulation. IPFIX IP Flow Information Export. An IETF protocol to export flow information from routers and other devices. RED Random Early Detection 3. Problem Statement IP backhaul networks such as those of mobile networks are provisioned and managed to provide the subscribed levels of end user service. These networks are traffic engineered, and have defined mechanisms for providing differentiated services and QoS per user or flow. Policy to configure per user flow attributes in these networks have traditionally been based on monitoring and static configuration. Currently, these networks are increasingly used for applications that demand high bandwidth. The nature of the flows and length of end user sessions can lead to significant variability in aggregate bandwidth demands and latency. In such cases, it would be useful to have a more dynamic feedback of congestion information. This aggregate congestion feedback could be used to determine flow handling and admission control. Tunnels such as PMIP, GTP or other IP-in-IP maybe used to carry end user flows within the backhaul network such as shown in Figure 1. ECN handling mechanisms in RFC 6040 specifies how ECN should be handled for tunneling. In addition, RFC 6040, Appendix C provides guidance on how to calculate congestion experienced in the tunnel itself. However, there is no standardized mechanism by which the congestion information inside the tunnel can be fed back from egress to ingress Wei Expires December 27, 2013 [Page 4] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 router. In addition, it should be noted that these tunnels may carry ECT or Not-ECT traffic. A well defined mechanism for aggregate congestion calculation should be able to work in the presence of all kinds of traffic and would benefit from a common feedback mechanism and protocol. \|/ | | +-|---+ +------+ +------+ +--+ | | Tunnel1 | | Tunnel2 | | Ext |UE|-(RAN)-| eNB |===========| S-GW |=========| P-GW |-------- +--+ | | RAN | | Core | |Network +-+---+ Backhaul +---+--+ Network +---+--+ \ | / \ | / \ .-------------+----------+. / ( ) ( ) / IP \ ( Backhaul ) \ Network / ( ) ( ) .------------------------. Figure 1: Example - Mobile Network and Tunnels 4. Congestion Model To support traffic management and congestion feedback in tunnel, there are mainly two issues that this document discusses: calculation of congestion information in ECT and Not-ECT flows, and feeding back the congestion information from egress to ingress router. 4.1 Congestion Calculation Calculation of congestion in the tunnel is based on the method described in RFC 6040, Appendix C. The egress can calculate congestion using moving averages. The Wei Expires December 27, 2013 [Page 5] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 proportion of packets not marked in the inner header that have a CE marking in the outer header is considered to have experienced congestion in the tunnel. Note that the packets are ECN capable and not congestion-marked before tunnel. Since routers implementing RED randomly select a percentage of packets to mark, this method can be effectively used to expose congestion in the tunnel. When the ingress is RFC6040 compliant, the packets collected by egress can be divided into to 4 categories, shown in figure 2. The tag before "|" stands for ECN field in outer header; and the tag after "|" stands for ECN field in inner header. "Not-ECN|Not-ECN" indicates traffic that does not support ECN, for example UDP and Not-ECT marked TCP; "CE|CE" indicates ECN capable packets that have CE-mark before entering the tunnel; "CE|ECT" indicates ECN capable packets that are CE-marked in the tunnel; "ECT|ECT" indicates ECN capable packets that have not experienced congested in tunnel (or outside the tunnel). +--------------------------+ | Not-ECN|Not-ECN | +--------------------------+ | CE|CE | +--------------------------+ | CE|ECT | +--------------------------+ | ECT|ECT | +--------------------------+ Figure 2: ECN marking categories by outer/inner packet Out of the total number of packets, if the quantity of CE|ECT packets is A, the quantity of ECT|ECT packets is B, then the congestion level (C) can be calculated as follows: C=A/(A+B) As an example, consider 100 packets to calculate the moving average as shown in RFC 6040, Appendix C. Say that there are 12 packets that have CE|ECT marks indicating that they have experienced congestion in the tunnel. And, there are 58 packets that have ECT|ECT marks indicating that there was no congestion in either the tunnel or elsewhere. The egress can calculate congestions as: Wei Expires December 27, 2013 [Page 6] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 C = 12/ (12 + 58) = 12/70 (17% congestion) 4.2 Congestion Feedback The figure below introduces an abstract view of the tunnel and outlines a tunnel congestion feedback model. ,-----------. IP-in-IP TUNNEL ,-----------. | Ingress |========================================| Egress | | | | | | ,-----------Congestion-Feedback-Signals--------------. | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ,-----------. '\ | | | | | |____________| |__________| \ | | | |+----v----+| Outer Header(IP Layer) Data Flow \ |+----+----+| ||Collector|| | | \||Feedback || |+---------+| |(Congested)| /|+---------+| || Manager || | Router |Outer-CE-Signals--> Meter || |+---------+|____________| |________ / |+---------+| | | | | |/ | | | | `-----------' ' | | | |========================================| | `-----------' `-----------' Figure 3: Basic Feedback Model The basic model consists of the following components: Ingress, Egress, Feedback, Meter, Collector and Manager. At the egress, a module named Meter is used to estimate the congestion level of the tunnel as described in the section above. A congestion information feedback module, called Feedback, is used to control the congestion information feedback. The metering module (Meter) in the Egress node accounts the congestion marks it receives. The Feedback module calculates the amount of congestion and feeds back the congestion information to the Wei Expires December 27, 2013 [Page 7] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 Ingress node. The Collector at the Ingress receives the congestion information which is fed back from the Egress node. The Manager has admission control and flow control functions which are out of the scope of this document. It should be assumed that the ingress and egress of the tunnel are ECN-enabled and the intermediate routers in the tunnel path are also ECN-enabled. Congestion feedback signals in the figure are fed back using protocols described in section 5. Wei Expires December 27, 2013 [Page 8] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 5. Congestion Feedback Protocol 5.1 Properties of Candidate Protocol To feedback congestion efficiently there are some properties that are desirable in the feedback protocol. 1. Congestion friendliness. The feeding back traffics are coexistence with other traffics, so when congestion happens in the network, the feeding back traffic should be reduced, So that feedback itself will not congest the network further when the network is already getting congested. In other words, feedback frequency should adjust to network's congestion level. 2. Extensibility. The authors consider that using an existing protocol, or extensions to an existing protocol is preferable. The ability of a protocol to support modular extensions to report congestion level as feedback is a key attribute of the protocol under consideration. 3. Compactness. In different situations, there may be different congestion information to be conveyed, and in order to reduce network load, the information to be conveyed should be selectable, i.e. only the required information should be possible to convey. 5.2 IPFIX Extensions for Congestion Feedback This section outlines IPFIX extensions for feedback of congestion. The authors consider that IPFIX is a suitable protocol that is reasonably easy to extend to carry tunnel congestion reporting. Since IPFIX uses TCP transport, it has the foundation for congestion- friendly behavior. When congestion occurs in the network, the Exporter (Egress) can reduce the IPFIX traffic. Thus the feedback itself will not congest the network further when the network is already getting congested. When the Exporter detects network congestion, it can also reduce IPFIX traffic frequency to avoid more congestion in network while being able to sufficiently convey congestion status. Because the template mechanism in IPFIX is flexible, it allows the export of only the required information. Sending only the required information can also reduce network load. The basic procedure for feedback using IPFIX is as follows: (1)The exporter inform the collector how to interpret the IEs in Wei Expires December 27, 2013 [Page 9] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 IPFIX message using template. Collector just accepts template passively; which IEs to send is configured by other means that not included in IPFIX specification. (2)The exporter meters the traffic and sends the congestion level to collector. Congestion feedback using IPFIX is shown in the figures below. There are two variations to congestion feedback model using IPFIX. In the first one shown in Figure 4(a), congestion information is sent directly from egress to ingress and ingress makes decisions according this information. In the second case shown in Figure 4(b), congestion information is sent to a mediation controller instead of tunnel ingress; the controller is in charge of making decisions according to network congestion and control the behavior of ingress node, for example, reducing traffic or forbidding new traffic flows. In this model the congestion information from egress to controller is conveyed by IPFIX, but how controller controls the behavior of ingress is out of scope of this document. Wei Expires December 27, 2013 [Page 10] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 IPFIX |-----------------------------------------| | | | | | V +----------+ tunnel +-----------+ |Egress |========================== |Inress | |(Exporter)| |(Collector)| +----------+ +-----------+ (a) Direct Feedback. IPFIX +-----------+ --------->|Controller |##################### | |(Collector)| # | +-----------+ # | # +----------+ tunnel +-----V-+ |Egress | ===========================|Ingress| |(Exporter)| +-------+ +----------+ (b) Mediated Feedback. Figure 4: IPFIX Congestion Feedback Models To support feeding back congestion information, some extensions to the IPFIX protocol are necessary. A new IE conveying congestion level is defined for this purpose. Wei Expires December 27, 2013 [Page 11] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 Definition of new IE indicating congestion level. Description: The congestion level calculated by exporter. Abstract Data Type: unsigned8 Data Type Semantics: quantity ElementId: TBD. Status: current The example below shows how IPFIX can be used for congestion feedback (note: the information conveyed here may be incomplete). (1) Sending Template Set The exporter use Template Set to inform the collector how to interpret the IEs in the following Data Set. +------------------------+--------------------+ |Set ID=2 |Length=n | +------------------------+--------------------+ |Template ID=257 |Field Count=m | +------------------------+--------------------+ |exporterIPv4Address=130 |Field Length=4 | +------------------------+--------------------+ |collectorIPv4Address=211|Field Length=4 | +------------------------+--------------------+ |CongestionLevel=TBD1 |Field Length=2 | +---------------------------------------------+ |Enterprise Number=TBD2 | +---------------------------------------------+ (2) Sending Data Set The exporter meters the traffic and sends the congestion information to collector by Data Set. +------------------+-------------------+ |Set ID=257 |Length=n | +--------------------------------------+ |192.0.2.12 | +--------------------------------------+ |192.0.2.34 | +--------------------------------------+ |15 | +--------------------------------------+ Wei Expires December 27, 2013 [Page 12] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 +--------+ +---------+ |Exporter| |Collector| +--------+ +---------+ | | | | | (1)Sending Template Set | |------------------------------------->| | | +--------+ | |metering| | +--------+ | | (2)Sending Data Set | |------------------------------------->| | . | | . | | . | | | | | Figure 5: IPFIX Congestion Flow The Exporter can send congestion information periodically or when triggerd to the Collector. Before sending congestion data to the collector, the exporter sends a Template set to Collector. The Template set specifies the structure and semantics of the subsequent Data Set containing congestion-related information. The Collector understands the Data Sets that follow according to Template Set that was sent previously. The exporting Process transmits the Template Set in advance of any Data Sets that use that Template ID, to help ensure that the Collector has the Template Record before receiving the first Data Record. Data Records that correspond to a Template Record may appear in the same and/or subsequent IPFIX Message(s). The Exporter meters the traffic passing through it and generates flow records. At this point, the Exporter may cache the records and then send congestion cumulative information to the collector. When Exporter detects that the network is heavily congested, it can change the feedback frequency to avoid adding more congestion to network. When receiving congestion related information, the Collector will make decisions to control the traffic entering the tunnel to reduce tunnel congestion. 5.3 Other Protocols A thorough evaluation of other protocols have not been performed at this time. Wei Expires December 27, 2013 [Page 13] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 6. Security Considerations This document describes the tunnel congestion calculation and feedback. For feeding back congestion, security mechanisms of IPFIX are expected to be sufficient. No additional security concerns are expected. 7. IANA Considerations IANA assignment of parameters for IPFIX extension may need to be considered in this document. 8. References 8.1 Normative References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC1776] Crocker, S., "The Address is the Message", RFC 1776, April 1 1995. [TRUTHS] Callon, R., "The Twelve Networking Truths", RFC 1925, April 1 1996. 8.2 Informative References [EVILBIT] Bellovin, S., "The Security Flag in the IPv4 Header", RFC 3514, April 1 2003. [RFC5513] Farrel, A., "IANA Considerations for Three Letter Acronyms", RFC 5513, April 1 2009. [RFC5514] Vyncke, E., "IPv6 over Social Networks", RFC 5514, April 1 2009. [RFC 5559] Eardley, P., "Pre-Congestion Notification (PCN) Architecture, RFC 5559, June 2009. [conex-mob]Kutscher et al, "Mobile Communication Congestion Exposure Wei Expires December 27, 2013 [Page 14] INTERNET DRAFT Tunnel Congestion Feedback June 25, 2013 Scenario", draft-ietf-conex-mobile-00, July 09, 2012. [TS23402] Architecture Enhancements for Non-3GPP Accesses, 3GPP TS 23.402, V11.2.0 (2012-03). Authors' Addresses Xinpeng Wei Huawei Building, Q20 No.156 Beiqing Rd.Z-park Haidian District, Beijing 100095 P. R. China E-mail: weixinpeng@huawei.com Zhu Lei Huawei Building, Q20 No.156 Beiqing Rd.Z-park Haidian District, Beijing 100095 P. R. China E-mail: lei.zhu@huawei.com Wei Expires December 27, 2013 [Page 15]