| rfc9330v7.txt | rfc9330.txt | |||
|---|---|---|---|---|
| skipping to change at line 186 ¶ | skipping to change at line 186 ¶ | |||
| It has been demonstrated that, if the sending host replaces a Classic | It has been demonstrated that, if the sending host replaces a Classic | |||
| congestion control with a 'Scalable' alternative, the performance | congestion control with a 'Scalable' alternative, the performance | |||
| under load of all the above interactive applications can be | under load of all the above interactive applications can be | |||
| significantly improved once a suitable AQM is deployed in the | significantly improved once a suitable AQM is deployed in the | |||
| network. Taking the example solution cited below that uses Data | network. Taking the example solution cited below that uses Data | |||
| Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] | Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] | |||
| on a DSL or Ethernet link, queuing delay under heavy load is roughly | on a DSL or Ethernet link, queuing delay under heavy load is roughly | |||
| 1-2 ms at the 99th percentile without losing link utilization | 1-2 ms at the 99th percentile without losing link utilization | |||
| [L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). | [L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). | |||
| This compares with 5-20 ms on _average_ with a Classic congestion | This compares with 5-20 ms on _average_ with a Classic congestion | |||
| control and current state-of-the-art AQMs, such as FQ-CoDel | control and current state-of-the-art AQMs, such as Flow Queue CoDel | |||
| [RFC8290], PIE [RFC8033], or DOCSIS PIE [RFC8034] and about 20-30 ms | [RFC8290], Proportional Integral controller Enhanced (PIE) [RFC8033], | |||
| at the 99th percentile [DualPI2Linux]. | or DOCSIS PIE [RFC8034] and about 20-30 ms at the 99th percentile | |||
| [DualPI2Linux]. | ||||
| L4S is designed for incremental deployment. It is possible to deploy | L4S is designed for incremental deployment. It is possible to deploy | |||
| the L4S service at a bottleneck link alongside the existing best | the L4S service at a bottleneck link alongside the existing best | |||
| efforts service [DualPI2Linux] so that unmodified applications can | efforts service [DualPI2Linux] so that unmodified applications can | |||
| start using it as soon as the sender's stack is updated. Access | start using it as soon as the sender's stack is updated. Access | |||
| networks are typically designed with one link as the bottleneck for | networks are typically designed with one link as the bottleneck for | |||
| each site (which might be a home, small enterprise, or mobile | each site (which might be a home, small enterprise, or mobile | |||
| device), so deployment at either or both ends of this link should | device), so deployment at either or both ends of this link should | |||
| give nearly all the benefit in the respective direction. With some | give nearly all the benefit in the respective direction. With some | |||
| transport protocols, namely TCP [ACCECN] and SCTP [RFC4960], the | transport protocols, namely TCP [ACCECN], the sender has to check | |||
| sender has to check that the receiver has been suitably updated to | that the receiver has been suitably updated to give more accurate | |||
| give more accurate feedback, whereas with more recent transport | feedback, whereas with more recent transport protocols, such as QUIC | |||
| protocols, such as QUIC [RFC9000] and DCCP [RFC4340], all receivers | [RFC9000] and Datagram Congestion Control Protocol (DCCP) [RFC4340], | |||
| have always been suitable. | all receivers have always been suitable. | |||
| This document presents the L4S architecture. It consists of three | This document presents the L4S architecture. It consists of three | |||
| components: network support to isolate L4S traffic from Classic | components: network support to isolate L4S traffic from Classic | |||
| traffic; protocol features that allow network elements to identify | traffic; protocol features that allow network elements to identify | |||
| L4S traffic; and host support for L4S congestion controls. The | L4S traffic; and host support for L4S congestion controls. The | |||
| protocol is defined separately in [RFC9331] as an experimental change | protocol is defined separately in [RFC9331] as an experimental change | |||
| to Explicit Congestion Notification (ECN). This document describes | to Explicit Congestion Notification (ECN). This document describes | |||
| and justifies the component parts and how they interact to provide | and justifies the component parts and how they interact to provide | |||
| the low latency, low loss, and scalable Internet service. It also | the low latency, low loss, and scalable Internet service. It also | |||
| details the approach to incremental deployment, as briefly summarized | details the approach to incremental deployment, as briefly summarized | |||
| skipping to change at line 285 ¶ | skipping to change at line 286 ¶ | |||
| utilization, whatever the flow rate, as well as ensuring that | utilization, whatever the flow rate, as well as ensuring that | |||
| high throughput is more robust to disturbances. The Scalable | high throughput is more robust to disturbances. The Scalable | |||
| control used most widely (in controlled environments) is DCTCP | control used most widely (in controlled environments) is DCTCP | |||
| [RFC8257], which has been implemented and deployed in Windows | [RFC8257], which has been implemented and deployed in Windows | |||
| Server Editions (since 2012), in Linux, and in FreeBSD. Although | Server Editions (since 2012), in Linux, and in FreeBSD. Although | |||
| DCTCP as-is functions well over wide-area round-trip times | DCTCP as-is functions well over wide-area round-trip times | |||
| (RTTs), most implementations lack certain safety features that | (RTTs), most implementations lack certain safety features that | |||
| would be necessary for use outside controlled environments, like | would be necessary for use outside controlled environments, like | |||
| data centres (see Section 6.4.3). Therefore, Scalable congestion | data centres (see Section 6.4.3). Therefore, Scalable congestion | |||
| control needs to be implemented in TCP and other transport | control needs to be implemented in TCP and other transport | |||
| protocols (QUIC, SCTP, RTP/RTCP, RTP Media Congestion Avoidance | protocols (QUIC, Stream Control Transmission Protocol (SCTP), | |||
| Techniques (RMCAT), etc.). Indeed, between the present document | RTP/RTCP, RTP Media Congestion Avoidance Techniques (RMCAT), | |||
| being drafted and published, the following Scalable congestion | etc.). Indeed, between the present document being drafted and | |||
| controls were implemented: TCP Prague [PragueLinux], QUIC Prague, | published, the following Scalable congestion controls were | |||
| implemented: Prague over TCP and QUIC [PRAGUE-CC] [PragueLinux], | ||||
| an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and | an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and | |||
| the L4S ECN part of BBRv2 [BBRv2] intended for TCP and QUIC | the L4S ECN part of Bottleneck Bandwidth and Round-trip | |||
| propagation time (BBRv2) [BBRv2] intended for TCP and QUIC | ||||
| transports. | transports. | |||
| 2) Network: | 2) Network: | |||
| L4S traffic needs to be isolated from the queuing latency of | L4S traffic needs to be isolated from the queuing latency of | |||
| Classic traffic. One queue per application flow (FQ) is one way | Classic traffic. One queue per application flow (FQ) is one way | |||
| to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just | to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just | |||
| two queues is sufficient and does not require inspection of | two queues is sufficient and does not require inspection of | |||
| transport layer headers in the network, which is not always | transport layer headers in the network, which is not always | |||
| possible (see Section 5.2). With just two queues, it might seem | possible (see Section 5.2). With just two queues, it might seem | |||
| skipping to change at line 345 ¶ | skipping to change at line 348 ¶ | |||
| negative impact on its flow rate [RFC5033]. The scaling problem | negative impact on its flow rate [RFC5033]. The scaling problem | |||
| with Classic congestion control is explained, with examples, in | with Classic congestion control is explained, with examples, in | |||
| Section 5.1 and in [RFC3649]. | Section 5.1 and in [RFC3649]. | |||
| Scalable Congestion Control: A congestion control where the average | Scalable Congestion Control: A congestion control where the average | |||
| time from one congestion signal to the next (the recovery time) | time from one congestion signal to the next (the recovery time) | |||
| remains invariant as flow rate scales, all other factors being | remains invariant as flow rate scales, all other factors being | |||
| equal. For instance, DCTCP averages 2 congestion signals per | equal. For instance, DCTCP averages 2 congestion signals per | |||
| round trip, whatever the flow rate, as do other recently developed | round trip, whatever the flow rate, as do other recently developed | |||
| Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], | Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], | |||
| TCP Prague [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] [BBR-CC], and | Prague for TCP and QUIC [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] | |||
| the L4S variant of SCReAM for real-time media [SCReAM-L4S] | [BBR-CC], and the L4S variant of SCReAM for real-time media | |||
| [RFC8298]. See Section 4.3 of [RFC9331] for more explanation. | [SCReAM-L4S] [RFC8298]. See Section 4.3 of [RFC9331] for more | |||
| explanation. | ||||
| Classic Service: The Classic service is intended for all the | Classic Service: The Classic service is intended for all the | |||
| congestion control behaviours that coexist with Reno [RFC5681] | congestion control behaviours that coexist with Reno [RFC5681] | |||
| (e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC | (e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC | |||
| [RFC5348]). The term 'Classic queue' means a queue providing the | [RFC5348]). The term 'Classic queue' means a queue providing the | |||
| Classic service. | Classic service. | |||
| Low Latency, Low Loss, and Scalable throughput (L4S) service: The | Low Latency, Low Loss, and Scalable throughput (L4S) service: The | |||
| 'L4S' service is intended for traffic from Scalable congestion | 'L4S' service is intended for traffic from Scalable congestion | |||
| control algorithms, such as the Prague congestion control | control algorithms, such as the Prague congestion control | |||
| skipping to change at line 781 ¶ | skipping to change at line 785 ¶ | |||
| clearly problematic for a congestion control to take multiple | clearly problematic for a congestion control to take multiple | |||
| seconds to recover from each congestion event. CUBIC [RFC8312] | seconds to recover from each congestion event. CUBIC [RFC8312] | |||
| was developed to be less unscalable, but it is approaching its | was developed to be less unscalable, but it is approaching its | |||
| scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC | scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC | |||
| is still fully in its Reno-friendly mode, so it takes about 4.3 s | is still fully in its Reno-friendly mode, so it takes about 4.3 s | |||
| to recover. However, once flow rate scales by 8 times again to | to recover. However, once flow rate scales by 8 times again to | |||
| 960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 | 960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 | |||
| s. From then on, each further scaling by 8 times doubles CUBIC's | s. From then on, each further scaling by 8 times doubles CUBIC's | |||
| recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ | recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ | |||
| s, the recovery time is 24.3 s. In contrast, a Scalable | s, the recovery time is 24.3 s. In contrast, a Scalable | |||
| congestion control like DCTCP or TCP Prague induces 2 congestion | congestion control like DCTCP or Prague induces 2 congestion | |||
| signals per round trip on average, which remains invariant for any | signals per round trip on average, which remains invariant for any | |||
| flow rate, keeping dynamic control very tight. | flow rate, keeping dynamic control very tight. | |||
| For a feel of where the global average lone-flow download sits on | For a feel of where the global average lone-flow download sits on | |||
| this scale at the time of writing (2021), according to [BDPdata], | this scale at the time of writing (2021), according to [BDPdata], | |||
| the global average fixed access capacity was 103 Mb/s in 2020 and | the global average fixed access capacity was 103 Mb/s in 2020 and | |||
| the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging | the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging | |||
| of per-country data was weighted by Internet user population (data | of per-country data was weighted by Internet user population (data | |||
| collected globally is necessarily of variable quality, but the | collected globally is necessarily of variable quality, but the | |||
| paper does double-check that the outcome compares well against a | paper does double-check that the outcome compares well against a | |||
| skipping to change at line 1373 ¶ | skipping to change at line 1377 ¶ | |||
| Three complementary approaches are in progress to address this issue, | Three complementary approaches are in progress to address this issue, | |||
| but they are all currently research: | but they are all currently research: | |||
| * In Prague congestion control, ignore certain losses deemed | * In Prague congestion control, ignore certain losses deemed | |||
| unlikely to be due to congestion (using some ideas from BBR | unlikely to be due to congestion (using some ideas from BBR | |||
| [BBR-CC] regarding isolated losses). This could mask any of the | [BBR-CC] regarding isolated losses). This could mask any of the | |||
| above types of loss while still coexisting with drop-based | above types of loss while still coexisting with drop-based | |||
| congestion controls. | congestion controls. | |||
| * A combination of RACK [RFC8985], L4S, and link retransmission | * A combination of Recent Acknowledgement (RACK) [RFC8985], L4S, and | |||
| without resequencing could repair transmission errors without the | link retransmission without resequencing could repair transmission | |||
| head of line blocking delay usually associated with link-layer | errors without the head of line blocking delay usually associated | |||
| retransmission [UnorderedLTE] [RFC9331]. | with link-layer retransmission [UnorderedLTE] [RFC9331]. | |||
| * Hybrid ECN/drop rate policers (see Section 8.3). | * Hybrid ECN/drop rate policers (see Section 8.3). | |||
| L4S deployment scenarios that minimize these issues (e.g., over | L4S deployment scenarios that minimize these issues (e.g., over | |||
| wireline networks) can proceed in parallel to this research, in the | wireline networks) can proceed in parallel to this research, in the | |||
| expectation that research success could continually widen L4S | expectation that research success could continually widen L4S | |||
| applicability. | applicability. | |||
| 6.4.4. L4S Flow but Classic ECN Bottleneck | 6.4.4. L4S Flow but Classic ECN Bottleneck | |||
| skipping to change at line 1851 ¶ | skipping to change at line 1855 ¶ | |||
| <https://doi.org/10.1145/3404868.3406669>. | <https://doi.org/10.1145/3404868.3406669>. | |||
| [NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency | [NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency | |||
| Requirements for Head-Worn Display S/EVS Applications", | Requirements for Head-Worn Display S/EVS Applications", | |||
| Proceedings of SPIE 5424, DOI 10.1117/12.554462, April | Proceedings of SPIE 5424, DOI 10.1117/12.554462, April | |||
| 2004, <https://ntrs.nasa.gov/api/citations/20120009198/ | 2004, <https://ntrs.nasa.gov/api/citations/20120009198/ | |||
| downloads/20120009198.pdf?attachment=true>. | downloads/20120009198.pdf?attachment=true>. | |||
| [NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | [NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | |||
| Behavior (NQB PHB) for Differentiated Services", Work in | Behavior (NQB PHB) for Differentiated Services", Work in | |||
| Progress, Internet-Draft, draft-ietf-tsvwg-nqb-14, 24 | Progress, Internet-Draft, draft-ietf-tsvwg-nqb-15, 11 | |||
| October 2022, <https://datatracker.ietf.org/doc/html/ | January 2023, <https://datatracker.ietf.org/doc/html/ | |||
| draft-ietf-tsvwg-nqb-14>. | draft-ietf-tsvwg-nqb-15>. | |||
| [PRAGUE-CC] | [PRAGUE-CC] | |||
| De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague | De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague | |||
| Congestion Control", Work in Progress, Internet-Draft, | Congestion Control", Work in Progress, Internet-Draft, | |||
| draft-briscoe-iccrg-prague-congestion-control-01, 11 July | draft-briscoe-iccrg-prague-congestion-control-01, 11 July | |||
| 2022, <https://datatracker.ietf.org/doc/html/draft- | 2022, <https://datatracker.ietf.org/doc/html/draft- | |||
| briscoe-iccrg-prague-congestion-control-01>. | briscoe-iccrg-prague-congestion-control-01>. | |||
| [PragueLinux] | [PragueLinux] | |||
| Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | |||
| End of changes. 8 change blocks. | ||||
| 24 lines changed or deleted | 28 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||