rfc9330v7.txt   rfc9330.txt 
skipping to change at line 186 skipping to change at line 186
It has been demonstrated that, if the sending host replaces a Classic It has been demonstrated that, if the sending host replaces a Classic
congestion control with a 'Scalable' alternative, the performance congestion control with a 'Scalable' alternative, the performance
under load of all the above interactive applications can be under load of all the above interactive applications can be
significantly improved once a suitable AQM is deployed in the significantly improved once a suitable AQM is deployed in the
network. Taking the example solution cited below that uses Data network. Taking the example solution cited below that uses Data
Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332]
on a DSL or Ethernet link, queuing delay under heavy load is roughly on a DSL or Ethernet link, queuing delay under heavy load is roughly
1-2 ms at the 99th percentile without losing link utilization 1-2 ms at the 99th percentile without losing link utilization
[L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). [L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3).
This compares with 5-20 ms on _average_ with a Classic congestion This compares with 5-20 ms on _average_ with a Classic congestion
control and current state-of-the-art AQMs, such as FQ-CoDel control and current state-of-the-art AQMs, such as Flow Queue CoDel
[RFC8290], PIE [RFC8033], or DOCSIS PIE [RFC8034] and about 20-30 ms [RFC8290], Proportional Integral controller Enhanced (PIE) [RFC8033],
at the 99th percentile [DualPI2Linux]. or DOCSIS PIE [RFC8034] and about 20-30 ms at the 99th percentile
[DualPI2Linux].
L4S is designed for incremental deployment. It is possible to deploy L4S is designed for incremental deployment. It is possible to deploy
the L4S service at a bottleneck link alongside the existing best the L4S service at a bottleneck link alongside the existing best
efforts service [DualPI2Linux] so that unmodified applications can efforts service [DualPI2Linux] so that unmodified applications can
start using it as soon as the sender's stack is updated. Access start using it as soon as the sender's stack is updated. Access
networks are typically designed with one link as the bottleneck for networks are typically designed with one link as the bottleneck for
each site (which might be a home, small enterprise, or mobile each site (which might be a home, small enterprise, or mobile
device), so deployment at either or both ends of this link should device), so deployment at either or both ends of this link should
give nearly all the benefit in the respective direction. With some give nearly all the benefit in the respective direction. With some
transport protocols, namely TCP [ACCECN] and SCTP [RFC4960], the transport protocols, namely TCP [ACCECN], the sender has to check
sender has to check that the receiver has been suitably updated to that the receiver has been suitably updated to give more accurate
give more accurate feedback, whereas with more recent transport feedback, whereas with more recent transport protocols, such as QUIC
protocols, such as QUIC [RFC9000] and DCCP [RFC4340], all receivers [RFC9000] and Datagram Congestion Control Protocol (DCCP) [RFC4340],
have always been suitable. all receivers have always been suitable.
This document presents the L4S architecture. It consists of three This document presents the L4S architecture. It consists of three
components: network support to isolate L4S traffic from Classic components: network support to isolate L4S traffic from Classic
traffic; protocol features that allow network elements to identify traffic; protocol features that allow network elements to identify
L4S traffic; and host support for L4S congestion controls. The L4S traffic; and host support for L4S congestion controls. The
protocol is defined separately in [RFC9331] as an experimental change protocol is defined separately in [RFC9331] as an experimental change
to Explicit Congestion Notification (ECN). This document describes to Explicit Congestion Notification (ECN). This document describes
and justifies the component parts and how they interact to provide and justifies the component parts and how they interact to provide
the low latency, low loss, and scalable Internet service. It also the low latency, low loss, and scalable Internet service. It also
details the approach to incremental deployment, as briefly summarized details the approach to incremental deployment, as briefly summarized
skipping to change at line 285 skipping to change at line 286
utilization, whatever the flow rate, as well as ensuring that utilization, whatever the flow rate, as well as ensuring that
high throughput is more robust to disturbances. The Scalable high throughput is more robust to disturbances. The Scalable
control used most widely (in controlled environments) is DCTCP control used most widely (in controlled environments) is DCTCP
[RFC8257], which has been implemented and deployed in Windows [RFC8257], which has been implemented and deployed in Windows
Server Editions (since 2012), in Linux, and in FreeBSD. Although Server Editions (since 2012), in Linux, and in FreeBSD. Although
DCTCP as-is functions well over wide-area round-trip times DCTCP as-is functions well over wide-area round-trip times
(RTTs), most implementations lack certain safety features that (RTTs), most implementations lack certain safety features that
would be necessary for use outside controlled environments, like would be necessary for use outside controlled environments, like
data centres (see Section 6.4.3). Therefore, Scalable congestion data centres (see Section 6.4.3). Therefore, Scalable congestion
control needs to be implemented in TCP and other transport control needs to be implemented in TCP and other transport
protocols (QUIC, SCTP, RTP/RTCP, RTP Media Congestion Avoidance protocols (QUIC, Stream Control Transmission Protocol (SCTP),
Techniques (RMCAT), etc.). Indeed, between the present document RTP/RTCP, RTP Media Congestion Avoidance Techniques (RMCAT),
being drafted and published, the following Scalable congestion etc.). Indeed, between the present document being drafted and
controls were implemented: TCP Prague [PragueLinux], QUIC Prague, published, the following Scalable congestion controls were
implemented: Prague over TCP and QUIC [PRAGUE-CC] [PragueLinux],
an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and
the L4S ECN part of BBRv2 [BBRv2] intended for TCP and QUIC the L4S ECN part of Bottleneck Bandwidth and Round-trip
propagation time (BBRv2) [BBRv2] intended for TCP and QUIC
transports. transports.
2) Network: 2) Network:
L4S traffic needs to be isolated from the queuing latency of L4S traffic needs to be isolated from the queuing latency of
Classic traffic. One queue per application flow (FQ) is one way Classic traffic. One queue per application flow (FQ) is one way
to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just
two queues is sufficient and does not require inspection of two queues is sufficient and does not require inspection of
transport layer headers in the network, which is not always transport layer headers in the network, which is not always
possible (see Section 5.2). With just two queues, it might seem possible (see Section 5.2). With just two queues, it might seem
skipping to change at line 345 skipping to change at line 348
negative impact on its flow rate [RFC5033]. The scaling problem negative impact on its flow rate [RFC5033]. The scaling problem
with Classic congestion control is explained, with examples, in with Classic congestion control is explained, with examples, in
Section 5.1 and in [RFC3649]. Section 5.1 and in [RFC3649].
Scalable Congestion Control: A congestion control where the average Scalable Congestion Control: A congestion control where the average
time from one congestion signal to the next (the recovery time) time from one congestion signal to the next (the recovery time)
remains invariant as flow rate scales, all other factors being remains invariant as flow rate scales, all other factors being
equal. For instance, DCTCP averages 2 congestion signals per equal. For instance, DCTCP averages 2 congestion signals per
round trip, whatever the flow rate, as do other recently developed round trip, whatever the flow rate, as do other recently developed
Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], Scalable congestion controls, e.g., Relentless TCP [RELENTLESS],
TCP Prague [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] [BBR-CC], and Prague for TCP and QUIC [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2]
the L4S variant of SCReAM for real-time media [SCReAM-L4S] [BBR-CC], and the L4S variant of SCReAM for real-time media
[RFC8298]. See Section 4.3 of [RFC9331] for more explanation. [SCReAM-L4S] [RFC8298]. See Section 4.3 of [RFC9331] for more
explanation.
Classic Service: The Classic service is intended for all the Classic Service: The Classic service is intended for all the
congestion control behaviours that coexist with Reno [RFC5681] congestion control behaviours that coexist with Reno [RFC5681]
(e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC (e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC
[RFC5348]). The term 'Classic queue' means a queue providing the [RFC5348]). The term 'Classic queue' means a queue providing the
Classic service. Classic service.
Low Latency, Low Loss, and Scalable throughput (L4S) service: The Low Latency, Low Loss, and Scalable throughput (L4S) service: The
'L4S' service is intended for traffic from Scalable congestion 'L4S' service is intended for traffic from Scalable congestion
control algorithms, such as the Prague congestion control control algorithms, such as the Prague congestion control
skipping to change at line 781 skipping to change at line 785
clearly problematic for a congestion control to take multiple clearly problematic for a congestion control to take multiple
seconds to recover from each congestion event. CUBIC [RFC8312] seconds to recover from each congestion event. CUBIC [RFC8312]
was developed to be less unscalable, but it is approaching its was developed to be less unscalable, but it is approaching its
scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC
is still fully in its Reno-friendly mode, so it takes about 4.3 s is still fully in its Reno-friendly mode, so it takes about 4.3 s
to recover. However, once flow rate scales by 8 times again to to recover. However, once flow rate scales by 8 times again to
960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2
s. From then on, each further scaling by 8 times doubles CUBIC's s. From then on, each further scaling by 8 times doubles CUBIC's
recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/
s, the recovery time is 24.3 s. In contrast, a Scalable s, the recovery time is 24.3 s. In contrast, a Scalable
congestion control like DCTCP or TCP Prague induces 2 congestion congestion control like DCTCP or Prague induces 2 congestion
signals per round trip on average, which remains invariant for any signals per round trip on average, which remains invariant for any
flow rate, keeping dynamic control very tight. flow rate, keeping dynamic control very tight.
For a feel of where the global average lone-flow download sits on For a feel of where the global average lone-flow download sits on
this scale at the time of writing (2021), according to [BDPdata], this scale at the time of writing (2021), according to [BDPdata],
the global average fixed access capacity was 103 Mb/s in 2020 and the global average fixed access capacity was 103 Mb/s in 2020 and
the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging
of per-country data was weighted by Internet user population (data of per-country data was weighted by Internet user population (data
collected globally is necessarily of variable quality, but the collected globally is necessarily of variable quality, but the
paper does double-check that the outcome compares well against a paper does double-check that the outcome compares well against a
skipping to change at line 1373 skipping to change at line 1377
Three complementary approaches are in progress to address this issue, Three complementary approaches are in progress to address this issue,
but they are all currently research: but they are all currently research:
* In Prague congestion control, ignore certain losses deemed * In Prague congestion control, ignore certain losses deemed
unlikely to be due to congestion (using some ideas from BBR unlikely to be due to congestion (using some ideas from BBR
[BBR-CC] regarding isolated losses). This could mask any of the [BBR-CC] regarding isolated losses). This could mask any of the
above types of loss while still coexisting with drop-based above types of loss while still coexisting with drop-based
congestion controls. congestion controls.
* A combination of RACK [RFC8985], L4S, and link retransmission * A combination of Recent Acknowledgement (RACK) [RFC8985], L4S, and
without resequencing could repair transmission errors without the link retransmission without resequencing could repair transmission
head of line blocking delay usually associated with link-layer errors without the head of line blocking delay usually associated
retransmission [UnorderedLTE] [RFC9331]. with link-layer retransmission [UnorderedLTE] [RFC9331].
* Hybrid ECN/drop rate policers (see Section 8.3). * Hybrid ECN/drop rate policers (see Section 8.3).
L4S deployment scenarios that minimize these issues (e.g., over L4S deployment scenarios that minimize these issues (e.g., over
wireline networks) can proceed in parallel to this research, in the wireline networks) can proceed in parallel to this research, in the
expectation that research success could continually widen L4S expectation that research success could continually widen L4S
applicability. applicability.
6.4.4. L4S Flow but Classic ECN Bottleneck 6.4.4. L4S Flow but Classic ECN Bottleneck
skipping to change at line 1851 skipping to change at line 1855
<https://doi.org/10.1145/3404868.3406669>. <https://doi.org/10.1145/3404868.3406669>.
[NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency [NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency
Requirements for Head-Worn Display S/EVS Applications", Requirements for Head-Worn Display S/EVS Applications",
Proceedings of SPIE 5424, DOI 10.1117/12.554462, April Proceedings of SPIE 5424, DOI 10.1117/12.554462, April
2004, <https://ntrs.nasa.gov/api/citations/20120009198/ 2004, <https://ntrs.nasa.gov/api/citations/20120009198/
downloads/20120009198.pdf?attachment=true>. downloads/20120009198.pdf?attachment=true>.
[NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop [NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", Work in Behavior (NQB PHB) for Differentiated Services", Work in
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-14, 24 Progress, Internet-Draft, draft-ietf-tsvwg-nqb-15, 11
October 2022, <https://datatracker.ietf.org/doc/html/ January 2023, <https://datatracker.ietf.org/doc/html/
draft-ietf-tsvwg-nqb-14>. draft-ietf-tsvwg-nqb-15>.
[PRAGUE-CC] [PRAGUE-CC]
De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague
Congestion Control", Work in Progress, Internet-Draft, Congestion Control", Work in Progress, Internet-Draft,
draft-briscoe-iccrg-prague-congestion-control-01, 11 July draft-briscoe-iccrg-prague-congestion-control-01, 11 July
2022, <https://datatracker.ietf.org/doc/html/draft- 2022, <https://datatracker.ietf.org/doc/html/draft-
briscoe-iccrg-prague-congestion-control-01>. briscoe-iccrg-prague-congestion-control-01>.
[PragueLinux] [PragueLinux]
Briscoe, B., De Schepper, K., Albisser, O., Misund, J., Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
 End of changes. 8 change blocks. 
24 lines changed or deleted 28 lines changed or added

This html diff was produced by rfcdiff 1.48.