rfc9840v1.txt   rfc9840.txt 
Internet Research Task Force (IRTF) M. Bagnulo Internet Research Task Force (IRTF) M. Bagnulo
Request for Comments: 9840 A. Garcia-Martinez Request for Comments: 9840 A. García-Martínez
Category: Experimental Universidad Carlos III de Madrid Category: Experimental Universidad Carlos III de Madrid
ISSN: 2070-1721 G. Montenegro ISSN: 2070-1721 G. Montenegro
P. Balasubramanian P. Balasubramanian
Confluent Confluent
August 2025 September 2025
rLEDBAT: Receiver-Driven Low Extra Delay Background Transport for TCP rLEDBAT: Receiver-Driven Low Extra Delay Background Transport for TCP
Abstract Abstract
This document specifies receiver-driven Low Extra Delay Background This document specifies receiver-driven Low Extra Delay Background
Transport (rLEDBAT) -- a set of mechanisms that enable the execution Transport (rLEDBAT) -- a set of mechanisms that enable the execution
of a less-than-best-effort congestion control algorithm for TCP at of a less-than-best-effort congestion control algorithm for TCP at
the receiver end. This document is a product of the Internet the receiver end. This document is a product of the Internet
Congestion Control Research Group (ICCRG) of the Internet Research Congestion Control Research Group (ICCRG) of the Internet Research
skipping to change at line 150 skipping to change at line 150
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
We use the following abbreviations throughout the text and include We use the following abbreviations throughout the text and include
them here for the reader's convenience: them here for the reader's convenience:
RCV.WND: The value included in the Receive Window field of the TCP RCV.WND: The value included in the Receive Window field of the TCP
header (which computation is modified by this specification). header (the computation of which is modified by its
specification).
SND.WND: The TCP sender's window. SND.WND: The TCP sender's window.
cwnd: The congestion window as computed by the congestion control cwnd: The congestion window as computed by the congestion control
algorithm running at the TCP sender. algorithm running at the TCP sender.
RLWND: The window value calculated by the rLEDBAT algorithm. RLWND: The window value calculated by the rLEDBAT algorithm.
fcwnd: The value that a standard RFC793bis TCP receiver calculates fcwnd: The value that a standard-TCP receiver compliant with
to set in the receive window for flow control purposes. [RFC9293] calculates to set in the receive window for flow control
purposes.
RCV.HGH: The highest sequence number corresponding to a received RCV.HGH: The highest sequence number corresponding to a received
byte of data at one point in time. byte of data at one point in time.
TSV.HGH: The Timestamp Value (TSval) [RFC7323] corresponding to the TSV.HGH: The Timestamp Value (TSval) [RFC7323] corresponding to the
segment in which RCV.HGH was carried at that point in time. segment in which RCV.HGH was carried at that point in time.
SEG.SEQ: The sequence number of the last received segment. SEG.SEQ: The sequence number of the last received segment.
TSV.SEQ: The TSval value of the last received segment. TSV.SEQ: The TSval of the last received segment.
3. Motivations for rLEDBAT 3. Motivations for rLEDBAT
rLEDBAT enables new use cases and new deployment models, fostering rLEDBAT enables new use cases and new deployment models, fostering
the use of LBE traffic. The following scenarios are enabled by the use of LBE traffic. The following scenarios are enabled by
rLEDBAT: rLEDBAT:
Content Delivery Networks (CDNs) and more sophisticated file Content Delivery Networks (CDNs) and more sophisticated file
distribution scenarios: distribution scenarios:
Consider the case where the source of a file to be distributed Consider the case where the source of a file to be distributed
skipping to change at line 231 skipping to change at line 233
user. rLEDBAT enables the mobile device to selectively use an LBE user. rLEDBAT enables the mobile device to selectively use an LBE
traffic class for some of the incoming traffic. For instance, by traffic class for some of the incoming traffic. For instance, by
using rLEDBAT, a user can use regular standard-TCP/UDP for a video using rLEDBAT, a user can use regular standard-TCP/UDP for a video
stream (e.g., YouTube) and use rLEDBAT for other background file stream (e.g., YouTube) and use rLEDBAT for other background file
downloads. downloads.
4. rLEDBAT Mechanisms 4. rLEDBAT Mechanisms
rLEDBAT provides the mechanisms to implement an LBE congestion rLEDBAT provides the mechanisms to implement an LBE congestion
control algorithm at the receiver end of a TCP connection. The control algorithm at the receiver end of a TCP connection. The
rLEDBAT receiver controls the sender's rate through the Receive rLEDBAT receiver controls the sender's rate through the receive
Window announced by the receiver in the TCP header. window announced by the receiver in the TCP header.
rLEDBAT assumes that the sender is a standard TCP sender. rLEDBAT rLEDBAT assumes that the sender is a standard-TCP sender. rLEDBAT
does not require any rLEDBAT-specific modifications to the TCP does not require any rLEDBAT-specific modifications to the TCP
sender. The envisioned deployment model for rLEDBAT is that the sender. The envisioned deployment model for rLEDBAT is that the
clients implement rLEDBAT and this enables rLEDBAT in communications clients implement rLEDBAT and this enables rLEDBAT in communications
with existing standard TCP senders. In particular, the sender MUST with existing standard-TCP senders. In particular, the sender MUST
implement [RFC9293] and also MUST implement the TCP Timestamps (TS) implement [RFC9293] and also MUST implement the TCP Timestamps (TS)
option as defined in [RFC7323]. Also, the sender should implement option as defined in [RFC7323]. Also, the sender should implement
some of the standard congestion control mechanisms, such as CUBIC some of the standard congestion control mechanisms, such as CUBIC
[RFC9438] or NewReno [RFC5681]. [RFC9438] or NewReno [RFC5681] [RFC6582].
rLEDBAT does not define a new congestion control algorithm. The LBE rLEDBAT does not define a new congestion control algorithm. The LBE
congestion control algorithm executed in the rLEDBAT receiver is congestion control algorithm executed in the rLEDBAT receiver is
defined in other documents. The rLEDBAT receiver MUST use an LBE defined in other documents. The rLEDBAT receiver MUST use an LBE
congestion control algorithm. Because rLEDBAT assumes a standard TCP congestion control algorithm. Because rLEDBAT assumes a standard-TCP
sender, the sender will be using a "best effort" congestion control sender, the sender will be using a "best effort" congestion control
algorithm (such as CUBIC or NewReno). Since rLEDBAT uses the Receive algorithm (such as CUBIC or NewReno). Since rLEDBAT uses the receive
Window to control the sender's rate and the sender calculates the window to control the sender's rate and the sender calculates the
sender's window as the minimum of the Receive window and the sender's window as the minimum of the receive window and the
congestion window, rLEDBAT will only be effective as long as the congestion window, rLEDBAT will only be effective as long as the
congestion control algorithm executed in the receiver yields a congestion control algorithm executed in the receiver yields a
smaller window than the one calculated by the sender. This is smaller window than the one calculated by the sender. This is
normally the case when the receiver is using an LBE congestion normally the case when the receiver is using an LBE congestion
control algorithm. The rLEDBAT receiver SHOULD use the LEDBAT control algorithm. The rLEDBAT receiver SHOULD use the LEDBAT
congestion control algorithm [RFC6817] or the LEDBAT++ congestion congestion control algorithm [RFC6817] or the LEDBAT++ congestion
control algorithm [LEDBAT++]. The rLEDBAT MAY use other LBE control algorithm [LEDBAT++]. rLEDBAT MAY use other LBE congestion
congestion control algorithms defined elsewhere. Irrespective of control algorithms defined elsewhere. Irrespective of which
which congestion control algorithm is executed in the receiver, an congestion control algorithm is executed in the receiver, a rLEDBAT
rLEDBAT connection will never be more aggressive than standard-TCP, connection will never be more aggressive than standard-TCP, since it
since it is always bounded by the congestion control algorithm is always bounded by the congestion control algorithm executed at the
executed at the sender. sender.
rLEDBAT is essentially composed of three types of mechanisms, namely rLEDBAT is essentially composed of three types of mechanisms, namely
those that provide the means to measure the packet delay (either the those that provide the means to measure the packet delay (either the
RTT or the one-way delay, depending on the selected algorithm), RTT or the one-way delay, depending on the selected algorithm),
mechanisms to detect packet loss, and the means to manipulate the mechanisms to detect packet loss, and the means to manipulate the
Receive Window to control the sender's rate. The first two provide receive window to control the sender's rate. The first two provide
input to the LBE congestion control algorithm, while the third uses input to the LBE congestion control algorithm, while the third uses
the congestion window computed by the LBE congestion control the congestion window computed by the LBE congestion control
algorithm to manipulate the Receive window, as depicted in Figure 1. algorithm to manipulate the receive window, as depicted in Figure 1.
+------------------------------------------+ +------------------------------------------+
| TCP Receiver | | TCP Receiver |
| +-----------------+ | | +-----------------+ |
| | +------------+ | | | | +------------+ | |
| +---------------------| RTT | | | | +---------------------| RTT | | |
| | | | Estimation | | | | | | | Estimation | | |
| | | +------------+ | | | | | +------------+ | |
| | | | | | | | | |
| | | +------------+ | | | | | +------------+ | |
skipping to change at line 304 skipping to change at line 306
| | +------------+ | | | | +------------+ | |
| +-----------------+ | | +-----------------+ |
+------------------------------------------+ +------------------------------------------+
Figure 1: The rLEDBAT Architecture Figure 1: The rLEDBAT Architecture
We next describe each of the rLEDBAT components. We next describe each of the rLEDBAT components.
4.1. Controlling the Receive Window 4.1. Controlling the Receive Window
rLEDBAT uses the TCP Receive Window (RCV.WND) to enable the receiver rLEDBAT uses the TCP receive window (RCV.WND) to enable the receiver
to control the sender's rate. [RFC9293] specifies that the RCV.WND to control the sender's rate. [RFC9293] specifies that the RCV.WND
is used to announce the available receive buffer to the sender for is used to announce the available receive buffer to the sender for
flow control purposes. In order to avoid confusion, we will call flow control purposes. In order to avoid confusion, we will call
fcwnd the value that a standard RFC793bis TCP receiver calculates to fcwnd the value that a standard-TCP receiver compliant with [RFC9293]
set in the receive window for flow control purposes. We call RLWND calculates to set in the receive window for flow control purposes.
the window value calculated by the rLEDBAT algorithm, and we call We call RLWND the window value calculated by the rLEDBAT algorithm,
RCV.WND the value actually included in the Receive Window field of and we call RCV.WND the value actually included in the Receive Window
the TCP header. For an RFC793bis receiver, RCV.WND == fcwnd. field of the TCP header. For a receiver compliant with [RFC9293],
RCV.WND == fcwnd.
In the case of the rLEDBAT receiver, this receiver MUST NOT set the In the case of the rLEDBAT receiver, this receiver MUST NOT set the
RCV.WND to a value larger than fcwnd and SHOULD set the RCV.WND to RCV.WND to a value larger than fcwnd and SHOULD set the RCV.WND to
the minimum of RLWND and fcwnd, honoring both. the minimum of RLWND and fcwnd, honoring both.
When using rLEDBAT, two congestion controllers are in action in the When using rLEDBAT, two congestion controllers are in action in the
flow of data from the sender to the receiver, namely the TCP flow of data from the sender to the receiver, namely the TCP
congestion control algorithm on the sender side and the LBE congestion control algorithm on the sender side and the LBE
congestion control algorithm executed in the receiver and conveyed to congestion control algorithm executed in the receiver and conveyed to
the sender through the RCV.WND. In the normal TCP operation, the the sender through the RCV.WND. In the normal TCP operation, the
skipping to change at line 355 skipping to change at line 358
increases or decreases RLWND according to congestion signals increases or decreases RLWND according to congestion signals
(variations in the estimated queuing delay and packet loss). If (variations in the estimated queuing delay and packet loss). If
RLWND is decreased and directly announced in RCV.WND, this could lead RLWND is decreased and directly announced in RCV.WND, this could lead
to an announced window that is smaller than what is currently in use. to an announced window that is smaller than what is currently in use.
This so-called "shrinking the window" is discouraged as per This so-called "shrinking the window" is discouraged as per
[RFC9293], as it may cause unnecessary packet loss and performance [RFC9293], as it may cause unnecessary packet loss and performance
penalties. To be consistent with [RFC9293], the rLEDBAT receiver penalties. To be consistent with [RFC9293], the rLEDBAT receiver
SHOULD NOT shrink the receive window. SHOULD NOT shrink the receive window.
In order to avoid window shrinking, the receiver MUST only reduce In order to avoid window shrinking, the receiver MUST only reduce
RCV.WND by the number of bytes upon of a received data packet. This RCV.WND by the number of bytes contained in a received data packet.
may fall short to honor the new calculated value of the RLWND This may fall short to honor the new calculated value of the RLWND
immediately. However, the receiver SHOULD progressively reduce the immediately. However, the receiver SHOULD progressively reduce the
advertised RCV.WND, always honoring that the reduction is less than advertised RCV.WND, always honoring that the reduction is less than
or equal to the received bytes, until the target window determined by or equal to the received bytes, until the target window determined by
the rLEDBAT algorithm is reached. This implies that it may take up the rLEDBAT algorithm is reached. This implies that it may take up
to one RTT for the rLEDBAT receiver to drain enough in-flight bytes to one RTT for the rLEDBAT receiver to drain enough in-flight bytes
to completely close its receive window without shrinking it. This is to completely close its receive window without shrinking it. This is
sufficient to honor the window output from the LEDBAT/LEDBAT++ sufficient to honor the window output from the LEDBAT/LEDBAT++
algorithms, since they only allow to perform at most one algorithms, since they are only allowed to perform at most one
multiplicative decrease per RTT. multiplicative decrease per RTT.
4.1.2. Setting the Window Scale Option 4.1.2. Setting the Window Scale Option
The Window Scale (WS) option [RFC7323] is a means to increase the The Window Scale (WS) option [RFC7323] is a means to increase the
maximum window size permitted by the Receive Window. The WS option maximum window size permitted by the receive window. The WS option
defines a scale factor that restricts the granularity of the receive defines a scale factor that restricts the granularity of the receive
window that can be announced. This means that the rLEDBAT client window that can be announced. This means that the rLEDBAT client
will have to accumulate the increases resulting from multiple will have to accumulate the increases resulting from multiple
received packets and only convey a change in the window when the received packets and only convey a change in the window when the
accumulated sum of increases is equal to or higher than one increase accumulated sum of increases is equal to or higher than one increase
step as imposed by the scaling factor according to the WS option in step as imposed by the scaling factor according to the WS option in
place for the TCP connection. place for the TCP connection.
Changes in the receive window that are smaller than 1 MSS (Maximum Changes in the receive window that are smaller than 1 MSS (Maximum
Segment Size) are unlikely to have any immediate impact on the Segment Size) are unlikely to have any immediate impact on the
skipping to change at line 431 skipping to change at line 434
LEDBAT++ discovers the base RTT (RTTb) by taking the minimum value of LEDBAT++ discovers the base RTT (RTTb) by taking the minimum value of
the measured RTTs over a period of time. The current RTT (RTTc) is the measured RTTs over a period of time. The current RTT (RTTc) is
estimated using a number of recent samples and applying a filter, estimated using a number of recent samples and applying a filter,
such as the minimum (or the mean) of the last k samples. Using RTT such as the minimum (or the mean) of the last k samples. Using RTT
to estimate the queuing delay has a number of shortcomings and to estimate the queuing delay has a number of shortcomings and
difficulties, as discussed below. difficulties, as discussed below.
The queuing delay measured using RTT also includes the queuing delay The queuing delay measured using RTT also includes the queuing delay
experienced by the return packets in the direction from the rLEDBAT experienced by the return packets in the direction from the rLEDBAT
receiver to the sender. This is a fundamental limitation of this receiver to the sender. This is a fundamental limitation of this
approach. The impact of this error is that the rLEDBAT controller approach. The impact of this limitation is that the rLEDBAT
will also react to congestion in the reverse path direction, controller will also react to congestion in the reverse path
resulting in an even more conservative mechanism. direction, resulting in an even more conservative mechanism.
In order to measure RTT, the rLEDBAT client MUST enable the TS option In order to measure RTT, the rLEDBAT client MUST enable the TS option
[RFC7323]. By matching the TSval value carried in outgoing packets [RFC7323]. By matching the TSval carried in outgoing packets with
with the Timestamp Echo Reply (TSecr) value [RFC7323] observed in the Timestamp Echo Reply (TSecr) value [RFC7323] observed in incoming
incoming packets, it is possible to measure RTT. This allows the packets, it is possible to measure RTT. This allows the rLEDBAT
rLEDBAT receiver to measure RTT even if it is acting as a pure receiver to measure RTT even if it is acting as a pure receiver. In
receiver. In a pure receiver, there is no data flowing from the a pure receiver, there is no data flowing from the rLEDBAT receiver
rLEDBAT receiver to the sender, making it impossible to match data to the sender, making it impossible to match data packets with
packets with Acknowledgment packets to measure RTT, as it is usually Acknowledgment packets to measure RTT, in contrast to what is usually
done in TCP for other purposes. done in TCP for other purposes.
Depending on the frequency of the local clock used to generate the Depending on the frequency of the local clock used to generate the
values included in the TS option, several packets may carry the same values included in the TS option, several packets may carry the same
TSval value. If that happens, the rLEDBAT receiver will be unable to TSval. If that happens, the rLEDBAT receiver will be unable to match
match the different outgoing packets carrying the same TSval value the different outgoing packets carrying the same TSval with the
with the different incoming packets also carrying the same TSecr different incoming packets also carrying the same TSecr value.
value. However, it is not necessary for rLEDBAT to use all packets However, it is not necessary for rLEDBAT to use all packets to
to estimate RTT, and sampling a subset of in-flight packets per RTT estimate RTT, and sampling a subset of in-flight packets per RTT is
is enough to properly assess the queuing delay. RTT MUST then be enough to properly assess the queuing delay. RTT MUST then be
calculated as the time since the first packet with a given TSval was calculated as the time since the first packet with a given TSval was
sent and the first packet that was received with the same value sent and the first packet that was received with the same value
contained in the TSecr. Other packets with repeated TS values SHOULD contained in the TSecr. Other packets with repeated TS values SHOULD
NOT be used for RTT calculations. NOT be used for RTT calculations.
Several issues must be addressed in order to avoid an artificial Several issues must be addressed in order to avoid an artificial
increase in the observed RTT. Different issues emerge, depending on increase in the observed RTT. Different issues emerge, depending on
whether the rLEDBAT-capable host is sending data packets or pure ACKs whether the rLEDBAT-capable host is sending data packets or pure ACKs
to measure RTT. We next consider these issues separately. to measure RTT. We next consider these issues separately.
skipping to change at line 490 skipping to change at line 493
throughout the lifetime of the communication. However, if, for throughout the lifetime of the communication. However, if, for
example, the file is structured in blocks of data, it may be the case example, the file is structured in blocks of data, it may be the case
that the sender will seldom have to wait until the next block is that the sender will seldom have to wait until the next block is
available to proceed with the data transfer. To address this available to proceed with the data transfer. To address this
situation, the filter used by the congestion control algorithm situation, the filter used by the congestion control algorithm
executed in the receiver SHOULD discard outliers (e.g., a MIN filter executed in the receiver SHOULD discard outliers (e.g., a MIN filter
[RFC6817] would achieve this) when measuring RTT using pure ACK [RFC6817] would achieve this) when measuring RTT using pure ACK
packets. packets.
This limitation of the sender's window can come from either the TCP This limitation of the sender's window can come from either the TCP
congestion window in host B or the announced receive window from the congestion window in host B or the announced receive window from
rLEDBAT in host A. Normally, the receive window will be the one to rLEDBAT in host A. Normally, the receive window will be the one to
limit the sender's transmission rate, since the LBE congestion limit the sender's transmission rate, since the LBE congestion
control algorithm used by the rLEDBAT node is designed to be more control algorithm used by the rLEDBAT node is designed to be more
restrictive on the sender's rate than standard-TCP. If the limiting restrictive on the sender's rate than standard-TCP. If the limiting
factor is the congestion window in the sender, it is less relevant if factor is the congestion window in the sender, it is less relevant if
rLEDBAT further reduces the receive window due to a bloated RTT rLEDBAT further reduces the receive window due to a bloated RTT
measurement, since the rLEDBAT node is not actively controlling the measurement, since the rLEDBAT node is not actively controlling the
sender's rate. Nevertheless, the proposed approach to discard larger sender's rate. Nevertheless, the proposed approach to discard larger
samples would also address this issue. samples would also address this issue.
To address the case in which the limiting factor is the receive To address the case in which the limiting factor is the receive
window announced by rLEDBAT, the congestion control algorithm at the window announced by rLEDBAT, the congestion control algorithm at the
receiver SHOULD discard RTT measurements during the window reduction receiver SHOULD discard RTT measurements during the window reduction
phase that are triggered by pure ACK packets. The rLEDBAT receiver phase that are triggered by pure ACK packets. The rLEDBAT receiver
is aware of whether a given TSval value was sent in a pure ACK packet is aware of whether a given TSval was sent in a pure ACK packet where
where the window was reduced, and if so, it can discard the the window was reduced, and if so, it can discard the corresponding
corresponding RTT measurement. RTT measurement.
4.2.1.2. Measuring RTT When Sending Data Packets 4.2.1.2. Measuring RTT When Sending Data Packets
In the case that the rLEDBAT node is sending data packets and In the case that the rLEDBAT node is sending data packets and
matching them with pure ACKs to measure RTT, a factor that can matching them with pure ACKs to measure RTT, a factor that can
artificially increase the RTT measured is the presence of delayed artificially increase the RTT measured is the presence of delayed
Acknowledgments. According to the TS option generation rules Acknowledgments. According to the TS option generation rules
[RFC7323], the value included in the TSecr for a delayed ACK is the [RFC7323], the value included in the TSecr for a delayed ACK is the
one in the TSval field of the earliest unacknowledged segment. This one in the TSval field of the earliest unacknowledged segment. This
may artificially increase the measured RTT. may artificially increase the measured RTT.
skipping to change at line 591 skipping to change at line 594
An additional difficulty regarding the estimation of the TS units and An additional difficulty regarding the estimation of the TS units and
clock skew in the context of (r)LEDBAT is that the LEDBAT congestion clock skew in the context of (r)LEDBAT is that the LEDBAT congestion
controller actions directly affect the (queuing) delay experienced by controller actions directly affect the (queuing) delay experienced by
packets. In particular, if there is an error in the estimation of packets. In particular, if there is an error in the estimation of
the TS units/skew, the LEDBAT controller will attempt to compensate the TS units/skew, the LEDBAT controller will attempt to compensate
for it by reducing/increasing the load. The result is that the for it by reducing/increasing the load. The result is that the
LEDBAT operation interferes with the TS units/clock skew LEDBAT operation interferes with the TS units/clock skew
measurements. Because of this, measurements are more accurate when measurements. Because of this, measurements are more accurate when
there is no traffic in the connection (in addition to the packets there is no traffic in the connection (in addition to the packets
used for the measurements). The problem is that the receiver is used for the measurements). The problem is that the receiver is
unaware if the sender is injecting traffic at any point in time, and unaware of whether the sender is injecting traffic at any point in
so, it is unable to use these quiet intervals to perform time; it is therefore unable to use these quiet intervals to perform
measurements. The receiver can, however, force periodic slowdowns, measurements. The receiver can, however, force periodic slowdowns,
reducing the announced receive window to a few packets and perform reducing the announced receive window to a few packets and performing
the measurements then. the measurements at that time.
It is possible for the rLEDBAT receiver to perform multiple It is possible for the rLEDBAT receiver to perform multiple
measurements to assess both the TS units and the relative clock skew measurements to assess both the TS units and the relative clock skew
during the lifetime of the connection, in order to obtain more during the lifetime of the connection, in order to obtain more
accurate results. Clock skew measurements are more accurate if the accurate results. Clock skew measurements are more accurate if the
time period used to discover the skew is larger, as the impact of the time period used to discover the skew is larger, as the impact of the
skew becomes more apparent. It is a reasonable approach for the skew becomes more apparent. It is a reasonable approach for the
rLEDBAT receiver to perform an early discovery of the TS units (and rLEDBAT receiver to perform an early discovery of the TS units (and
the clock skew) using the first few packets of the TCP connection and the clock skew) using the first few packets of the TCP connection and
then improve the accuracy of the TS units/clock skew estimation using then improve the accuracy of the TS units/clock skew estimation using
periodic measurements later in the lifetime of the connection. periodic measurements later in the lifetime of the connection.
4.3. Detecting Packet Losses and Retransmissions 4.3. Detecting Packet Losses and Retransmissions
The rLEDBAT receiver is capable of detecting retransmitted packets as The rLEDBAT receiver is capable of detecting retransmitted packets as
follows. We call RCV.HGH the highest sequence number corresponding follows. We call RCV.HGH the highest sequence number corresponding
to a received byte of data (not assuming that all bytes with smaller to a received byte of data (not assuming that all bytes with smaller
sequence numbers have been received already, there may be holes), and sequence numbers have been received already, there may be holes), and
we call TSV.HGH the TSval value corresponding to the segment in which we call TSV.HGH the TSval corresponding to the segment in which that
that byte was carried. SEG.SEQ stands for the sequence number of a byte was carried. SEG.SEQ stands for the sequence number of a newly
newly received segment, and we call TSV.SEQ the TSval value of the received segment, and we call TSV.SEQ the TSval of the newly received
newly received segment. segment.
If SEG.SEQ < RCV.HGH and TSV.SEQ > TSV.HGH, then the newly received If SEG.SEQ < RCV.HGH and TSV.SEQ > TSV.HGH, then the newly received
segment is a retransmission. This is so because the newly received segment is a retransmission. This is so because the newly received
segment was generated later than another already-received segment segment was generated later than another already-received segment
that contained data with a larger sequence number. This means that that contained data with a larger sequence number. This means that
this segment was lost and was retransmitted. this segment was lost and was retransmitted.
The proposed mechanism to detect retransmissions at the receiver The proposed mechanism to detect retransmissions at the receiver
fails when there are window tail drops. If all packets in the tail fails when there are window tail drops. If all packets in the tail
of the window are lost, the receiver will not be able to detect a of the window are lost, the receiver will not be able to detect a
mismatch between the sequence numbers of the packets and the order of mismatch between the sequence numbers of the packets and the order of
the timestamps. In this case, rLEDBAT will not react to losses but the timestamps. In this case, rLEDBAT will not react to losses;
the TCP congestion controller at the sender will, most likely however, the TCP congestion controller at the sender will, most
reducing its window to 1 MSS and take over the control of the sending likely reducing its window to 1 MSS and taking over the control of
rate, until slow start ramps up and catches the current value of the the sending rate until slow start ramps up and catches the current
rLEDBAT window. value of the rLEDBAT window.
5. Experiment Considerations 5. Experiment Considerations
The status of this document is Experimental. The general purpose of The status of this document is Experimental. The general purpose of
the proposed experiment is to gain more experience running rLEDBAT the proposed experiment is to gain more experience running rLEDBAT
over different network paths to see if the proposed rLEDBAT over different network paths to see if the proposed rLEDBAT
parameters perform well in different situations. Specifically, we parameters perform well in different situations. Specifically, we
would like to learn about the following aspects of the rLEDBAT would like to learn about the following aspects of the rLEDBAT
mechanism: mechanism:
* Interaction between the sender's and receiver's congestion control * Interaction between the sender's and receiver's congestion control
algorithms. rLEDBAT posits that because the rLEDBAT receiver is algorithms. rLEDBAT posits that because the rLEDBAT receiver is
using a less-than-best-effort congestion control algorithm, the using a less-than-best-effort congestion control algorithm, the
receiver's congestion control algorithm will expose a smaller receiver's congestion control algorithm will expose a smaller
congestion window (conveyed through the Receive Window) than the congestion window (conveyed through the receive window) than the
one resulting from the congestion control algorithm executed at one resulting from the congestion control algorithm executed at
the sender. One of the purposes of the experiment is to learn how the sender. One of the purposes of the experiment is to learn how
these two algorithms interact and if the assumption that the these two algorithms interact and if the assumption that the
receiver side is always controlling the sender's rate (and making receiver side is always controlling the sender's rate (and making
rLEDBAT effective) holds. The experiment should include the rLEDBAT effective) holds. The experiment should include the
different congestion control algorithms that are currently widely different congestion control algorithms that are currently widely
used in the Internet, including CUBIC, Bottleneck Bandwidth and used in the Internet, including CUBIC, Bottleneck Bandwidth and
Round-trip propagation time (BBR), and LEDBAT(++). Round-trip propagation time (BBR), and LEDBAT(++).
* Interaction between rLEDBAT and Active Queue Management techniques * Interaction between rLEDBAT and Active Queue Management techniques
such as Controlled Delay (CoDel); Proportional Integral controller such as Controlled Delay (CoDel); Proportional Integral controller
Enhanced (PIE); and Low Latency, Low Loss, and Scalable Throughput Enhanced (PIE); and Low Latency, Low Loss, and Scalable Throughput
(L4S). (L4S).
* How the rLEDBAT should resume after a period during which there * How rLEDBAT should resume after a period during which there was no
was no incoming traffic and the information about the rLEDBAT incoming traffic and the information about the rLEDBAT state
state information is potentially dated. information is potentially dated.
5.1. Status of the Experiment at the Time of This Writing 5.1. Status of the Experiment at the Time of This Writing
Currently, the following implementations of rLEDBAT can be used for Currently, the following implementations of rLEDBAT can be used for
experimentation: experimentation:
* Windows 11. rLEDBAT is available in Microsoft's Windows 11 22H2 * Windows 11. rLEDBAT is available in Microsoft's Windows 11 22H2
since October 2023 [Windows11]. since October 2023 [Windows11].
* Windows Server 2022. rLEDBAT is available in Microsoft's Windows * Windows Server 2022. rLEDBAT is available in Microsoft's Windows
Server 2022 since September 2022 [WindowsServer]. Server 2022 since September 2022 [WindowsServer].
* Apple. rLEDBAT is available in macOS and iOS since 2021 [Apple]. * Apple. rLEDBAT is available in macOS and iOS since 2021 [Apple].
* Linux implementation, open source, available since 2022 at * Linux implementation, open source, available since 2022
<https://github.com/net-research/rledbat_module>. [rledbat_module].
* ns3 implementation, open source, available since 2020 at * ns3 implementation, open source, available since 2020
<https://github.com/manas11/implementation-of-rLEDBAT-in-ns-3>. [rLEDBAT-in-ns-3].
In addition, rLEDBAT has been deployed by Microsoft at wide scale in In addition, rLEDBAT has been deployed by Microsoft at wide scale in
the following services: the following services:
* BITS (Background Intelligent Transfer Service) * BITS (Background Intelligent Transfer Service)
* DO (Delivery Optimization) service * DO (Delivery Optimization) service
* Windows update # using DO * Windows update: using DO
* Windows Store # using DO * Windows Store: using DO
* OneDrive * OneDrive
* Windows Error Reporting # wermgr.exe; werfault.exe * Windows Error Reporting: wermgr.exe; werfault.exe
* System Center Configuration Manager (SCCM) * System Center Configuration Manager (SCCM)
* Windows Media Player * Windows Media Player
* Microsoft Office * Microsoft Office
* Xbox (download games) # using DO * Xbox (download games): using DO
Some initial experiments involving rLEDBAT have been reported in Some initial experiments involving rLEDBAT have been reported in
[COMNET3]. Experiments involving the interaction between LEDBAT++ [COMNET3]. Experiments involving the interaction between LEDBAT++
and BBR are presented in [COMNET2]. An experimental evaluation of and BBR are presented in [COMNET2]. An experimental evaluation of
the LEDBAT++ algorithm is presented in [COMNET1]. As LEDBAT++ is one the LEDBAT++ algorithm is presented in [COMNET1]. As LEDBAT++ is one
of the less-than-best-effort congestion control algorithms that of the less-than-best-effort congestion control algorithms that
rLEDBAT relies on, the results regarding how LEDBAT++ interacts with rLEDBAT relies on, the results regarding how LEDBAT++ interacts with
other congestion control algorithms are relevant for the other congestion control algorithms are relevant for the
understanding of rLEDBAT as well. understanding of rLEDBAT as well.
6. Security Considerations 6. Security Considerations
Overall, we believe that rLEDBAT does not introduce any new Overall, we believe that rLEDBAT does not introduce any new
vulnerabilities to existing TCP endpoints, as it relies on existing vulnerabilities to existing TCP endpoints, as it relies on existing
TCP knobs, notably the Receive Window and timestamps. TCP knobs, notably the receive window and timestamps.
Specifically, rLEDBAT uses RCV.WND to modulate the rate of the Specifically, rLEDBAT uses RCV.WND to modulate the rate of the
sender. An attacker wishing to starve a flow can simply reduce the sender. An attacker wishing to starve a flow can simply reduce the
RCV.WND, irrespective of whether rLEDBAT is being used or not. RCV.WND, irrespective of whether rLEDBAT is being used or not.
We can further ask ourselves whether the attacker can use the rLEDBAT We can further ask ourselves whether the attacker can use the rLEDBAT
mechanisms in place to force the rLEDBAT receiver to reduce the mechanisms in place to force the rLEDBAT receiver to reduce the
RCV.WND. There are two ways an attacker can do this: RCV.WND. There are two ways an attacker can do this:
* One would be to introduce an artificial delay to the packets by * One would be to introduce an artificial delay to the packets by
skipping to change at line 803 skipping to change at line 806
Bagnulo, "LEDBAT++: Congestion Control for Background Bagnulo, "LEDBAT++: Congestion Control for Background
Traffic", Work in Progress, Internet-Draft, draft-irtf- Traffic", Work in Progress, Internet-Draft, draft-irtf-
iccrg-ledbat-plus-plus-02, 13 February 2025, iccrg-ledbat-plus-plus-02, 13 February 2025,
<https://datatracker.ietf.org/doc/html/draft-irtf-iccrg- <https://datatracker.ietf.org/doc/html/draft-irtf-iccrg-
ledbat-plus-plus-02>. ledbat-plus-plus-02>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>. <https://www.rfc-editor.org/info/rfc5681>.
[RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
NewReno Modification to TCP's Fast Recovery Algorithm",
RFC 6582, DOI 10.17487/RFC6582, April 2012,
<https://www.rfc-editor.org/info/rfc6582>.
[RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
"Low Extra Delay Background Transport (LEDBAT)", RFC 6817, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
DOI 10.17487/RFC6817, December 2012, DOI 10.17487/RFC6817, December 2012,
<https://www.rfc-editor.org/info/rfc6817>. <https://www.rfc-editor.org/info/rfc6817>.
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R. [RFC7323] Borman, D., Braden, B., Jacobson, V., and R.
Scheffenegger, Ed., "TCP Extensions for High Performance", Scheffenegger, Ed., "TCP Extensions for High Performance",
RFC 7323, DOI 10.17487/RFC7323, September 2014, RFC 7323, DOI 10.17487/RFC7323, September 2014,
<https://www.rfc-editor.org/info/rfc7323>. <https://www.rfc-editor.org/info/rfc7323>.
[RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)", [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022, STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
<https://www.rfc-editor.org/info/rfc9293>. <https://www.rfc-editor.org/info/rfc9293>.
[RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed., [RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed.,
"CUBIC for Fast and Long-Distance Networks", RFC 9438, "CUBIC for Fast and Long-Distance Networks", RFC 9438,
DOI 10.17487/RFC9438, August 2023, DOI 10.17487/RFC9438, August 2023,
<https://www.rfc-editor.org/info/rfc9438>. <https://www.rfc-editor.org/info/rfc9438>.
[rLEDBAT-in-ns-3]
"Implementation-of-rLEDBAT-in-ns-3", commit 2ab34ad, 24
June 2020, <https://github.com/manas11/implementation-of-
rLEDBAT-in-ns-3>.
[rledbat_module]
"rledbat_module", commit d82ff20, 9 September 2022,
<https://github.com/net-research/rledbat_module>.
[Windows11] [Windows11]
Microsoft, "What's new in Delivery Optimization", Microsoft, "What's new in Delivery Optimization",
Microsoft Windows Documentation, October 2024, Microsoft Windows Documentation, October 2024,
<https://learn.microsoft.com/en-us/windows/deployment/do/ <https://learn.microsoft.com/en-us/windows/deployment/do/
whats-new-do>. whats-new-do>.
[WindowsServer] [WindowsServer]
Havey, D., "LEDBAT Background Data Transfer for Windows", Havey, D., "LEDBAT Background Data Transfer for Windows",
Microsoft Networking Blog, September 2022, Microsoft Networking Blog, September 2022,
<https://techcommunity.microsoft.com/t5/networking-blog/ <https://techcommunity.microsoft.com/t5/networking-blog/
skipping to change at line 859 skipping to change at line 876
example, in the case of LEDBAT++, the WindowIncrease() function is an example, in the case of LEDBAT++, the WindowIncrease() function is an
additive increase, while the WindowDecrease() function is a additive increase, while the WindowDecrease() function is a
multiplicative decrease. In the case of the WindowIncrease() multiplicative decrease. In the case of the WindowIncrease()
function, we assume that it takes as input the current window size function, we assume that it takes as input the current window size
and the number of bytes that were acknowledged since the last window and the number of bytes that were acknowledged since the last window
update (ackedBytes) and returns as output the updated window size. update (ackedBytes) and returns as output the updated window size.
In the case of the WindowDecrease() function, it takes as input the In the case of the WindowDecrease() function, it takes as input the
current window size and returns the updated window size. current window size and returns the updated window size.
The data structures used in the algorithms are as follows. The The data structures used in the algorithms are as follows. The
sentList is a list that contains the TSval and the local send time of sendList is a list that contains the TSval and the local send time of
each packet sent by the rLEDBAT-enabled endpoint. The TSecr field of each packet sent by the rLEDBAT-enabled endpoint. The TSecr field of
the packets received by the rLEDBAT-enabled endpoint is matched with the packets received by the rLEDBAT-enabled endpoint is matched with
the sendList to compute the RTT. the sendList to compute the RTT.
The RTT values computed for each received packet are stored in the The RTT values computed for each received packet are stored in the
RTTlist, which also contains the received TSecr (to avoid using RTTlist, which also contains the received TSecr (to avoid using
multiple packets with the same TSecr for RTT calculations, only the multiple packets with the same TSecr for RTT calculations, only the
first packet received for a given TSecr is used to compute the RTT). first packet received for a given TSecr is used to compute the RTT).
It also contains the local time at which the packet was received, to It also contains the local time at which the packet was received, to
allow selecting the RTTs measured in a given period (e.g., in the allow selecting the RTTs measured in a given period (e.g., in the
last 10 minutes). RTTlist is initialized with all its values to its last 10 minutes). RTTlist is initialized with all its values to its
maximum. maximum.
procedure receivePacket() procedure receivePacket()
//Looks for first sent packet with same TSval as TSecr, and //Looks for first sent packet with same TSval as TSecr, and
//returns time difference //returns time difference
receivedRTT = computeRTT(sentList, receivedTSecr, receivedTime) receivedRTT = computeRTT(sendList, receivedTSecr, receivedTime)
//Inserts minimum value for a given receivedTSecr //Inserts minimum value for a given receivedTSecr
//Note that many received packets may contain same receivedTSecr //Note that many received packets may contain same receivedTSecr
insertRTT (RTTlist, receivedRTT, receivedTSecr, receivedTime) insertRTT (RTTlist, receivedRTT, receivedTSecr, receivedTime)
filteredRTT = minLastKMeasures(RTTlist, K=4) filteredRTT = minLastKMeasures(RTTlist, K=4)
baseRTT = minLastNSeconds(RTTlist, N=180) baseRTT = minLastNSeconds(RTTlist, N=180)
qd = filteredRTT - baseRTT qd = filteredRTT - baseRTT
//ackedBytes is the number of bytes that can be used to reduce //ackedBytes is the number of bytes that can be used to reduce
//the Receive Window - without shrinking it - if necessary //the receive window - without shrinking it - if necessary
ackedBytes = ackedBytes + receiveBytes ackedBytes = ackedBytes + receiveBytes
if retransmittedPacketDetected then if retransmittedPacketDetected then
RLWND = DecreaseWindow(RLWND) //Only once per RTT RLWND = DecreaseWindow(RLWND) //Only once per RTT
end if end if
if qd < T then if qd < T then
RLWND = IncreaseWindow(RLWND, ackedBytes) RLWND = IncreaseWindow(RLWND, ackedBytes)
else else
RLWND = DecreaseWindow(RLWND) RLWND = DecreaseWindow(RLWND)
end if end if
skipping to change at line 912 skipping to change at line 929
procedure SENDPACKET procedure SENDPACKET
if (RLWND > RLWNDPrevious) or (RLWND - RLWNDPrevious < ackedBytes) if (RLWND > RLWNDPrevious) or (RLWND - RLWNDPrevious < ackedBytes)
then then
RLWNDPrevious = RLWND RLWNDPrevious = RLWND
else else
RLWNDPrevious = RLWND - ackedBytes RLWNDPrevious = RLWND - ackedBytes
end if end if
ackedBytes = 0 ackedBytes = 0
RLWNDPrevious = RLWND RLWNDPrevious = RLWND
//Compute the RWND to include in the packet //Compute the RLWND to include in the packet
RLWND = min(RLWND, fcwnd) RLWND = min(RLWND, fcwnd)
end procedure end procedure
Figure 3: Procedure Executed When a Packet Is Sent Figure 3: Procedure Executed When a Packet Is Sent
Acknowledgments Acknowledgments
This work was supported by the EU through the StandICT projects RXQ, This work was supported by the EU through the StandICT projects RXQ,
CCI, and CEL6; the NGI Pointer RIM project; and the H2020 5G-RANGE CCI, and CEL6; the NGI Pointer RIM project; and the H2020 5G-RANGE
project; and by the Spanish Ministry of Economy and Competitiveness project; and by the Spanish Ministry of Economy and Competitiveness
skipping to change at line 937 skipping to change at line 954
for his help. We would like to thank Colin Perkins, Mirja Kühlewind, for his help. We would like to thank Colin Perkins, Mirja Kühlewind,
and Vidhi Goel for their reviews and comments on earlier draft and Vidhi Goel for their reviews and comments on earlier draft
versions of this document. versions of this document.
Authors' Addresses Authors' Addresses
Marcelo Bagnulo Marcelo Bagnulo
Universidad Carlos III de Madrid Universidad Carlos III de Madrid
Email: marcelo@it.uc3m.es Email: marcelo@it.uc3m.es
Alberto Garcia-Martinez Alberto García-Martínez
Universidad Carlos III de Madrid Universidad Carlos III de Madrid
Email: alberto@it.uc3m.es Email: alberto@it.uc3m.es
Gabriel Montenegro Gabriel Montenegro
Email: g.e.montenegro@hotmail.com Email: g.e.montenegro@hotmail.com
Praveen Balasubramanian Praveen Balasubramanian
Confluent Confluent
Email: pravb.ietf@gmail.com Email: pravb.ietf@gmail.com
 End of changes. 44 change blocks. 
84 lines changed or deleted 101 lines changed or added

This html diff was produced by rfcdiff 1.48.