Internet-Draft | Careful Congestion Control Convergence | July 2023 |
Kuhn, et al. | Expires 6 January 2024 | [Page] |
This document specifies careful convergence of Congestion Control (CC), providing a cautious method that enables fast startup for a wide range of connections or reconnections.¶
The method reuses a set of computed CC parameters that are based on the previously observed path characteristics between the same pair of transport endpoints, such as the bottleneck bandwidth, available capacity, or the Round Trip Time (RTT). These parameters are stored, allowing them to be later used to modify the CC behavior of a subsequent connection. The document also discusses assumptions and defines requirements around how a sender utilizes these parameters to provide opportunities for a connection to more quickly get up to speed (i.e. utilize the available capacity). It discusses how these changes impact the capacity at a shared network bottleneck and the safe response that is needed after any indication that the new rate is inappropriate. The method is expected to be appropriate to IETF transports.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 6 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
All Internet transports are required to either use a CC method, or to constrain their rate of transmission [RFC8085]. In 2010, a survey of alternative CC methods [RFC5783], noted that there are challenges when a CC operates across an Internet path with a high and/or variable bandwidth-delay product (BDP). This mechanism targets these challenges.¶
A CC method typically takes time to ramp-up the packet rate, called the "slow-start phase", informally known as the time to "Get up to speed". This slow start phase is a period in which a sender intentionally uses less capacity than might be available, with the intention to avoid or limit overshooting the actual capacity at a bottleneck. This can increase queuing (latency/jitter) and/or congestion packet loss to the flow. Any overshoot in the capacity can also have a detrimental effect on other flows sharing a common bottleneck. In the extreme case, persistent congestion could result in unwanted starvation of other flows [RFC8867] (i.e., Preventing other flows from successfully sharing a common bottleneck).¶
The method can improve performance by reducing the time to get up to speed, and hence can reduce the total duration of a transfer. It introduces an alternative method to select initial CC parameters, including a way to more rapidly and safely grow the congestion window (cwnd). This method is based on temporal sharing (sometimes known as caching) of a set of computed CC parameters that relate to a previously observed path, such as the bottleneck bandwidth, available capacity, and RTT. These parameters are stored and used to modify the CC behavior of a subsequent connection between the same local and remote endpoints.¶
When used with the QUIC transport, it provides transport services that resemble those currently available in TCP, such as TCP Control Block (TCB) [RFC9040] caching or updates to support application-limited traffic.¶
"Generally, implementations are advised to be cautious when using previous values on a new path", as stated in [RFC9000]. This advice is appropriate for any IETF transport protocol.¶
Care is therefore needed in the use of any temporal information to assure safe use of the Internet and to be robust to changes in traffic patterns, network routing and link/node failures. There are also cases where using the parameters of a previous connection are not appropriate, and a need to evaluate the potential for malicious use of the method.¶
Whilst a sender could take optimization decisions without considering the receiver's preference, there are cases where a client at the receiver could have information that is not available at the sender. In these cases, a client could explicitly ask for tuning the slow start when the application continues transmission, or to to inhibit tuning. Examples where this could have benefit include:¶
A related document complements this CC method by allowing the sender-generated transport information to be stored at the receiver [I-D.kuhn-quic-bdpframe-extension]. This enables a receiver to implement a policy that informs a sender whether the receiver desires the sender to reuse the CC parameters. By transferring the information to a receiver, it also releases the sender from needing to retain CC parameter state for each receiver.¶
This section provides a set of examples where the method is expected to improve performance.¶
The method is expected to reduce the time to complete a transfer when the transfer sends significantly more data than allowed by the IW, and the BDP is also significantly more than the IW.¶
An application could use a series of connections over the same path (i.e. resumes a connection to the same endpoint). This can be used by a sender that performs a unidirectional data transfer towards the receiver, (e.g., a receiver downloading a file or a web page). Without the method, each connection would otherwise need to individually discover the CC parameters.¶
Either or both endpoints can assume the role of a sender or a receiver. The method supports a bidirectional data transfer, where both endpoints simultaneously send data to each other (e.g., remote execution of an application, or a bidirectional video conference call).¶
In this example, an application resumes using capacity after a pause in transmission. Without the method, the application that pauses would otherwise need to discover new CC parameters each time it connects to the same endpoint.¶
A variant of this example is when the application reconnects after a disruption that had temporarily reduced the path capacity (e.g., after a link propagation impairment, or where a user on a train journey travels through different areas of connectivity) before the endpoint returns to use a path with the original characteristics.¶
QUIC introduces the concept of transport parameters (Section 4 of [RFC9000]). The present document adds to this by noting that a new connection can utilize a set of key transport parameters from a previous connection to reduce the completion time for a transfer.¶
This benefit is particularly evident for a path where the RTT is much larger than for typical Internet paths. In a specific example of high BDP path, a satellite access network, takes up to 9 seconds to complete a 5.3 MB transfer using standard CC, whereas using the specified method the transfer time could reduce to 4 seconds [IJSCN]; and the time to complete a 1 MB transfer could be reduced by 62 % [MAPRG111]. Benefit is also expected for other sizes of transfer and for different path characteristics that also result in a path with high BDP.¶
{XXX-Editor note: A future revision can provide further Path Examples here.}¶
This section provides a brief summary of key terms and the requirements language that is used.¶
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Sender-generated information is used in this document for two functions:¶
The document uses language drawn from a range of IETF RFCs. It defines current, and saved values for a set of CC parameters:¶
The Endpoint Token is described in Appendix A.¶
This section defines a series of phases that the CC algorithm moves through as a connection gets up to speed when it uses the Careful Resume method.¶
During a previous connection, information about the specific path to an endpoint is saved. This is used to characterize the path and to indicate the capacity that was available. It includes the current RTT (current_rtt), bottleneck bandwidth (current_bb) and current receiver Endpoint Token (current_endpoint_token) stored as saved_rtt, saved_bb and saved_endpoint_token Section 4.1. One implementation solution could be to store the information at the server. Different implementation solutions are detailed in [I-D.kuhn-quic-bdpframe-extension].¶
When a sender resumes between the same pair of endpoints, (aka the same path) it enters the Reconnaissance Phase. The sender only enters this phase when there are saved CC parameters for the same pair of endpoints and this information is currently valid (i.e., the parameters have not expired.) When a method is provided (such as the BDP_Frame), a receiver can request the sender to not enter this phase.¶
In the Reconnaissance Phase, the sender sends initial data, limited by the Initial Window, and monitors its reception in the acknowledgements from the receiver. This phase checks whether the current path is consistent with the saved path information. The sender measures the path characteristics of the present path to confirm that the path is consistent with the previously characterized path (including a similar RTT) Section 4.2.¶
The Reconnaissance Phase calculates a jump_cwnd based on the saved CC parameters. The correct reception of packets sent using the jump_cwnd is monitored during the Unvalidated Phase.¶
To avoid starving other flows that started or increased their capacity after the Observation Phase, the sender MUST NOT set a jump_cwnd that corresponds to all the capacity that it previously used.¶
{XXX-Editor note: What safety factor is appropriate for the resuming sender? If using slow-start it would anyway double the rate on the next RTT, so is capacity*2/3 or 1/2 appropriate? Could this be a MUST NOT for the part about not using the values without somehow curbing them, with maybe a SHOULD for a specific value? Do we need to factor-in the degree of the indication? This could be nice, but then it makes it even harder to pick something useful? }¶
{XXX-Editor Note: It is possible that an implementation can start sending data using the jump_cwnd while still in the Reconnaissance Phase, before the all initial data is acknowledged. In this method, the cwnd is increased for each new ACK received, in proportion to the acknowledged volume of initial data, i.e. cwnd+=(jump_cwnd*(acknowledged_bytes/Initial_Window)). Transmission of this unvalidated data still requires pacing (see section XXX), and is tentative based on the rules for the Reconnaisance Phase. This proprtional method reduces the impact of delayed acknowledgements, which could otherwise delay the start of transmisison using the jump_cwnd, it also reduces additional delay when the IW was paced.}¶
In the Unvalidated Phase, a sender monitors the tentaive use of the updated CC parameters. (These CC parameters are based on saved path information and allow a rate higher than allowed by a traditional slow-start mechanism.) The convergence towards the previous rate is expected to be faster, but should not be instantaneous, to avoid adding congestion to an already congested bottleneck. In this phase, the sender continues to check the saved and current path information are consistent Section 4.3.¶
In the Safe Retreat Phase, the sender stops using the saved CC parameters. This phase is designed to mitigate the impact on other flows that might have been sharing a congested bottleneck when in the Unvalidated Phase. The sender needs to re-initialize CC parameters to drain any queue that has built at the bottleneck during the Unvalidated Phase and allow other flows to then regain their share of the available capacity. This reaction differs to a traditional CC reaction to congestion, because in this case the capacity estimate was unvalidated Section 4.4. Saved CC parameters for this path need be removed from any cache, to prevent the parameters being used again with other flows.¶
When the sender transitions to the Safe Retreat Phase, there could still be packets that were sent in the Unvalidated Phase that have not yet been acknowledged. If these packets from the Unvalidated Phase are declared lost, they do not trigger an additional CC reaction.¶
If the data in the packets that are lost in the Unvalidated Phase needs to be recovered, this recovery commences using the reduced window set on entry to the Safe Retreat Phase. In the case of multiple loss, this could require multiple RTTs to complete successful resending of data that lost in the Unvalidated Phase. The loss of the packets used to resend data is considered a separate congestion event, and this does also trigger another CC reaction.¶
The sender then enters the Normal phase with re-initialized CC parameters.¶
The sender continues transmisison using the normal CC method.¶
If the sender experiences a Retransmission Time Out (RTO) expiry, the sender returns to the normal CC phase and processes the RTO expiry.¶
The sender is limited by any rate-limitation of the transport protocol being used. For QUIC this includes: flow control mechanisms or amplification attack prevention. In particular, a QUIC receiver might need to issue proactive MAX_DATA frames to increase the flow control limits of a connection that is started with this method to gain the expected benefit.¶
As in QUIC, a TCP sender is limited by the receiver window (rwnd). The rwnd may need to be increased for a connection that is started with this method to gain the expected benefit.¶
Congestion controllers, such as CUBIC or RENO, can estimate the saved_bb and current_bb values by utilizing a combination of the cwnd/flight_size and the minimum RTT. A different method could be used to estimate the same values when using a rate-based congestion controller, such as BBR [I-D.cardwell-iccrg-bbr-congestion-control].¶
The sender sends initial data limited by the IW - this value is assumed a safe starting point for any path where there is no path information or congestion control information. This limit avoids adding excessive congestion to a potentially congested path.¶
The sender monitors the reception of the initial data. If the path characteristics resemble those of a previously observed connection (i.e., current_rtt < 1.2*saved_rtt) and all data was acknowledged without reported congestion, the method permits the sender to utilize the saved_bb as an input to adapt current_bb to rapidly determine a new safe rate.¶
When used in a controlled network, additional information about local path characteristics could be known, which might be used to configure a non-standard IW.¶
Paths can change with respect to time for many reasons. This could result in previously measured CC parameters becoming irrelevant.¶
{NOTE: A future revision of this document needs to specify how long CC Parameters can be cached, possibly based on TCP-new-CWV or TCB}.¶
Reconnaissance Phase:¶
{XXX-Editor-note: RTT check should be a range rather than an inequality (current_rtt < 1.2*saved_rtt).}¶
This section defines the safety requirements for using saved CC parameters to update the cwnd. These safety guidelines are designed to mitigate the risk that sender adds excessive congestion to an already congested path.¶
The method needs to be designed to avoid sending excessive data into a congested bottleneck, because this can have a material impact on any flows sharing that bottleneck, and the ability of those flows to control their own sending rate.¶
{NOTE: A later revision needs to define how to decide a significant change.}¶
The network conditions for the same path can also change over time. Bottleneck bandwidth and network traffic can change at any time. An Internet method needs to be robust to network conditions that can differ from one connection to the next, due to variations in the forwarding path, reconfiguration of equipment or changes in the link conditions.¶
The sender needs to avoid sending a burst of packets as a result of a step-increase in the congestion window [RFC8085], [RFC9000]. Various modifications to the sender to avoid line-rate bursts have been suggested (e.g., [I-D.hughes-restart]). Pacing the packets as a function of the current_rtt can provide this additional safety during the unvalidated period.¶
Identifing a relevant pacing rhythm:¶
The sender estimates a pacing rhythm using saved_rtt and saved_bb. The Inter-packet Transmission Time (ITT) is determined from the ratio between the current Maximum Message Size (MMS) and the ratio between the saved_bb and saved_rtt. A tunable safety margin can avoid sending more than a recommended maximum IW (recom_iw):¶
This follows the idea presented in [RFC4782], [I-D.irtf-iccrg-sallantin-initial-spreading] and [CONEXT15].¶
This section defines the safety requirements after a path change or congestion has been detected during the Unvalidated Phase.¶
The transport parameters are adjusted in the Unvalidated Phase, resulting in a higher cwnd. If there are indications of congestion, this also indicates that the parameters no longer reflect the current path, and the cwnd needs to be reduced to avoid overshoot of the bottleneck capacity. This can result from changes in traffic at the bottleneck and/or changes in the path capacity.¶
{XXX-Editor note: A later revision will guide on the mitigation after detected congestion.}¶
The CC controller returns to the Normal Phase.¶
{XXX-Editor note: A later revision will guide on the entering normal CC.}¶
{XXX-Editor note: It would be good to have a discussing about updating the saved values, whether used or not, after reaching normal operation for use the next time even if that update is to just refresh the expiration time.}¶
The authors would like to thank John Border, Gabriel Montenegro, Patrick McManus, Ian Swett, Igor Lubashev, Robin Marx, Roland Bless and Franklin Simo for their fruitful comments on earlier versions of this document.¶
The authors would like to particularly thank Tom Jones for co-authoring previous versions of this document.¶
{XXX-Editor note: Text is required to register any IANA Considerations.¶
This document does not exhibit specific security considerations since only sender level considerations are proposed. Security considerations for the interactions with the receiver are discussed in [I-D.kuhn-quic-bdpframe-extension].¶
This proposes an Endpoint Token to allow a sender to identify its own view of the network path that it is using. In [I-D.kuhn-quic-bdpframe-extension] this Endpoint Token could be shared and used as an opaque path identifier to other parties and the sender can verify if this is one of its current paths.¶
When computing the Endpoint Token, the sender includes information to identify the path on which it sends, for example:¶
When creating an Endpoint Token, the sender has to ensure the following:¶
Previous individual submissions were discussed in TSVWG and QUIC.¶