Careful convergence of congestion control from retained state with QUIC

Internet-Draft	Careful congestion control convergence	March 2023
Kuhn, et al.	Expires 4 September 2023	[Page]

Abstract

This document discusses careful convergence of Congestion Control (CC) in QUIC, providing a cautious method that enables fast startup in a wide range of connections : reconnections using previous transport security credentials (0-RTT context), reconnections between 2 peers (prior knowledge of transport context), application-limited traffic.¶

The method provides QUIC with transport services that resemble those currently available in TCP, such as TCP Control Block (TCB) [RFC9040] caching or updates to support application-limited traffic.¶

The method reuses a set of computed CC parameters that are based on the previously observed path characteristics between the the same pair of transport endpoints, such as the bottleneck bandwidth, available capacity, or the RTT. These parameters are stored, allowing then to be later used to modify the CC behavior of a subsequent connection. The document also discusses assumptions and defines requirements around how a sender utilizes these parameters to provide opportunities for a new connection to more quickly get up to speed (i.e. utilize the available capacity). It discusses how these changes impact the capacity at a shared network bottleneck and the safe response that is needed after any indication that the new rate is inappropriate.¶

1. Introduction

All Internet transports are required to either use a CC method, or to constrain there rate of transmission [RFC8085]. In 2010, a survey of alternative CC methods [RFC5783], noted that there are challenges when a CC operates across an Internet path with a high and/or variable bandwidth-delay product (BDP).¶

A CC method typically takes time to ramp-up the packet rate, called the "slow-start phase", informally known as the time to "Get up to speed". This slow start phase is a period in which a sender intentionally uses less capacity than might be available, with the intention to avoid or limit overshooting the actual capacity at a bottleneck. This can result in increased queuing (latency/jitter) and/or congestion packet loss to the flow. Any overshoot in the capacity can also have a detrimental effect on other flows sharing a common bottleneck. In the extreme case, persistent congestion could result in unwanted starvation of other flows [RFC8867] (i.e., Preventing other flows from successfully sharing a common bottleneck).¶

This document specifies a method that can improve performance by reducing the time to get up to speed, and hence can reduce the total duration of a transfer. It introduces an alternative method to select initial CC parameters, including a way to more rapidly and safely grow the congestion window (cwnd). This method is based on temporal sharing (sometimes known as caching) of a set of computed CC parameters that relate to a previously observed path, such as the bottleneck bandwidth, available capacity, and RTT. These parameters are stored and used to modify the CC behaviour of a subsequent connection between the same local and remote endpoints.¶

1.1. Using the Information with Care

Care is needed in the use of any temporal information to assure safe use of the Internet and to be robust to changes in traffic patterns, network routing and link/node failures. There are also cases where using the parameters of a previous connection are not appropriate, and a need to evaluate the potential for malicious use of the method. The specification for the QUIC transport protocol [RFC9000] therefore notes "Generally, implementations are advised to be cautious when using previous values on a new path."¶

1.2. Receiver Preference

Whilst a sender could take optimization decisions without considering the receiver's preference, there are cases where a client at the receiver could have information that is not available at the sender. In these cases, a client could could explicitely ask for tuning the slow start when the application continues transmission, or to to inhibit tuning. Examples where this could have benfit include:¶

when a receiver understands that the pattern of traffic that a connection will use (e.g., the volume of data to be sent, the length of the session, or the maximum transfer rate required);¶
when a receiver has a local indication that the path/local interface has changed since CC parameters were stored;¶
when there is information related to the current hardware limitations at the receiver;¶
where the receiver understands the capacity that will be needed for other concurrent flows that might be expected to share the capacity of the path.¶

A related document complements this CC method by allowing the sender-generated transport information to be stored at the receiver [I-D.kuhn-quic-bdpframe-extension]. This enables a receiver to implement a policy that informs a sender whether the receiver desires the sender to reuse the CC parameters. By transfering the information to a receiver, it also releases the sender from needing to retain CC parameter state for each receiver.¶

1.3. Examples of Scenarios of Interest

This secion provides a set of examples where the method is expected to improve performance.¶

QUIC introduces the concept of transport parameters (Section 4 of [RFC9000]). The present document adds to this by noting that a new connection can utilize a set of key transport parameters from a previous connection to reduce the completion time for a transfer. This is expected to have benefit when the transfer is significantly larger than the IW, and the BDP is also significantly more than the IW. This benefit is particularly evident for a path where the RTT is much larger than for typical Internet paths.¶

The method can be used by a sender performing a unidirectional data transfer towards the receiver, (e.g., a receiver downloading a file or a web page). This applies to a CC that sends data to a remote endpoint and that remote endpoint resumes the connection, which is the focus of the current version of the document.¶

Both endpoints can assume the role of a sender or a receiver. Receivers can therefore also perform a bidirectional data transfer, where both endpoints simulatenously send data to each other (e.g., remote execution of an application, or a bidirectional video conference call).¶

Examples where temporal sharing of CC parameters can eliminate round-trip times at the start of a new connection include the following:¶

where an application uses a series of connections over the same path (each connection which otherwise would need to individually discover the CC parameters);¶
where an application resumes using capacity after a pause in transmission (an application that pauses would otherwise need to discover new CC parameters each time it connects over the same path);¶
where an application reconnects after a disruption that had temporarilly reduced the path capacity (e.g., after to a link propagation impairment, or where a user on a train journey travels through different areas of connectivity before the endpoint returns to use a path with the original characteristics).¶

1.3.1. A Satellite Access Network Example

In a specific example of high BDP path, a satellite access network, takes up to 9 seconds to complete a 5.3 MB transfer using standard CC, whereas using the specified method the transfer time could reduce to 4 seconds [IJSCN]; and the time to complete a 1 MB transfer could be reduced by 62 % [MAPRG111]. Benefit is also expected for other sizes of transfer and for different path characteristics that also result in a path with high BDP.¶

1.3.2. Another Network Example

{XXX-Editor note: A future revision can provide other Path Examples here.}¶

3. The Phases of CC using Careful Resume

This section defines a series of phases through that the CC algorithm moves through as a connection gets up to speed when uit uses the Careful Resume method.¶

Observe Phase: During a previous connection, information about the specific path to an endpoint is saved. This is used to characterise the path and to indicate the capacity that was available. It includes the current RTT (current_rtt), bottleneck bandwidth (current_bb) and current receiver Endpoint Token (current_endpoint_token) are stored as saved_rtt, saved_bb and saved_endpoint_token.¶
Reconnaissance Phase: When a sender resumes between the same pair of endpoints, (aka the same path) it enters the Reconnaissance Phase. The sender only enters this phase when there are saved CC parameters for the same pair of endpoints and this information is currnetly valid (i.e., the parameters have not expired.) When a method is provided (such as the BDP_Frame), a receiver can request the sender to not enter this phase. The sender is send iniial data, limited by the Initial Window. This phase checks whether the current path is consistent with the saved path information. The sender then measures the path characteristics of the present path to confirm that the path is consistent with the previously characterised path (including a similar RTT).¶
1. If the sender determines that the path RTT or the other saved path information are not consistent with the current path, then the sender continues using the standard CC, and enters the Normal Phase.¶
2. To ensure a sender avoids resuming under severely congested conditions, if any sent initial data was not correctly received, the sender continues using the standard CC, and enters the Normal Phase.¶
3. If the sender confirms both that the saved and current path information are consistent and that the sent initial data was correctly received, the sender enters the Unvalidated Phase.¶
Unvalidated Phase: In the Unvalidated Phase, a sender can utilize the saved path information to update its CC parameters. This phase a rate higher than allowed by a traditional slow-start mechanism. The convergence towards the previous rate is expected to be faster, but should not be instantaneous, to avoid adding congestion to an already congested bottleneck. In this phase, the sender continues to check the saved and current path information are consistent.¶
1. If a sender determines either that previous parameters are not valid (due to a detected change in the path) or congestion was experienced, then the sender needs to enter the Retreat Phase.¶
2. If acknowledgments show that the unvalidated rate was succesfully used without inducing significant congestion to the path, then the sender is permitted to continue at the rate used in in the unvalidated phase when it continues in the Normal Phase.¶
Retreat Phase: In the Retreat Phase, the sender stops using the saved CC parameters. This phase is designed to mitigate the impact on other flows that might have been sharing a congested bottleneck when in the Unvalidated Phase. The sender needs to re-initialised CC parameters to drain any queue built at the bottleneck duing the Unvalidated Phase and allow other flows to then regain their share of the available capacity. This reaction differs to a traditional CC reaction to congestion, because in this case the capacity estimate was unvalidated. Saved CC parameters for this path should be removed, to prevent the parameters being used again with other flows.¶
1. The sender then enters the Normal phase with re-initialised CC parameters.¶
Normal Phase: The sender resumes using the normal CC method.¶

4. Congestion Control Guidelines and Requirements

The sender is limited by any rate-limitation of the transport protocol with which the method is used. For QUIC this includes: flow control mechanisms or amplification attack prevention. In particular, a QUIC receiver may need to issue proactive MAX_DATA frames to increase the flow control limits of a connection that is started with this method.¶

4.1. Determing the current Path Capacity in the Observe Phase

Congestion controllers, such as CUBIC or RENO, could estimate the saved_bb and current_bb values by utilizing a combination of the cwnd/flight_size and the minimum RTT. A different method could be used to estimate the same values when using a rate-based congestion controller, such as BBR [I-D.cardwell-iccrg-bbr-congestion-control].¶

(Observe Phase) The sender SHOULD NOT store and/or send CC parameter information related to an estimated bottleneck bandwidth (saved_bb) (see Section 2.3 for more details on bottleneck bandwidth definition), if the cwnd is not at least four times larger than the IW.¶

4.2. Confirming the Path in the Reconnaissance Phase

The sender sends the first data limited by the IW - this is assumed a safe starting point for any path where there is no path information or congestion control information. This limit avoids adding excessive congestion to a potentially congested path.¶

The sender monitors reception of the IW data. If the path characteristics resemble those of a recent previous connection from to the same sender (i.e., current_rtt < 1.2*saved_rtt) and all data was acknowledged without reported congestion, the method permits the sender to utilize the saved_bb as an input to adapt current_bb to rapidly determine a new safe rate.¶

(Reconnaissance Phase) The sender MUST NOT send more than the IW in the first RTT of transmitted data [RFC9000].¶

When used in a controlled network, additional information about local path characteristics could be known, which might be used to configure a non-standard IW.¶

4.3. Confirming the Path

Paths change with respect to time for many reasons. This could result in previously measured CC parameters becoming irelevant.¶

Endpoint Token change: If the Endpoint Token changes (i.e., the saved_endpoint_token is different from the current_endpoint_token), the different Endpoint Token can be assumed as an indication of a different network path. This new path does not necessarily exhibit the same characteristics as the old one.¶
RTT change: A significant change in RTT might be an indication that the network conditions have changed. Since the CC information is directly impacted by the RTT, a significant change in the RTT is a strong indication that the previously estimated BDP parameters are likely to not be valid for the current path.¶
Lifetime of the information: The CC information is temporal. Frequent connections to the same Endpoint Token are likely to track changes, but long-term use of previous values is not appropriate.¶

{NOTE: A future revision of this document needs to specify how long CC Parameters can be cached, possibly based on TCP-new-CWV or TCB}.¶

(Reconnaissance Phase) The sender MUST compare the measured transport parameters (in particular current_rtt) of the new session with those of the previous session (in particular saved_rtt). The method MUST NOT be used when the path fails to be validated.¶

{XXX-Editor-note: RTT check should be a range rather than an inequality (current_rtt < 1.2*saved_rtt).}¶

4.4. Safety Requirements for the Unvalidated Phase

This section defines the safety requirements for using saved CC parameters.¶

{XXX-Editor note: The sender ought not to re-utilize all the capacity it previously used, to avoid starving other flows that started or increased their capacity after the last measurement. How strong should this be stated: ... MUST or SHOULD ... What safety factor is appropriate for the resuming sender? If using slow-start it would anyway double the rate on the next RTT, so is capacity/2 appropriate to initially try?}¶

The method needs to be designed to avoid sending excessive data into a congested bottleneck, because this can have a material impact on any flows sharing that bottleneck, and the ability of those flows to control their own sending rate.¶

(Unvalidated Phase) A new connection MUST NOT directly use the previously measured saved_rtt and saved_bb to simply initialize a new flow to resume sending at the same rate.¶

4.4.1. Variable Network Conditions - Choosing Careful Resume

The network conditions for the same path can also change over time. Bottleneck bandwidth and network traffic can change at any time. An Internet method needs to be robust to network conditions that can differ from one connection to the next, due to variations in the forwarding path, reconfiguration of equipment or changes in the link conditions.¶

(Unvalidated Phase) Careful Resume MUST be robust to changes in network traffic, including the arrival of new traffic flows that compete for the bottleneck capacity.¶
The sender MUST check the validity of any received saved_rtt and saved_bb parameters, whether these are sent by a receiver or are stored at the sender. The following events indicates cases where the use of these parameters is inappropriate:¶

{NOTE: A later revision needs to define how to decide a significant change.}¶

BB over-estimation: There are cases where using a measured cwnd would inflate the bottleneck bandwidth. At the end of the CC slow start phase, the value of cwnd can be significantly larger than the minimum value needed to utilize the path (i.e., cwnd overshoot). In most case, the cwnd finally converges to a stable value after several more RTTs. It would be inappropriate to use an overshoot in the cwnd as a basis for estimating the bottleneck bandwidth. NOTE: One mitigation could be to further restrict to only a fraction (e.g., 1/2) of the previously used cwnd; another mitigation might be to calculate the bottleneck bandwidth based on the flight_size or an averaged cwnd.¶
Preventing Starvation of New Flows: It would not be appropriate to fully use a bottleneck bandwidth estimate based on a previous measurement of capacity, because new flows might have started using the available capacity since that measurement was made. The mitigation could be to restrict to only a fraction (e.g., 1/2) of the previously used cwnd.¶

These safety guidelines are designed to mitigate the risk that sender adds excessive congestion to an already congested path. The following mechanisms help in fulfilling this objective:¶

(Unvalidated phase) The sender MUST NOT use the parameters unless the first IW packets when packets are detected as lost or acknowledgments indicate the packets were ECN CE-marked. These are indication of potential congestion and therefore the method MUST NOT be used;¶
(Unvalidated phase) The sender MUST implement the retreat method when packets are detected as lost or acknowledgments indicate the packets were ECN CE-marked. These are indication of potential congestion and therefore the method MUST NOT be used.¶

{XXX-Editor note: Decide on the mitigation for Starvation of New Flows.}¶

4.4.2. Pacing in Careful Resume

The following mechanisms could be implemented.¶

The sender needs to avoid sending a burst of packets as a result of a step-increase in the congestion window [RFC9000]. Pacing the packets as a function of the current_rtt can provide this additional safety during the unvalidated period.¶

Identify a relevant pacing rhythm:¶

The sender estimates a pacing rhythm using saved_rtt and saved_bb. The Inter-packet Transmission Time (ITT) is determined from the ratio between the current Maximum Message Size (MMS) and the ratio between the saved_bb and saved_rtt. A tunable safety margin can avoid sending more than a recommended maximum IW (recom_iw):¶
- current_iw = min(recom_iw,saved_bb)¶
- ITT = MSS/(current_iw/saved_rtt)¶
A successful receipt of the IW data confirms the path can be used with the method specified in this document.¶

This follows the idea presented in [RFC4782], [I-D.irtf-iccrg-sallantin-initial-spreading] and [CONEXT15].¶

4.5. Safety Requirements for the Retreat Phase

This section defines the safety requirements after a path change or congestion is detected in the Unvalidated Phase.¶

After transport parameters are set to a previously estimated bottleneck bandwidth, if the slow-start mechanisms continue with parameters set by Carfeul Resume, the sender might then overshoot the bottleneck capacity. This can occur even when using the safety check described in this section.¶

4.5.1. Variable Network Conditions - Mitigating Mistakes

The impact of a mistaken decision to use Careful Resume can be mitigated by 2 potential solutions:¶

When resuming, restore the current_bb and current_rtt from the saved_bb and saved_rtt CC parameters estimated from a previous connection.¶
When resuming, implement a safety check to measure and avoid using the saved_bb and saved_rtt CC parameters to cause congestion over the path. In this case, the current_bb and current_rtt might not be set directly from the saved_bb and saved_rtt: the sender might wait for the completion of the safety check before this is done.¶

{XXX-Editor note: Decide on the mitigation after detected congestion.}¶

4.6. Returning to Normal Congestion Control

At the end of Carfeul Resume, the CC controller returns to the Normal Phase.¶

For NewReno and CUBIC, it is recommended to exit slow-start and enter the congestion avoidance phase.¶
For BBR CC, it is recommended to enter the "probe bandwidth" state.¶

8. References

8.1. Normative References

[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC4782]: Floyd, S., Allman, M., Jain, A., and P. Sarolahti, "Quick-Start for TCP and IP", RFC 4782, DOI 10.17487/RFC4782, January 2007, <https://www.rfc-editor.org/info/rfc4782>.
[RFC8085]: Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, March 2017, <https://www.rfc-editor.org/info/rfc8085>.
[RFC8174]: Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8801]: Pfister, P., Vyncke, É., Pauly, T., Schinazi, D., and W. Shao, "Discovering Provisioning Domain Names and Data", RFC 8801, DOI 10.17487/RFC8801, July 2020, <https://www.rfc-editor.org/info/rfc8801>.
[RFC9000]: Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, <https://www.rfc-editor.org/info/rfc9000>.
[RFC9040]: Touch, J., Welzl, M., and S. Islam, "TCP Control Block Interdependence", RFC 9040, DOI 10.17487/RFC9040, July 2021, <https://www.rfc-editor.org/info/rfc9040>.

8.2. Informative References

[CONEXT15]: Li, Q., Dong, M., and P B. Godfrey, "Halfback: Running Short Flows Quickly and Safely", ACM CoNEXT , 2015.
[I-D.cardwell-iccrg-bbr-congestion-control]: Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. Jacobson, "BBR Congestion Control", Work in Progress, Internet-Draft, draft-cardwell-iccrg-bbr-congestion-control-02, 7 March 2022, <https://datatracker.ietf.org/doc/html/draft-cardwell-iccrg-bbr-congestion-control-02>.
[I-D.irtf-iccrg-sallantin-initial-spreading]: Sallantin, R., Baudoin, C., Arnal, F., Dubois, E., Chaput, E., and A. Beylot, "Safe increase of the TCP's Initial Window Using Initial Spreading", Work in Progress, Internet-Draft, draft-irtf-iccrg-sallantin-initial-spreading-00, 15 January 2014, <https://datatracker.ietf.org/doc/html/draft-irtf-iccrg-sallantin-initial-spreading-00>.
[I-D.kuhn-quic-bdpframe-extension]: Kuhn, N., Emile, S., Fairhurst, G., Jones, T., and C. Huitema, "BDP Frame Extension", Work in Progress, Internet-Draft, draft-kuhn-quic-bdpframe-extension-00, 6 March 2022, <https://datatracker.ietf.org/doc/html/draft-kuhn-quic-bdpframe-extension-00>.
[IJSCN]: Thomas, L., Dubois, E., Kuhn, N., and E. Lochin, "Google QUIC performance over a public SATCOM access", International Journal of Satellite Communications and Networking 10.1002/sat.1301, 2019.
[MAPRG111]: Kuhn, N., Stephan, E., Fairhurst, G., Jones, T., and C. Huitema, "Feedback from using QUIC's 0-RTT-BDP extension over SATCOM public access", IETF 111 - MAPRG meeting , 2022.
[RFC5783]: Welzl, M. and W. Eddy, "Congestion Control in the RFC Series", RFC 5783, DOI 10.17487/RFC5783, February 2010, <https://www.rfc-editor.org/info/rfc5783>.
[RFC8867]: Sarker, Z., Singh, V., Zhu, X., and M. Ramalho, "Test Cases for Evaluating Congestion Control for Interactive Real-Time Media", RFC 8867, DOI 10.17487/RFC8867, January 2021, <https://www.rfc-editor.org/info/rfc8867>.

Appendix A. Annexe: An Endpoint Token

This proposes an Endpoint Token to allow a sender to identify its own view of the network path that it is using. In [I-D.kuhn-quic-bdpframe-extension] this Endpoint Tokencould be shared and used as an opaque path identifier to other parties and the sender can verify if this is one of its current paths.¶

A.1. Creating an Endpoint Token

When computing the Endpoint Token, the sender includes information to identify the path on which it sends, for example:¶

it must include a unique identifier for itself (e.g., a globally assigned address/prefix; or randomly chosen value).¶
it must include an identifier for the destination (e.g., a destination IP address or name).¶
it should an interface identifier (e.g., an index value or a MAC address to associate the endpoint with the interface on which the path starts);¶
it could include other information such as the DSCP, ports, flow label, etc (recognising that this additional infromation might improve the path differentiation, but that this can can reduce the re-usability of the token);¶
it could include any other information the sender chooses to include, and potentially including PvD information [RFC8801] or information relating to its public-facing IP address;¶
it could include a nonce;¶
it could include a time-dependent value to define the validity period of the token.¶

When creating an Endpoint Token, the sender has to ensure the following:¶

To reduce the likelihood of misuse of the Endpoint Token, the value should be encoded in a way that hides the component information from the recipient and any eavesdropper on the path.¶
The sender can recalculate the Endpoint Token if it needs to validate a previously issued token; and that the Endpoint Token itself can be included in the computed integrity check for any path information it provides.¶
The Endpoint Token is designed so that if shared it prevents another party from deriving private data from the token, or to use the token to perform unwanted likability with other information. This implies that the Endpoint Token MUST necessarily be different when used to identify different interfaces.¶

Careful convergence of congestion control from retained state with QUIC

Abstract

Status of This Memo

Copyright Notice

Table of Contents