TCPM Working Group | G. Fairhurst |
Internet-Draft | I. Biswas |
Intended status: Standards Track | University of Aberdeen |
Expires: September 11, 2011 | March 10, 2011 |
Updating TCP to support Variable-Rate Traffic
draft-fairhurst-tcpm-newcwv-01
This document addresses issues that arise when TCP is used to support variable-rate traffic that includes periods where the transmission rate is limited by the application. It evaluates TCP Congestion Window Validation (TCP-CWV), an IETF experimental specification defined in RFC 2861, and concludes that TCP-CWV sought to address important issues, but failed to deliver a widely used solution.
The document recommends that the IETF should consider moving RFC 2861 from Experimental to Historic status, and replacing this with the current specification, which updates TCP to allow a TCP sender to restart quickly following either an idle or data-limited period. The method is expected to benefit variable-rate TCP applications, while also providing an appropriate response if congestion is experienced.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 11, 2011.
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The TCP congestion window (cwnd) controls the number of packets a TCP flow may have in the network at any time. A bulk application that always sends as fast as possible, will continue to grow the cwnd, and increase its transmission rate until it reaches the maximum permitted by the receiver and congestion windows. In contrast, a variable-rate application may experience long periods when the sender is either idle or application-limited. The focus of this document is on the operation of TCP with such an idle or application-limited case.
Standard TCP [RFC5681] requires the cwnd to be reset to the restart window (RW) when an application becomes idle. RFC 2861 noted that this behaviour was not always observed in current implementations. Recent experiments [Bis08] confirm this to still be the case. Standard TCP does not control growth of the cwnd when the TCP sender application-limited. A application-limited sender may therefore grow a cwnd that does not reflect any current information about the state of the network. Use of an invalid cwnd may result in reduced application performance or could significantly contribute to network congestion. These issues were noted in [RFC2861].
CWV proposed a solution to help reduce the cases where TCP experienced an invalid cwnd. The use and drawbacks of CWV are discussed in Section 2.
Section 4 discusses an alternative to CWV that seeks to address the same issues, but does so in a way that is expected to mitigate the impact on an application that varies its transmission rate. The method described applies to both an application-limited and an idle condition.
RFC 2861 described a simple modification to the TCP congestion control algorithm that decayed the cwnd after the transition from a “sufficiently-long” application-limited period, while using the slow-start threshold (ssthresh) to save information about the previous value of the congestion window. This approach relaxed the standard TCP behaviour [RFC5681] for an idle session, intended to improve application performance. CWV did not modify the behaviour for an application-limited session where a sender continues to transmit at a rate less than allowed by cwnd.
RFC 2861 has been implemented in some mainstream operating systems as the default behaviour [Bis08]. Analysis (e.g. [Bis10]) has shown that CWV is able to use the available capacity after an idle period over a shared path and that this can have benefit, especially over long delay paths, when compared to slow-start restart specified by standard TCP, but this behaviour can be too conservative to be attractive to many common variable-rate applications. Experience from using applications with CWV suggests that this mechanism does not therefore offer the desirable increase in application performance for variable rate applications and it is unclear that applications actually use the mechanism.
CWV offers a benefit, compared to standard TCP, for an application that exhibits regular idleness. However, CWV would only benefit the application if the idle period were less than several RTOs, since the behaviour would otherwise be the same as for standard TCP, which reset cwnd to the RW. Although CWV benefits the network in an application-limited scenario, the conservative approach of CWV does not provide an incentive to application to use it. It is therefore suggested that CWV is often a poor solution for many variable rate applications. In summary, CWV has the correct motivation, but has the wrong approach to solving this problem.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The document also assumes familiarity with the terminology of TCP congestion control [RFC5681].
This section proposes an update to the TCP congestion control behaviour during an idle or application-limited period. The new method allows a TCP sender to preserve the cwnd when an application becomes idle for a period of time (set in this specification to 6 minutes). This period where actual usage is less than allowed by cwnd is named the nonvalidation phase. This allows an application to resume transmission at a previous rate without incurring the delay of slow-start. However, if the TCP sender experiences congestion using the preserved cwnd, it is required to immediately reset the cwnd to an appropriate value. If a sender does not take advantage of the preserved cwnd within 6 minutes, the value of cwnd is updated, ensuring the value then reflects the capacity that was recently used.
The new method does not differentiate between times when the sender has become idle or application-limited. It recognises that applications can result in variable-rate transmission. This therefore reduces the incentive for an application to send data, simply to keep transport congestion state. The method requires SACK to be enabled. This allows a sender to select a cwnd following a congestion event that is based on the measured path capacity path, better reflecting the fair-share. A similar approach was proposed by TCP Jump Start [Liu07], as a congestion response after more rapid opening of a TCP connection.
It is expected that the proposed TCP update will satisfy the requirements of many variable-rate applications and at the same time provide an appropriate method for use in the Internet. This change may also encourage applications to use standards-based congestion control methods.
The method described in this document updates RFC 5681. Use of the method REQUIRES a TCP sender and the corresponding receiver to enable the SACK option [RFC3517].
RFC 5681 defines a variable FlightSize, that indicates the amount of outstanding data in the network. In RFC 5681 this is used during loss recovery, whereas in this method it is also used during normal data transfer. A sender is not required to continuously track this value, but SHOULD measure the volume of data in the network with a sampling period of not less than one RTT period.
The updated method creates a new TCP phase that captures whether the cwnd reflects a valid or nonvalidated value. The phases are defined as:
A TCP sender that enters the non-validated phase MUST preserve the cwnd (i.e., this neither grows nor reduces while the sender remains in this phase). The phase is concluded after a fixed period of time (6 minutes, as explained in section 4.4) or when the sender transmits using the full cwnd (i.e. it is no longer application-limited).
The behaviour in the non-validated phase is specified as:
The threshold value of cwnd required to enter the nonvalidated phase is intentionally different to that required to leave the phase. This introduces hysteresis to avoid rapid oscillation between the phases. Note that the change between phases does not significantly impact an application-limited sender.
During the non-validated phase, an application may produce bursts of data at up to the cwnd in size. This is no different to normal TCP, however it is desirable to control the maximum burst size, e.g. by setting a burst size limit, using a pacing algorithm, or some other method.
An application that remains in the nonvalidated phase for a period greater than six minutes is required to adjust its congestion control state.
At the end of the nonvalidated phase, the sender MUST update cwnd:
cwnd = max(FlightSize*2, IW).
Where IW is the TCP initial window [RFC5681].
(The value for cwnd was chosen to allow an application to continue to send at the currently utilised rate, and not incur delay should it increase to twice the utilised rate.)
The sender also MUST reset the ssthresh:
ssthresh = max(ssthresh, 3*cwnd/4).
This adjustment of ssthresh ensures that the sender records that it has safely sustained the present rate. The change is beneficial to applications-limited flows that encounter occasional congestion, and could otherwise suffer an unwanted additional delay in recovering the transmission rate.
The sender MAY re-enter the nonvalidated phase, if required (see section 4.2).
Reception of congestion feedback while in the nonvalidated phase, i.e., it detects a packet-drop or receives an Explicit Congestion Notification (ECN), indicates that it was inappropriate for the sender to use the preserved cwnd. The sender is therefore required to quickly reduce the rate to avoid further congestion. since cwnd does not reflect a validated value, a new cwnd value must be selected based on the utilised rate.
When congestion is detected, the sender MUST calculate a safe cwnd:
cwnd = FlightSize? R.
Where, R is the volume of data that was reported as unacknowledged by the SACK information. This follows the method proposed for Jump Start [Liu07].
At the end of the recovery phase, the TCP sender MUST reset the cwnd:
cwnd = (FlightSize/2).
Setting a limit to the period that cwnd is preserved avoids the undesirable side effects that would result if the cwnd were to be preserved for an arbitrary long period, which was a part of the problem that CWV originally attempted to address. The period a sender may safely preserve the cwnd, is a function of the period that a network path is expected to sustain capacity reflected by cwnd. There is no perfect choice for this time. The period of 6 minutes was chosen as a compromise that was larger than the idle intervals of common applications, but not sufficiently larger than the period for which an Internet path may commonly be regarded as stable.
The capacity of wired networks is usually relatively stable for periods of several minutes and that load stability increases with the capacity. This suggests that cwnd may be preserved for at least a few minutes.
There are cases where the TCP throughput exhibits significant variability over a time less than 6 minutes. Examples could include many wireless topologies, where TCP rate variations may fluctuate on the order of a few seconds as a consequence of medium access protocol instabilities. Mobility changes may also impact TCP performance over short time scales. Senders that observe such rapid changes in the path characteristic may also experience increased congestion with the new method, however such variation would likely also impact TCP’s behaviour when supporting interactive and bulk applications.
Routing algorithms may modify the network path, disrupting the RTT measurement and changing the capacity available to a TCP connection, however such changes do not often occur within a time frame of a few minutes.
The value of 6 minutes is expected to be sufficient for most current applications. Simulation studies also suggest that for most practical applications, the performance using this value will not be significantly different to that observed using a non-standard method that does not reset cwnd after idle.
General security considerations concerning TCP congestion control are discussed in RFC 5681. This document describes a algorithm that updates one aspect of those congestion control procedures, and so the considerations described in RFC 5681 apply to this algorithm also.
None.
The authors acknowledge the contributions of Dr A Sathiaseelan and Dr R Secchi in supporting the evaluation of CWV and for their help in developing the protocol proposed in this draft.
[RFC2119] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. |
[RFC2861] | Handley, M., Padhye, J. and S. Floyd, "TCP Congestion Window Validation", RFC 2861, June 2000. |
[RFC3517] | Blanton, E., Allman, M., Fall, K. and L. Wang, "A Conservative Selective Acknowledgment (SACK)-based Loss Recovery Algorithm for TCP", RFC 3517, April 2003. |
[RFC5681] | Allman, M., Paxson, V. and E. Blanton, "TCP Congestion Control", RFC 5681, September 2009. |