Internet-Draft scetcp November 2019
Grimes & Heist Expires 7 May 2020 [Page]
Workgroup:
TCP Maintenance and Minor Extensions
Internet-Draft:
draft-grimes-tcpm-tcpsce-01
Published:
Intended Status:
Experimental
Expires:
Authors:
R.W. Grimes
P. Heist

Some Congestion Experienced in TCP

Abstract

This memo classifies a TCP code point ESCE ("Echo Some Congestion Experienced") for use in feedback of IP code point SCE ("Some Congestion Experienced").

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 7 May 2020.

Table of Contents

1. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] and [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Introduction

This memo requests a TCP header codepoint for use as ESCE.

This memo limits its scope to the definition of the TCP codepoint ESCE, with a few brief illustrations of how it may be used.

SCE provides early and proportional feedback to the CC (congestion control) algorithms for transport protocols, including but not limited to TCP. The [sce-repo] is a Linux kernel modified to support SCE, including:

3. Background

[I-D.morton-tsvwg-sce] defines the IP SCE codepoint.

4. TCP Receiver

The mechanism defined to feed back SCE signals to the sender explicitly makes use of the ESCE ("Echo Some Congestion Experienced") code point in the TCP header.

4.1. Single ACK implementation

Upon receipt of a packet an ACK is immediatly generated, the SCE codepoint is copied into the ESCE codepoint of the ACK. This keeps the count of bytes SCE marked or not marked properly reflected in the ACK packet(s). This valid implementation has the downside of increasing ACK traffic. This implementation is NOT RECOMMENDED, but useful for experimental work.

4.2. Simple Delayed ACK implementation

Upon receipt of a packet without an SCE codepoint traditional delayed ACK processing is performed. Upon receipt of a packet with an SCE codepoint immediate ACK processing SHOULD be done, this allows some delaying of ACK's, but creates earlier feedback of the congested state. This has the negative effect of over signalling ESCE.

4.3. Dithered Delayed ACK implementation

Upon receipt of a packet the SCE codepoint is stored in the TCP state. Multiple packets state may be stored. Upon generation of an ACK, normal or delayed, the stored SCE state is used to set the state of ESCE. If no SCE state is in the TCP state, then the ESCE code point MUST NOT be set. If all of the packets to be ACKed have SCE state set then the ESCE code point MUST be set in the ACK. If some of the packets to be ACKed have SCE state set then some proportional number of ACK packets SHOULD be sent with the ESCE code point set. Though this may defer a ESCE congestion signal when there is not a next packet for some time it is generally accepted that such sparse flows are not the source of congestion and thus the delayed signal is of low impact. The goal is to have the same number of bytes marked with ESCE as arrived with SCE.

4.4. Advanced ACK implementation

The Advanced ACK implementation actually immediately flushes any pending ACK's up to the previous segment when the state of the SCE marking changes, allowing consecutive packets with the same SCE state to be coalesced by the normal delayed-ack logic. The ACK volume is then inflated only slightly compared to an unmarked connection, and may actually involve fewer acks than a connection involving CE marks or losses, during which delayed acks are temporarily disabled.

4.5. ACK Thinning

Ack thinning is something that has been considered, given that [cake] includes an optional ack-filter which does thinning. We have, for example, added consideration of the ESCE bit to Cake's ack-filter. Mathematically, the most extreme errors possible in either direction, due to ack thinning, are easily corrected during subsequent RTTs.

5. TCP Sender

The recommended response to each single segment marked with ESCE is to reduce cwnd by an amortised 1/sqrt(cwnd) segments. If the growth rate is greater than that provided by the Reno-linear algorithm - eg. slow-start exponential or CUBIC polynomial - then the growth rate SHOULD also be reduced.

Other responses, such as the 1/cwnd from DCTCP, are also acceptable but may perform less well.

There are no changes to the response functions with respect to CE or packet loss specificed by this draft, hence [RFC3168] and [RFC8511] are still applicable

This is still an area of continued investigation.

7. IANA Considerations

This document requests one of the reserved bits in the TCP header, with the former TCP NS ("Nonce Sum") bit (bit 7) being suggested due to similarities with its previous usage. [RFC8311] (section 3) obsoletes the NS codepoint making it avaliable for use.

8. Security Considerations

There are no Security considerations.

9. Acknowledgements

TBD

10. Normative References

[RFC8311]
Black, D., "Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation", RFC 8311, DOI 10.17487/RFC8311, , <https://www.rfc-editor.org/info/rfc8311>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[I-D.morton-tsvwg-sce]
Morton, J. and R. Grimes, "The Some Congestion Experienced ECN Codepoint", Work in Progress, Internet-Draft, draft-morton-tsvwg-sce-00, , <https://tools.ietf.org/html/draft-morton-tsvwg-sce-00>.

11. Informative References

[cake]
"Cake - Common Applications Kept Enhanced", , <http://www.bufferbloat.net/projects/codel/wiki/Cake>.
[RFC8511]
Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, "TCP Alternative Backoff with ECN (ABE)", RFC 8511, DOI 10.17487/RFC8511, , <https://www.rfc-editor.org/info/rfc8511>.
[I-D.ietf-tcpm-accurate-ecn]
Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More Accurate ECN Feedback in TCP", Work in Progress, Internet-Draft, draft-ietf-tcpm-accurate-ecn-09, , <https://tools.ietf.org/html/draft-ietf-tcpm-accurate-ecn-09>.
[sce-repo]
"Some Congestion Experienced Reference Implementation GitHub Repository", , <https://github.com/chromi/sce/>.
[RFC3168]
Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, , <https://www.rfc-editor.org/info/rfc3168>.

Authors' Addresses

Rodney W. Grimes
Redacted
Portland, OR 97217
United States
Peter G. Heist
Redacted
463 11 Liberec 30
Czech Republic