Internet-Draft | Transport for Satellite | October 2021 |
Jones, et al. | Expires 16 April 2022 | [Page] |
IETF transport protocols such as TCP, SCTP and QUIC are designed to function correctly over any network path. This includes networks paths that utilise a satellite link or network. While transport protocols function, the characteristics of satellite networks can impact performance when using the defaults in standard mechanisms, due to the specific characteristics of these paths.¶
[RFC2488] and [RFC3135] describe mechanisms that enable TCP to more effectively utilize the available capacity of a network path that includes a satellite system. Since publication, both application and transport layers and satellite systems have evolved. Indeed, the development of encrypted protocols such as QUIC challenges currently deployed solutions, for satellite systems the capacity has increased and commercial systems are now available that use a range of satellite orbital positions.¶
This document describes the current characterises of common satellite paths and describes considerations when implementing and deploying reliable transport protocols that are intended to work efficiently over paths that include a satellite system. It discusses available network mitigations and offers advice to designers of protocols and operators of satellite networks.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 16 April 2022.¶
Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
Satellite communications (SATCOM) systems have long been used to support point-to-point links and specialised networks. The predominate current use today is to support Internet Protocols. Typical example applications include: use as an access technology for remote locations, backup and rapid deployment of new services, transit networks, backhaul of various types of IP networks, and provision to mobile environments (maritime, aircraft, etc.).¶
In most scenarios, the satellite IP network segment forms only one part of the end-to-end path used by an Internet transport protocol. This means that user traffic can experience a path that includes a satellite network combined with a wide variety of other network technologies (Ethernet, cable modems, WiFi, cellular, radio links, etc). Although a user can sometimes know the presence of a satellite service, a typical user does not deploy special software or applications when a satellite network is being used. Users are often unaware of the technologies underpinning the links forming a network path.¶
Satellite path characteristics have an effect on the operation of transport protocols, such as TCP, SCTP or QUIC. Transport Protocol performance can be affected by the magnitude and variability of the network delay. When transport protocols perform poorly the link utilization can be low. Techniques and recommendations have been made that can improve the performance of transport protocols when the path includes as satellite network.¶
The end-to-end performance of an application using an Internet path can be impacted by the path characteristics, such as the Bandwidth-Delay Product (BDP) of the links and network devices forming the path. It can also be impacted by underlying mechanisms used to manage the radio resources.¶
Performance can be impacted at several layers. For instance, the page load time for a complex page can be much larger when a path includes a satellite system. A significant contribution to the reduced performance arises from the initialisation and design of transport mechanisms.¶
Although mechanisms are designed for use across Internet paths, not all designs are performant when used over the wide diversity of path characteristics that can occur. This document therefore considers the implications of Internet paths that include a satellite system. The analysis and conclusions might also apply to other network systems that also result in characteristics that differ from typical Internet paths.¶
RFC2488 specifies an Internet Best Current Practices for the Internet Community, relating to use of the standards-track Transmission Control Protocol (TCP) mechanisms over satellite channels [RFC2488]. A separate RFC,[RFC2760], identified research issues and proposed mitigations for satellite paths.¶
Since the publication of these RFCs many TCP mechanisms have become widely used. In particular, this includes a series of mitigation based on Performance Enhancing Proxies (PEPs) [RFC3135] that split the protocol at the transport layer. Although PEPs are now a common component of satellite systems, their use slows the deployment of new transport protocols and mechanisms (each of which demands an update to the PEP functionality). This has made it difficult for new protocol extensions to achieve comparable performance over satellite channels. In addition, protocols with strong requirements on authentication and privacy such as QUIC [RFC9000] are not able to be split using a PEP and mitigation, and need to therefore use other methods.¶
XXX Note from the current authors: This document currently focuses on Geosynchronous Earth Orbit (GEO) satellite systems, the authors solicit feedback and experience from users and operators of satellite systems using other orbits. XXX¶
The remainder of this document is divided as follows:¶
Satellite communications systems have been deployed using many space orbits, including low earth orbit, medium earth orbits, geosynchronous orbits, elliptical orbits and more. This document considers the characteristics of all satellite networks.¶
The characteristics of systems using Geosynchronous Earth Orbit (GEO) satellites differ from paths only using terrestrial links in their path characteristics:¶
As an example. GEO systems use the DVB-S2 specifications [EN 302 307-1], published by the European Telecommunications Standards Institute (ETSI), where the key concept is to ensure both a good usage of the satellite resource and a Quasi Error Free (QEF) link. These systems typically monitor the link quality in real-time, with the help of known symbol sequences, included along with regular packets, which enable an estimation of the current signal-to-noise ratio. This estimation is then feedback allowing the transmitting link to adapt its coding rate and modulation to the actual transmission conditions.¶
There is an important variability of LEO systems. Depending on the locations of the gateways on the ground, routing within the constellation may be necessary to bring to packets down to the ground. Depending on the routes currently available for an end user, high levels of jitter may occur (from 40ms to 140ms with the Iridium constellation). This may lead to out-of-order delivery of packets.¶
XXX The authors solicit feedback and experience from users and operators of satellite systems in LEO orbits. XXX¶
MEO systems such as O3B combines advantages and drawbacks from both LEO and GEO systems.¶
MEO systems can have a large coverage and with limited number of satellites required providing a broad service. The usage of powerful satellites enables provision of high data rates.¶
MEO systems have the drawback, from a transport protocol perspective, that the BDP can be very high due to the altitude of such constellations (8 063 km for O3B) and there may be delay variations when the satellite changes (every 45 minutes with O3B). The latter can be dealt with by means of double antennas terminals.¶
XXX The authors solicit feedback and experience from users and operators of satellite systems in MEO orbits. XXX¶
XXX The authors solicit feedback and experience from users and operators of satellite systems in hybrid network scenarios. XXX¶
There is an inherent delay in the delivery of a packet over a satellite system due to the finite speed of light and the altitude of communications satellites.¶
Satellite links are dominated by two fundamental characteristics, as described below:¶
Satellite systems have several characteristics that differ from most terrestrial channels. These characteristics may degrade the performance of TCP. These characteristics include:¶
Even for characteristics shared with terrestrial paths, the impact on a satellite link could be amplified by the path RTT. For example, paths using a satellite system can also exhibit a high loss-rate (e.g., a mobile user or a user behind a Wi-Fi link), where the additional delay can impact transport mechanisms.¶
Although capacity is often less than in many terrestrial systems, the bandwidth delay product (BDP) defines the amount of data that a protocol is permitted to have "in flight" at any one time to fully utilize the available capacity. In flight means data that is transmitted, but not yet acknowledged.¶
The delay used in this equation is the path RTT and the bandwidth is the capacity of the bottleneck link along the network path. Because the delay in some satellite environments is higher, protocols need to keep a larger number of packets in flight.¶
This also impacts the size of window/credit needed to avoid flow control mechanisms throttling the sender rate.¶
In some satellite environments, such as some Low Earth Orbit (LEO) constellations, the propagation delay to and from the satellite varies over time.¶
Even when the propagation delay varies only very slightly, the effects of medium access methods can result in significant variation in the link delay. Whether or not this will have an impact on performance of a well-designed transport is currently an open question.¶
The link delay of some satellite systems may require more time for a transport sender to determine whether or not a packet has been successfully received at the final destination. This delay impacts interactive applications as well as loss recovery, congestion control, flow control, and other algorithms (see Section 5).¶
In some non-GEO satellite orbit configurations, from time to time Internet connections need to be transferred from one satellite to another or from one ground station to another. This hand-off might cause excessive packet loss or reordering if not properly performed.¶
This section describes mitigations that operate on the path, rather than with the transport endpoints.¶
XXX Common, but includes adaptive ModCod and sometimes ARQ - which can reduce the loss at the expense of decreasing the available capacity. XXX¶
XXX Packet Size can impact performance and mitigations (such as PEP/Application Proxy) can interact with end-to-end PMTUD XXX¶
Links where packets are sent over radio channels exhibit various trade-offs in the way the signal is sent on the communications channel. These trade-offs are not necessarily the same for all packets, and network traffic flows can be optimised by mapping these onto different types of lower layer treatment (packet queues, resource management requests, resource usage, and adaption to the channel using FEC, ARQ, etc). Many systems differentiate classes of traffic to mange these QoS trade-offs.¶
High BDP networks commonly break the TCP end-to-end paradigm to adapt the transport protocol. Splitting a TCP connection allows adaptation for a specific use-case and to address the issues discussed in Section 2. Satellite communications commonly deploy Performance Enhancing Proxy (PEP) for compression, caching and TCP acceleration services [RFC3135] . Their deployment can result in significant performance improvement (e.g., a 50% page load time reduction in a SATCOM use-case [ICCRG100] .¶
[NCT13] and [RFC3135] describe the main functions of a SATCOM TCP split solution. For traffic originated at a gateway to an endpoint connected via a satellite terminal, the TCP split proxy intercepts TCP SYN packets, acting on behalf of the endpoint and adapts the sending rate to the SATCOM scenario. The split solution can specifically tune TCP parameters to the satellite link (latency, available capacity).¶
When a proxy is used on each side of the satellite link, the transport protocol can be replaced by a protocol other than TCP, optimized for the satellite link. This can be tuned using a priori information about the satellite system and/or by measuring the properties of the network segment that includes the satellite system.¶
Split connections can also recover from packet loss that is local to the part of the connection on which the packet losses occur. This eliminates the need for end-to-end recovery of lost packets.¶
One important advantage of a TCP split solution is that it does not require any end-to-end modification and is independent of both the client and server sides.¶
Split-TCP comes with a significant drawback: TCP splitters are often unable to track end-to-end improvements in protocol mechanisms (e.g., RACK, ECN, TCP Fast Open) or new protocols that share a wire format with TCP (MPTCP [RFC6824]). The set of methods configured in a split proxy usually continue to be used, until the split solution is finally updated. This can delay/negate the benefit of any end-to-end improvements, contributing to ossification of the transport system.¶
Authenticated proxies:¶
This section outlines transport protocol mechanisms that may be necessary to tune or optimize in satellite or hybrid satellite/terrestrial networks to better utilize the available capacity of the link. These mechanisms may also be needed to fully utilize fast terrestrial channels. Furthermore, these mechanisms do not fundamentally hurt performance in a shared terrestrial network. Each of the following sections outlines one mechanism and why that mechanism may be needed.¶
Many transport protocols now deploy 0-RTT mechanisms [REF] to reduce the number of RTTs required to establish a connection. QUIC has an advantage that the TLS and TCP negotiations can be completed during the transport connection handshake. This can reduce the time to transmit the first data.¶
Results of [IJSCN19] illustrate that it can still take many RTTs for a CC to increase the sending rate to fill the bottleneck capacity. The delay in getting up to speed can dominate performance for a path with a large RTT, and requires the congestion and flow controls to accommodate the impact of path delay.¶
One relevant solution is tuning of the initial window described in [I-D.irtf-iccrg-sallantin-initial-spreading] , which has been shown to improve performance both for high BDP and more common BDP [CONEXT15] [ICC16] . Such a solution requires using sender pacing to avoid generating bursts of packets in a network.¶
Size of windows required: to fully exploit the bottleneck capacity, a high BDP requires a larger number of in-flight packets.¶
The number of in-flight packets required to fill a bottleneck capacity, is dependent on the BDP. Default values of maximum windows may not be suitable for a SATCOM context.¶
Such as presented in [PANRG105] , only increasing the initial congestion window is not the only way that can improve QUIC performance in a SATCOM context: increasing maximum congestion windows can also result in much better performance. Other protocol mechanisms also need to be considered, such as flow control at the stream level in QUIC.¶
The time for end systems to perform packet loss detection and recovery/repair is a function of the path RTT.¶
The RTT also determines the time needed by a server to react to a congestion event. Both can impact the user experience. For example, when a user uses a Wi-Fi link to access the Internet via SATCOM terminal.¶
A solution could be to opportunistically retransmit packets even if they have not been detected as lost but the congestion control allows to transmit more packets.¶
XXX Packet level FEC can mitigate loss/re-ordering, with a trade-off in capacity. XXX¶
End-to-end packet Forward Error Correction offers an alternative to retransmission with different trade offs in terms of utilised capacity and repair capability.¶
The benefits of introducing FEC need to weighed against the additional overhead introduced by end-to-end FEC and the opportunity to use link-local ARQ and/or link-adaptive FEC. A transport connections can suffer link-related losses from a particular link (e.g., Wi-Fi), but also congestion loss (e.g. router buffer overflow in a satellite operator ground segment or along an Internet path).¶
Flow Control mechanisms allow the receiver to control the amount of data a sender can have in flight at any time. Flow Control allows the receiver to allocate the smallest buffer sizes possible improving memory usage on receipt.¶
The sizing of initial receive buffers requires a balance between keeping receive memory allocation small while allowing the send window to grow quickly to help ensure high utilization. The size of receive windows and their growth can govern the performance of the protocol if updates are not timely.¶
Many TCP implementations deploy Auto-scaling mechanisms to increase the size of the largest receive window over time. If these increases are not timely then sender traffic can stall while waiting to be notified of an increase in receive window size. XXX QUIC? XXX¶
Multi-streaming Protocols such as QUIC implement Flow Control using credit-based mechanisms that allow the receiver to prioritise which stream is able to send and when. Credit-based systems, when flow credit allocations are not timely, can stall sending when credit is exhausted.¶
When the links are asymmetric, for various reasons, the return path may modify the rate and/timing of transport acknowledgment traffic, potentially changing behaviour (e.g., limiting the forward sending rate).¶
Asymmetry in capacity (or in the way capacity is granted to a flow) can lead to cases where the transmission in one direction of communication is restricted by the transmission of the acknowledgment traffic flowing in the opposite direction. A network segment could present limitations in the volume of acknowledgment traffic (e.g., limited available return path capacity) or in the number of acknowledgment packets (e.g., when a radio-resource management system has to track channel usage), or both.¶
TCP Performance Implications of Network Path Asymmetry [RFC3449] describes a range of mechanisms that have been used to mitigate the impact of path asymmetry, primarily targeting operation of TCP.¶
Many mitigations have been deployed in satellite systems, often as a mechanism within a PEP. Despite their benefits over paths with high asymmetry, most mechanisms rely on being able to inspect and/or modify the transport layer header information of TCP ACK packets. This is not possible when the transport layer information is encrypted (e.g., using an IP VPN).¶
One simple mitigation is for the remote endpoint to send compound acknowledgments less frequently. A rate of one ACK for every RTT/4 can significantly reduce this traffic. The QUIC transport specification may evolve to allow the ACK Ratio to be adjusted.¶
XXX This includes between different satellite systems and between satellite and terrestrial paths XXX¶
One relevant solution is tuning of the initial window described in [I-D.irtf-iccrg-sallantin-initial-spreading][RFC6928], which has been shown to improve performance both for high BDP and more common BDP [CONEXT15] [ICC16]. This requires sender pacing to avoid generating bursts of packets to the network.¶
Mechanisms are being proposed in TCPM for TCP [REF].¶
QUIC has an advantage that the TLS and transport protocol negotiations can be completed during the transport connection handshake. This can reduce the time to transmit the first data. Moreover, using 0-RTT may further reduce the connection time for users reconnecting to a server.¶
Getting up to speed may be easier with the usage of the 0-RTT-BDP extension proposed in [I-D.kuhn-quic-0rtt-bdp].¶
The QUIC transport specification may evolve to allow the ACK Ratio to be adjusted.¶
Default could be adapted following [I-D.fairhurst-quic-ack-scaling] or using extensions to tune acknowledgement strategies [I-D.iyengar-quic-delayed-ack].¶
Network coding as proposed in [I-D.swett-nwcrg-coding-for-quic] and [I-D.roca-nwcrg-rlc-fec-scheme-for-quic] could help QUIC recover from link or congestion loss.¶
Another approach could utilise QUIC tunnels [I-D.schinazi-masque] to apply packet FEC to all or a part of the end-to-end path or enable local retransmissions.¶
Splitting the congestion control requires the deployment of application proxies.¶
Many of the issues identified for high BDP paths already exist when using an encrypted transport service over a path that employs encryption at the IP layer. This includes endpoints that utilise IPsec at the network layer, or use VPN technology over a satellite network segment. Users are unable to benefit from enhancement within the satellite network segment, and often the user is unaware of the presence of the satellite link on their path, except through observing the impact it has on the performance they experience.¶
One solution would be to provide PEP functions at the termination of the security association (e.g., in a VPN client). Another solution could be to fall-back to using TCP (possibly with TLS or similar methods being used on the transport payload). A different solution could be to deploy and maintain a bespoke protocol tailored to high BDP environments. In the future, we anticipate that fall-back to TCP will become less desirable, and methods that rely upon bespoke configurations or protocols will be unattractive. In parallel, new methods such as QUIC will become widely deployed. The opportunity therefore exists to ensure that the new generation of protocols offer acceptable performance over high BDP paths without requiring operating tuning or specific updates by users.¶
XXX A Table will be inserted here XXX¶
The authors would like to thank Mark Allman, Daniel R. Glover and Luis A. Sanchez the authors of RFC2488 from which the format and descriptions of satellite systems in this document have taken inspiration.¶
The authors would like to thank Christian Huitema, Igor Lubashev, Alexandre Ferrieux, Francois Michel, Emmanuel Lochin, github user sedrubal and the participants of the IETF106 side-meeting on QUIC for high BDP for their useful feedback.¶
This document does not propose changes to the security functions provided by the QUIC protocol. QUIC uses TLS encryption to protect the transport header and its payload. Security is considered in the "Security Considerations" of cited IETF documents.¶
This proposes sampler profiles and a set of regression tests to evaluate transport protocols over SATCOM links and discusses how to ensure acceptable protocol performance.¶
XXX These test profiles currently focus on the measuring performance and testing for regressions in the QUIC protocol. The authors solicit input to adapt these tests to apply to more transport protocols. XXX¶
This section proposes a set of regression tests for QUIC that consider high BDP scenarios. We define by:¶
The tested scenario has the following path characteristics:¶
During the transmission of 100 MB on both download and upload paths, the test should report the upload and download time of 2 MB, 10 MB and 100 MB.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
The tested scenario has the following path characteristics:¶
During the transmission of 100 MB on the download path, the test should report the download time for 2 MB, 10 MB and 100 MB. Then, to assess the performance of QUIC with the 0-RTT extension and its variants, after 10 seconds, repeat the transmission of 100 MB on the download path where the download time for 2 MB, 10 MB and 100 MB is recorded.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
There are cases where the uplink path is congested or where the capacity of the uplink path is not guaranteed.¶
The tested scenario has the following path characteristics:¶
During the transmission of 100 MB on the download path, the test should report the download time for 2 MB, 10 MB and 100 MB.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
There are cases where the downlink path is congested or where, due to link layer adaptations to rain fading, the capacity of the downlink path is variable.¶
The tested scenario has the following path characteristics:¶
During the transmission of 100 MB on the download path, the test should report the download time for 2 MB, 10 MB and 100 MB.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
The tested scenario has the following path characteristics:¶
During the transmission of 100 MB on the download path, the test should report the download time for 2 MB, 10 MB and 100 MB. Then, to assess the performance of QUIC with the 0-RTT extension and its variants, after 10 seconds, repeat the transmission of 100 MB on the download path where the download time for 2 MB, 10 MB and 100 MB is recorded.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
The tested scenario has the following path characteristics:¶
Emulated packet loss on both downlink and uplink paths:¶
During the transmission of 100 MB on the download path, the test should report the download time for 2 MB, 10 MB and 100 MB.¶
Initial thoughts of the performance objectives for QUIC are the following:¶
Note to RFC-Editor: please remove this entire section prior to publication.¶
Individual draft -00:¶
Individual draft -01:¶