Internet-Draft | IPv6 Extended Fragment Header | December 2023 |
Templin | Expires 14 June 2024 | [Page] |
The Internet Protocol, version 4 (IPv4) header includes a 16-bit Identification field in all packets, but this length is too small to ensure reassembly integrity even at moderate data rates in modern networks. Even for Internet Protocol, version 6 (IPv6), the 32-bit Identification field included when a Fragment Header is present may be smaller than desired for some applications. This specification addresses these limitations by defining an IPv6 Destination Options Extended Fragment Header option that includes a 64-bit Identification; it further defines a control messaging service for fragment retransmission and reassembly congestion management.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 June 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Internet Protocol, version 4 (IPv4) header includes a 16-bit Identification in all packets [RFC0791], but this length is too small to ensure reassembly integrity even at moderate data rates in modern networks [RFC4963][RFC6864][RFC8900]. For Internet Protocol, version 6 (IPv6), the Identification field is only present in packets that include a Fragment Header where its standard length is 32-bits [RFC8200], but even this length may be too small for some applications (such as those that reset the starting sequence number frequently). This specification therefore defines a means to extend the IPv6 Identification length through the introduction of a new IPv6 Destination Options Extended Fragment Header option.¶
The Extended Fragment Header option may be useful for networks that engage fragmentation and reassembly at extreme data rates, or for cases when advanced packet Identification uniqueness assurance is critical. (The placement of the Extended Fragment Header in a Destination Options header also makes the packet less prone to loss due to network filtering.) This specification further defines a messaging service for adaptive realtime response to loss and congestion related to fragmentation/reassembly. Together, these extensions support robust fragmentation and reassembly services as well as packet Identification uniqueness for IPv6.¶
The terms "Maximum Transmission Unit (MTU)", "Effective MTU to Receive (EMTU_R)", "Effective MTU to Send (EMTU_S)" and "Maximum Segment Lifetime (MSL)" are used exactly the same as for standard Internetworking terminology [RFC1122]. The term MSL is equivalent to the term "maximum datagram lifetime (MDL)" defined in [RFC0791][RFC6864].¶
The term "Packet Too Big (PTB)" refers to an IPv6 "Packet Too Big" [RFC8201][RFC4443] message.¶
The term "flow" refers to a sequence of packets sent from a particular source to a particular unicast, anycast or multicast destination [RFC6437].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Upper layer protocols often achieve greater performance by configuring segment sizes that exceed the path Maximum Transmission Unit (MTU). When the segment size exceeds the path MTU, IP fragmentation at some layer is a natural consequence. However, the 4-octet (32-bit) Identification field of the Fragment Header may be too small to ensure reassembly integrity at sufficiently high data rates, especially when the source resets the starting sequence number often to maintain an unpredictable profile [RFC7739]. This specification therefore proposes to fortify the IPv6 Identification by extending its length.¶
Performance increases for upper layer protocols that use larger segment sizes was historically observed for NFS over UDP, and can still be readily observed today for TCP and the Delay Tolerant Network (DTN) Licklider Transmission Protocol (LTP). A simple test setup consists of a pair of modern high-performance servers with 100Gbps Ethernet cards connected via a point-to-point link and running a public domain linux release such as Ubuntu. TCP performance using the public domain 'iperf3' tool is proven to increase using larger segment sizes even if they exceed the path MTU. LTP performance with segment sizes that exceed the path MTU is similarly proven using the HDTN and ION-DTN LTP implementations. QUIC performance testing using the 'qperf' tool does not show an advantage for the use of larger path MTUs (with or without fragmentation) since 'qperf' limits its packet sizes to 1280 octets. For this reason, 'qperf' performance was a factor of 4 less than LTP and a factor of 8 less than TCP when those protocols used larger packet sizes and/or invoked fragmentation.¶
In addition to accommodating higher data rates in the presence of fragmentation and reassembly, extending the IPv6 Identification can enable other important services. For example, an extended Identification can enable a duplicate packet detection service in which the network remembers recent Identification values for a flow to aid detection of potential duplicates (note however that the network layer must not incorrectly flag intentional lower layer retransmissions as duplicates). An extended Identification can also provide a packet sequence number that allows communicating peers to exclude any packets with values outside of a current sequence number window for a flow as potential spurious transmissions.¶
A robust IP fragmentation and reassembly service can provide a useful tool for performance maximization in the Internet when an extended Identification is available. This document therefore presents a means to extend the IPv6 Identification to better support these services through the introduction of an IPv6 Extended Fragment Header.¶
For a standard 4-octet IPv6 Identification, the source can simply include an ordinary IPv6 Fragment Header as specified in [RFC8200] with the Fragment Offset field and M flag set either to values appropriate for a fragmented packet or the value 0 for an unfragmented packet. The source then includes a 4-octet Identification value for the packet.¶
For an extended Identification and/or for paths that drop packets including the standard IPv6 Fragment Header, this specification permits the source to instead include an IPv6 Extended Fragment Header in a Destination Options Header in the same extension header order where the Fragment Header would normally appear.¶
The source includes the Extended Fragment Header as the lone option in a Destination Options header inserted after any Per-Fragment headers and before the Extension and Upper Layer Headers for the first fragment or before the Fragment data for non-first fragments - see Sections 4.1 and 4.5 of [RFC8200].¶
Since middleboxes may not recognize this as a Fragment Header, however, the source caches the Destination Options Next Header value in the Extended Fragment Header option NH-Cache field and upon fragmentation sets the Next Header field to "No Next Header" to avoid any possibility for confusion (the destination will restore the Next Header value upon reassembly).¶
The IPv6 Extended Fragment Header is formatted as shown in Figure 1:¶
The Extended Fragment Header option is therefore identified as an Option Type with the low-order 5 bits set to TBD1 (see: IANA Considerations), with the third-highest-order bit (i.e., "chg") set to 0 and with the highest-order 2 bits (i.e., "act") set as discussed below. The Identification field is 8 octets (64 bits) in length, and a Destination Options header that includes the option may appear either in an unfragmented IPv6 packet or in one for which IPv6 fragmentation is applied (with a copy of the header appearing in each fragment).¶
When the source includes an Extended Fragment Header option and applies source fragmentation (see: Section 5, it sets the highest-order 2 bits of the option code to '01', '10' or '11' so that destinations that do not recognize the option will drop the fragments and (possibly) also return an ICMPv6 Parameter Problem message. When no source fragmentation is applied, the source can optionally set the highest-order 2 bits of the option code to '00' allowing the destination to process the packet even if it does not recognize the Extended Fragment Header.¶
When the source applies source fragmentation using the Extended Fragment Header destination option, fragmentation procedures are the same as for standard IPv6 fragmentation except that the Destination Options header containing the option appears in place of the standard IPv6 Fragment Header (see: Section 4.5 of [RFC8200]) and the Fragment Header is omitted.¶
During source fragmentation, the source SHOULD produce the smallest number of fragments possible (i.e., the largest possible fragments) within current path MTU constraints. In particular, the source SHOULD limit the number of fragments produced to no more than 64 fragments per packet, allowing for all conventional packet sizes up to and including the 65535 octet maximum under the 1280 octet IPv6 minimum path MTU.¶
For each fragment produced during fragmentation except for the first (i.e., the one with Offset=0), the source writes an ordinal index number in the Extended Fragment Header "Index" field. Specifically, the source sets Index to 1 for the first non-first fragment, 2 for the second, 3 for the third, etc., up to and including the final fragment (i.e., the one with M=0). If there are more than 64 fragments, the source sets Index to 63 in all remaining fragments beginning with the 64th up to and including the final.¶
The source also caches the Destination Options header Next Header value in the NH-Cache field. For each fragment produced during fragmentation, the source includes the Destination Options header in place of the standard Fragment Header and resets its Next Header field to "No Next Header".¶
The destination then reassembles the same as specified in Section 4.5 of [RFC8200]. During reassembly, the destination resets the Destination Options header Next Header field to the value cached in the NH-Cache field.¶
Intermediate systems that forward packets fragmented in this way will therefore interpret the data that follows the Destination Options header that incudes the Extended Fragment Header as undefined data (by virtue of the "No Next Header" setting) unless they are configured to more deeply inspect the data content.¶
Destinations that do not recognize the Extended Fragment Header Destination Option accept or drop the packet according to the Option Type "act" code. If the "act" code is '00', destinations ignore the option and accept the packet. For other codes, destinations instead drop the packet and (for "act" codes '1X') may return a Code 2 ICMPv6 Parameter Problem message [RFC4443]. (ICMPv6 messages may be lost on the return path and/or manufactured by an adversary, however, and therefore provide only an advisory indication.)¶
The source can then test whether destinations recognize the Extended Fragment Header by occasionally sending "probe" packets (either fragmented or unfragmented) that include the option with an "act" code other than '00'. The source has assurance that some destinations recognize the option if it receives acknowledgments and/or hints that some destinations do not recognize the option if it receives ICMPv6 Parameter Problem messages. The source should re-probe destinations occasionally in case routing redirects a flow to a different anycast destination or in case a multicast group membership changes (see: Section 9).¶
The source can also include IPv6 Minimum Path MTU Discovery Hop-by-Hop options in packets/fragments sent to unicast, multicast or anycast destinations per [RFC9268]. If the source receives acknowledgements that include a return path MTU option, the source should regard the reported MTU as the largest potential fragment size for this destination under current path MTU conditions noting that the actual size may be smaller still for some paths.¶
When a router attempts to forward an IPv6 packet (or fragment) that exceeds the next hop link MTU but for which fragmentation is forbidden, it returns a standard ICMPv6 Packet Too Big (PTB) message to the source [RFC4443][RFC8201] and discards the packet. This always results in wasted transmissions by the source and all routers on the path toward the one with the restricting link. Moreover, the messages are subject to spoofing and loss in the network [RFC2923].¶
Since routers are not permitted to perform IPv6 fragmentation, this means that the source should perform source fragmentation (if necessary) with a maximum fragment size limited to the path MTU. If the source receives PTB messages, it should reduce the size of the packets/fragments it sends.¶
While the fragmentation and reassembly processes eliminate wasted transmissions and support significant performance gains by accommodating upper layer protocol segment sizes that exceed the path MTU, the processes sometimes represent pain points for the destination and/or network as a whole that the destination should communicate to the source. The source should then take measures to reduce the size of the packets that it sends.¶
End systems that recognize the Extended Fragment Header according to this specification also recognize a new Destination Options option type as shown in Figure 2. The destination end system includes the option in a Destination Options Extension Header of an IPv6 protocol return packet to the source when reassembly congestion and/or fragment loss occurs. Any IPv6 packet that is part of an ongoing flow can be used to carry the option, especially if it includes identifying information such as transport layer parameters and/or an authentication signature.¶
Destinations that experience reassembly congestion and/or fragment loss can return an IPv6 packet for an ongoing flow that includes the option in a Destination Options Extension Header (i.e., and not as a Per-Fragment header). The destination sets Option Type to '000[TBD2]' (see: IANA Considerations), sets Option Data Len to 16 and sets Identification to the value for the current reassembly. The destination next sets all Bitmap(i) bits (for i=0 to 63) to 1 if all fragments of the packet have arrived. If some fragments are missing, the destination instead sets Bitmap(i) to 1 for each ordinal fragment index 'i' it has received for this reassembly and sets Bitmap(i) to 0 for all others.¶
When the source receives authentic IPv6 packets with the Fragmentation Report option, it should significantly decrease the size of its future packet transmissions to this destination to reduce congestion. If the PTB includes a Bitmap with some Bitmap(i) bits set to 0, the source can retransmit any missing ordinal fragments if it still has them in its cache. When the source ceases to receive Fragmentation Reports, it can begin to gradually increase the size its future packet transmissions to this destination.¶
Note: the above source packet size adaptation based on destination reassembly feedback parallels the Additive Increase, Multiplicative Decrease (AIMD) congestion control strategy employed by TCP and other reliable transports.¶
Note: for reassembly of fragment chains that include more than 64 fragments, the destination sets Bitmap(63) to 1 only if all ordinal fragments beginning with the 64th and beyond have been received; otherwise, it sets Bitmap(63) to 0. Note that this may cause the source to unnecessarily retransmit many trailing fragments, i.e., beginning with ordinal 63 up to and including the final fragment.¶
In addition to unicast flows, similar considerations apply for flows in which the destination is a multicast group or an anycast address. Unless the source and all candidate destinations are members of a limited domain network [RFC8799] for which all nodes recognize the IPv6 Extended Fragment Header Destination Option, some destinations may recognize the option while others drop packets containing the option and may return a Code 2 ICMPv6 Parameter Problem message [RFC4443].¶
When a source sends packets/fragments with IPv6 Extended Fragment Headers to a multicast group, the packets/fragments may be replicated in the network such that a single transmission may reach N destinations over as many as N different paths. Some destinations may then return IPv6 packets with Fragmentation Reports if they experience congestion and/or loss. (Some destinations may also return Code 2 ICMPv6 Parameter Problem messages if they do not recognize the Extended Fragment Header.)¶
While the source receives PTB messages or authentic Fragmentation Reports, it should reduce the fragment/packet sizes that it sends to the multicast group even if only one or a few paths or destinations are currently experiencing congestion. This means that transmissions to a multicast group will converge to the performance characteristics of the lowest common denominator group member destinations and/or paths. While the source receives ICMPv6 Parameter Problem messages and/or otherwise detects that some multicast group members do not recognize the Extended Fragment Header option, it must determine whether the benefits for group members that recognize the option outweigh the drawbacks of service denial for those that do not.¶
When a source sends packets/fragments with IPv6 Extended Fragment Headers to an anycast address, routing may direct initial fragments of the same packet to a first destination that configures the address while directing the remaining fragments to other destinations that configure the address. These wayward fragments will simply result in incomplete reassemblies at each such anycast destination which will soon purge the fragments from the reassembly buffer. The source will eventually retransmit, and all resulting fragments should eventually reach a single reassembly target.¶
All nodes that process an IPv6 Destination Options Header with Extended Fragment Header and/or Fragmentation Report options observe the extension header limits specified in [I-D.ietf-6man-eh-limits].¶
Intermediate systems MUST forward without dropping IPv6 packets that include a Destination Options header with an Extended Fragment Header and/or Fragmentation Report unless they detect a security policy threat through deeper inspection of the protocol data that follows.¶
Destinations that accept flows using Extended Fragment Headers MUST configure an EMTU_R of 65535 octets or larger.¶
Sources MUST include at most one IPv6 Standard or Extended Fragment Header in each IPv6 packet/fragment. Intermediate systems and destinations SHOULD silently drop packets/fragments with multiples. If the source includes an Extended Fragment Header option, it must appear as the only option in a Destination Options header that appears in the same extension header order that the IPv6 Standard Fragment Header would normally appear.¶
Sources and Destinations that recognize the Extended Fragment Header option MUST also recognize the Fragmentation Report option as specified in Section 8.¶
Sources SHOULD maintain a cache of recently-sent fragments in case the destination requests retransmission. The destination is required to send an "ICMP Time Exceeded - Fragment Reassembly Time Exceeded" message if insufficient fragments are received to complete reassembly within 60 seconds (see: Section 4.5 of [RFC8200]), but that time may be longer than practical for the source to retain fragments in a retransmission cache. The source SHOULD therefore maintain the cache for only a small time delta beyond the round-trip time to the destination, and the destination SHOULD send Fragmentation Reports as early as practically possible upon experiencing fragment loss.¶
During the earliest days of internetworking, researchers attributed the warning label "harmful" to IP fragmentation based on empirical observations in the ARPANET, DARPA Internet and other internetworks of the day [KENT87]. This inspired an engineering discipline known as "Path MTU Discovery" within an emerging community of interest known as the Internet Engineering Task Force (IETF).¶
In more recent times, the IETF published "IP Fragmentation Considered Fragile" [RFC8900] to characterize the current state of fragmentation in the modern Internet. The IPv6 Extended Fragment Header now introduces a more robust solution based on a properly functioning IP fragmentation and reassembly service as intended in the original architecture.¶
Although the IP fragmentation and reassembly services provide an appropriate solution for conventional packet sizes as large as 65535 octets, they cannot be applied for larger packets nor for IP parcels and Advanced Jumbos (AJs) [I-D.templin-intarea-parcels]. This means that a combined solution with robust fragmentation and reassembly applied in parallel with traditional path MTU probing provides a combination well suited for Internetworking futures. This document therefore updates [RFC8900].¶
In progress.¶
The IANA is requested to assign a first new IPv6 Destination Option type in the "Destination Options and Hop-by-Hop Options" table of the 'ipv6-parameters' registry (registration procedures IESG Approval, IETF Review or Standards Action). The option should appear in 4 consecutive table entries that set "act" to 'XX', "chg" to '0', "rest" to TBD1, "Description" to "IPv6 Extended Fragment Header" and "Reference" to this document [RFCXXXX] (i.e., with "act" set to '00 for the first entry, '01' for the second, '10' for the third, and '11' for the fourth and final entry). Each table entry also sets "Hex Value" to the 2-digit hexadecimal value corresponding to the 8-bit concatenation of act/chg/rest.¶
The IANA is requested to assign a second new IPv6 Destination Option type in the "Destination Options and Hop-by-Hop Options" table of the 'ipv6-parameters' registry (registration procedures IESG Approval, IETF Review or Standards Action). The option should appear in a table entry that sets "act" to '00', "chg" to '0', "rest" to TBD2, "Description" to "IPv6 Fragmentation Report" and "Reference" to this document [RFCXXXX]. The table entry also sets "Hex Value" to the 2-digit hexadecimal value corresponding to the 8-bit concatenation of act/chg/rest.¶
All aspects of IP security apply equally to this document, which does not introduce any new vulnerabilities. Moreover, when employed correctly the mechanisms in this document robustly address known IP reassembly integrity concerns [RFC4963] and also provide an advanced degree of packet Identification uniqueness assurance.¶
All security aspects of [RFC7739], including the algorithms for selecting fragment identification values, apply also to the IPv6 Extended Fragment Header. In particular, the source should reset its starting Identification value frequently (e.g., per the [RFC7739] algorithms) to maintain an unpredictable profile.¶
All normative security guidance on IPv6 fragmentation found in [RFC8200] (e.g., processing of tiny first fragments, overlapping fragments, etc.) applies also to the fragments generated under the Extended Fragment Header.¶
This work was inspired by continued DTN performance studies. Amanda Baber, Tom Herbert, Bob Hinden, Christian Huitema, Mark Smith and Eric Vyncke offered useful insights that helped improve the document.¶
Honoring life, liberty and the pursuit of happiness.¶
<< RFC Editor - remove prior to publication >>¶
Differences from earlier versions:¶
First draft publication.¶