Internet-Draft IPv6 Extended Fragment Header December 2023
Templin Expires 10 June 2024 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-templin-6man-ipid-ext-08
Updates:
8200, 8900 (if approved)
Published:
Intended Status:
Standards Track
Expires:
Author:
F. L. Templin, Ed.
Boeing Research & Technology

IPv6 Extended Fragment Header

Abstract

The Internet Protocol, version 4 (IPv4) header includes a 16-bit Identification field in all packets, but this length is too small to ensure reassembly integrity even at moderate data rates in modern networks. Even for Internet Protocol, version 6 (IPv6), the 32-bit Identification field included when a Fragment Header is present may be smaller than desired for some applications. This specification addresses these limitations by defining an IPv6 Destination Options Extended Fragment Header option that includes a 64-bit Identification; it further defines control messaging services for fragmentation and reassembly congestion management.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 10 June 2024.

Table of Contents

1. Introduction

The Internet Protocol, version 4 (IPv4) header includes a 16-bit Identification in all packets [RFC0791], but this length is too small to ensure reassembly integrity even at moderate data rates in modern networks [RFC4963][RFC6864][RFC8900]. For Internet Protocol, version 6 (IPv6), the Identification field is only present in packets that include a Fragment Header where its standard length is 32-bits [RFC8200], but even this length may be too small for some applications. This specification therefore defines a means to extend the IPv6 Identification length through the introduction of a new IPv6 Destination Options Extended Fragment Header option.

The Extended Fragment Header option may be useful for networks that engage fragmentation and reassembly at extreme data rates, or for cases when advanced packet Identification uniqueness assurance is critical. The specification further defines a messaging service for adaptive realtime response to congestion related to fragmentation and reassembly. Together, these extensions support robust fragmentation and reassembly services as well as packet Identification uniqueness for IPv6.

2. Terminology

This document uses the term "IP" to refer generically to either protocol version (i.e., IPv4 or IPv6), and uses the term "IP ID" to refer generically to the IP Identification field whether in simple or extended form.

The terms "Maximum Transmission Unit (MTU)", "Effective MTU to Receive (EMTU_R)", "Effective MTU to Send (EMTU_S)" and "Maximum Segment Lifetime (MSL)" are used exactly the same as for standard Internetworking terminology [RFC1122]. The term MSL is equivalent to the term "maximum datagram lifetime (MDL)" defined in [RFC0791][RFC6864].

The term "Packet Too Big (PTB)" refers to an IPv6 "Packet Too Big" [RFC8201][RFC4443] message.

The term "flow" refers to a sequence of packets sent from a particular source to a particular unicast, anycast or multicast destination [RFC6437].

The term "source" refers to either the original end system that produces an IP packet or an encapsulation ingress intermediate system on the path.

The term "destination" refers to either a decapsulation egress intermediate system on the path or the final end system that consumes an IP packet.

The term "intermediate system" refers to a node on the path from the (original) source to the (final) destination that forward packets not addressed to itself. Intermediate systems that decrement the IP header TTL/Hop Limit are also known as "routers".

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Motivation

Upper layer protocols often achieve greater performance by configuring segment sizes that exceed the path Maximum Transmission Unit (MTU). When the segment size exceeds the path MTU, IP fragmentation at some layer is a natural consequence. However, the 4-octet (32-bit) IPv6 Identification field may be too small to ensure reassembly integrity at sufficiently high data rates, especially when the source resets the starting sequence number often to maintain an unpredictable profile [RFC7739]. This specification therefore proposes to fortify the IP ID by extending its length.

In addition to accommodating higher data rates in the presence of fragmentation and reassembly, extending the IP ID can enable other important services. For example, an extended IP ID can support a duplicate packet detection service in which the network remembers recent IP ID values for a flow to aid detection of potential duplicates (note however that the network layer must not incorrectly flag intentional lower layer retransmissions as duplicates). An extended IP ID can also provide a packet sequence number that allows communicating peers to exclude any packets with IP ID values outside of a current sequence number window for a flow as potential spurious transmissions.

A robust IP fragmentation and reassembly service can provide a useful tool for performance maximization in the Internet when an extended IP ID is available. This document therefore presents a means to extend the IPv6 ID to better support these services through the introduction of an IPv6 Extended Fragment Header.

4. IPv6 Extended Fragment Header

For a standard 4-octet IPv6 Identification, the source can simply include an ordinary IPv6 Fragment Header as specified in [RFC8200] with the Fragment Offset field and M flag set either to values appropriate for a fragmented packet or the value 0 for an unfragmented packet. The source then includes a 4-octet Identification value for the packet.

For an extended Identification and/or for paths that do not recognize the standard IPv6 Fragment Header, this specification permits the source to instead include an IPv6 Extended Fragment Header in a Destination Options Header. The Extended Fragment Header must appear as the first option in a Destination Options Header that appears immediately following the Hop-by-Hop Options (if present) and immediately before any other Per-Fragment extension headers - see Sections 4.1 and 4.5 of [RFC8200]. The IPv6 Destination Options Header with Extended Fragment Header option is formatted as shown in Figure 1:

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |  Hdr Ext Len  |  Option Type  |  Opt Data Len |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   NH-Cache   |   Index   |P|S|      Fragment Offset    |R|D|M|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +-+-+-+-              Identification (64 bits)           -+-+-+-+
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Next Header           encodes protocol number of header following
                         the current Destination Options header.

   Hdr Ext Len           8-bit value 1 (i.e., 2 units of 8 octets) or
                         a larger value if there are more options.

   Option Type           8-bit value 'XXX[TBD1]'.

   Opt Data Len          8-bit value 12.

   NH-Cache              a temporary copy of Next Header cached
                         when the packet is subject to fragmentation.

   Index/P/S             a control octet that identifies the components
                         of an IP Parcel [I-D.templin-intarea-parcels].

   Fragment Offset       the same as the fragmentation offset field
                         of the standard IPv6 Fragment Header.

   R/D/M                 fragmentation control flags: "(R)eserved",
                         "(D)on't Fragment" and "(M)ore Fragments".

   Identification        an 8-octet (64 bit) unsigned integer
                         Identification, in network byte order.
Figure 1: IPv6 Extended Fragment Header

The Extended Fragment Header option is therefore identified as an Option Type with the low-order 5 bits set to TBD1 (see: IANA Considerations) and with the highest-order 3 bits set as discussed below. The Identification field is 8 octets (64 bits) in length, and the option may appear either in an unfragmented IPv6 packet or in one for which IPv6 fragmentation is applied (with a copy of the header appearing in each fragment).

When the source applies source fragmentation using the Extended Fragment Header destination option, fragmentation procedures are the same as for the standard IPv6 Fragment Header except that the Destination Option itself is included in the Per-Fragment Headers (see: Section 4.5 of [RFC8200]) and the standard IPv6 Fragment Header is omitted. The source further sets the R/D/M fragmentation control flags as discussed in Section 5.

When the source applies source fragmentation, it sets the highest-order 2 bits of the option code (i.e., the "act" code) to '01', '10' or '11' so that destinations that do not recognize the option will drop the packets/fragments (and possibly also return an ICMPv6 Parameter Problem message). When the source sets the fragmentation control "D" flag to '0', it also sets the third-highest-order bit of the option code (i.e., the "chg" flag) to '1' since the option contents may change in the path. When no fragmentation is applied by the source or permitted by the network, the source instead sets the highest-order 3 bits of the option code to '000'.

For each fragment produced during fragmentation, the source also caches the Next Header value found in the final Per-Fragment extension header in the NH-Cache field. The source then resets this final Next Header field to "No Next Header" (note that the "final" Per-Fragment extension header may be the Destination Options header containing the Extended Fragment Header itself).

The destination reassembles the same as specified in Section 4.5 of [RFC8200]. Following reassembly, the destination resets the final Per-Fragment extension header Next Header field to the value cached in the NH-Cache field.

Intermediate systems that forward packets fragmented in this way will therefore interpret the data that follows the Per-Fragment headers as undefined data (by virtue of the "No Next Header" setting) unless they are configured to more deeply inspect the data content.

5. IPv6 Network Fragmentation

When an IPv6 network intermediate system forwards a packet that includes an IPv6 Destination Option with Extended Fragment Header option, it can optionally examine the R/D/M fragmentation control flags the same as for IPv4 intermediate systems. In particular:

When an intermediate system forwards a packet with D=0 that is too large to traverse the next hop toward the destination, it can apply (further) fragmentation using the same procedures as specified for the source above, except that it only rewrites the Next Header fields if both Fragment Offset and M are 0.

This specification therefore updates [RFC8200] by permitting network fragmentation for IPv6 under the above conditions.

Note: Intermediate systems that do not recognize/process the IPv6 Extended Fragment Header Destination Option drop packets that are too large to traverse the next hop toward the destination and return a standard ICMPv6 Packet Too Big (PTB) message [RFC4443][RFC8201].

6. Destination Qualification

Destinations that do not recognize the Extended Fragment Header Destination Option accept or drop the packet according to the Option Type "act" code. If the "act" code is '00', destinations ignore the option and accept the packet. For other codes, destinations instead drop the packet and (for "act" codes '1X') may return a Code 2 ICMPv6 Parameter Problem message [RFC4443]. (ICMPv6 messages may be lost on the return path and/or manufactured by an adversary, however, and therefore provide only an advisory indication.)

The source can then test whether destinations recognize the Extended Fragment Header by occasionally sending "probe" packets (either fragmented or unfragmented) that include the option with an "act" code other than '00'. The source has assurance that destinations recognize the option if it receives acknowledgments and/or hints that some destinations do not recognize the option if it receives ICMPv6 Parameter Problem messages. The source should re-probe destinations occasionally in case routing redirects a flow to a different anycast destination or in case a multicast group membership changes (see: Section 8).

7. Packet Too Big (PTB) Extensions

When an intermediate system attempts to forward an IP packet that exceeds the next hop link MTU but for which fragmentation is forbidden, it returns an ICMPv6 Packet Too Big (PTB) message to the source [RFC4443][RFC8201] and discards the packet. This always results in wasted transmissions by the source and all intermediate systems on the path toward the one with the restricting link. Conversely, when network fragmentation is permitted intermediate systems may perform (further) fragmentation if necessary allowing the packet to reach the destination without loss due to a size restriction. This results in an internetwork that is more adaptive to dynamic MTU fluctuations and less subject to wasted transmissions.

While the fragmentation and reassembly processes eliminate wasted transmissions and support significant performance gains by accommodating upper layer protocol segment sizes that exceed the path MTU, the processes sometimes represent pain points that should be communicated to the source. The source should then take measures to reduce the size of the packets/fragments that it sends.

The ICMPv6 PTB format includes a Code field set to the value 0 for ordinary PTB messages. The value 0 signifies a "classic" PTB and always denotes that the subject packet was unconditionally dropped due to a size restriction.

For end systems and intermediate systems that recognize the Extended Fragment Header according to this specification, the following additional PTB unused/Code values are defined:

1 (suggested)
Sent by an intermediate system (subject to rate limiting) when it performs (further) fragmentation on a packet with an Extended Fragment Header. The intermediate system sends the PTB message while still fragmenting and forwarding the packet. The MTU field of the PTB message includes the maximum fragment size that can pass through the restricting link as an indication for the source to reduce its (source) fragment sizes. This size will often be considerably smaller than the current receive packet size advertised by the destination.
2 (suggested)
The same as for Code 1, except that the intermediate system drops the packet instead of fragmenting and forwarding. This message type represents a hard error indicating loss. In one possible strategy, the intermediate system could begin sending Code 1 PTBs then revert to sending Code 2 PTBs if the source fails to reduce its fragment sizes.
3 (suggested)
Sent by the destination (subject to rate limiting) when it performs reassembly on a packet with an Extended Fragment Header during periods of reassembly congestion and/or network fragment loss. The destination sends the PTB message while still reassembling and accepting the packet if possible. The MTU field of the PTB message includes the largest desired receive packet size (less than or equal to the EMTU_R) under current congestion constraints as an indication for the source to begin sending smaller packets if necessary. This size will still often be considerably larger than the path MTU and must be no smaller than the IPv6 minimum EMTU_R.
4 (suggested)
The same as for Code 3, except that the destination drops the packet instead of reassembling and accepting. This message type represents a hard error indicating loss due to either network fragment loss or reassembly buffer congestion. In one possible strategy, the destination could begin sending Code 3 PTBs then revert to dropping packets while sending Code 4 PTBs if the source fails to reduce its packet sizes.

Sources that receive PTB messages with Code 1/2 should immediately engage source fragmentation for future packets using a maximum fragment size no larger than the MTU advertised in the PTB messages. This not only eases the burden on intermediate system fragmentation but also ensures better performance by avoiding degenerate fragment sizes that may result.

Sources that receive PTB messages with Code 3/4 should adaptively tune the size of the packets they send for better performance and to minimize congestion both for themselves and the network as a whole.

8. Multicast and Anycast

In addition to unicast flows, similar considerations apply for flows in which the destination is a multicast group or an anycast address. Unless the source and all candidate destinations are members of a limited domain network [RFC8799] for which all nodes recognize the IPv6 Extended Fragment Header Destination Option, some destinations may recognize the option while others drop packets containing the option and may return a Code 2 ICMPv6 Parameter Problem message [RFC4443].

When a source sends packets/fragments with IPv6 Extended Fragment Headers to a multicast group, the packets/fragments may be replicated in the network such that a single transmission may reach N destinations over as many as N different paths. Intermediate systems in each such path may return a Code 1/2 PTB message if (further) fragmentation is needed, and each such destination may return a Code 3/4 PTB message if it experiences congestion and/or loss. (Each destination may also return a Code 2 ICMPv6 Parameter Problem message if it does not recognize the option.)

While the source receives PTB messages, it should reduce the fragment/packet sizes that it sends to the multicast group even if only one or a few paths or destinations are currently experiencing congestion. This means that transmissions to a multicast group will converge to the performance characteristics of the lowest common denominator group member destinations and/or paths. While the source receives ICMPv6 Parameter Problem messages, it must determine whether the benefits for group members that recognize the Extended Fragment Header option outweigh the drawbacks of service denial for those that do not.

When a source sends packets/fragments with IPv6 Extended Fragment Headers to an anycast address, routing may direct initial fragments of the same packet to a first destination that configures the address while directing the remaining fragments to other destinations that configure the address. These wayward fragments will simply result in incomplete reassemblies at each such anycast destination which will soon purge the fragments from the reassembly buffer. The source will eventually retransmit, and all resulting fragments should eventually reach a single reassembly target.

9. Requirements

All nodes that process an IPv6 Destination Options Header with Extended Fragment Header option observe the extension header limits specified in [I-D.ietf-6man-eh-limits].

Intermediate systems MUST forward without dropping IPv6 packets that include a Destination Options header with an Extended Fragment Header unless they detect a security policy threat through deeper inspection of the protocol data that follows.

Sources MUST include at most one IPv6 Standard or Extended Fragment Header in each IPv6 packet/fragment. Intermediate systems and destinations SHOULD silently drop packets/fragments with multiples. If the source includes an Extended Fragment Header, it must appear as the first option in a first Destination Options Header immediately following the Hop-by-Hop Options Header if present (otherwise following the IPv6 header itself) and immediately before any other Per-Fragment extension headers.

Destinations that accept flows using Extended Fragment Headers:

10. A Note on Fragmentation Considered Harmful

During the earliest days of internetworking, researchers attributed the warning label "harmful" to IP fragmentation based on empirical observations in the ARPANET, DARPA Internet and other internetworks of the day [KENT87]. This inspired an engineering discipline known as "Path MTU Discovery" within an emerging community of interest known as the Internet Engineering Task Force (IETF).

In more recent times, the IETF published "IP Fragmentation Considered Fragile" [RFC8900] to characterize the current state of fragmentation in the modern Internet. The IPv6 Extended Fragment Header now introduces a more robust solution based on a properly functioning IP fragmentation and reassembly service as intended in the original architecture.

Although the IP fragmentation and reassembly services provide an appropriate solution for conventional packet sizes as large as 65535 octets, they cannot be applied for larger packets nor for IP parcels and Advanced Jumbos (AJs) [I-D.templin-intarea-parcels]. This means that a combined solution with robust fragmentation and reassembly applied in parallel with traditional path MTU probing provides a combination well suited for Internetworking futures. This document therefore updates [RFC8900].

11. Implementation Status

In progress.

12. IANA Considerations

The IANA is requested to assign a new IPv6 Destination Option type in the "Destination Options and Hop-by-Hop Options" table of the 'ipv6-parameters' registry (registration procedures IESG Approval, IETF Review or Standards Action). The option should appear in 8 consecutive table entries that set "act" to 'XX', "chg" to 'X', "rest" to TBD1, "Description" to "IPv6 Extended Fragment Header" and "Reference" to this document [RFCXXXX] (i.e., with act/chg set to 00/0 for the first line, 00/1 for the second line, 01/0 for the third line, etc., up to and including 11/1 for the last line). Each line also sets "Hex Value" to the 2-digit hexadecimal value corresponding to the 8-bit concatenation of act/chg/rest.

The IANA is further instructed to assign new Code values in the "ICMPv6 Code Fields: Type 2 - Packet Too Big" table of the 'icmpv6-parameters' registry (registration procedure is Standards Action or IESG Approval). The registry should appear as follows:

   Code                  Name                         Reference
   ---                   ----                         ---------
   0                     PTB Hard Error               [RFC4443]
   1 (suggested)         Fragmentation Needed (soft)  [RFCXXXX]
   2 (suggested)         Fragmentation Needed (hard)  [RFCXXXX]
   3 (suggested)         Reassembly Needed (soft)     [RFCXXXX]
   4 (suggested)         Reassembly Needed (hard)     [RFCXXXX]
Figure 2: ICMPv6 Code Fields: Type 2 - Packet Too Big Values

13. Security Considerations

All aspects of IP security apply equally to this document, which does not introduce any new vulnerabilities. Moreover, when employed correctly the mechanisms in this document robustly address known IP reassembly integrity concerns [RFC4963] and also provide an advanced degree of packet Identification uniqueness assurance.

All normative security guidance on IPv6 fragmentation (e.g., processing of tiny first fragments, overlapping fragments, etc.) applies also to the fragments generated under the Extended Fragment Header.

14. Acknowledgements

This work was inspired by continued DTN performance studies. Amanda Baber, Tom Herbert, Bob Hinden, Christian Huitema, Mark Smith and Eric Vyncke offered useful insights that helped improve the document.

Honoring life, liberty and the pursuit of happiness.

15. References

15.1. Normative References

[RFC0791]
Postel, J., "Internet Protocol", STD 5, RFC 791, DOI 10.17487/RFC0791, , <https://www.rfc-editor.org/info/rfc791>.
[RFC1122]
Braden, R., Ed., "Requirements for Internet Hosts - Communication Layers", STD 3, RFC 1122, DOI 10.17487/RFC1122, , <https://www.rfc-editor.org/info/rfc1122>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC4443]
Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, , <https://www.rfc-editor.org/info/rfc4443>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8200]
Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", STD 86, RFC 8200, DOI 10.17487/RFC8200, , <https://www.rfc-editor.org/info/rfc8200>.
[RFC8201]
McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., "Path MTU Discovery for IP version 6", STD 87, RFC 8201, DOI 10.17487/RFC8201, , <https://www.rfc-editor.org/info/rfc8201>.

15.2. Informative References

[I-D.ietf-6man-eh-limits]
Herbert, T., "Limits on Sending and Processing IPv6 Extension Headers", Work in Progress, Internet-Draft, draft-ietf-6man-eh-limits-11, , <https://datatracker.ietf.org/doc/html/draft-ietf-6man-eh-limits-11>.
[I-D.templin-intarea-parcels]
Templin, F. L., "IP Parcels and Advanced Jumbos (AJs)", Work in Progress, Internet-Draft, draft-templin-intarea-parcels-91, , <https://datatracker.ietf.org/api/v1/doc/document/draft-templin-intarea-parcels/>.
[KENT87]
Kent, C. and J. Mogul, ""Fragmentation Considered Harmful", SIGCOMM '87: Proceedings of the ACM workshop on Frontiers in computer communications technology, DOI 10.1145/55482.55524, http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-87-3.pdf.", .
[RFC4963]
Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly Errors at High Data Rates", RFC 4963, DOI 10.17487/RFC4963, , <https://www.rfc-editor.org/info/rfc4963>.
[RFC6437]
Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, "IPv6 Flow Label Specification", RFC 6437, DOI 10.17487/RFC6437, , <https://www.rfc-editor.org/info/rfc6437>.
[RFC6864]
Touch, J., "Updated Specification of the IPv4 ID Field", RFC 6864, DOI 10.17487/RFC6864, , <https://www.rfc-editor.org/info/rfc6864>.
[RFC7739]
Gont, F., "Security Implications of Predictable Fragment Identification Values", RFC 7739, DOI 10.17487/RFC7739, , <https://www.rfc-editor.org/info/rfc7739>.
[RFC8799]
Carpenter, B. and B. Liu, "Limited Domains and Internet Protocols", RFC 8799, DOI 10.17487/RFC8799, , <https://www.rfc-editor.org/info/rfc8799>.
[RFC8900]
Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., and F. Gont, "IP Fragmentation Considered Fragile", BCP 230, RFC 8900, DOI 10.17487/RFC8900, , <https://www.rfc-editor.org/info/rfc8900>.

Appendix A. Change Log

<< RFC Editor - remove prior to publication >>

Differences from earlier versions:

Author's Address

Fred L. Templin (editor)
Boeing Research & Technology
P.O. Box 3707
Seattle, WA 98124
United States of America