Internet-Draft | IPv6 Parcels and AJs | December 2023 |
Templin | Expires 14 June 2024 | [Page] |
IPv6 packets contain a single unit of transport layer protocol data which becomes the retransmission unit in case of loss. Transport layer protocols including the Transmission Control Protocol (TCP) and reliable transport protocol users of the User Datagram Protocol (UDP) prepare data units known as segments which the network layer packages into individual IPv6 packets each containing only a single segment. This specification presents new packet constructs known as IPv6 Parcels and Advanced Jumbos (AJs) with different properties. Parcels permit a single packet to include multiple segments as a "packet-of-packets", while AJs offer significant operational advantages over basic jumbograms for transporting singleton segments of all sizes ranging from very small to very large. Parcels and AJs provide essential building blocks for improved performance, efficiency and integrity while encouraging larger Maximum Transmission Units (MTUs) in the Internet.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 June 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
IPv6 packets [RFC8200] contain a single unit of transport layer protocol data which becomes the retransmission unit in case of loss. Transport layer protocols such as the Transmission Control Protocol (TCP) [RFC9293] and reliable transport protocol users of the User Datagram Protocol (UDP) [RFC0768] (including QUIC [RFC9000], LTP [RFC5326] and others) prepare data units known as segments which the network layer packages into individual IPv6 packets each containing only a single segment. This document presents a new construct known as an "IPv6 Parcel" which permits a single packet to include multiple segments. The parcel is essentially a "packet-of-packets" with the full {TCP,UDP}/IPv6 headers appearing only once but with possibly multiple segments included.¶
Transport layer protocol entities form parcels by preparing a data buffer (or buffer chain) containing at most 64 consecutive transport layer protocol segments that can be broken out into individual packets or smaller sub-parcels as necessary. All segments except the final one must be equal in length and no larger than 65535 octets, while the final segment must be no larger than the others. The transport layer protocol entity then presents the buffer(s), number of segments and non-final segment size to the network layer. The network layer next appends per-segment headers and trailers, merges the segments into the parcel body, appends a single {TCP,UDP} header and finally appends a single IPv6 header plus extensions that identify this as a parcel and not an ordinary packet.¶
The network layer then forwards each parcel over consecutive parcel-capable links in a path until they arrive at a node with a next hop link that does not support parcels, a parcel-capable link with a size restriction, or an ingress Overlay Multilink Network (OMNI) Interface [I-D.templin-intarea-omni] connection to an OMNI link that spans intermediate Internetworks. In the first case, the original source or next hop router applies packetization to break the parcel into individual IPv6 packets. In the second case, the node applies network layer parcellation to form smaller sub-parcels. In the final case, the OMNI interface applies adaptation layer parcellation to form still smaller sub-parcels, then applies adaptation layer IPv6 encapsulation and fragmentation if necessary. The node then forwards the resulting packets/parcels/fragments to the next hop.¶
Following IPv6 reassembly if necessary, an egress OMNI interface applies adaptation layer reunification if necessary to merge multiple sub-parcels into a minimum number of larger (sub-)parcels then delivers them to the network layer which either processes them locally or forwards them via the next hop link toward the final destination. The final destination can then apply network layer (parcel-based) reunification or (packet-based) restoration if necessary to deliver a minimum number of larger (sub-)parcels to the transport layer. Reordering, loss or corruption of individual segments within the network is therefore possible, but most importantly the parcels delivered to the final destination's transport layer should be the largest practical size for best performance, and loss or receipt of individual segments (rather than parcel size) determines the retransmission unit.¶
This document further introduces an Advanced Jumbo (AJ) service that provides essential extensions beyond the basic IPv6 jumbogram service defined in [RFC2675]. AJs provide end systems and intermediate systems with a more robust service when transmission of singleton segments of all sizes ranging from very small to very large is necessary.¶
The following sections discuss rationale for creating and shipping parcels and AJs as well as actual protocol constructs and procedures involved. Parcels and AJs provide essential building blocks for improved performance, efficiency and integrity while encouraging larger Maximum Transmission Units (MTUs). A new Internetworking link service model for parcels and AJs further supports delay/disruption tolerance especially suited for air/land/sea/space mobility applications. These services should inspire future innovation in applications, transport protocols, operating systems, network equipment and data links in ways that promise to transform the Internet architecture.¶
The Oxford Languages dictionary defines a "parcel" as "a thing or collection of things wrapped in paper in order to be carried or sent by mail". Indeed, there are many examples of parcel delivery services worldwide that provide an essential transit backbone for efficient business and consumer transactions.¶
In this same spirit, an "IPv6 parcel" is simply a collection of at most 64 transport layer protocol segments wrapped in an efficient package for transmission and delivery as a "packet-of-packets", with each segment including its own end-to-end integrity checks. Each segment may be up to 65535 octets in length, and all non-final segments must be equal in length while the final segment may be smaller. IPv6 parcels are distinguished from ordinary packets and various jumbogram types through the constructs specified in this document.¶
Where the document refers to "IPv6 header length", it means only the length of the base IPv6 header (i.e., 40 octets), while the length of any extension headers is referred to separately as the "IPv6 extension header length". The term "IPv6 header plus extensions" refers generically to an IPv6 header plus all included extension headers.¶
The term "Advanced Jumbo (AJ)" refers to a new type of IPv6 jumbogram modeled from the basic IPv6 jumbogram construct defined in [RFC2675]. AJs include a 32-bit Jumbo Payload Length field and a single transport layer protocol segment the same as for basic IPv6 jumbograms, but are differentiated from parcels and other jumbogram types by including an "Advanced Jumbo Type" value in the IPv6 Payload Length field plus end-to-end segment integrity checks the same as for parcels. Unlike basic IPv6 jumbograms which are always 64KB or larger, AJs can range in size from as small as the headers plus a minimal or even null payload to as large as 2**32 octets minus headers.¶
Where the document refers to "{TCP,UDP} header length", it means the length of either the TCP header plus options (20 or more octets) or the UDP header (8 octets). It is important to note that only a single IPv6 header and a single full {TCP,UDP} header appears in each parcel regardless of the number of segments included. This distinction often provides a significant overhead savings advantage made possible only by parcels.¶
Where the document refers to checksum calculations, it means the standard Internet checksum unless otherwise specified. The same as for TCP [RFC9293] and UDP [RFC0768], the standard Internet checksum is defined as (sic) "the 16-bit one's complement of the one's complement sum of all (pseudo-)headers plus data, padded with zero octets at the end (if necessary) to make a multiple of two octets". A notional Internet checksum algorithm can be found in [RFC1071], while practical implementations require detailed attention to network byte ordering to ensure interoperability between diverse architectures.¶
The term Cyclic Redundancy Check (CRC) is used consistently with its application in widely deployed Internetworking services. Parcels use the CRC32C [RFC3385] or CRC64E [ECMA-182] standards according to non-final segment length "L" (see: Section 11). AJs include either a CRC or message digest calculated according to the MD5 [RFC1321], SHA1 [RFC3174] or US Secure Hash [RFC6234] algorithms. In all cases, the CRC or message digest is appended as a per-segment trailer arranged for transmission in network byte order per standard Internetworking conventions.¶
The terms "application layer (L5 and higher)", "transport layer (L4)", "network layer (L3)", "(data) link layer (L2)" and "physical layer (L1)" are used consistently with common Internetworking terminology, with the understanding that reliable delivery protocol users of UDP are considered as transport layer elements. The OMNI specification further defines an "adaptation layer" logically positioned below the network layer but above the link layer (which may include physical links and Internet- or higher-layer tunnels). The adaptation layer is not associated with a layer number itself and is simply known as "the layer below L3 but above L2". A network interface is a node's attachment to a link (via L2), and an OMNI interface is therefore a node's attachment to an OMNI link (via the adaptation layer).¶
The term "parcel/AJ-capable link/path" refers to paths that transit interfaces to adaptation layer and/or link layer media (either physical or virtual) capable of transiting {TCP,UDP}/IPv6 packets that employ the parcel/AJ constructs specified in this document. The source and each router in the path has a "next hop link" that forwards parcels/AJs toward the final destination, while each router and the final destination has a "previous hop link" that accepts en route parcels/AJs. Each next hop link must be capable of forwarding parcels/AJs (after first applying parcellation if necessary) with segment lengths no larger than can transit the link. Currently only the OMNI link satisfies these properties, while other link types that support parcels/AJs should soon follow.¶
The term "5-tuple" refers to a transport layer protocol entity identifier that includes the network layer (source address, destination address, source port, destination port, protocol number). The term "4-tuple" refers to a network layer parcel entity identifier that includes the adaptation layer (source address, destination address, Parcel ID, Identification).¶
The Internetworking term "Maximum Transmission Unit (MTU)" is widely understood to mean the largest packet size that can transit a single link ("link MTU") or an entire path ("path MTU") without requiring network layer fragmentation. If the MTU value returned during parcel path qualification is larger than 65535 (plus the length of the parcel headers), it determines the maximum-sized parcel/AJ that can transit the link/path without requiring a router to perform packetization/parcellation. If the MTU is no larger than 65535, the value instead determines the "Maximum Segment Size (MSS)" for the leading portion of the path up to a router that cannot forward the parcel further. (Note that this size may still be larger than the MSS that can transit the remainder of the path to the final destination, which can only be determined through explicit MSS probing.)¶
The terms "packetization" and "restoration" refer to a network layer process in which the original source or a router on the path breaks a parcel out into individual IPv6 packets that can transit the remainder of the path without loss due to a size restriction. The final destination then restores the combined packet contents into a parcel before delivery to the transport layer. In current practice, packetization/restoration can be considered as functional equivalents to the well-known Generic Segmentation/Receive Offload (GSO/GRO) services.¶
The terms "parcellation" and "reunification" refer to either network layer or adaptation layer processes in which the original source or a router on the path breaks a parcel into smaller sub-parcels that can transit the path without loss due to a size restriction. These sub-parcels are then reunified into larger (sub-)parcels before delivery to the transport layer. As a network layer process, the sub-parcels resulting from parcellation may only be reunified at the final destination. As an adaptation layer process, the resulting sub-parcels may first be reunified at an adaptation layer egress node then possibly further reunified by the network layer of the final destination.¶
The terms "fragmentation" and "reassembly" follow exactly from their definitions in the IPv6 [RFC8200] standard. In particular, OMNI interfaces support IPv6 encapsulation and fragmentation as an adaptation layer process that can transit packet/parcel/AJs sizes that exceed the underlying Internetwork path MTU. OMNI interface fragmentation/reassembly occurs at a lower layer of the protocol stack than restoration and/or reunification and therefore provides a complimentary service. Note that IPv6 parcels and AJs are not eligible for direct fragmentation and reassembly at the network layer but become eligible for adaptation layer fragmentation and reassembly following OMNI IPv6 encapsulation.¶
"Automatic Extended Route Optimization (AERO)" [I-D.templin-intarea-aero] and the "Overlay Multilink Network Interface (OMNI)" [I-D.templin-intarea-omni] provide an adaptation layer framework for transmission of parcels/AJs over one or more concatenated Internetworks. AERO/OMNI will provide an operational environment for parcels/AJs beginning from the earliest deployment phases and extending indefinitely to accommodate continuous future growth. As more and more parcel/AJ-capable links are enabled (e.g., in data centers, wireless edge networks, space-domain optical links, etc.) AERO/OMNI will continue to provide an essential service for Internetworking performance maximization.¶
The parcel sizing variables "J", "K", "L" and "M" are cited extensively throughout this document. "J" denotes the number of non-final segments included in the parcel, "K" is the length of the final segment, "L" is the length of each non-final segment and "M" is termed the "Parcel Payload Length".¶
IPv6 parcels and AJs are derived from the basic jumbogram specification found in [RFC2675], but the specifications in this document take precedence whenever they differ from the basic requirements. Most notably, IPv6 parcels and AJs use one of either the IPv6 Minimum Path MTU [RFC9268] or basic IPv6 jumbogram [RFC2675] Hop-by-Hop option. (The former is used during path probing and initial parcel/AJ transmissions while the latter is used for more efficient transmissions following path qualification.)¶
IPv6 parcels/AJs are further permitted to encode values other than 0 in the IPv6 Payload length field and they are not limited to packet sizes that exceed 65535 octets. (Instead, parcels can be as small as the packet headers plus a singleton segment with its integrity checks while AJs can be as small as the headers plus a NULL payload.)¶
The same as for standard jumbograms, IPv6 parcels and AJs are not eligible for direct network layer IPv6 fragmentation and reassembly although they may become eligible for adaptation layer fragmentation and reassembly following OMNI IPv6 encapsulation. IPv6 parcels and AJs therefore SHOULD NOT include IPv6 (Extended) Fragment Headers, and implementation MUST silently ignore any IPv6 (Extended) Fragment Headers in IPv6 parcels and AJs.¶
For further Hop-by-Hop option considerations, see: [I-D.ietf-6man-hbh-processing]. For IPv6 extension header limits, see: [I-D.ietf-6man-eh-limits].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Studies have shown that applications can improve their performance by sending and receiving larger packets due to reduced numbers of system calls and interrupts as well as larger atomic data copies between kernel and user space. Larger packets also result in reduced numbers of network device interrupts and better network utilization (e.g., due to header overhead reduction) in comparison with smaller packets.¶
A first study [QUIC] involved performance enhancement of the QUIC protocol [RFC9000] using the linux Generic Segment/Receive Offload (GSO/GRO) facility. GSO/GRO provides a robust service that has shown significant performance increases based on a multi-segment transfer capability between the operating system kernel and QUIC applications. GSO/GRO performs (virtual) fragmentation and reassembly at the transport layer with the transport protocol segment size limited by the path MTU (typically 1500 octets or smaller in today's Internet).¶
A second study [I-D.templin-dtn-ltpfrag] showed that GSO/GRO also improves performance for the Licklider Transmission Protocol (LTP) [RFC5326] used for the Delay Tolerant Networking (DTN) Bundle Protocol [RFC9171] for segments larger than the actual path MTU through the use of OMNI interface encapsulation and fragmentation. Historically, the NFS protocol also saw significant performance increases using larger (single-segment) UDP datagrams even when IPv6 fragmentation is invoked, and LTP still follows this profile today. Moreover, LTP shows this (single-segment) performance increase profile extending to the largest possible segment size which suggests that additional performance gains are possible using (multi-segment) parcels or AJs that approach or even exceed 65535 octets in total length.¶
TCP also benefits from larger packet sizes and efforts have investigated TCP performance using jumbograms internally with changes to the linux GSO/GRO facilities [BIG-TCP]. The approach proposed to use the Jumbo Payload option internally and to allow GSO/GRO to use buffer sizes that exceed 65535 octets, but with the understanding that links that support jumbograms natively are not yet widely deployed and/or enabled. Hence, parcels/AJs provide a packaging that can be considered in the near term under current deployment limitations.¶
A limiting consideration for sending large packets is that they are often lost at links with MTU restrictions, and the resulting Packet Too Big (PTB) messages [RFC4443][RFC8201] may be lost somewhere in the return path to the original source. This path MTU "black hole" condition can degrade performance unless robust path probing techniques are used, however the best case performance always occurs when loss of packets due to size restrictions is minimized.¶
These considerations therefore motivate a design where transport protocols can employ segment sizes as large as 65535 octets (minus headers) while parcels that carry multiple segments may themselves be significantly larger. (Transport layer protocols can also use AJs to transit even larger singleton segments.) Parcels allow the receiving transport layer protocol entity to process multiple segments in parallel instead of one at a time per existing practices. Parcels therefore support improvements in performance, integrity and efficiency for the original source, final destination and networked path as a whole. This is true even if the network and lower layers need to apply packetization/restoration, parcellation/reunification and/or fragmentation/reassembly.¶
An analogy: when a consumer orders 50 small items from a major online retailer, the retailer does not ship the order in 50 separate small boxes. Instead, the retailer packs as many of the small items as possible into one or a few larger boxes (i.e., parcels) then places the parcels on a semi-truck or airplane. The parcels may then pass through one or more regional distribution centers where they may be repackaged into different parcel configurations and forwarded further until they are finally delivered to the consumer. But most often, the consumer will only find one or a few parcels at their doorstep and not 50 separate small boxes. This flexible parcel delivery service greatly reduces shipping and handling cost for all including the retailer, regional distribution centers and finally the consumer.¶
The classical Internetworking link service model requires each link in the path to apply a link-layer frame integrity check often termed a "Frame Check Sequence (FCS)". The link near-end calculates and appends an FCS trailer to each packet pending transmission, and the link far-end verifies the FCS upon packet reception. If verification fails, the link far-end unconditionally discards the packet. This process is repeated for each link in the path so that only packets that pass all link-layer checks are delivered to the final destination.¶
While this link service model has contributed to the unparalleled success of terrestrial Internetworks (including the global public Internet), new uses in which significant delays or disruptions can occur are not as well supported. For example, a path that contains multiple links with higher bit error rates may be unable to pass an acceptable percentage of packets since loss due to link errors can occur at any hop. Moreover, packets that incur errors at an intermediate link but somehow pass the link integrity check will be forwarded by all remaining links in the path leaving only the final destination's Internet checksum as a last resort integrity check. Advanced error detection and correction services not typically associated with packets are therefore necessary; especially with the advent of space-domain and wireless Internetworking, long delays and significant disruptions are often intolerant of retransmissions.¶
Parcels and AJs include an end-to-end Cyclic Redundancy Check (CRC) or message digest with each segment that is calculated and inserted by the original source and verified by the final destination. For each IPv6 parcel or AJ admitted into a parcel/AJ-capable link, the link near-end applies its standard link-layer FCS upon transmission which the link far-end then verifies upon reception. Instead of unconditionally discarding frames with link errors, however, the link far-end delivers all parcel/AJ frames to upper layers. If a link error was detected at any hop, the link far-end sets a "CRC error" flag in the parcel/AJ header (see: Section 11).¶
Each link along the path simply discards any ordinary packets that have incurred link errors according to current practice. For IPv6 parcels and AJs received with link errors, however, each intermediate hop SHOULD and the final destination MUST first verify the parcel/AJ header Checksum to protect against mis-delivery. Each intermediate hop then unconditionally forwards the parcel/AJ to the next hop even though it may include link errors.¶
IPv6 Parcel/AJ segments may therefore acquire cumulative link errors along the path, but the parcel/AJ error bit plus per segment end-to-end CRCs and/or Internet checksums support final destination integrity checking. The final destination in turn delivers each segment to the local transport layer along with an error flag that is set if an end-to-end CRC or Internet checksum error was detected (otherwise the flag is cleared). The error flag is then taken under advisement by the transport layer, which should employ transport or higher-layer integrity checks to guide corrective actions.¶
The ubiquitous 1500 octet link MTU had its origins in the very earliest deployments of 10Mbps Ethernet technologies beginning in the early 1980's, however modern wired-line link data rates of 1Gbps are now typical for end user devices such as laptop computers while much higher rates of 10Gbps, 100Gbps or even more commonly occur for data center servers. At these data rates, the serialization delays range from 1200usec at 10Mbps to only .12usec at 100Gbps [ETHERMTU]. This suggests that the legacy 1500 MTU may be too small by multiple orders of magnitude for many well-connected data centers, wide-area wired-line networked paths or even for deep space communications over optical links. For these cases, larger parcels and AJs present a performance maximization vehicle that supports larger transport layer segment sizes.¶
While data centers, Internetworking backbones and deep space networks are often connected through robust fixed link services, the Internet edge is rapidly evolving from to a much more mobile environment where 4G/5G (and beyond) cellular services and WiFi radios connect a growing majority of end user systems. Although some wireless edge networks and mobile ad-hoc networks support considerable data rates, more typical rates with wireless signal disruption and link errors suggest that limiting channel contention by configuring more conservative MTU levels is often prudent. Even in such environments, a mixed link model with error-tolerant data sent in parcels/AJs and error-intolerant data sent in packets may present a more balanced profile.¶
IPv6 parcels and AJs therefore provide a revolutionary advancement for delay/disruption tolerance in air/land/sea/space mobile Internetworking applications. As the Internet continues to evolve from its more stable fixed terrestrial network origins to one where more and more nodes operate in the mobile edge, this new link service model relocates error detection and correction responsibilities from intermediate systems to the end systems that are uniquely capable of take corrective actions.¶
Note: IPv6 parcels and AJs may already be compatible with widely-deployed link types such as 1/10/100-Gbps Ethernet. Each Ethernet frame is identified by a preamble followed by a Start Frame Delimiter (SFD) followed by the frame data itself followed by the FCS and finally an Inter Packet Gap (IPG). Since no length field is included, however, the frame can theoretically extend as long as necessary for transmission of IPv6 parcels and AJs that are much larger than the typical 1500 octet Ethernet MTU as long as the time duration on the link media is properly bounded. Widely-deployed links may therefore already include all of the necessary features to natively support large parcels and AJs with no additional extensions, while operating systems may need to be modified to post larger receive buffers.¶
A transport protocol entity identified by its 5-tuple forms a parcel body by preparing a data buffer (or buffer chain) containing at most 64 transport layer protocol segments, with each TCP segment preceded by a 4-octet Sequence Number header. Each segment plus Sequence Number (for TCP) is further preceded by a 2-octet Internet Checksum header and followed by a 4- or 8-octet CRC trailer. All non-final segments MUST be equal in length while the final segment MUST NOT be larger and MAY be smaller. The number of non-final segments is represented as J; the total number of segments is therefore (J + 1).¶
The non-final segment size L is set to a 16-bit value that MUST be no smaller than 256 octets and SHOULD be no larger than 65535 octets minus the length of the {TCP,UDP} header (plus options), minus the length of the IPv6 header (plus extensions), minus 2 octets for the Checksum header minus 4 octets for the Sequence Number (for TCP) minus 4/8 octets for the CRC trailer (see: Appendix B). The final segment length K MUST NOT be larger than L but MAY be smaller. The transport layer protocol entity then presents the buffer(s) and size L to the network layer, noting that the combined buffer length(s) may exceed 65535 octets when there are sufficient segments of a large enough size.¶
If the next hop link is not parcel capable, the network layer performs packetization to package each segment as an individual IPv6 packet as discussed in Section 7.1. If the next hop link is parcel capable, the network layer instead completes the parcel by appending a single full {TCP,UDP} header (plus options) and a single full IPv6 header (plus extensions). The network layer finally includes a specially-formatted Parcel Payload option as an extension to the IPv6 header of each parcel prior to transmission over a network interface.¶
The Parcel Payload option format for IPv6 appears as shown in Figure 1:¶
The network layer includes the Parcel Payload option as an IPv6 Hop-by-Hop option with Option Type set to '0x30' and Opt Data Len set to 14. The length also distinguishes this type from its use as the IPv6 Minimum Path MTU Hop-by-Hop Option [RFC9268]. The network layer then sets the IPv6 header Payload Length field to L and sets Parcel Payload Length to a 3-octet value M that encodes the length of the IPv6 extension headers plus the length of the {TCP,UDP} header plus the combined length of all concatenated segments with their Checksum and sequence number (for TCP) headers and CRC trailers.¶
The network layer next sets Index to an ordinal segment "Parcel Index" value between 0 and 63, sets the "(P)arcel" flag to 1 and sets the "More (S)egments" flag to 1 for non-final sub-parcels or 0 for the final (sub-)parcel. (Note that non-zero Index values identify the initial segment index in non-first sub-parcels of a larger original parcel while the value 0 denotes the first (sub-)parcel.) The network layer finally includes an 8-octet Identification, then sets Code to 255 and sets Check to the same value that will appear in the IPv6 header Hop Limit field on transmission. These values provide hop-by-hop assurance that previous hops correctly implement the parcel protocol without applying legacy IPv6 option processing per [RFC9268].¶
Following this transport and network layer processing, {TCP,UDP}/IPv6 parcels therefore have the structures shown in Figure 2:¶
A TCP Parcel is a arcel that includes an IPv6 header plus extensions with a Parcel Payload option formed as shown in Section 6 with Parcel Payload Length encoding a value no larger than 16,777,215 (2**24 - 1) octets. The IPv6 header plus extensions is then followed by a TCP header plus options (20 or more octets) followed by (J + 1) consecutive segments that each include a 2-octet Internet Checksum plus 4-octet per-segment Sequence Number header and 4/8-octet CRC trailer. The TCP header Sequence Number is set to 0, each non-final segment is L octets in length and the final segment is K octets in length. The value L is encoded in the IPv6 header Payload Length field while the overall length of the parcel is determined by the Parcel Payload Length M.¶
The source prepares TCP Parcels in an alternative adaptation of TCP jumbograms [RFC2675]. The source calculates a checksum of the TCP header plus IPv6 pseudo-header only (see: Section 11). The source then writes the exact calculated value in the TCP header Checksum field (i.e., without converting calculated 0 values to '0xffff').¶
The source next calculates the Internet checksum for each segment independently beginning with the Sequence Number header and extending over the length of the segment, then writes the value into the 2-octet Checksum header. The source then calculates the CRC beginning with the Checksum header and extending over both the Sequence Number header and the length of the segment, then writes the value into the 4/8-octet CRC trailer.¶
Note: The parcel TCP header Source Port, Destination Port and (per-segment) Sequence Number fields apply to each parcel segment, while the TCP control bits and all other fields apply only to the first segment (i.e., "segment(0)"). Therefore, only parcel segment(0) may be associated with control bit settings while all other segment(i)'s must be simple data segments.¶
See Appendix A for additional TCP considerations. See Section 11 for additional integrity considerations.¶
A UDP Parcel is an IPv6 Parcel that includes an IPv6 header plus extensions with a Parcel Payload option formed as shown in Section 6 with Parcel Payload Length encoding a value no larger than 16,777,215 (2**24 - 1) octets. The IPv6 header plus extensions is then followed by an 8-octet UDP header followed by (J + 1) transport layer segments with their Checksum headers and CRCs trailers. Each segment must begin with a transport-specific start delimiter (e.g., a segment identifier, a sequence number, etc.) included by the transport layer user of UDP. The length of the first segment L is encoded in the IPv6 Payload Length field while the overall length of the parcel is determined by the Parcel Payload Length M as above.¶
The source prepares UDP Parcels in an alternative adaptation of UDP jumbograms [RFC2675]. The source first sets the UDP header length field to 0, then calculates the checksum of the UDP header plus IPv6 pseudo-header (see: Section 11) and writes the exact calculated value into the UDP header Checksum field (i.e., without converting calculated 0 values to '0xffff'). If UDP checksums are enabled, the source also calculates a separate checksum for each segment while writing the values into the corresponding per-segment Checksum header with calculated 0 values converted to '0xffff' (if UDP checksums are disabled, the source instead writes the value 0). The source then calculates the CRC over each segment beginning with the segment Checksum field and writes the value into the 4/8-octet CRC trailer.¶
See: Section 11 for additional integrity considerations.¶
The parcel source unambiguously encodes the values L and M in the corresponding header fields as specified above. The values J and K are not encoded in header fields and must therefore be calculated by intermediate and final destination nodes as follows:¶
Note: from the above calculations, a minimal parcel is one that sets L to at least 256 and includes at least one segment no larger than L along with its per-segment header(s) and trailer. In addition, all parcels set L to at most 65535 and contain at most 64 segments along with their corresponding headers/trailers.¶
When the network layer of the source assembles a {TCP,UDP}/IPv6 parcel it fully populates all IPv6 header fields including the source address, destination address and Parcel Payload option as above. The source also sets IPv6 Payload Length to L (between 256 and 65535) to distinguish the parcel from other jumbogram types (see: Section 8).¶
The network layer of the source also maintains a randomly- initialized 8-octet (64-bit) Identification value for each destination. For each packet, parcel or AJ transmission, the source sets the Identification to the current cached value for this destination and increments the cached value by 1 (modulo 2**64) for each successive transmission. (The source can then reset the cached value to a new random number when necessary, e.g., to maintain an unpredictable profile.) For each parcel transmission, the source includes the Identification value in the IPv6 Parcel Payload Option.¶
The network layer of the source finally presents the parcel to an interface for transmission to the next hop. For ordinary interface attachments to parcel-capable links, the source simply admits each parcel into the interface the same as for any IPv6 packet where it may be forwarded by one or more routers over additional consecutive parcel-capable links possibly even traversing the entire forward path to the final destination. Note that any node in the path that does not recognize the parcel construct may either drop it and return an ICMP Parameter Problem message or (erroneously) attempt to forward it as an ordinary packet.¶
Most importantly, each parcel-capable link in the path forwards the parcel even if link errors were detected since all parcels/AJs include end-to-end CRC and Checksum integrity checks. This ensures that the vast majority of coherent data is delivered to the final destination instead of being discarded along with a minor amount of corrupted data at an intermediate hop. When the link far end receives a parcel/AJ that includes link errors, it sets a "CRC error" flag in the parcel/AJ header before forwarding to the next hop (see: Section 11).¶
When the next hop link does not support parcels at all, or when the next hop link is parcel-capable but configures an MTU that is too small to pass the entire parcel, the source breaks the parcel up into individual IPv6 packets (in the first case) or into smaller sub-parcels (in the second case). In the first case, the source can apply packetization using Generic Segment Offload (GSO), and the final destination can apply restoration using Generic Receive Offload (GRO) to deliver the largest possible parcel buffer(s) to the transport layer. In the second case, the source can apply parcellation to break the parcel into sub-parcels with each containing the same Identification value and with the S flag set appropriately. The final destination can then apply reunification to deliver the largest possible parcel buffer(s) to the transport layer. In all other ways, the source processes of breaking a parcel up into individual IPv6 packets or smaller sub-parcels entail the same considerations as for a router on the path that invokes these processes as discussed in the following subsections.¶
Parcel probes that test the forward path's ability to pass parcels set a Path MTU (PMTU field) to a non-zero value as discussed in Section 7.5. Each router in the path then rewrites PMTU in a similar fashion as for [RFC9268]. Specifically, each router compares the parcel PMTU value with the next hop link MTU in the parcel path and MUST (re)set PMTU to the minimum value. The fact that the parcel transited a previous hop link provides sufficient evidence of forward progress (since parcel path MTU determination is unidirectional in the forward path only), but nodes can also include the previous hop link MTU in their minimum PMTU calculations in case the link may have an ingress size restriction (such as a receive buffer limitation). Each parcel also includes one or more transport layer segments corresponding to the 5-tuple for the flow, which may include {TCP,UDP} segment size probes used for packetization layer path MTU discovery [RFC4821][RFC8899]. (See: Section 7.5 for further details on parcel path probing.)¶
When a router receives a parcel it first compares Code with 255 and Check with the IPv6 header Hop Limit; if either value differs, the router drops the parcel and returns a negative Jumbo Report (see: Section 7.6) subject to rate limiting. (Note that the parcel may also have been truncated in length by a previous-hop router that does not recognize the construct.) For all other intact parcels, the router next compares the value L with the next hop link MTU. If the next hop link is parcel capable but configures an MTU too small to admit a parcel with a single segment of length L the router returns a positive Jumbo Report (subject to rate limiting) with MTU set to the next hop link MTU. If the next hop link is not parcel capable and configures an MTU too small to pass an individual IPv6 packet with a single segment of length L the router instead returns a positive Parcel Report (subject to rate limiting) with MTU set to the next hop link MTU. If the next hop link is parcel capable the router MUST reset Check to the same value that would appear in the IPv6 header Hop Limit field upon transmission to the next hop.¶
If the router recognizes parcels but the next hop link in the path does not, or if the entire parcel would exceed the next hop link MTU, the router instead opens the parcel. The router then forwards each enclosed segment in individual IPv6 packets or in a set of smaller sub-parcels that each contain a subset of the original parcel's segments. If the next hop link is via an OMNI interface, the router instead follows OMNI Adaptation Layer procedures. These considerations are discussed in detail in the following sections.¶
For transmission of individual packets over links that do not support parcels, or for transmission of (sub-)parcels larger than the next-hop link MTU, the source or router (i.e., the node) engages GSO to perform packetization. The node first determines whether an individual packet with segment of length L can fit within the next hop link/path MTU. If an individual packet would be too large (and if source fragmentation is not an option), the node drops the parcel and returns a positive Parcel Report message (subject to rate limiting) with MTU set to the next hop link/path MTU and with the leading portion of the parcel beginning with the IPv6 header as the "packet in error". If an individual packet can be accommodated, the node removes the Parcel Payload option and caches the per-segment Checksum header values (and for TCP also caches the Sequence Numbers). The node then removes the Parcel Payload option, verifies the CRCs of each segment(i) (for i = 0 thru j) and discards any segment(i)'s with incorrect CRCs. The node then copies the {TCP,UDP}/IPv6 headers followed by segment (i) (i.e., while discarding the per-segment Checksum, Sequence Number and CRC fields) into as many as 'j' individual packets ("packet(i)"). Each such packet(i) will be subject to the independent CRC verifications of each remaining link in the path.¶
For each packet(i), the node then clears the TCP control bits in all but packet(0), and includes only those TCP options that are permitted to appear in data segments in all but packet(0) which may also include control segment options (see: Appendix A for further discussion). The node then sets IPv6 Payload Length for each packet(i) based on the length of segment(i) according to [RFC8200].¶
For each packet(i), the node includes an IPv6 Destination Options Header with an IPv6 Extended Fragment Header option per [I-D.templin-6man-ipid-ext]. The Option Type sets the "act" code to '00' so that destinations that do not recognize the option will still process each packet(i) as a standalone singleton. In the Extended Fragment Header, the node then sets the Identification field to the value found in the parcel header. The node next sets the 6-bit Index field to 'i' and interprets the 2-bit Res field as a "(P)arcel" flag followed by a "More (S)egments" flag, i.e., the same as these fields appear in the Parcel Payload Option in Figure 1. The node then sets P to 1 and finally sets S to 1 for each non-final segment or 0 for the final segment. This document therefore updates [I-D.templin-6man-ipid-ext] by defining the above format for the IPv6 Extended Fragment Header Index/Res field for packets that set Fragment Offset to 0.¶
For each TCP/IPv6 packet, the node next sets Payload Length then calculates/sets the checksum for the packet according to [RFC9293]. For each UDP/IPv6 packet, the node instead sets the Payload Length and UDP length fields then calculates/sets the checksum according to [RFC0768]. The node reuses the cached checksum value for each segment in the checksum calculation process. The node first calculates the Internet checksum over the new packet {TCP,UDP}/IPv6 headers then adds the cached segment checksum value. For TCP, the node finally writes the cached Sequence Number value for each segment into the TCP Sequence Number field which initially encoded the value 0 (note that this permits the node to use the cached segment checksum without having to recalculate). For UDP, if a per-segment Checksum was 0 the node instead writes the value 0 in the Checksum field of the corresponding UDP/IPv6 packet.¶
For each IPv6 packet, the node then sets both the Fragment Offset field and (M)ore fragments flag to 0 and forwards each packet to the next hop.¶
Note: Packets resulting from packetization may be too large to transit the remaining path to the final destination, such that a router may drop the packet(s) and possibly also return an ordinary ICMP PTB message. Since these messages cannot be authenticated or may be lost on the return path, the original source should take care in setting a segment size larger than the known path MTU unless as part of an active probing service.¶
For transmission of smaller sub-parcels over parcel-capable links, the source or intermediate system (i.e., the node) first determines whether a single segment of length L can fit within the next hop link MTU if packaged as a (singleton) sub-parcel. If a singleton sub-parcel would be too large, the node returns a positive Jumbo Report message (subject to rate limiting) with MTU set to the next hop link MTU and containing the leading portion of the parcel beginning with the IPv6 header, then performs packetization as discussed in Section 7.1. Otherwise, the node employs network layer parcellation to break the original parcel into smaller groups of segments that can traverse the path as whole (sub-)parcels. The node first determines the number of segments of length L that can fit into each sub-parcel under the size constraints. For example, if the node determines that each sub-parcel can contain 3 segments of length L, it creates sub-parcels with the first containing Segments 0-2, the second containing 3-5, the third containing 6-8, etc., and with the final containing any remaining Segments (where each segment includes its Checksum header and CRC trailer from the original (sub-)parcel).¶
If the original parcel's Parcel Payload option has S set to 0, the node then sets S to 1 in all resulting sub-parcels except the last (i.e., the one containing the final segment of length K, which may be shorter than L) for which it sets S to 0. If the original parcel has S set to 1, the node instead sets S to 1 in all resulting sub-parcels including the last. The node next sets the Index field to the value 'i' which is the ordinal number of the first segment included in each sub-parcel. (In the above example, the first sub-parcel sets Index to 0, the second sets Index to 3, the third sets Index to 6, etc.). If another router further down the path toward the final destination forwards the sub-parcel(s) over a link that configures a smaller MTU, the router may break it into even smaller sub-parcels each with Index set to the ordinal number of the first segment included.¶
The node next appends identical {TCP,UDP}/IPv6 headers (including the Parcel Payload option plus any other extensions) to each sub-parcel while resetting Index, S, {Total, Payload} Length (L) and Parcel Payload Length (M) in each as above. For TCP, the node then clears the TCP control bits in all but the first sub-parcel and includes only those TCP options that are permitted to appear in data segments in all but the first sub-parcel (which may also include control segment options). For both TCP and UDP, the node then resets the {TCP,UDP} Checksum according to ordinary parcel formation procedures (see above). The node finally sets PMTU to the next hop link MTU then forwards each (sub-)parcel to the parcel-capable next hop.¶
For transmission of original parcels or sub-parcels over OMNI interfaces, the node admits all parcels into the interface unconditionally since the OMNI interface MTU is unrestricted. The OMNI Adaptation Layer (OAL) of this First Hop Segment (FHS) OAL source node then forwards the parcel to the next OAL hop which may be either an intermediate node or a Last Hop Segment (LHS) OAL destination. OMNI interface parcellation and reunification procedures are specified in detail in the remainder of this section, while parcel encapsulation and fragmentation procedures are specified in [I-D.templin-intarea-omni].¶
When the OAL source forwards a parcel (whether generated by a local application or forwarded over a network path that transited one or more parcel-capable links), it first assigns a monotonically-incrementing (modulo 64) adaptation layer Parcel ID (note that this value differs from the (Parcel) Index encoded in the Parcel Payload option). If the parcel is larger than the OAL maximum segment size of 65535 octets, the OAL source next employs parcellation to break the parcel into sub-parcels the same as for the above network layer procedures. This includes re-setting the Index, P, S, {Total, Payload} Length (L) and Parcel Payload Length (M) fields in each sub-parcel the same as specified in Section 7.2.¶
The OAL source next assigns a different monotonically-incrementing adaptation layer Identification value for each sub-parcel of the same Parcel ID then performs adaptation layer encapsulation while writing the Parcel ID into the OAL IPv6 Extended Fragment Header. The OAL source then performs OAL fragmentation if necessary and finally forwards each fragment to the next OAL hop toward the OAL destination. (During encapsulation, the OAL source examines the Parcel Payload option S flag to determine the setting for the adaptation layer fragment header S flag according to the same rules specified in Section 7.2.)¶
When the sub-parcels arrive at the OAL destination, it retains them along with their Parcel IDs and Identifications for a short time to support reunification with peer sub-parcels of the same original (sub-)parcel identified by the 4-tuple information corresponding to the OAL source. This reunification entails the concatenation of Checksums/Segments included in sub-parcels with the same Parcel ID and with Identification values within modulo-64 of one another to create a larger sub-parcel possibly even as large as the entire original parcel. The OAL destination concatenates the segments (plus their checksums and CRCs) for each sub-parcel in ascending Identification value order, while ensuring that any sub-parcel with TCP control bits set appears as the first concatenated element in a reunified larger parcel and any sub-parcel with S flag set to 0 appears as the final concatenation. The OAL destination then sets S to 0 in the reunified (sub-)parcel if and only if one of its constituent elements also had S set to 0; otherwise, it sets S to 1.¶
The OAL destination then appends a common {TCP,UDP}/IPv6 header plus extensions to each reunified sub-parcel while resetting Index, S, Payload Length (=L) and Parcel Payload Length (=M) in the corresponding header fields of each. For TCP, if any sub-parcel has TCP control bits set the OAL destination regards it as sub-parcel(0) and uses its TCP header as the header of the reunified (sub-)parcel with the TCP options including the union of the TCP options of all reunified sub-parcels. The OAL destination then resets the {TCP,UDP}/IPv6 header checksum. If the OAL destination is also the final destination, it then delivers the sub-parcels to the network layer which processes them according to the 5-tuple information supplied by the original source. If the OAL destination is not the final destination, it instead forwards each sub-parcel toward the final destination the same as for an ordinary IPv6 packet.¶
Note: Adaptation layer parcellation over OMNI links occurs only at the OAL source while adaptation layer reunification occurs only at the OAL destination (intermediate OAL nodes do not engage in the parcellation/reunification processes). The OAL destination should retain sub-parcels in the reunification buffer only for a short time (e.g., 1 second) or until all sub-parcels of the original parcel have arrived. The OAL destination then delivers full and/or incomplete reunifications to the network layer (in cases where loss and/or delayed arrival interfere with full reunification).¶
Note: OMNI interface parcellation and reunification is an OAL process based on the adaptation layer 4-tuple and not the network layer 5-tuple. This is true even if the OAL has visibility into network layer information since some sub-parcels of the same original parcel may be forwarded over different network paths.¶
Note: Some implementations may encounter difficulty in applying adaptation layer reunification for sub-parcels that have already incurred lower layer fragmentation and reassembly (e.g., due to network kernel buffer structure limitations). In that case, the adaptation layer can either linearize each sub-parcel before applying reunification or deliver incomplete reunifications or even individual sub-parcels to upper layers.¶
When the original source or a router on the path opens a parcel and forwards its contents as individual IPv6 packets, these packets will arrive at the final destination which can hold them in a restoration buffer for a short time before restoring the original parcel using GRO. The 5-tuple information plus the Identification and Index/P/S values included by the source during packetization (see above) provide sufficient context for GRO restoration which practical implementations have proven as a robust service at high data rates.¶
When the original source or a router on the path opens a parcel and forwards its contents as smaller sub-parcels, these sub-parcels will arrive at the final destination which can hold them in a reunification buffer for a short time or until all sub-parcels have arrived. The 5-tuple information plus the Index/P/S and Identification values provide sufficient context for reunification.¶
In both the restoration and reunification cases, the final destination concatenates segments according to ascending Index and/or Identification numbers to preserve segment ordering even if a small degree of reordering and/or loss may have occurred in the networked path. When the final destination performs restoration/reunification on TCP segments, it must include the one with any TCP flag bits set as the first concatenation and with the TCP options including the union of the TCP options of all concatenated packets or sub-parcels. For both TCP and UDP, any packet or sub-parcel containing the final segment must appear as a final concatenation.¶
The final destination can then present the concatenated parcel contents to the transport layer with segments arranged in (nearly) the same order in which they were originally transmitted. Strict ordering is not mandatory since each segment will include a transport layer protocol specific start delimiter with positional coordinates. However, the Index field and/or Identification includes an ordinal value that preserves ordering since each sub-parcel or individual IPv6 packet contains an integral number of whole transport layer protocol segments.¶
Note: Restoration and/or reunification buffer management is based on a hold timer during which singleton packets or sub-parcels are retained until all members of the same original parcel have arrived. Implementations should maintain a short hold timer (e.g., 1 second) and advance any restorations/reunifications to upper layers when the hold timer expires even if incomplete.¶
Note: Since loss and/or reordering may occur in the network, the final destination may receive a packet or sub-parcel with S set to 0 before all other elements of the same original parcel have arrived. This condition does not represent an error, but in some cases may cause the network layer to deliver sub-parcels that are smaller than the original parcel to the transport layer. The transport layer simply accepts any segments received from all such deliveries and will request retransmission of any segments that were lost and/or damaged.¶
Note: Restoration and/or reunification buffer congestion may indicate that the network layer cannot sustain the service(s) at current arrival rates. The network layer should then begin to deliver incomplete restorations/reunifications or even individual segments to the receive queue (e.g., a socket buffer) instead of waiting for all segments to arrive. The network layer can manage restoration/reunification buffers, e.g., by maintaining buffer occupancy high/low watermarks.¶
Note: Some implementations may encounter difficulty in applying network layer restoration/reunification for packets/sub-parcels that have already incurred adaptation layer reassembly/reunification. In that case, the network layer can either linearize each packet/sub-parcel before applying restoration/reunification or deliver incomplete restorations/reunifications or even individual packets/sub-parcels to upper layers.¶
All parcels also serve as implicit probes and may cause either a router in the path or the final destination to return an ordinary ICMPv6 error [RFC4443] and/or Packet Too Big (PTB) message [RFC8201] concerning the parcel. A router in the path or the final destination may also return a Parcel/Jumbo Report (subject to rate limiting per [RFC4443]) as discussed in Section 7.6.¶
To determine whether parcels can transit at least an initial portion of the forward path toward the final destination, the original source can also send parcels with a Parcel Payload option PMTU field included and set to the next hop link MTU as an explicit Parcel Probe. The Parcel Probe option format is shown in Figure 4, where "Opt Data Len" is set to 18:¶
The parcel probe will cause the final destination or a router on the path to return a Parcel/Jumbo Report.¶
A Parcel Probe can be included either in an ordinary data parcel or a {TCP,UDP}/IPv6 parcel with destination port set to 9 (discard) [RFC0863]. The probe must still contain a valid {TCP,UDP} parcel header Checksum that any intermediate hops as well as the final destination can use to detect mis-delivery, while the final destination will process any parcel data in probes with correct Checksums/CRCs.¶
If the original source receives a positive Parcel/Jumbo Report, it marks the path as "parcels supported" and ignores any ordinary ICMP and/or PTB messages concerning the probe. If the original source instead receives a negative Jumbo Report or no report/reply, it marks the path as "parcels not supported" and may regard any ordinary ICMP and/or PTB messages concerning the probe (or its contents) as indications of a possible path limitation.¶
The original source can therefore send Parcel Probes in the same parcels used to carry real data. The probes will transit parcel-capable links joined by routers on the forward path possibly extending all the way to the destination. If the original source receives a positive Parcel/Jumbo Report it can continue using parcels after adjusting its segment size if necessary.¶
The original source sends Parcel Probes unidirectionally in the forward path toward the final destination to elicit a report, since it will often be the case that parcels/AJs are supported only in the forward path and not in the return path. Parcel Probes may be dropped in the forward path by any node that does not recognize parcels, but Parcel/Jumbo Reports must be packaged to reduce the risk of return path filtering. For this reason, the Parcel Payload options included in Parcel Probes are always packaged as IPv6 Hop-by-Hop options while Parcel/Jumbo Reports are returned as UDP/IPv6 encapsulated ICMPv6 PTB messages with a Parcel/Jumbo Report Code value (see: IANA Considerations).¶
Original sources send ordinary parcels or discard parcels as explicit Parcel Probes by setting the Parcel Payload PMTU to the (non-zero) next hop link MTU. The source then sets Index/P/S, Parcel Payload Length, and {Total, Payload} Length, then calculates the header Checksum and per-segment Checksums/CRCs the same as for an ordinary parcel. The source finally sends the Parcel Probe via the outbound IPv6 interface.¶
Original sources can send Parcel Probes that include a large segment size, but these may be dropped by a router on the path even if the next hop link is parcel-capable. The original source may then receive a Jumbo Report that contains only the MTU of the leading portion of the path up to the router with the restrictive link. The original source can instead send Parcel Probes with smaller segments that would be likely to transit the entire forward path to the final destination if all links are parcel-capable. For parcel-capable paths, this may allow the original source to discover both the path MTU and the MSS in a single message exchange instead of multiple.¶
According to [RFC9268], IPv6 middleboxes (i.e., routers, security gateways, firewalls, etc.) that do not observe this specification will either ignore the option altogether or notice that the option length differs from its base definition and presumably ignore the option or drop the packet. IPv6 middleboxes that observe this specification instead MUST process the option as an implicit or explicit Parcel Probe.¶
When a router that observes this specification receives a Parcel Probe it first compares Code with 255 and Check with the IPv6 header Hop Limit; if either value differs, the router drops the probe and returns a negative Jumbo Report subject to rate limiting. (Note that the Parcel Probe may also have been truncated in length by a previous-hop router that does not recognize the construct.) For all other intact Parcel Probes, if the next hop link is non-parcel-capable the router compares PMTU with the next hop link MTU and returns a positive Parcel Report subject to rate limiting with MTU set to the minimum value. The router then applies packetization to convert the probe into individual IPv6 packet(s) and forwards each packet to the next hop; otherwise, it drops the probe.¶
If the next hop link both supports parcels and configures an MTU that is large enough to pass the probe, the router instead compares the probe PMTU with the next hop link MTU. The router next MUST (re)set PMTU to the minimum value then forward the probe to the next hop (and also reset Check to the same value that will appear in the IPv6 header Hop Limit upon transmission to the next hop). If the next hop link supports parcels but configures an MTU that is too small to pass the probe, the router then applies parcellation to break the probe into multiple smaller sub-parcels that can transit the link. In the process, the router sets PMTU to the minimum link MTU value in the first sub-parcel and omits the PMTU field in all non-first sub-parcels (and also resets Check in all sub-parcels). If the next hop link supports parcels but configures an MTU that is too small to pass a singleton sub-parcel of the probe, the router instead drops the probe and returns a positive Jumbo Report subject to rate limiting with MTU set to the next hop link MTU.¶
The final destination may therefore receive individual IPv6 packets and/or (sub-)parcels including intact Parcel Probes. If the final destination receives individual packets, it performs any necessary integrity checks, applies restoration if possible then delivers the (restored) parcel contents to the transport layer. If the final destination receives a (sub-)parcel with an intact Parcel Probe, it first compares Code with 255 and Check with the IPv6 header Hop Limit; if either value differs, the final destination drops the probe and returns a negative Jumbo Report. (Note that the Parcel Probe may also have been truncated in length by a previous-hop router that does not recognize the construct.) For all other intact Parcel Probes, if the {TCP,UDP} port number is not 9 (discard) it applies reunification and delivers the (reunified) parcel contents to the transport layer. The final destination then returns a positive Jumbo Report to the original source.¶
After sending Parcel Probes (or ordinary parcels) the original source may therefore receive UDP/IPv6 encapsulated Parcel/Jumbo Reports and/or transport layer protocol probe replies. If the source receives a Parcel/Jumbo Report, it verifies the UDP Checksum then verifies that the ICMPv6 Checksum is 0. If both Checksum values are correct, the node then matches the enclosed PTB message with an original probe/parcel by examining the ICMPv6 "packet in error" containing the leading portion of the invoking packet. If the "packet in error" does not match one of its previous packets, the source discards the Parcel/Jumbo Report; otherwise, it continues to process.¶
If the source receives a negative Parcel/Jumbo Report (i.e., one with MTU set to 0), it marks the path as "parcels not supported". Otherwise, the source marks the path as "parcels supported" and also records the MTU value as the parcel path MTU (i.e., the portion of the path up to and including the node that returned the Parcel/Jumbo Report). If the MTU value is 65535 (plus headers) or larger, the MTU determines the largest whole parcel that can transit the path without packetization/parcellation while using any segment size up to and including the maximum. For Reports that include a smaller MTU, the value represents both the largest whole parcel size and a maximum segment size limitation. In that case, the maximum parcel size that can transit the initial portion of the path may be larger than the maximum segment size that can continue to transit the remaining path to the final destination.¶
Note: when a source sends a parcel probe into a new path that has not been probed previously, it should include enough padding payload so that the overall packet length is consistent with the value found in the IPv6 Payload Length field. This allows legacy routers on the path that do not recognize parcels to see a length that is consistent with the value found in the IPv6 header.¶
Note: the path MTU discovered through a Parcel Probe exchange can conceivably exceed the maximum-sized parcel, since link MTUs are represented as 32-bit values whereas the maximum-sized parcel is limited to 24 bits. For this reason, Parcel Probes can serve the dual purpose of also determining the maximum AJ size that can traverse the path.¶
For further discussion on parcel/AJ probing alternatives, see: Appendix C.¶
When a router or final destination returns a Parcel/Jumbo Report, it prepares an ICMPv6 PTB message [RFC4443] with Code set to either Parcel Report or Jumbo Report (see: IANA considerations) and with MTU set to either the minimum MTU value for a positive report or to 0 for a negative report. The node then writes its own IPv6 address as the Parcel/Jumbo Report source and writes the source address of the packet that invoked the report as the Parcel/Jumbo Report destination. The node next copies as much of the leading portion of the invoking parcel/AJ as possible (beginning with the IPv6 header) into the "packet in error" field without causing the entire Parcel/Jumbo Report (beginning with the IPv6 header) to exceed 512 octets in length. The node then sets the Checksum field to 0 instead of calculating and setting a true checksum since the UDP checksum (see below) already provides an integrity check.¶
Since middleboxes often filter ICMPv6 messages, the node next wraps the Parcel/Jumbo Report in UDP/IPv6 headers with the IPv6 source and destination addresses copied from the Parcel/Jumbo Report and with UDP port numbers set to the OMNI UDP port number [I-D.templin-intarea-omni]. The node next calculates and sets the UDP Checksum, then finally sends the prepared Parcel/Jumbo Report to the original source of the probe.¶
Note: This implies that original sources that send parcels/AJs must be capable of accepting and processing these OMNI protocol UDP messages. A source that sends parcels/AJs must therefore implement enough of the OMNI interface to be able to recognize and process these messages.¶
This specification introduces an IPv6 Advanced Jumbo (AJ) service as an alternative to parcels and basic jumbograms that also includes a path probing function based on the mechanisms specified in Section 7.5. The function employs an Advanced Jumbo Option with the same option Type and Length values as for the Parcel Payload option, except that for AJs that do not require an Identification the Length is reduced by 8 octets and the Identification is omitted (for Jumbo probes, both the Identification and PMTU field must be included). The Parcel Index and Parcel Payload Length fields are also replaced by a 32-bit Jumbo Payload Length field as shown in Figure 5:¶
{TCP/UDP}/IPv6 AJs/probes are formed the same as for parcels as shown in Figure 2 except that they include only a single segment ("Segment 0") preceded by a 2-octet Internet Checksum header and followed by an N-octet message digest trailer. Unlike parcels, TCP AJs do not include a separate Sequence Number header for the (single) segment since the sequence number is coded in the TCP header the same as for an ordinary packet.¶
AJ implementations honor the message digest algorithms specified for MD5 [RFC1321], SHA1 [RFC3174] and the advanced US Secure Hash Algorithms [RFC6234] as selected by an Advanced Jumbo Type value (see below). AJs can instead employ a CRC32C/CRC64E integrity check by selecting a Type value that selects a CRC code instead of a message digest. (An Advanced Jumbo Type value is also reserved by IANA as a non-functional placeholder for a nominal CRC128J algorithm, which may be specified in future documents - see: Appendix D.)¶
The source includes a CRC or message digest according to an algorithm appropriate for the segment length while considering the error characteristics of the path. The destination verifies the digest according to the selected algorithm and uses local knowledge to determine whether the integrity check strength is sufficient to relax upper layer checking. Source implementations must therefore select a sufficiently strong integrity check to provide the destination with adequate protection.¶
AJ implementations MUST support the following integrity checking algorithms:¶
The source prepares an AJ/probe by first setting the IPv6 Payload Length field to an Advanced Jumbo Type value taken from the above table to distinguish this from a basic jumbogram or parcel. The source can begin by sending a Jumbo Probe to pre-qualify the path for AJs if necessary.¶
To prepare a Jumbo Probe that will trigger a Jumbo Report, the source can set {Protocol, Next Header} to {TCP,UDP}, set the {TCP,UDP} port to 9 (discard) and either include no octets beyond the {TCP,UDP} header or a single discard segment of the desired probe size immediately following the header. (The source can instead set the {TCP,UDP} port to the port number for a current data flow in order to receive IPv6 Jumbo Reply MTU options in return packets as discussed in Section 7.5.) The source then sets Jumbo Payload Length to the length of the {TCP,UDP} header plus the length of the segment Checksum header and message digest trailer plus the discard segment plus the length of the IPv6 extension headers.¶
The source next sets the Identification the same as for a Parcel Probe, sets the Jumbo Probe PMTU to the next hop link MTU, then sets Code to 255 and Check to the next hop TTL/Hop Limit. The source then calculates the {TCP,UDP} Checksum based on the same pseudo header as for an ordinary parcel (see: Figure 9) but with the Parcel Index and Payload Length fields replaced with a 32-bit Jumbo Payload Length field and with the Segment Length replaced with one of the supported Advanced Jumbo Type values. The source then calculates the checksum of the segment payload, writes the value into the segment Checksum header, then calculates the CRC or message digest over the length of the (single) segment beginning with the Checksum field and writes the value into the trailer. The source then sends the Jumbo Probe via the next hop link toward the final destination.¶
At each forwarding hop, the router examines Code and Check then drops the Jumbo Probe and returns a negative Jumbo Report if either value is incorrect. (Note that the Jumbo Probe may also have been truncated in length by a previous-hop router that does not recognize the construct.) For all other intact probes, if the next hop link is jumbo-capable the router compares PMTU to the next hop link MTU, resets PMTU to the minimum value, sets Check to the next hop TTL/Hop Limit then forwards the probe to the next hop. If the next hop link is not jumbo-capable, the router instead drops the probe and returns a negative Jumbo Report.¶
If the Jumbo Probe encounters an OMNI link, the OAL source can either drop the probe and return a negative Jumbo Report or set PMTU to the minimum of itself and 65535 octets then forward the probe further toward the OAL destination using adaptation layer encapsulation/fragmentation. If the OAL source already knows a larger-sized OAL path MTU for this OAL destination, it can encapsulate and forward the Jumbo Probe with PMTU set to the minimum of itself and the known value (minus the adaptation layer header size), and without adding any padding octets.¶
If the Jumbo Probe PMTU is larger than 65535 and the OAL path MTU is unknown, the OAL source can instead encapsulate the Jumbo Probe in an adaptation layer IPv6 header with an Advanced Jumbo option and with padding octets added beyond the end of the encapsulated Jumbo Probe to form an adaptation layer jumbogram as large as the minimum of PMTU and (2**24 - 1) octets (minus the adaptation layer header size) as a form of "jumbo-in-jumbo" encapsulation.¶
The OAL source then writes this size into the Jumbo Probe PMTU field and forwards the newly-created adaptation layer jumbogram toward the OAL destination. If the jumbogram somehow transits the path, the OAL destination then removes the adaptation layer encapsulation, discards the padding, then forwards the Jumbo Probe onward toward the final destination (with each hop reducing PMTU if necessary).¶
When a router on the path forwards a Jumbo Probe, it drops and returns a Jumbo Report if the next hop MTU is insufficient; otherwise, it forwards to the next hop toward the final destination. When the final destination receives the Jumbo Probe, it returns a Jumbo Report with the PMTU set to the maximum-sized jumbo that can transit the path.¶
After successfully probing the path, the original source can begin sending AJs by setting the IPv6 Payload Length field to one of the supported Advanced Jumbo Type values, omitting the PMTU field and calculating the (TCP,UDP}/IPv6 header checksum and per-segment Checksum header and CRC or message digest trailer the same as described for probes above. When the network layer of the final destination receives an AJ, it first verifies the integrity checks then delivers the data (along with a CRC/Checksum error flag) to the transport layer without returning a Jumbo Report. The source can continue to send AJs into the path with the possibility that the path may change. In that case, a router in the network may return an ICMP error, an ICMPv6 PTB, or a Jumbo Report if the path MTU decreases.¶
Note: when a source sends a jumbo probe into a new path that has not been probed previously, it should include enough padding payload so that the overall packet length is consistent with the value found in the IPv6 Payload Length field. This allows legacy routers on the path that do not recognize jumbos to see a length that is consistent with the value found in the IPv6 header.¶
Note: If an OAL source can in some way determine that a very large packet is likely to transit the OAL path, it can encapsulate a Jumbo Probe to form an adaptation layer jumbogram even larger than (2**24 - 1) octets with the understanding that the time required to transit the path plus the receive buffer size determine acceptable sizes.¶
Note: The Jumbo Report message types returned in response to both Parcel and Jumbo Probes are one and the same, and signify that both parcels and AJs at least as large as the reported MTU can transit the path. However, only a Parcel Probe (i.e., and not a Jumbo Probe) may elicit a Parcel Report. This may indicate a preference to use Parcel Probes instead of Jumbo Probes for general-purpose path probing.¶
Note: unlike basic jumbograms, AJs may encode a Jumbo Payload Length value smaller than 65536. This means that AJs can range in size from as small as the headers plus a minimal or even null payload to as large as 2**32 octets minus headers. This allows smaller AJs to operate within the traditional realms of ordinary packets or singleton parcels, according to the new link service model.¶
Note: When the source has assurance that the path will pass AJs smaller than the measured path MTU, it can suspend explicit transmission of the Identification values for these smaller AJs to reduce overhead. However, each packet/parcel/AJ transmission still increments the source's internal Identification counter whether or not the current Identification value explicitly transmitted.¶
The basic IPv6 parcel/AJ constructs specified in the previous sections use the IPv6 Minimum Path MTU Hop-by-Hop option [RFC9268] initially to allow each hop to participate in path qualification. Once a path has been qualified to accept the basic constructs, however, the source can begin sending minimal IPv6 parcels/AJs that instead use the IPv6 Jumbo Payload Hop-by-Hop Option [RFC2675] to benefit from a per parcel/AJ overhead savings as shown in Figure 7:¶
In this format, the network layer includes the IPv6 minimal Parcel/Jumbo Option as an IPv6 Hop-by-Hop option with Option Type set to '0xC2' and Opt Data Len set to 4 or 12 depending on whether an identification is included (see: Section 8). For parcels, the first four octets of the Option Data are formatted exactly as shown in Figure 1 while for AJs the first four octets are exactly as shown in Figure 5. The network layer prepares all other aspects of IPv6 minimal parcels/AJs exactly the same as for the basic specifications found in previous sections except the option type/length are different and the Code/Check fields are omitted.¶
This implies that implementations that honor the basic IPv6 parcel/AJ formats and processing specified in the previous sections MUST also honor the IPv6 Minimal Parcel/Jumbo Option format specified above as an equivalent construct. Therefore, the Parcel/Jumbo probe results received for the basic formats also serve as probe results for the minimal format.¶
Since the minimal format does not include Code and Check fields, intermediate and end systems must monitor the lengths of minimal parcels/AJs they receive in case the path changes and a previous hop begins truncating them. In that case, the node MUST drop the parcel/AJ and return a negative Jumbo Report to the source which must then re-initiate parcel/jumbo path probing.¶
Network intermediate systems often drop packets that contain unrecognized IPv6 extension headers unconditionally. This presents an obstacle to deploying new Internet extensions. Rather than wait for network systems to catch up, the source could instead employ an alternative more likely to provide service by concealing IPv6 extension headers within the body of a protocol data unit such as UDP.¶
End systems and intermediate systems that recognize the OMNI protocol [I-D.templin-intarea-omni] can use the parcel, AJ and minimal parcel/jumbo formats specified in this document as native protocol extension headers coded within the body of the OMNI protocol data unit.¶
The section titled "OMNI L2 Extension Header Encapsulation" in [I-D.templin-intarea-omni] depicts protocol layering for encapsulation of IPv6 Extension Headers in IPv6 packets as shown in Figure 8:¶
In this encapsulation format, the IPv6 parcel, AJ and minimal parcel/jumbo extension headers specified in previous sections as well as the IPv6 Extended Fragment Header appear as IPv6 Extension Headers following the OMNI protocol UDP, IPv6 or Ethernet header. The OMNI protocol requires each node to honor and implement the parcel/AJ constructs specified in this document with reference to [I-D.templin-intarea-omni]. This includes the setting of the IPv6 Payload Length fields as well as the settings of the parcel/AJ options themselves.¶
Intermediate systems that do not recognize the OMNI protocol are likely to drop any OMNI packets that include parcel/AJ options, but they may instead forward the packet without updating the Code/Check values and/or while truncating the overall packet length. Intermediate systems and end systems that recognize OMNI therefore perform the checks specified in this document to determine whether previous path hops correctly process parcels/AJs.¶
Since parcel/AJ options are coded within the OMNI protocol data unit itself instead of as an IPv6 header extension, network intermediate systems must also reset the OMNI protocol checksum if necessary when they alter the contents of an option (such as when resetting Path MTU or Check). For this reason, sources MAY disable the OMNI protocol checksum in path probes and SHOULD advance to using minimal parcels/AJs soon after probing the path to minimize intermediate system checksum interactions.¶
See: [I-D.templin-intarea-omni] for the full specification of OMNI L2 Extension Header encapsulation and processing. All parcel/AJ implementations that recognize the OMNI protocol are required to implement those portions of the OMNI specification.¶
Note: OMNI-encapsulated parcels/AJs appear as ordinary IP packets to lower layers where they are subject to the legacy link model in which errored frames are dropped and not forwarded to the next hop. The new link model is therefore engaged only for "native" (unencapsulated) parcels/AJs.¶
IPv6 parcel/AJ integrity assurance responsibility is shared between lower layers of the protocol stack and the transport layer where more discrete compensations for lost or corrupted data recovery can be applied. In particular, intermediate system lower layers forward parcels/AJs with correct headers to the final destination transport layer even if cumulative link errors were incurred at intermediate hops. The destination is then responsible for its own integrity assurance.¶
The {TCP,UDP}/IPv6 header plus each segment of a (multi-segment) parcel or AJ includes its own integrity checks. This means that parcels/AJs offer stronger and more discrete integrity checks for the same amount of transport layer protocol data compared to an ordinary IPv6 packet or jumbogram. The {TCP,UDP} Checksum header integrity check SHOULD be verified at each hop for which a link error is encountered to ensure that parcels/AJs with errored addressing information are detected. The per-segment Checksums and CRCs are set by the source and verified by the destination. Note that each segment includes both checks since there will be many instances when errors missed by the CRC are detected by the Checksum [STONE].¶
IPv6 parcels can range in length from as small as only the {TCP,UDP}/IPv6 headers plus a single segment to as large as the headers plus (64 * 65535) octets, while AJs include only a single segment that can be as small as the headers plus a small or even null segment to as large as 2**32 octets (minus headers). Due to parcellation/packetization in the path, the segment contents of a received parcel may arrive in an incomplete and/or rearranged order with respect to their original packaging.¶
IPv6 parcels and AJs include a separate 2-octet Internet Checksum header for each segment. The original source calculates the checksum for each segment beginning with the first octet of the per-segment Sequence Number (for TCP) then continuing with the first segment octet (noting that per-segment Checksum values of 0 indicate that the segment checksum is disabled). The source extends the checksum calculation over the entire length of the segment but does not extend the calculation into the trailing CRC field.¶
IPv6 parcels employ two different CRC types according to the non-final segment length "L". For values of L smaller than 9216 octets (9KB), the original source uses the CRC32C specification [RFC3385] and encodes the CRC in a 4 octet trailer. For larger L values, the source uses the CRC64E specification [ECMA-182] and encodes the CRC in an 8 octet trailer. For AJs, the source instead includes either a 4/8 octet CRC or an N-octet message digest trailer calculated per [RFC1321], [RFC3174] or [RFC6234] where N is determined according to the hash algorithm assigned to the Advanced Jumbo Type (see: IANA Considerations).¶
When link errors are detected, the network layer of the link far end SHOULD verify the parcel/AJ {TCP,UDP}/IPv6 header Checksum at its layer, since an errored header could result in mis-delivery. If the network layer of the link far end detects an incorrect {TCP,UDP}/IP header Checksum it should discard the entire parcel/AJ unless the header(s) can somehow first be repaired. If the {TCP,UDP}/IPv6 header Checksum was correct, but the link far end detected CRC errors, the network layer sets a "CRC error" flag in the parcel/AJ option.¶
The CRC error flag entails clearing/setting the IPv6 Hop-by-Hop Option Type third-highest-order bit as "0 - Option does not change en route or "1 - Option Data may change en route" or [RFC8200]. Therefore, nodes must recognize the Option Type '0x10' as "IPv6 Parcel/AJ with errors' and Option Type '0xE2' as "Minimal IPv6 Parcel/AJ with errors" (see: IANA Considerations).¶
To support the parcel/AJ header checksum calculation, the network layer uses a modified version of the {TCP,UDP}/IPv6 pseudo-header found in Section 8.1 of [RFC8200] as shown in Figure 9. This allows for maximum reuse of widely deployed code while ensuring interoperability.¶
where the following fields appear:¶
Source Address is the 16-octet IPv6 source address of the prepared parcel/AJ.¶
Destination Address is the 16-octet IPv6 destination address of the prepared parcel/AJ.¶
For parcels, Index/P/S is the combined 1-octet field and Parcel Payload Length is the 3-octet field that appear in the Parcel Payload Option fields of the same name. For AJs, these two fields are replaced by a single 4-octet Jumbo Payload Length field.¶
Segment Length is the value that appears in the IPv6 Payload Length field of the prepared parcel/AJ.¶
zero encodes the constant value 0.¶
Next Header is the IP protocol number corresponding to the transport layer protocol, i.e., TCP or UDP.¶
When the transport layer protocol entity of the source delivers a parcel body to the network layer, it presents the values L and J along with the (J + 1) segments in canonical order as a list of data buffers and with each TCP segment preceded by a 4-octet Sequence Number field. (For AJs, the transport layer instead delivers the singleton AJ segment along with the Jumbo Payload Length.) When the network layer of the source accepts the parcel/AJ body from the transport layer protocol entity, it calculates the Internet checksum for each segment and writes the value in the per-segment Checksum header (or writes the value 0 when UDP checksums are disabled). The network layer then calculates the CRC/message digest for each segment beginning with the Checksum field, inserts the result as a segment trailer in network byte order, then concatenates all segments and appends the necessary {TCP,UDP}/IPv6 headers and extensions to form a parcel. The network layer then calculates the {TCP,UDP}/IPv6 header checksum over the length of only the {TCP,UDP} headers plus IPv6 pseudo header then forwards the parcel to the next hop without further processing.¶
When the network layer of the destination accepts an AJ or reunifies a parcel from one or more sub-parcels received from the source it first verifies the {TCP,UDP}/IPv6 header checksum then verifies first the CRC/digest and next the Checksum (except when UDP checksums are disabled) for each segment and marks any with incorrect integrity check values as errors. When the network layer restores a parcel from one or more individual {TCP,UDP}/IPv6 packets received from the source, it instead marks the CRCs of each segment as correct since the individual packets were subject to CRC checks at each hop along the path. The network layer then verifies the Internet checksum of each individual packet (except when UDP checksums are disabled), restores the parcel, and delivers each parcel/AJ segment along with a CRC/Checksum error flag to the transport layer.¶
When the transport layer of the destination processes parcel or AJ segments, it can accept any with correct CRCs and Checksums while optionally applying additional higher-layer integrity checks. The transport layer can instead process any segments with incorrect CRC/Checksum by either discarding the entire segment or applying higher-layer integrity checks on the component elements of the segment to accept as many non-errored elements as possible. The transport layer can then either reconstruct from local information or request retransmission for any segment elements that may have been damaged in transit as necessary.¶
Note: when the destination network layer receives a parcel with an IPv6 Option Type with third-highest-order bit set to indicate that a link CRC error was detected, it still engages its per-segment CRC and Checksum tests to accept as many error-free segments as possible. When the destination receives an AJ with a CRC error setting, it need not engage its (single segment) integrity checks since the segment is already known to include link errors.¶
Note: when the destination network layer detects a per-segment CRC error, it immediately posts the segment plus an error code for delivery to the transport instead of continuing to verify the segment Checksum. Performing a second integrity check on a segment already determined to contain errors by a first check would serve no useful purpose.¶
Note: the source and destination network layers can often engage hardware functions to greatly improve CRC/Checksum calculation performance.¶
Common widely-deployed implementations include services such as TCP Segmentation Offload (TSO) and Generic Segmentation/Receive Offload (GSO/GRO). These services support a robust service that has been shown to improve performance in many instances.¶
An early prototype of UDP/IPv4 parcels (draft version -15) has been implemented relative to the linux-5.10.67 kernel and ION-DTN ion-open-source-4.1.0 source distributions. Patch distribution found at: "https://github.com/fltemplin/ip-parcels.git".¶
Performance analysis with a single-threaded receiver has shown that including increasing numbers of segments in a single parcel produces measurable performance gains over fewer numbers of segments due to more efficient packaging and reduced system calls/interrupts. For example, sending parcels with 30 2000-octet segments shows a 48% performance increase in comparison with ordinary packets with a single 2000-octet segment.¶
Since performance is strongly bounded by single-segment receiver processing time (with larger segments producing dramatic performance increases), it is expected that parcels with increasing numbers of segments will provide a performance multiplier on multi-threaded receivers in parallel processing environments.¶
The IANA is instructed to add a reference to this document ([RFCXXXX]) in the "Minimum Path MTU Hop-by-Hop Option" entry in the "Destination Options and Hop-by-Hop Options" table of the 'ipv6-parameters' registry.¶
The IANA is instructed to assign new Code values in the "ICMPv6 Code Fields: Type 2 - Packet Too Big" table in the 'icmpv6-parameters' registry (registration procedure is Standards Action or IESG Approval). The registry entries should appear as follows:¶
The IANA is requested to assign two new entries in the 'ipv6-parameters' registry "Destination Options and Hop-by-Hop Options" table (registration procedures IESG Approval, IETF Review or Standards Action). The first entry sets "Hex Value" to '0xE2', "acct" to '11', "chg" to '1', "rest" to '00010' and Description to "Minimal Parcel/AJ With Errors". The second entry sets "Hex Value" to '0x10', "acct" to '00', "chg" to '1', "rest" to '10000' and Description to "Parcel/AJ With Errors". Both entries set "Reference" to this document [RFCXXXX].¶
The IANA is instructed to create and maintain a new registry titled "IPv6 Parcel and Advanced Jumbo Formats and Types".¶
For IPv6 parcels and Advanced Jumbos, the value in the 'Opt Data Len' field of the IPv6 Minimum Path MTU Hop-by-Hop Option [RFC9268] serves as an "Option Format" code that distinguishes the various IPv6 option formats specified in this document. Initial values are given below:¶
For minimal IPv6 parcels and Advanced Jumbos, the value in the 'Opt Data Len' field of the IPv6 Jumbo Payload Hop-by-Hop Option [RFC2675] serves as an "Option Format" code that distinguishes the minimal formats specified in this document. Initial values are given below:¶
For all IPv6 Parcels/Advanced Jumbos and their corresponding probes, the IPv6 Payload Length field encodes an "Advanced Jumbo Type" value instead of an ordinary total/payload length. Initial values are given below:¶
In the control plane, original sources match the Identification (and/or other identifying information) received in Parcel/Jumbo Reports with their corresponding parcels/AJs. If the identifying information matches, the report is likely authentic. When stronger authentication is needed, nodes that send Parcel and/or Jumbo Reports can apply the message authentication services specified for AERO/OMNI.¶
In the data plane, multi-layer security solutions may be needed to ensure confidentiality, integrity and availability. Since parcels and AJs are defined only for TCP and UDP, IPsec-AH/ESP [RFC4301] cannot be applied in transport mode although they can certainly be used in tunnel mode at lower layers such as for transmission of parcels/AJs over OMNI link secured spanning trees, VPNs, etc. Since the network layer does not manipulate transport layer segments, parcels/AJs do not interfere with transport or higher-layer security services such as (D)TLS/SSL [RFC8446] which may provide greater flexibility in some environments.¶
IPv4 fragment reassembly is known to be dangerous at high data rates where undetected reassembly buffer corruptions can result from fragment misassociations [RFC4963]. IPv6 is less subject to these concerns when the 32-bit Identification field is managed responsibly but this may be less true if the starting sequence number is changed frequently. However, IPv6 can robustly sustain high data rate restoration/reunification and uniqueness verification using the 64-bit Identifications included in Parcels/AJs.¶
IPv6 parcels and AJs are processed according to a new link service model for the Internet in which intermediate systems may forward parcels/AJs that incurred link errors and end systems are responsible for detecting any link errors incurred along the path. The destination end system in particular is uniquely positioned to verify and/or correct the integrity of any transport layer segments received. For this reason, transport layer protocols that use parcels/AJs should include higher layer error detection and/or forward error correction codes in addition to the per-segment link error integrity checks.¶
The message digests included with AJs provide integrity checks only and must not be considered as authentication codes in the absence of additional security services. Further security considerations related to IPv6 parcels and Advanced Jumbos are found in the AERO/OMNI specifications.¶
This work was inspired by ongoing AERO/OMNI/DTN investigations. The concepts were further motivated through discussions with colleagues.¶
A considerable body of work over recent years has produced useful segmentation offload facilities available in widely-deployed implementations.¶
With the advent of networked storage, big data, streaming media and other high data rate uses the early days of Internetworking have evolved to accommodate the need for improved performance. The need fostered a concerted effort in the industry to pursue performance optimizations at all layers that continues in the modern era. All who supported and continue to support advances in Internetworking performance are acknowledged.¶
This work has been presented at working group sessions of the Internet Engineering Task Force (IETF). The following individuals are acknowledged for their contributions: Roland Bless, Scott Burleigh, Madhuri Madhava Badgandi, Joel Halpern, Tom Herbert, Andy Malis, Herbie Robinson, Bhargava Raman Sai Prakash.¶
Honoring life, liberty and the pursuit of happiness.¶
TCP Extensions for High Performance are specified in [RFC7323], which updates earlier work that began in the late 1980's and early 1990's. These efforts determined that the TCP 16-bit Window was too small to sustain transmissions at high data rates, and a TCP Window Scale option allowing window sizes up to 2^30 was specified. The work also defined a Timestamp option used for round-trip time measurements and as a Protection Against Wrapped Sequences (PAWS) at high data rates. TCP users of IPv6 parcels are strongly encouraged to adopt these mechanisms.¶
Since TCP/IPv6 parcels only include control bits for the first segment ("segment(0)"), nodes must regard all other segments of the same parcel as data segments. When a node breaks a TCP/IPv6 parcel out into individual packets or sub-parcels, only the first packet or sub-parcel contains the original segment(0) and therefore only its TCP header retains the control bit settings from the original parcel TCP header. If the original TCP header included TCP options such as Maximum Segment Size (MSS), Window Scale (WS) and/or Timestamp, the node copies those same options into the options section of the new TCP header.¶
For all other packets/sub-parcels, the note sets all TCP header control bits to 0 as data segment(s). Then, if the original parcel contained a Timestamp option, the node copies the Timestamp option into the options section of the new TCP header. Appendix A of [RFC7323] provides implementation guidelines for the Timestamp option layout.¶
Appendix A of [RFC7323] also discusses Interactions with the TCP Urgent Pointer as follows: "if the Urgent Pointer points beyond the end of the TCP data in the current segment, then the user will remain in urgent mode until the next TCP segment arrives. That segment will update the Urgent Pointer to a new offset, and the user will never have left urgent mode". In the case of IPv6 parcels, however, it will often be the case that the next TCP segment is included in the same (sub-)parcel as the segment that contained the urgent pointer such that the urgent pointer can be updated immediately.¶
Finally, if a parcel/AJ contains more than 65535 octets of data (i.e., spread across multiple segments), then the Urgent Pointer can be regarded in the same manner as for jumbograms as described in Section 5.2 of [RFC2675].¶
For each parcel, the transport layer can specify any L value between 256 and 65535 octets. Transport protocols that send isolated control and/or data segments smaller than 256 octets should package them as ordinary packets, AJs, singleton parcels or as the final segment of a larger parcel. It is also important to note that segments smaller than 256 octets are likely to include control information for which timely delivery rather than bulk packaging is desired. Transport protocol streams therefore often include a mix of (larger) parcels and (smaller) ordinary packets, AJs or singleton parcels.¶
The transport layer should also specify an L value no larger than can accommodate the maximum-sized transport and network layer headers that the source will include without causing a single segment plus headers to exceed 65535 octets. For example, if the source will include a 28 octet TCP header plus a 40 octet IPv6 header with 24 extension header octets (plus 6/10 octets for the per-segment Checksum/CRC) the transport should specify an L value no larger than (65535 - 28 - 40 - 24 - 10) = 65433 octets.¶
The transport can specify still larger "extreme" L values up to 65535 octets, but the resulting parcels might be lost along some paths with unpredictable results. For example, a parcel with an extreme L value set as large as 65535 might be able to transit paths that can pass jumbograms natively but might not be able to transit a path that includes non-jumbo links. The transport layer should therefore carefully consider the benefits of constructing parcels with extreme L values larger than the recommended maximum due to high risk of loss compared with only minor potential performance benefits.¶
Parcels that include extreme L values larger than the recommended maximum and with a maximum number of included segments could also cause a parcel to exceed 16,777,215 (2**24 - 1) octets in total length. Since the Parcel Payload Length field is limited to 24 bits, however, the largest possible parcel is also limited by this size. See also the above risk/benefit analysis for parcels that include extreme L values larger than the recommended maximum.¶
After sending a Parcel/Jumbo Probe, the source may receive a Parcel/Jumbo Report from either a router on the path or from the final destination itself. If a router or final destination receives a Parcel/Jumbo Probe but does not recognize the parcel/AJ constructs, it will likely drop the probe without further processing and may return an ICMP error. The original source will then consider the probe as lost, but may attempt to probe again later, e.g., in case the path may have changed.¶
When the source examines the "packet in error" portion of a Parcel/Jumbo Report, it can easily match the Report against its recent transmissions if the Identification value is available. For each "packet in error" that does not include an Identification, the source can attempt to match based on any other identifying information; otherwise, it should discard the message.¶
If the source receives multiple Parcel/Jumbo Reports for a single parcel/jumbo sent into a given path, it should prefer any information reported by the final destination over information reported by a router. For example, if a router returns a negative report while the destination returns a positive report the latter should be considered as more-authoritative. For this reason, the source should provide a configuration knob allowing it to accept or ignore reports that originate from routers, e.g., according to the network trust model.¶
When a destination returns a Parcel/Jumbo Report, it can optionally attach the report to an ordinary data packet, parcel or AJ that it returns to the original source. For example, the OMNI specification includes a "super-packet" service that allows multiple independent IPv6 packets to be encapsulated as attachments to a single adaptation layer packet. This is distinct from an IP parcel in that each packet member of the super-packet includes its own IPv6 (and possibly other upper layer) header.¶
This section postulates a 128-bit Cyclic Redundancy Check (CRC) algorithm for AJs termed "CRC128J". An Advanced Jumbo Type value is reserved for CRC128J, but at the time of this writing no algorithm exists. Future specifications may update this document and provide an algorithm for handling Advanced Jumbos with Type CRC128J.¶
<< RFC Editor - remove prior to publication >>¶
Changes from earlier versions:¶
Submit for review.¶