Internet-Draft | IPsec anti-replay subspaces | March 2023 |
Ponchon, et al. | Expires 14 September 2023 | [Page] |
This document discusses the challenges of running IPsec with anti-replay in multi-core environments where packets may be re-ordered (e.g., when sent over multiple IP paths, traffic-engineered paths and/or using different QoS classes). A new solution based on splitting the anti-replay sequence number space into multiple different sequencing subspaces is proposed. Since this solution requires support on both parties, an IKE extension is proposed in order to negotiate the use of the anti-replay sequence number subspaces.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 September 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The IPsec and IKE protocol suite is very commonly used in secure overlay networks, often interconnecting thousands or tens of thousands of sites. Leveraging the high core-counts and multi-uplinks (e.g., multiple fiber/cable, cellular or MPLS uplinks) capabilities of modern systems is important to bring greater throughput, availability and quality of service.¶
Such scale and multi-paths requirements conflict with how anti-replay currently works. This document first describes the problems related to running IPsec with anti-replay in conjunction with traffic-engineered paths or multi-core systems, and how existing solutions are not sufficient to address these challenges. An IPsec extension is then defined. It divides the IPsec sequence number space into multiple subspaces. Finally, an IKE extension is defined in order to enable this option only when both tunnel endpoints support it.¶
While the problem is explored in more detail in [I-D.mrossberg-ipsecme-multiple-sequence-counters], this section will highlight the key issues associated with running IPsec with anti-replay in multi-core systems and environments where traffic-engineering is used, as well as the limitations of current solutions.¶
Scaling the current anti-replay mechanism to run on multiple cores concurrently shows performance limitations: - When receiving a packet, preventing the same IPsec packet from being accepted by two different cores in parrallel requires constant synchronization between the cores. - When transmitting a packet, sequence numbers must be allocated efficiently, and packets must be transmitted without too much re-ordering, as to not exceed the receiver's anti-replay window size. This also ends-up requiring locks and synchronization between cores.¶
A commonly used alternative is to assign each Child SA to a given core, but that limits the throughput that is achievable by a single tunnel and adds a performance overhead associated with passing packets across cores.¶
These restrictions are discussed in [I-D.pwouters-ipsecme-multi-sa-performance], which mainly focuses on high-throughput IPsec tunnels, but the problem also arises with small tunnels since multiple inner flows processed by multiple threads often need to be transmitted on the same tunnel (causing multiple threads to need to access shared resources).¶
A possible solution to leverage the multi-core capability of the IPsec peers for a given tunnel would be to allocate one Child SA per core. However, combined with QoS classes and multi-path capabilities, this approach shows scalability issues with both the IKE and IPsec implementations:¶
Finally, in situations where packet reordering is present, such as with QoS or multiple uplinks, slower or lower priority packets may fall outside of the anti-replay window and be dropped. Using an extraordinarily large window size causes both performance and scalability limitations.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Processing packets associated with a single Child SA on multiple cores and using a single Child SA on multiple paths or with multile QoS classes suffer from limitations due to the anti-replay mechanism.¶
As a result, this section describes a solution which modifies the anti-replay mechanism by allowing to split the 64 bits (with Extended Sequence Number, ESN) anti-replay sequence number space into multiple subspaces. Each core, path, or QoS class, or any combination of those, can then use their own unique anti-replay sequence number subspace. The changes needed to the ESP header and IPsec protocol are described in Section 4.1, Section 4.2 and Section 4.3.¶
To avoid potential issues with non-standard extensions of IPsec ESP, this solution modifies only the field related to the anti-replay mechanism (i.e., the sequence number) and not the SPI field, which is intended to identify the Child SA. An IKE extension is presented in Section 4.5 to coordinate the use, or not, of this extension, which requires both IPsec peers to implement it.¶
This document extends the 32-bit field of the sequence number in the ESP header to a 64-bit field, which is in turn divided into two sub-fields:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Security Parameters Index (SPI) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subspace ID | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rest of the ESP payload¶
While the reduced usage of subspaces due to the restriction of the extended sequence number to 48 bits is a consideration, it is estimated that a 1 Tbps would exhaust a subspace in over 938 hours. This is for ethernet frames of 1500 bytes, T = 2^48 (pkts) * 1500 (B/pkt) * 8 / 10^12 (bps) = ~3.4 * 10^6 seconds = ~938 hours.¶
The sender MAY set the sequence number subspace ID to any value. For example, the sender would use different values per path or per processing core.¶
The sender MUST maintain one sequence number counter per sequence number subspace that it makes use of. But the sender MAY use only some (and as few as a single one) of the available 65,536 subspaces.¶
When transmitting a packet, the sender MUST use the sequence number counter associated with the sequence number subspace in use for that packet. The lower-order 48 bits of the sequence number counter are placed in the sequence number field, as specified in [RFC4303].¶
The receiver MUST maintain one anti-replay window and counter for each sequence number subspace being used.¶
When receiving a packet, the receiver MUST use the anti-replay window and counter associated with the sequence number subspace identified with the subspace ID field.¶
Note: Since the sender may decide to only use a subset of the available subspace values, receivers SHOULD NOT allocate all the possible anti-replay windows per peer by default. Two mitigation mechanisms may be used to reduce the number of anti-replay windows:¶
Due to the reduction of the sequence number space by using the 16 higher order bits of the field, using a 32-bit sequence number field is not a possibility. For instance, on a 1Gbps with 1500B ethernet frames, it would take less than one second for the sequence number to loop. Such a small periodicity would make it impractical to keep the peers of the IPsec tunnel in sync.¶
As such, the peers MUST use an explicit Extended Sequence Number (ESN) as a sub-second period for a resync operation (as defined in appendix A3 of [RFC4303]) would not be possible.¶
To negotiate the use of sequence number subspaces for use with IPsec ESP, a new anti-replay subspaces transform (Section 4.5.1) is defined with two attributes:¶
If both attributes are set to 0, the sender does not support sequence number subspaces. The requested value MUST be lower than the supported value.¶
During the CREATE_CHILD_SA exchange, the sender and receiver negotiate the use of this transform. The sender indicates the number of subspaces it supports and prefers to use, while the receiver decides on the number of subspaces to use based on the sender's capabilities. This negotiation mechanism allows for flexibility in the number of subspaces used and can help optimize the performance of IPsec in different environments.¶
With a single Child SA negotiated between the two IPsec peers, the failure model is clean, as all requested subspaces are either available or none of them.¶
1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Last Substruc | RESERVED | Transform Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Transform Type | RESERVED | Transform ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Transform Attributes ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| Attribute Type | AF=1 Attribute Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| Attribute Type | AF=1 Attribute Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+¶
As described in Section 2, anti-replay comes with implementation and scalability challenges when running in environments where IPsec peers may leverage multiple QoS policies to send packets or multiple cores to process them.¶
Since the anti-replay mechanism seems to be the source cause of these observed challenges, this document provides a solution which relies on a small and optional change at the anti-replay level.¶
By using sequence number subspaces, IPsec peers may:¶
The effectiveness of the subspace mechanism can be further improved with smart NICs or multiple paths to efficiently steer packets to different cores on the receiver side. However, even without these capabilities, sequence number subspaces still provide benefits for IPsec tunnels. Without subspaces, IPsec tunnels are often restricted to a single core due to the need for locking mechanisms, which can cause significant overhead. With subspaces, it is still possible to distribute the subspaces between cores by resteering packets to increase performances.¶
In scenarios where NATs are used to modify IP addresses or ports, the use of multiple uplinks on a single IPsec tunnel may not be feasible without additional IKE negotiation to perform NAT traversal. As a result, using multiple uplinks is recommended only in scenarios where NATs are not present.¶
The sequence number is used by the anti-replay mechanism to ensure a packet could not be accepted twice by the receiver. This prevents an attacker from trying to replay one or multiple packets from an IPsec tunnel.¶
In this proposal, a single Child SA is associated with multiple anti-replay windows and counters. If a packet is replayed, the sequence number subspace ID remains the same since the Subspace ID field is authenticated. As a result, the receiver will use the same anti-replay state when processing the replayed packet as the one used when the first packet was initially received. This ensures that a replayed packet will be detected and dropped by the receiver.¶
The use of a subspace ID as part of the 64-bit sequence number ensures that the usage limit of cryptographic materials is evenly distributed among the subspaces without the need for an additional mechanism. This means each of the 2^16 subspaces can encrypt 2^48 packets, fully utilizing the 2^64 usage limits of the cryptographic keys.¶
When a single sequence number space is used within a given Child SA, encryption and decryption operations must always happen on the same core (locking anti-replay structures or using contended atomic operations has a dramatic performance hit).¶
To avoid the performance degradation caused by packet resteering, each thread may use its own sequence number subspace:¶
Similarly, when multiple paths are used:¶
If a combination of both multi-path and multi-core load-balancing is needed, the subspace field could be used partly to encode a path ID, partly to encode a core ID. But this is purely implementation specific and does not require coordination between the peers.¶
Depending on the cryptographic mode of operations, the Initialization Vector (IV) comes with specific requirements.¶
Some modes (e.g., CBC) make use of random IV values. When implementing this specification, each thread independently generates its independent stream of random values, ensuring the IV randomness property. Care must be taken as to limit the global number of transmitted packets using the same Child SA in order to avoid birthday paradox attacks. A lockless counter, or batched token bucket mechanism, may be used to efficiently implement this process without performance degradation.¶
Other cryptographic modes (e.g., GCM) do not have randomness requirements over the IV, but the IV values must only be used once. RFC4106 Section 3.1 states that "The most natural way to implement this is with a counter, but anything that guarantees uniqueness can be used, such as a linear feedback shift register (LFSR). Note that the encrypter can use any IV generation method that meets the uniqueness requirement, without coordinating with the decrypter." . One simple way to implement this specification is to divide the IV into a subspace field, which reuses the ESP sequence number subspace value, and a variable IV part, which is simply incremented for each encrypted packet. To ensure compatibility with implicit IVs from [RFC8750], only the 48-bit sequence number field must be initialized to zero, while the 16-bit subspace ID can be used for its intended purpose.¶
Author's note: Are there other cryptographic modes with different requirements over the IV ?¶