Internet-Draft | Predictable Transient Numeric IDs | December 2022 |
Gont & Arce | Expires 14 June 2023 | [Page] |
This document analyzes the timeline of the specification and implementation of different types of "transient numeric identifiers" used in IETF protocols, and how the security and privacy properties of such protocols have been affected as a result of it. It provides empirical evidence that advice in this area is warranted. This document is a product of the Privacy Enhancement and Assessment Research Group (PEARG) in the IRTF.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 June 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
Networking protocols employ a variety of transient numeric identifiers for different protocol objects, such as IPv4 and IPv6 Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers (IIDs) [RFC4291], transport protocol ephemeral port numbers [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], NTP Reference IDs (REFIDs) [RFC5905], and DNS Query IDs [RFC1035]. These identifiers typically have specific interoperability requirements (e.g. uniqueness during a specified period of time), and associated failure severities when such requirements are not met [I-D.irtf-pearg-numeric-ids-generation].¶
For more than 30 years, a large number of implementations of the IETF protocols have been subject to a variety of attacks, with effects ranging from Denial of Service (DoS) or data injection, to information leakages that could be exploited for pervasive monitoring [RFC7258]. The root cause of these issues has been, in many cases, poor selection of transient numeric identifiers, usually as a result of insufficient or misleading specifications.¶
For example, implementations have been subject to security or privacy issues resulting from:¶
These examples indicate that when new protocols are standardized or implemented, the security and privacy properties of the associated transient numeric identifiers tend to be overlooked, and inappropriate algorithms to generate such identifiers (i.e. that negatively affect the security or privacy properties of the protocol) are either suggested in the specification or selected by implementers.¶
This document contains a non-exhaustive timeline of the specification and vulnerability disclosures related to some sample transient numeric identifiers, including other work that has led to advances in this area. This analysis indicates that:¶
While it is generally possible to identify an algorithm that can satisfy the interoperability requirements for a given transient numeric identifier, this document provides empirical evidence that doing so without negatively affecting the security or privacy properties of the aforementioned protocols is non-trivial. Other related documents ([I-D.irtf-pearg-numeric-ids-generation] and [I-D.gont-numeric-ids-sec-considerations]) provide guidance in this area, as motivated by the present document.¶
This document represents the consensus of the Privacy Enhancement and Assessment Research Group (PEARG).¶
The terms "constant IID", "stable IID", and "temporary IID" are to be interpreted as defined in [RFC7721].¶
Throughout this document, we do not consider on-path attacks. That is, we assume the attacker does not have physical or logical access to the system(s) being attacked, and that the attacker can only observe traffic explicitly directed to the attacker. Similarly, an attacker cannot observe traffic transferred between a sender and the receiver(s) of a target protocol, but may be able to interact with any of these entities, including by e.g. sending any traffic to them to sample transient numeric identifiers employed by the target systems when communicating with the attacker.¶
For example, when analyzing vulnerabilities associated with TCP Initial Sequence Numbers (ISNs), we consider the attacker is unable to capture network traffic corresponding to a TCP connection between two other hosts. However, we consider the attacker is able to communicate with any of these hosts (e.g., establish a TCP connection with any of them), to e.g. sample the TCP ISNs employed by these systems when communicating with the attacker.¶
Similarly, when considering host-tracking attacks based on IPv6 interface identifiers, we consider an attacker may learn the IPv6 address employed by a victim node if e.g. the address becomes exposed as a result of the victim node communicating with an attacker-operated server. Subsequently, an attacker may perform host-tracking by probing a set of target addresses composed by a set of target prefixes and the IPv6 interface identifier originally learned by the attacker. Alternatively, an attacker may perform host tracking if e.g. the victim node communicates with an attacker-operated server as it moves from one location to another, those exposing its configured addresses. We note that none of these scenarios requires the attacker observe traffic not explicitly directed to the attacker.¶
While assessing IETF protocol specifications regarding the use of transient numeric identifiers, we have found that most of the issues discussed in this document arise as a result of one of the following conditions:¶
A number of IETF protocol specifications have simply overlooked the security and privacy implications of transient numeric identifiers. Examples of them are the specification of TCP ephemeral ports in [RFC0793], the specification of TCP sequence numbers in [RFC0793], or the specification of the DNS TxID in [RFC1035].¶
On the other hand, there are a number of IETF protocol specifications that over-specify some of their associated transient numeric identifiers. For example, [RFC4291] essentially overloads the semantics of IPv6 Interface Identifiers (IIDs) by embedding link-layer addresses in the IPv6 IIDs, when the interoperability requirement of uniqueness could be achieved in other ways that do not result in negative security and privacy implications [RFC7721]. Similarly, [RFC2460] suggested the use of a global counter for the generation of Fragment Identification values, when the interoperability properties of uniqueness per {Src IP, Dst IP} could be achieved with other algorithms that do not result in negative security and privacy implications [RFC7739].¶
Finally, there are implementations that simply fail to comply with the corresponding IETF protocol specifications or recommendations. For example, some popular operating systems (notably Microsoft Windows) still fail to implement transport protocol ephemeral port randomization, as recommended in [RFC6056].¶
The following subsections document the timelines for a number of sample transient numeric identifiers, that illustrate how the problem discussed in this document has affected protocols from different layers over time. These sample transient numeric identifiers have different interoperability requirements and failure severities (see Section 6 of [I-D.irtf-pearg-numeric-ids-generation]), and thus are considered to be representative of the problem being analyzed in this document.¶
This section presents the timeline of the Identification field employed by IPv4 (in the base header) and IPv6 (in Fragment Headers). The reason for presenting both cases in the same section is to make it evident that while the Identification value serves the same purpose in both IPv4 and IPv6, the work and research done for the IPv4 case did not affect IPv6 specifications or implementations.¶
The IPv4 Identification is specified in [RFC0791], which specifies the interoperability requirements for the Identification field: the sender must choose the Identification field to be unique for a given source address, destination address, and protocol, for the time the datagram (or any fragment of it) could be alive in the internet. It suggests that a node may keep "a table of Identifiers, one entry for each destination it has communicated with in the last maximum packet lifetime for the internet", and suggests that "since the Identifier field allows 65,536 different values, hosts may be able to simply use unique identifiers independent of destination". The above has been interpreted numerous times as a suggestion to employ per-destination or global counters for the generation of Identification values. While [RFC0791] does not suggest any flawed algorithm for the generation of Identification values, the specification omits a discussion of the security and privacy implications of predictable Identification values. This has resulted in many IPv4 implementations generating predictable fragment Identification values by means of a global counter, at least at some point in time.¶
The IPv6 Identification was originally specified in [RFC1883]. It serves the same purpose as its IPv4 counterpart, with the only difference residing in the length of the corresponding field, and that while the IPv4 Identification field is part of the base IPv4 header, in the IPv6 case it is part of the Fragment header (which may or may not be present in an IPv6 packet). [RFC1883] states, in Section 4.5, that the Identification must be different than that of any other fragmented packet sent recently (within the maximum likely lifetime of a packet) with the same Source Address and Destination Address. Subsequently, it notes that this requirement can be met by means of a wrap-around 32-bit counter that is incremented each time a packet must be fragmented, and that it is an implementation choice whether to use a global or a per-destination counter. Thus, the implementation of the IPv6 Identification is similar to that of the IPv4 case, with the only difference that in the IPv6 case the suggestions to use simple counters is more explicit. [RFC2460] was the first revision of the core IPv6 specification, and maintained the same text for the specification of the IPv6 Identification field. [RFC8200], the second revision of the core IPv6 specification, removes the suggestion from [RFC2460] to use a counter for the generation of IPv6 Identification values, and points to [RFC7739] for sample algorithms for their generation.¶
[RFC0793] suggests that the choice of the ISN of a connection is not arbitrary, but aims to reduce the chances of a stale segment from being accepted by a new incarnation of a previous connection. [RFC0793] suggests the use of a global 32-bit ISN generator that is incremented by 1 roughly every 4 microseconds. However, as a matter of fact, protection against stale segments from a previous incarnation of the connection is enforced by preventing the creation of a new incarnation of a previous connection before 2*MSL have passed since a segment corresponding to the old incarnation was last seen (where "MSL" is the "Maximum Segment Lifetime" [RFC0793]). This is accomplished by the TIME-WAIT state and TCP's "quiet time" concept (see Appendix B of [RFC1323]). Based on the assumption that ISNs are monotonically increasing across connections, many stacks (e.g., 4.2BSD-derived) use the ISN of an incoming SYN segment to perform "heuristics" that enable the creation of a new incarnation of a connection while the previous incarnation is still in the TIME-WAIT state (see p. 945 of [Wright1994]). This avoids an interoperability problem that may arise when a node establishes connections to a specific TCP end-point at a high rate [Silbersack2005].¶
The interoperability requirements for TCP ISNs are probably not as clearly spelled out as one would expect. Furthermore, the suggestion of employing a global counter in [RFC0793] negatively affects the security and privacy properties of the protocol.¶
IPv6 Interface Identifiers can be generated as a result of different mechanisms, including SLAAC [RFC4862], DHCPv6 [RFC8415], and manual configuration. This section focuses on Interface Identifiers resulting from SLAAC.¶
The Interface Identifier of stable (traditional) IPv6 addresses resulting from SLAAC have traditionally resulted in the underlying link-layer address being embedded in the IID. At the time, employing the underlying link-layer address for the IID was seen as a convenient way to obtain a unique address. However, recent awareness about the security and privacy properties of this approach [RFC7707] [RFC7721] has led to the replacement of this flawed scheme with an alternative one [RFC7217] [RFC8064] that does not negatively affect the security and privacy properties of the protocol.¶
The NTP [RFC5905] Reference ID is a 32-bit code identifying the particular server or reference clock. Above stratum 1 (secondary servers and clients), this value can be employed to avoid degree-one timing loops; that is, scenarios where two NTP peers are (mutually) the time source of each other. If using the IPv4 address family, the identifier is the four-octet IPv4 address. If using the IPv6 address family, it is the first four octets of the MD5 hash of the IPv6 address.¶
Most (if not all) transport protocols employ "port numbers" to demultiplex packets to the corresponding transport protocol instances.¶
The DNS Query ID [RFC1035] can be employed to match DNS replies to outstanding DNS queries.¶
For more than 30 years, a large number of implementations of the IETF protocols have been subject to a variety of attacks, with effects ranging from Denial of Service (DoS) or data injection, to information leakages that could be exploited for pervasive monitoring [RFC7258]. The root cause of these issues has been, in many cases, poor selection of transient numeric identifiers, usually as a result of insufficient or misleading specifications.¶
While it is generally possible to identify an algorithm that can satisfy the interoperability requirements for a given transient numeric identifier, this document provides empirical evidence that doing so without negatively affecting the security or privacy properties of the aforementioned protocols is non-trivial. It is thus evident that advice in this area is warranted.¶
[I-D.gont-numeric-ids-sec-considerations] aims at requiring future IETF protocol specifications to contain analysis of the security and privacy properties of any transient numeric identifiers specified by the protocol, and to recommend an algorithm for the generation of such transient numeric identifiers. [I-D.irtf-pearg-numeric-ids-generation] specifies a number of sample algorithms for generating transient numeric identifiers with specific interorability requirements and failure severities.¶
There are no IANA registries within this document.¶
This document analyzes the timeline of the specification and implementation of the transient numeric identifiers of some sample IETF protocols, and how the security and privacy properties of such protocols have been affected as a result of it. It provides concrete evidence that advice in this area is warranted.¶
[I-D.gont-numeric-ids-sec-considerations] formally requires IETF protocol specifications to specify the interoperability requirements for their transient numeric identifiers, to do a warranted vulnerability assessment of such transient numeric identifiers, and to recommend possible algorithms for their generation, such that the interoperability requirements are complied with, while any negative security and privacy properties of these transient numeric identifiers are mitigated.¶
[I-D.irtf-pearg-numeric-ids-generation] analyzes and categorizes transient numeric identifiers based on their interoperability requirements and their associated failure severities, and recommends possible algorithms that can comply with those requirements without negatively affecting the security and privacy properties of the corresponding protocols.¶
The authors would like to thank (in alphabetical order) Bernard Aboba, Dave Crocker, Spencer Dawkins, Theo de Raadt, Sara Dickinson, Guillermo Gont, Christian Huitema, Colin Perkins, Vincent Roca, Kris Shrishak, Joe Touch, Brian Trammell, and Christopher Wood, for providing valuable comments on earlier versions of this document.¶
The authors would like to thank (in alphabetical order) Steven Bellovin, Joseph Lorenzo Hall, Gre Norcie, and Martin Thomson, for providing valuable comments on [I-D.gont-predictable-numeric-ids], on which this document is based.¶
Section 4.2 of this document borrows text from [RFC6528], authored by Fernando Gont and Steven Bellovin.¶
The authors would like to thank Sara Dickinson and Christopher Wood, for their guidance during the publication process of this document.¶
The authors would like to thank Diego Armando Maradona for his magic and inspiration.¶