rfc9415.original   rfc9415.txt 
Internet Research Task Force (IRTF) F. Gont Internet Research Task Force (IRTF) F. Gont
Internet-Draft SI6 Networks Request for Comments: 9415 SI6 Networks
Intended status: Informational I. Arce Category: Informational I. Arce
Expires: 14 June 2023 Quarkslab ISSN: 2070-1721 Quarkslab
11 December 2022 July 2023
On the Generation of Transient Numeric Identifiers On the Generation of Transient Numeric Identifiers
draft-irtf-pearg-numeric-ids-generation-12
Abstract Abstract
This document performs an analysis of the security and privacy This document performs an analysis of the security and privacy
implications of different types of "transient numeric identifiers" implications of different types of "transient numeric identifiers"
used in IETF protocols, and tries to categorize them based on their used in IETF protocols and tries to categorize them based on their
interoperability requirements and their associated failure severity interoperability requirements and their associated failure severity
when such requirements are not met. Subsequently, it provides advice when such requirements are not met. Subsequently, it provides advice
on possible algorithms that could be employed to satisfy the on possible algorithms that could be employed to satisfy the
interoperability requirements of each identifier category, while interoperability requirements of each identifier category while
minimizing the negative security and privacy implications, thus minimizing the negative security and privacy implications, thus
providing guidance to protocol designers and protocol implementers. providing guidance to protocol designers and protocol implementers.
Finally, it describes a number of algorithms that have been employed Finally, it describes a number of algorithms that have been employed
in real implementations to generate transient numeric identifiers, in real implementations to generate transient numeric identifiers and
and analyzes their security and privacy properties. This document is analyzes their security and privacy properties. This document is a
a product of the Privacy Enhancement and Assessment Research Group product of the Privacy Enhancements and Assessments Research Group
(PEARG) in the IRTF. (PEARG) in the IRTF.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Research Task Force
and may be updated, replaced, or obsoleted by other documents at any (IRTF). The IRTF publishes the results of Internet-related research
time. It is inappropriate to use Internet-Drafts as reference and development activities. These results might not be suitable for
material or to cite them other than as "work in progress." deployment. This RFC represents the consensus of the Privacy
Enhancements and Assessments Research Group of the Internet Research
Task Force (IRTF). Documents approved for publication by the IRSG
are not candidates for any level of Internet Standard; see Section 2
of RFC 7841.
This Internet-Draft will expire on 14 June 2023. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9415.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology
3. Threat Model . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Threat Model
4. Issues with the Specification of Transient Numeric 4. Issues with the Specification of Transient Numeric Identifiers
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 6 5. Protocol Failure Severity
5. Protocol Failure Severity . . . . . . . . . . . . . . . . . . 7 6. Categorizing Transient Numeric Identifiers
6. Categorizing Transient Numeric Identifiers . . . . . . . . . 7 7. Common Algorithms for Transient Numeric Identifier Generation
7. Common Algorithms for Transient Numeric Identifier 7.1. Category #1: Uniqueness (Soft Failure)
Generation . . . . . . . . . . . . . . . . . . . . . . . 10 7.2. Category #2: Uniqueness (Hard Failure)
7.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 10 7.3. Category #3: Uniqueness, Stable within Context (Soft
7.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 14 Failure)
7.3. Category #3: Uniqueness, stable within context (soft 7.4. Category #4: Uniqueness, Monotonically Increasing within
failure) . . . . . . . . . . . . . . . . . . . . . . . . 15 Context (Hard Failure)
7.4. Category #4: Uniqueness, monotonically increasing within
context (hard failure) . . . . . . . . . . . . . . . . . 17
8. Common Vulnerabilities Associated with Transient Numeric 8. Common Vulnerabilities Associated with Transient Numeric
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 23 Identifiers
8.1. Network Activity Correlation . . . . . . . . . . . . . . 23 8.1. Network Activity Correlation
8.2. Information Leakage . . . . . . . . . . . . . . . . . . . 24 8.2. Information Leakage
8.3. Fingerprinting . . . . . . . . . . . . . . . . . . . . . 25 8.3. Fingerprinting
8.4. Exploitation of the Semantics of Transient Numeric 8.4. Exploitation of the Semantics of Transient Numeric
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 26 Identifiers
8.5. Exploitation of Collisions of Transient Numeric 8.5. Exploitation of Collisions of Transient Numeric Identifiers
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 27
8.6. Exploitation of Predictable Transient Numeric Identifiers 8.6. Exploitation of Predictable Transient Numeric Identifiers
for Injection Attacks . . . . . . . . . . . . . . . . . . 27 for Injection Attacks
8.7. Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . 28 8.7. Cryptanalysis
9. Vulnerability Assessment of Transient Numeric Identifiers . . 28 9. Vulnerability Assessment of Transient Numeric Identifiers
9.1. Category #1: Uniqueness (soft failure) . . . . . . . . . 28 9.1. Category #1: Uniqueness (Soft Failure)
9.2. Category #2: Uniqueness (hard failure) . . . . . . . . . 29 9.2. Category #2: Uniqueness (Hard Failure)
9.3. Category #3: Uniqueness, stable within context (soft 9.3. Category #3: Uniqueness, Stable within Context (Soft
failure) . . . . . . . . . . . . . . . . . . . . . . . . 29 Failure)
9.4. Category #4: Uniqueness, monotonically increasing within 9.4. Category #4: Uniqueness, Monotonically Increasing within
context (hard failure) . . . . . . . . . . . . . . . . . 30 Context (Hard Failure)
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 10. IANA Considerations
11. Security Considerations . . . . . . . . . . . . . . . . . . . 32 11. Security Considerations
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 33 12. References
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 12.1. Normative References
13.1. Normative References . . . . . . . . . . . . . . . . . . 33 12.2. Informative References
13.2. Informative References . . . . . . . . . . . . . . . . . 35 Appendix A. Algorithms and Techniques with Known Issues
A.1. Predictable Linear Identifiers Algorithm
Appendix A. Algorithms and Techniques with Known Issues . . . . 41 A.2. Random-Increments Algorithm
A.1. Predictable Linear Identifiers Algorithm . . . . . . . . 41 A.3. Reusing Identifiers Across Different Contexts
A.2. Random-Increments Algorithm . . . . . . . . . . . . . . . 43 Acknowledgements
A.3. Re-using Identifiers Across Different Contexts . . . . . 44 Authors' Addresses
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44
1. Introduction 1. Introduction
Networking protocols employ a variety of transient numeric Networking protocols employ a variety of transient numeric
identifiers for different protocol objects, such as IPv4 and IPv6 identifiers for different protocol objects, such as IPv4 and IPv6
Fragment Identifiers [RFC0791] [RFC8200], IPv6 Interface Identifiers Identification values [RFC0791] [RFC8200], IPv6 Interface Identifiers
(IIDs) [RFC4291], transport protocol ephemeral port numbers (IIDs) [RFC4291], transport-protocol ephemeral port numbers
[RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC0793], and DNS [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC9293], NTP
Query IDs [RFC1035].These identifiers usually have specific Reference IDs (REFIDs) [RFC5905], and DNS IDs [RFC1035]. These
interoperability requirements (e.g. uniqueness during a specified identifiers typically have specific requirements (e.g., uniqueness
period of time) that must be satisfied such that they do not result during a specified period of time) that must be satisfied such that
in negative interoperability implications, and an associated failure they do not result in negative interoperability implications and an
severity when such requirements are not met, ranging from soft to associated failure severity when such requirements are not met.
hard failures.
| NOTE: Some documents refer to the DNS ID as the DNS "Query ID"
| or "TxID".
For more than 30 years, a large number of implementations of IETF For more than 30 years, a large number of implementations of IETF
protocols have been subject to a variety of attacks, with effects protocols have been subject to a variety of attacks, with effects
ranging from Denial of Service (DoS) or data injection, to ranging from Denial of Service (DoS) or data injection to information
information leakages that could be exploited for pervasive monitoring leakages that could be exploited for pervasive monitoring [RFC7258].
[RFC7258]. The root cause of these issues has been, in many cases, The root cause of these issues has been, in many cases, the poor
the poor selection of transient numeric identifiers in such selection of transient numeric identifiers in such protocols, usually
protocols, usually as a result of insufficient or misleading as a result of insufficient or misleading specifications. While it
specifications. While it is generally trivial to identify an is generally trivial to identify an algorithm that can satisfy the
algorithm that can satisfy the interoperability requirements of a interoperability requirements of a given transient numeric
given transient numeric identifier, empirical evidence exists that identifier, empirical evidence exists that doing so without
doing so without negatively affecting the security and/or privacy negatively affecting the security and/or privacy properties of the
properties of the aforementioned protocols is prone to error aforementioned protocols is prone to error [RFC9414].
[I-D.irtf-pearg-numeric-ids-history].
For example, implementations have been subject to security and/or For example, implementations have been subject to security and/or
privacy issues resulting from: privacy issues resulting from:
* Predictable IPv4 or IPv6 Fragment Identifiers (see e.g. * predictable IPv4 or IPv6 Identification values (e.g., see
[Sanfilippo1998a], [RFC6274], and [RFC7739]) [Sanfilippo1998a], [RFC6274], and [RFC7739]),
* Predictable IPv6 IIDs (see e.g. [RFC7721], [RFC7707], and * predictable IPv6 IIDs (e.g., see [RFC7217], [RFC7707], and
[RFC7217]) [RFC7721]),
* Predictable transport protocol ephemeral port numbers (see e.g. * predictable transport-protocol ephemeral port numbers (e.g., see
[RFC6056] and [Silbersack2005]) [RFC6056] and [Silbersack2005]),
* Predictable TCP Initial Sequence Numbers (ISNs) (see e.g. * predictable TCP Initial Sequence Numbers (ISNs) (e.g., see
[Morris1985], [Bellovin1989], and [RFC6528]) [Morris1985], [Bellovin1989], and [RFC6528]),
* Predictable initial timestamps in TCP timestamps Options (see e.g. * predictable initial timestamps in TCP timestamps options (e.g.,
[TCPT-uptime] and [RFC7323]) see [TCPT-uptime] and [RFC7323]), and
* Predictable DNS Query IDs (see e.g. [Schuba1993] and [Klein2007]) * predictable DNS IDs (see, e.g., [Schuba1993] and [Klein2007]).
Recent history indicates that when new protocols are standardized or Recent history indicates that, when new protocols are standardized or
new protocol implementations are produced, the security and privacy new protocol implementations are produced, the security and privacy
properties of the associated transient numeric identifiers tend to be properties of the associated transient numeric identifiers tend to be
overlooked, and inappropriate algorithms to generate transient overlooked, and inappropriate algorithms to generate such identifiers
numeric identifiers are either suggested in the specifications or are either suggested in the specifications or selected by
selected by implementers. As a result, it should be evident that implementers. As a result, advice in this area is warranted.
advice in this area is warranted.
We note that the use of cryptographic techniques may readily mitigate We note that the use of cryptographic techniques may readily mitigate
some of the issues arising from predictable transient numeric some of the issues arising from predictable transient numeric
identifiers. For example, cryptographic integrity and authentication identifiers. For example, cryptographic authentication can readily
can readily mitigate data injection attacks even in the presence of mitigate data injection attacks even in the presence of predictable
predictable transient numeric identifiers (such as "sequence transient numeric identifiers (such as "sequence numbers"). However,
numbers"). However, use of flawed algorithms (such as global use of flawed algorithms (such as global counters) for generating
counters) for generating transient numeric identifiers could still transient numeric identifiers could still result in information
result in information leakages even when cryptographic techniques are leakages even when cryptographic techniques are employed.
employed.
This document contains a non-exhaustive survey of transient numeric This document contains a non-exhaustive survey of transient numeric
identifiers employed in various IETF protocols, and aims to identifiers employed in various IETF protocols and aims to categorize
categorize such identifiers based on their interoperability such identifiers based on their interoperability requirements and the
requirements, and the associated failure severity when such associated failure severity when such requirements are not met.
requirements are not met. Subsequently, it provides advice on Subsequently, it provides advice on possible algorithms that could be
possible algorithms that could be employed to satisfy the employed to satisfy the interoperability requirements of each
interoperability requirements of each category, while minimizing category while minimizing negative security and privacy implications.
negative security and privacy implications. Finally, it analyzes Finally, it analyzes several algorithms that have been employed in
several algorithms that have been employed in real implementations to real implementations to meet such requirements and analyzes their
meet such requirements, and analyzes their security and privacy security and privacy properties.
properties.
This document represents the consensus of the Privacy Enhancement and This document represents the consensus of the Privacy Enhancements
Assessment Research Group (PEARG). and Assessments Research Group (PEARG).
2. Terminology 2. Terminology
Transient Numeric Identifier: Transient Numeric Identifier:
A data object in a protocol specification that can be used to A data object in a protocol specification that can be used to
definitely distinguish a protocol object (a datagram, network definitely distinguish a protocol object (a datagram, network
interface, transport protocol endpoint, session, etc.) from all interface, transport-protocol endpoint, session, etc.) from all
other objects of the same type, in a given context. Transient other objects of the same type, in a given context. Transient
numeric identifiers are usually defined as a series of bits, and numeric identifiers are usually defined as a series of bits and
represented using integer values. These identifiers are typically represented using integer values. These identifiers are typically
dynamically selected, as opposed to statically-assigned numeric dynamically selected, as opposed to statically assigned numeric
identifiers (see e.g. [IANA-PROT]). We note that different identifiers (see, e.g., [IANA-PROT]). We note that different
transient numeric identifiers may have additional requirements or transient numeric identifiers may have additional requirements or
properties depending on their specific use in a protocol. We use properties depending on their specific use in a protocol. We use
the term "transient numeric identifier" (or simply "numeric the term "transient numeric identifier" (or simply "numeric
identifier" or "identifier" as short forms) as a generic term to identifier" or "identifier" as short forms) as a generic term to
refer to any data object in a protocol specification that refer to any data object in a protocol specification that
satisfies the identification property stated above. satisfies the identification property stated above.
Failure Severity: Failure Severity:
The interoperability consequences of a failure to comply with the The interoperability consequences of a failure to comply with the
interoperability requirements of a given identifier. Severity interoperability requirements of a given identifier. Severity
considers the worst potential consequence of a failure, determined considers the worst potential consequence of a failure, determined
by the system damage and/or time lost to repair the failure. In by the system damage and/or time lost to repair the failure. In
this document we define two types of failure severity: "soft this document, we define two types of failure severity: "soft
failure" and "hard failure". failure" and "hard failure".
Soft Failure: Soft Failure:
A soft failure is a recoverable condition in which a protocol does A recoverable condition in which a protocol does not operate in
not operate in the prescribed manner but normal operation can be the prescribed manner but normal operation can be resumed
resumed automatically in a short period of time. For example, a automatically in a short period of time. For example, a simple
simple packet-loss event that is subsequently recovered with a packet-loss event that is subsequently recovered with a packet
packet-retransmission can be considered a soft failure. retransmission can be considered a soft failure.
Hard Failure: Hard Failure:
A hard failure is a non-recoverable condition in which a protocol A non-recoverable condition in which a protocol does not operate
does not operate in the prescribed manner or it operates with in the prescribed manner or it operates with excessive degradation
excessive degradation of service. For example, an established TCP of service. For example, an established TCP connection that is
connection that is aborted due to an error condition constitutes, aborted due to an error condition constitutes, from the point of
from the point of view of the transport protocol, a hard failure, view of the transport protocol, a hard failure, since it enters a
since it enters a state from which normal operation cannot be state from which normal operation cannot be resumed.
resumed.
3. Threat Model 3. Threat Model
Throughout this document, we do not consider on-path attacks. That Throughout this document, we do not consider on-path attacks. That
is, we assume an attacker does not have physical or logical access to is, we assume the attacker does not have physical or logical access
the system(s) being attacked, and that the attacker can only observe to the system(s) being attacked and that the attacker can only
traffic explicitly directed to the attacker. Similarly, an attacker observe traffic explicitly directed to the attacker. Similarly, an
cannot observe traffic transferred between a sender and the attacker cannot observe traffic transferred between the sender and
receiver(s) of a target protocol, but may be able to interact with the receiver(s) of a target protocol but may be able to interact with
any of these entities, including by e.g. sending traffic to them to any of these entities, including by, e.g., sending any traffic to
sample transient numeric identifiers employed by the target systems them to sample transient numeric identifiers employed by the target
when communicating with the attacker. hosts when communicating with the attacker.
For example, when analyzing vulnerabilities associated with TCP For example, when analyzing vulnerabilities associated with TCP
Initial Sequence Numbers (ISNs), we consider the attacker is unable Initial Sequence Numbers (ISNs), we consider the attacker is unable
to capture network traffic corresponding to a TCP connection between to capture network traffic corresponding to a TCP connection between
two other hosts. However, we consider the attacker is able to two other hosts. However, we consider the attacker is able to
communicate with any of these hosts (e.g., establish a TCP connection communicate with any of these hosts (e.g., establish a TCP connection
with any of them), to e.g. sample the TCP ISNs employed by these with any of them) to, e.g., sample the TCP ISNs employed by these
systems when communicating with the attacker. hosts when communicating with the attacker.
Similarly, when considering host-tracking attacks based on IPv6 Similarly, when considering host-tracking attacks based on IPv6
interface identifiers, we consider an attacker may learn the IPv6 Interface Identifiers, we consider an attacker may learn the IPv6
address employed by a victim node if e.g. the address becomes exposed address employed by a victim host if, e.g., the address becomes
as a result of the victim node communicating with an attacker- exposed as a result of the victim host communicating with an
operated server. Subsequently, an attacker may perform host-tracking attacker-operated server. Subsequently, an attacker may perform
by probing a set of target addresses composed by a set of target host-tracking by probing a set of target addresses composed by a set
prefixes and the IPv6 interface identifier originally learned by the of target prefixes and the IPv6 Interface Identifier originally
attacker. Alternatively, an attacker may perform host tracking if learned by the attacker. Alternatively, an attacker may perform
e.g. the victim node communicates with an attacker-operated server as host-tracking if, e.g., the victim host communicates with an
it moves from one location to another, those exposing its configured attacker-operated server as it moves from one location to another,
addresses. We note that none of these scenarios requires the thereby exposing its configured addresses. We note that none of
attacker observe traffic not explicitly directed to the attacker. these scenarios require the attacker observe traffic not explicitly
directed to the attacker.
4. Issues with the Specification of Transient Numeric Identifiers 4. Issues with the Specification of Transient Numeric Identifiers
While assessing protocol specifications regarding the use of While assessing IETF protocol specifications regarding the use of
transient numeric identifiers, we have found that most of the issues transient numeric identifiers, we have found that most of the issues
discussed in this document arise as a result of one of the following discussed in this document arise as a result of one of the following
conditions: conditions:
* Protocol specifications that under-specify the requirements for * protocol specifications that under specify their transient numeric
their transient numeric identifiers identifiers
* Protocol specifications that over-specify their transient numeric * protocol specifications that over specify their transient numeric
identifiers identifiers
* Protocol implementations that simply fail to comply with the * protocol implementations that simply fail to comply with the
specified requirements specified requirements
A number of protocol specifications (too many of them) have simply A number of IETF protocol specifications under specified their
overlooked the security and privacy implications of transient numeric transient numeric identifiers, thus leading to implementations that
identifiers [I-D.irtf-pearg-numeric-ids-history]. Examples of them were vulnerable to numerous off-path attacks. Examples of them are
are the specification of TCP ephemeral ports in [RFC0793], the the specification of TCP local ports in [RFC0793] or the
specification of TCP sequence numbers in [RFC0793], or the specification of the DNS ID in [RFC1035].
specification of the DNS Query ID in [RFC1035].
On the other hand, there are a number of protocol specifications that | NOTE: The TCP local port in an active OPEN request is commonly
over-specify some of their associated transient numeric identifiers. | known as the "ephemeral port" of the corresponding TCP
For example, [RFC4291] essentially overloads the semantics of IPv6 | connection [RFC6056].
Interface Identifiers (IIDs) by embedding link-layer addresses in the
IPv6 IIDs, when the interoperability requirement of uniqueness could On the other hand, there are a number of IETF protocol specifications
be achieved in other ways that do not result in negative security and that over specify some of their associated transient numeric
privacy implications [RFC7721]. Similarly, [RFC2460] suggested the identifiers. For example, [RFC4291] essentially overloads the
use of a global counter for the generation of Fragment Identification semantics of IPv6 Interface Identifiers (IIDs) by embedding link-
values, when the interoperability properties of uniqueness per {IPv6 layer addresses in the IPv6 IIDs when the interoperability
Source Address, IPv6 Destination Address} could be achieved with requirement of uniqueness could be achieved in other ways that do not
other algorithms that do not result in negative security and privacy result in negative security and privacy implications [RFC7721].
implications [RFC7739]. Similarly, [RFC2460] suggests the use of a global counter for the
generation of Identification values when the interoperability
requirement of uniqueness per {IPv6 Source Address, IPv6 Destination
Address} could be achieved with other algorithms that do not result
in negative security and privacy implications [RFC7739].
Finally, there are protocol implementations that simply fail to Finally, there are protocol implementations that simply fail to
comply with existing protocol specifications. For example, some comply with existing protocol specifications. For example, some
popular operating systems (notably Microsoft Windows) still fail to popular operating systems still fail to implement transport-protocol
implement transport protocol ephemeral port randomization, as ephemeral port randomization, as recommended in [RFC6056], or TCP
recommended in [RFC6056]. Initial Sequence Number randomization, as recommended in [RFC9293].
5. Protocol Failure Severity 5. Protocol Failure Severity
Section 2 defines the concept of "Failure Severity", along with two Section 2 defines the concept of "failure severity", along with two
types of failure severities that we employ throughout this document: types of failure severities that we employ throughout this document:
soft and hard. soft and hard.
Our analysis of the severity of a failure is performed from the point Our analysis of the severity of a failure is performed from the point
of view of the protocol in question. However, the corresponding of view of the protocol in question. However, the corresponding
severity on the upper protocol (or application) might not be the same severity on the upper protocol (or application) might not be the same
as that of the protocol in question. For example, a TCP connection as that of the protocol in question. For example, a TCP connection
that is aborted might or might not result in a hard failure of the that is aborted might or might not result in a hard failure of the
upper application: if the upper application can establish a new TCP upper application, i.e., if the upper application can establish a new
connection without any impact on the application, a hard failure at TCP connection without any impact on the application, a hard failure
the TCP protocol may have no severity at the application level. On at the TCP protocol may have no severity at the application layer.
the other hand, if a hard failure of a TCP connection results in On the other hand, if a hard failure of a TCP connection results in
excessive degradation of service at the application layer, it will excessive degradation of service at the application layer, it will
also result in a hard failure at the application. also result in a hard failure at the application.
6. Categorizing Transient Numeric Identifiers 6. Categorizing Transient Numeric Identifiers
This section includes a non-exhaustive survey of transient numeric This section includes a non-exhaustive survey of transient numeric
identifiers, which are representative of all the possible identifiers, which are representative of all the possible
combinations of interoperability requirements and failure severities combinations of interoperability requirements and failure severities
found in popular protocols from different layers. Additionally, it found in popular protocols of different layers. Additionally, it
proposes a number of categories that can accommodate these proposes a number of categories that can accommodate these
identifiers based on their interoperability requirements and their identifiers based on their interoperability requirements and their
associated failure severity (soft or hard). associated failure severity (soft or hard).
NOTE: | NOTE: All other transient numeric identifiers that were
All other transient numeric identifiers that were analyzed as part | analyzed as part of this effort could be accommodated into one
of this effort could be accommodated into one of the existing | of the existing categories from Table 1.
categories from Table 1.
+==============+===============================+==================+ +===============+===============================+==================+
| Identifier | Interoperability Requirements | Failure Severity | | Identifier | Interoperability Requirements | Failure Severity |
+==============+===============================+==================+ +===============+===============================+==================+
| IPv6 Frag ID | Uniqueness (for IP address | Soft/Hard (1) | | IPv6 ID | Uniqueness (for IPv6 address | Soft/Hard (1) |
| | pair) | | | | pair) | |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| IPv6 IID | Uniqueness (and stable within | Soft (3) | | IPv6 IID | Uniqueness (and stable within | Soft (3) |
| | IPv6 prefix) (2) | | | | IPv6 prefix) (2) | |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| TCP ISN | Monotonically-increasing (4) | Hard (4) | | TCP ISN | Monotonically increasing (4) | Hard (4) |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| TCP initial | Monotonically-increasing (5) | Hard (5) | | TCP initial | Monotonically increasing (5) | Hard (5) |
| timestamp | | | | timestamp | | |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| TCP eph. | Uniqueness (for connection | Hard | | TCP ephemeral | Uniqueness (for connection | Hard |
| port | ID) | | | port | ID) | |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| IPv6 Flow | Uniqueness | None (6) | | IPv6 Flow | Uniqueness | None (6) |
| Label | | | | Label | | |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
| DNS Query ID | Uniqueness | None (7) | | DNS ID | Uniqueness | None (7) |
+--------------+-------------------------------+------------------+ +---------------+-------------------------------+------------------+
Table 1: Survey of Transient Numeric Identifiers Table 1: Survey of Transient Numeric Identifiers
NOTE: NOTE:
(1) (1) While a single collision of IPv6 Identification (ID) values
While a single collision of Fragment ID values would simply lead would simply lead to a single packet drop (and hence, a "soft"
to a single packet drop (and hence a "soft" failure), repeated failure), repeated collisions at high data rates might result in
collisions at high data rates might result in self-propagating self-propagating collisions of IPv6 IDs, thus possibly leading
collisions of Fragment IDs, thus possibly leading to a hard to a hard failure [RFC4963].
failure [RFC4963].
(2) (2) While the interoperability requirements are simply that the
While the interoperability requirements are simply that the Interface Identifier results in a unique IPv6 address, for
Interface ID results in a unique IPv6 address, for operational operational reasons, it is typically desirable that the
reasons it is typically desirable that the resulting IPv6 address resulting IPv6 address (and hence, the corresponding Interface
(and hence the corresponding Interface ID) be stable within each Identifier) be stable within each network [RFC7217] [RFC8064].
network [RFC7217] [RFC8064].
(3) (3) While IPv6 Interface Identifiers must result in unique IPv6
While IPv6 Interface IDs must result in unique IPv6 addresses, addresses, IPv6 Duplicate Address Detection (DAD) [RFC4862]
IPv6 Duplicate Address Detection (DAD) [RFC4862] allows for the allows for the detection of duplicate addresses, and hence, such
detection of duplicate addresses, and hence such Interface ID Interface Identifier collisions can be recovered.
collisions can be recovered.
(4) (4) In theory, there are no interoperability requirements for TCP
In theory, there are no interoperability requirements for TCP Initial Sequence Numbers (ISNs), since the TIME-WAIT state and
Initial Sequence Numbers (ISNs), since the TIME-WAIT state and TCP's "quiet time" concept take care of old segments from
TCP's "quiet time" concept take care of old segments from previous previous incarnations of a connection. However, a widespread
incarnations of a connection. However, a widespread optimization optimization allows for a new incarnation of a previous
allows for a new incarnation of a previous connection to be connection to be created if the ISN of the incoming SYN is
created if the ISN of the incoming SYN is larger than the last larger than the last sequence number seen in that direction for
sequence number seen in that direction for the previous the previous incarnation of the connection. Thus, monotonically
incarnation of the connection. Thus, monotonically-increasing TCP increasing TCP ISNs allow for such optimization to work as
ISNs allow for such optimization to work as expected [RFC6528], expected [RFC6528] and can help avoid connection-establishment
and can help avoid connection-establishment failures. failures.
(5) (5) Strictly speaking, there are no interoperability requirements
Strictly speaking, there are no interoperability requirements for for the *initial* TCP timestamp employed by a TCP instance
the *initial* TCP timestamp employed by a TCP instance (i.e., the (i.e., the TS Value (TSval) in a segment with the SYN bit set).
TS Value (TSval) in a segment with the SYN bit set). However, However, some TCP implementations allow a new incarnation of a
some TCP implementations allow a new incarnation of a previous previous connection to be created if the TSval of the incoming
connection to be created if the TSval of the incoming SYN is SYN is larger than the last TSval seen in that direction for the
larger than the last TSval seen in that direction for the previous previous incarnation of the connection (please see [RFC6191]).
incarnation of the connection (please see [RFC6191]). Thus, Thus, monotonically increasing TCP initial timestamps (across
monotonically-increasing TCP initial timestamps (across connections to the same endpoint) allow for such optimization to
connections to the same endpoint) allow for such optimization to work as expected [RFC6191] and can help avoid connection-
work as expected [RFC6191], and can help avoid connection- establishment failures.
establishment failures.
(6) (6) The IPv6 Flow Label [RFC6437], along with the IPv6 Source
The IPv6 Flow Label [RFC6437], along with the Source and Address and the IPv6 Destination Address, is typically employed
Destination IPv6 addresses, is typically employed for load sharing for load sharing [RFC7098]. Reuse of a Flow Label value for the
[RFC7098]. Reuse of a Flow Label value for the same set {Source same set {Source Address, Destination Address} would typically
Address, Destination Address} would typically cause both flows to cause both flows to be multiplexed onto the same link. However,
be multiplexed onto the same link. However, as long as this does as long as this does not occur deterministically, it will not
not occur deterministically, it will not result in any negative result in any negative implications.
implications.
(7) (7) DNS IDs are employed, together with the IP Source Address, the
DNS Query IDs are employed, together with the Source Address, IP Destination Address, the transport-protocol Source Port, and
Destination Address, Source Port, and Destination Port, to match the transport-protocol Destination Port, to match DNS requests
DNS requests and responses. However, since an implementation and responses. However, since an implementation knows which DNS
knows which DNS requests were sent for that set of {Source requests were sent for that set of {IP Source Address, IP
Address, Destination Address, Source Port, and Destination Port, Destination Address, transport-protocol Source Port, transport-
Query ID}, a collision of Query IDs would result, if anything, in protocol Destination Port, DNS ID}, a collision of DNS IDs would
a small performance penalty (the response would nevertheless be result, if anything, in a small performance penalty (the
discarded when it is found that it does not answer the query sent response would nevertheless be discarded when it is found that
in the corresponding DNS query). it does not answer the query sent in the corresponding DNS
query).
Based on the survey above, we can categorize identifiers as follows: Based on the survey above, we can categorize identifiers as follows:
+=======+======================================+===================+ +=======+======================================+====================+
| Cat # | Category | Sample Proto IDs | | Cat # | Category | Sample Numeric |
+=======+======================================+===================+ | | | IDs |
| 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS | +=======+======================================+====================+
| | | Query ID | | 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS |
+-------+--------------------------------------+-------------------+ | | | ID |
| 2 | Uniqueness (hard failure) | IPv6 Frag ID, TCP | +-------+--------------------------------------+--------------------+
| | | ephemeral port | | 2 | Uniqueness (hard failure) | IPv6 ID, TCP |
+-------+--------------------------------------+-------------------+ | | | ephemeral port |
| 3 | Uniqueness, stable within context | IPv6 IID | +-------+--------------------------------------+--------------------+
| | (soft failure) | | | 3 | Uniqueness, stable within context | IPv6 IID |
+-------+--------------------------------------+-------------------+ | | (soft failure) | |
| 4 | Uniqueness, monotonically increasing | TCP ISN, TCP | +-------+--------------------------------------+--------------------+
| | within context (hard failure) | initial timestamp | | 4 | Uniqueness, monotonically increasing | TCP ISN, TCP |
+-------+--------------------------------------+-------------------+ | | within context (hard failure) | initial timestamp |
+-------+--------------------------------------+--------------------+
Table 2: Identifier Categories Table 2: Identifier Categories
We note that Category #4 could be considered a generalized case of We note that Category #4 could be considered a generalized case of
category #3, in which a monotonically increasing element is added to Category #3, in which a monotonically increasing element is added to
a stable (within context) element, such that the resulting a stable (within context) element, such that the resulting
identifiers are monotonically increasing within a specified context. identifiers are monotonically increasing within a specified context.
That is, the same algorithm could be employed for both #3 and #4, That is, the same algorithm could be employed for both #3 and #4,
given appropriate parameters. given appropriate parameters.
7. Common Algorithms for Transient Numeric Identifier Generation 7. Common Algorithms for Transient Numeric Identifier Generation
The following subsections describe some sample algorithms that can be The following subsections describe some sample algorithms that can be
employed for generating transient numeric identifiers for each of the employed for generating transient numeric identifiers for each of the
categories above, while mitigating the vulnerabilities analyzed in categories above while mitigating the vulnerabilities analyzed in
Section 8 of this document. Section 8 of this document.
All of the variables employed in the algorithms of the following All of the variables employed in the algorithms of the following
subsections are of "unsigned integer" type, except for the "retry" subsections are of "unsigned integer" type, except for the "retry"
variable, that is of (signed) "integer" type. variable, which is of (signed) "integer" type.
7.1. Category #1: Uniqueness (soft failure) 7.1. Category #1: Uniqueness (Soft Failure)
The requirement of uniqueness with a soft failure severity can be The requirement of uniqueness with a soft failure severity can be
complied with a Pseudo-Random Number Generator (PRNG). complied with a Pseudorandom Number Generator (PRNG).
NOTE: | NOTE: Please see [RFC4086] regarding randomness requirements
Please see [RFC4086] regarding randomness requirements for | for security.
security.
While most systems provide access to a PRNG, many of such PRNG While most systems provide access to a PRNG, many of such PRNG
implementations are not cryptographically secure, and therefore might implementations are not cryptographically secure and therefore might
be statistically biased or subject to adversarial influence. For be statistically biased or subject to adversarial influence. For
example, ISO C [C11] rand(3) implementations are not example, ISO C [C11] rand(3) implementations are not
cryptographically secure. cryptographically secure.
NOTE: | NOTE: Section 7.1 ("Uniform Deviates") of [Press1992] discusses
Section 7.1 ("Uniform Deviates") of [Press1992] discusses the | the underlying issues affecting ISO C [C11] rand(3)
underlying issues affecting ISO C [C11] rand(3) implementations. | implementations.
On the other hand, a number of systems provide an interface to a On the other hand, a number of systems provide an interface to a
Cryptographically Secure PRNG (CSPRNG) [RFC8937] [RFC4086], which Cryptographically Secure PRNG (CSPRNG) [RFC4086] [RFC8937], which
guarantees high entropy, unpredictability, and good statistical guarantees high entropy, unpredictability, and good statistical
distribution of the random values generated. For example, GNU/ distribution of the random values generated. For example, GNU/
Linux's CSPRNG implementation is available via the getentropy(3) Linux's CSPRNG implementation is available via the getentropy(3)
interface [GETENTROPY], while OpenBSD's CSPRNG implementation is interface [GETENTROPY], while OpenBSD's CSPRNG implementation is
available via the arc4random(3) and arc4random_uniform(3) interfaces available via the arc4random(3) and arc4random_uniform(3) interfaces
[ARC4RANDOM]. Where available, these CSPRNGs should be preferred [ARC4RANDOM]. Where available, these CSPRNGs should be preferred
over e.g. POSIX [POSIX] random(3) or ISO C [C11] rand(3) over, e.g., POSIX [POSIX] random(3) or ISO C [C11] rand(3)
implementations. implementations.
In scenarios where a CSPRNG is not readily available to select In scenarios where a CSPRNG is not readily available to select
transient numeric identifiers of Category #1, a security and privacy transient numeric identifiers of Category #1, a security and privacy
assessment of employing a regular PRNG should be performed, assessment of employing a regular PRNG should be performed,
supporting the implementation decision. supporting the implementation decision.
NOTE: | NOTE: [Aumasson2018], [Press1992], and [Knuth1983] discuss
[Aumasson2018], [Press1992], and [Knuth1983], discuss theoretical | theoretical and practical aspects of pseudorandom number
and practical aspects of pseudorandom numbers generation, and | generation and provide guidance on how to evaluate PRNGs.
provide guidance on how to evaluate PRNGs.
We note that since the premise is that collisions of transient We note that, since the premise is that collisions of transient
numeric identifiers of this category only leads to soft failures, in numeric identifiers of this category only lead to soft failures, in
many cases, the algorithm might not need to check the suitability of many cases, the algorithm might not need to check the suitability of
a selected identifier (i.e., the suitable_id() function, described a selected identifier (i.e., the suitable_id() function, described
below, could always return "true"). below, could always return "true").
In scenarios where e.g. simultaneous use of a given numeric ID is In scenarios where, e.g., simultaneous use of a given numeric
undesirable and the implementation detects such condition, an identifier is undesirable and an implementation detects such
implementation may opt to select the next available identifier in the condition, the implementation may opt to select the next available
same sequence, or select another random number. Section 7.1.1 is an identifier in the same sequence or select another random number.
implementation of the former strategy, while Section 7.1.2 is an Section 7.1.1 is an implementation of the former strategy, while
implementation of the later. Typically, the algorithm in Section 7.1.2 is an implementation of the latter. Typically, the
Section 7.1.2 results in a more uniform distribution of the generated algorithm in Section 7.1.2 results in a more uniform distribution of
transient numeric identifiers. However, for transient numeric the generated transient numeric identifiers. However, for transient
identifiers where an implementation typically keeps local state about numeric identifiers where an implementation typically keeps local
unsuitable/used identifiers, the algorithm in Section 7.1.2 may state about unsuitable/used identifiers, the algorithm in
require many more iterations than the algorithm in Section 7.1.1 to Section 7.1.2 may require many more iterations than the algorithm in
generate a suitable transient numeric identifier. This will usually Section 7.1.1 to generate a suitable transient numeric identifier.
be affected by the current usage ratio of transient numeric This will usually be affected by the current usage ratio of transient
identifiers (i.e., number of numeric identifiers considered suitable numeric identifiers (i.e., the number of numeric identifiers
/ total number of numeric identifiers) and other parameters. considered suitable / total number of numeric identifiers) and other
Therefore, in such cases many implementations tend to prefer the parameters. Therefore, in such cases, many implementations tend to
algorithm in Section 7.1.1 over the algorithm in Section 7.1.2. prefer the algorithm in Section 7.1.1 over the algorithm in
Section 7.1.2.
7.1.1. Simple Randomization Algorithm 7.1.1. Simple Randomization Algorithm
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
next_id = min_id + (random() % id_range); next_id = min_id + (random() % id_range);
retry = id_range; retry = id_range;
do { do {
skipping to change at page 12, line 37 skipping to change at line 549
next_id++; next_id++;
} }
retry--; retry--;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
NOTE: NOTE:
random() is a PRNG that returns a pseudo-random unsigned integer
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C-language's the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a "%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique uniform distribution, the rejection sampling technique
[Romailler2020] can be used. [Romailler2020] can be used.
The function suitable_id() can check, when possible and desirable, suitable_id() is a function that checks, if possible and
whether a selected transient numeric identifier is suitable (e.g. desirable, whether a candidate numeric identifier is suitable
it is not already in use). Depending on how/where the numeric (e.g., whether it is in use or has been recently employed).
identifier is used, it may or may not be possible (or even Depending on how/where the numeric identifier is used, it may or
desirable) to check whether the numeric identifier is in use (or may not be possible (or even desirable) to check whether the
whether it has been recently employed). When an identifier is numeric identifier is suitable.
found to be unsuitable, this algorithm selects the next available
numeric identifier in sequence.
Even when this algorithm selects numeric IDs randomly, it is All the variables (in this algorithm and all the others algorithms
biased towards the first available numeric ID after a sequence of discussed in this document) are unsigned integers.
unavailable numeric IDs. For example, if this algorithm is
employed for transport protocol ephemeral port randomization
[RFC6056] and the local list of unsuitable port numbers (e.g.,
registered port numbers that should not be used for ephemeral
ports) is significant, an attacker may actually have a
significantly better chance of guessing a port number.
All the variables (in this and all the algorithms discussed in When an identifier is found to be unsuitable, this algorithm selects
this document) are unsigned integers. the next available numeric identifier in sequence. Thus, even when
this algorithm selects numeric identifiers randomly, it is biased
towards the first available numeric identifier after a sequence of
unavailable numeric identifiers. For example, if this algorithm is
employed for transport-protocol ephemeral port randomization
[RFC6056] and the local list of unsuitable port numbers (e.g.,
registered port numbers that should not be used for ephemeral ports)
is significant, an attacker may actually have a significantly better
chance of guessing an ephemeral port number.
Assuming the randomness requirements for the PRNG are met (see Assuming the randomness requirements for the PRNG are met (see
[RFC4086]), this algorithm does not suffer from any of the issues [RFC4086]), this algorithm does not suffer from any of the issues
discussed in Section 8. discussed in Section 8.
7.1.2. Another Simple Randomization Algorithm 7.1.2. Another Simple Randomization Algorithm
The following pseudo-code illustrates another algorithm for selecting The following pseudocode illustrates another algorithm for selecting
a random transient numeric identifier which, in the event a selected a random transient numeric identifier where, in the event a selected
identifier is found to be unsuitable (e.g., already in use), another identifier is found to be unsuitable (e.g., already in use), another
identifier is randomly selected: identifier is randomly selected:
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = id_range; retry = id_range;
do { do {
next_id = min_id + (random() % id_range); next_id = min_id + (random() % id_range);
skipping to change at page 14, line 23 skipping to change at line 607
if (suitable_id(next_id)) { if (suitable_id(next_id)) {
return next_id; return next_id;
} }
retry--; retry--;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
This algorithm might be unable to select a transient numeric NOTE:
identifier (i.e., return "ERROR") even if there are suitable
identifiers available, in cases where a large number of identifiers
are found to be unsuitable (e.g. "in use").
The same considerations from Section 7.1.1 with respect to the random() is a PRNG that returns a pseudorandom unsigned integer
properties of random() and the adaptation of its output length apply number of appropriate size. Beware that "adapting" the length of
to this algorithm. the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
suitable_id() is a function that checks, if possible and
desirable, whether a candidate numeric identifier is suitable
(e.g., if it is not already in use). Depending on how/where the
numeric identifier is used, it may or may not be possible (or even
desirable) to check whether the numeric identifier is in use (or
whether it has been recently employed).
When an identifier is found to be unsuitable, this algorithm selects
another random numeric identifier. Thus, this algorithm might be
unable to select a transient numeric identifier (i.e., return
"ERROR"), even if there are suitable identifiers available, in cases
where a large number of identifiers are found to be unsuitable (e.g.,
"in use").
Assuming the randomness requirements for the PRNG are met (see Assuming the randomness requirements for the PRNG are met (see
[RFC4086]), this algorithm does not suffer from any of the issues [RFC4086]), this algorithm does not suffer from any of the issues
discussed in Section 8. discussed in Section 8.
7.2. Category #2: Uniqueness (hard failure) 7.2. Category #2: Uniqueness (Hard Failure)
One of the most trivial approaches for generating unique transient One of the most trivial approaches for generating a unique transient
numeric identifier (with a hard failure severity) is to reduce the numeric identifier (with a hard failure severity) is to reduce the
identifier reuse frequency by generating the numeric identifiers with identifier reuse frequency by generating the numeric identifiers with
a monotonically-increasing function (e.g. linear). As a result, any a monotonically increasing function (e.g., linear). As a result, any
of the algorithms described in Section 7.4 ("Category #4: Uniqueness, of the algorithms described in Section 7.4 ("Category #4: Uniqueness,
monotonically increasing within context (hard failure)") can be Monotonically Increasing within Context (Hard Failure)") can be
readily employed for complying with the requirements of this readily employed for complying with the requirements of this
transient numeric identifier category. transient numeric identifier category.
In cases where suitability (e.g. uniqueness) of the selected In cases where suitability (e.g., uniqueness) of the selected
identifiers can be definitely assessed by the local system, any of identifiers can be definitely assessed by the local system, any of
the algorithms described in Section 7.1 ("Category #1: Uniqueness the algorithms described in Section 7.1 ("Category #1: Uniqueness
(soft failure)") can be readily employed for complying with the (Soft Failure)") can be readily employed for complying with the
requirements of this numeric identifier category. requirements of this numeric identifier category.
NOTE: | NOTE: In the case of, e.g., TCP ephemeral ports or TCP ISNs, a
In the case of e.g. TCP ephemeral ports or TCP ISNs, a transient | transient numeric identifier that might seem suitable from the
numeric identifier that might seem suitable from the perspective | perspective of the local system might actually be unsuitable
of the local system, might actually be unsuitable from the | from the perspective of the remote system (e.g., because there
perspective of the remote system (e.g., because there is state | is state associated with the selected identifier at the remote
associated with the selected identifier at the remote system). | system). Therefore, in such cases, it is not possible to
Therefore, in such cases it is not possible to employ the | employ the algorithms from Section 7.1 ("Category #1:
algorithms from Section 7.1 ("Category #1: Uniqueness (soft | Uniqueness (Soft Failure)").
failure)").
7.3. Category #3: Uniqueness, stable within context (soft failure) 7.3. Category #3: Uniqueness, Stable within Context (Soft Failure)
The goal of the following algorithm is to produce identifiers that The goal of the following algorithm is to produce identifiers that
are stable for a given context (identified by "CONTEXT"), but that are stable for a given context (identified by "CONTEXT") but that
change when the aforementioned context changes. change when the aforementioned context changes.
In order to avoid storing in memory the transient numeric identifiers In order to avoid storing the transient numeric identifiers computed
computed for each CONTEXT, the following algorithm employs a for each CONTEXT in memory, the following algorithm employs a
calculated technique (as opposed to keeping state in memory) to calculated technique (as opposed to keeping state in memory) to
generate a stable transient numeric identifier for each given generate a stable transient numeric identifier for each given
context. context.
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = 0; retry = 0;
skipping to change at page 15, line 47 skipping to change at line 692
if (suitable_id(next_id)) { if (suitable_id(next_id)) {
return next_id; return next_id;
} }
retry++; retry++;
} while (retry <= MAX_RETRIES); } while (retry <= MAX_RETRIES);
return ERROR; return ERROR;
NOTE:
CONTEXT is the concatenation of all the elements that define a
given context.
F() is a pseudorandom function (PRF). It must not be computable
from the outside (without knowledge of the secret key). F() must
also be difficult to reverse, such that it resists attempts to
obtain the secret key, even when given samples of the output of
F() and knowledge or control of the other input parameters. F()
should produce an output of at least as many bits as required for
the transient numeric identifier. SipHash-2-4 (128-bit key,
64-bit output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length
output) [BLAKE3] are two possible options for F(). Alternatively,
F() could be implemented with a keyed hash message authentication
code (HMAC) [RFC2104]. HMAC-SHA-256 [FIPS-SHS] would be one
possible option for such implementation alternative. Note: Use of
HMAC-MD5 [RFC1321] or HMAC-SHA1 [FIPS-SHS] are not recommended for
F() [RFC6151] [RFC6194]. The result of F() is no more secure than
the secret key, and therefore, "secret_key" must be unknown to the
attacker and must be of a reasonable length. "secret_key" must
remain stable for a given CONTEXT, since otherwise, the numeric
identifiers generated by this algorithm would not have the desired
stability properties (i.e., stable for a given CONTEXT). In most
cases, "secret_key" should be selected with a PRNG (see [RFC4086]
for recommendations on choosing secrets) at an appropriate time
and stored in stable or volatile storage (as necessary) for future
use.
suitable_id() checks whether a candidate numeric identifier has
suitable uniqueness properties.
In this algorithm, the function F() provides a stateless and stable In this algorithm, the function F() provides a stateless and stable
per-CONTEXT offset, where CONTEXT is the concatenation of all the per-CONTEXT offset, where CONTEXT is the concatenation of all the
elements that define the given context. elements that define the given context.
For example, if this algorithm is expected to produce IPv6 IIDs For example, if this algorithm is expected to produce IPv6 IIDs that
that are unique per network interface and SLAAC autoconfiguration are unique per network interface and Stateless Address
prefix, the CONTEXT should be the concatenation of e.g. the Autoconfiguration (SLAAC) prefix, CONTEXT should be the concatenation
network interface index and the SLAAC autoconfiguration prefix of, e.g., the network interface index and the SLAAC autoconfiguration
(please see [RFC7217] for an implementation of this algorithm for prefix (please see [RFC7217] for an implementation of this algorithm
generation of stable IPv6 IIDs). for generation of stable IPv6 addresses).
F() is a pseudorandom function (PRF). It must not be computable from
the outside (without knowledge of the secret key). F() must also be
difficult to reverse, such that it resists attempts to obtain the
secret_key, even when given samples of the output of F() and
knowledge or control of the other input parameters. F() should
produce an output of at least as many bits as required for the
transient numeric identifier. SipHash-2-4 (128-bit key, 64-bit
output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length output)
[BLAKE3] are two possible options for F(). Alternatively, F() could
be implemented with a keyed-hash message authentication code (HMAC)
[RFC2104]. HMAC-SHA-256 [FIPS-SHS] would be one possible option for
such implementation alternative. Note: Use of HMAC-MD5 [RFC1321] or
HMAC-SHA1 [FIPS-SHS] are not recommended for F() [RFC6151] [RFC6194].
The result of F() is no more secure than the secret key, and
therefore 'secret_key' must be unknown to the attacker, and must be
of a reasonable length. 'secret_key' must remain stable for a given
CONTEXT, since otherwise the numeric identifiers generated by this
algorithm would not have the desired stability properties (i.e.,
stable for a given CONTEXT). In most cases, 'secret_key' should be
selected with a PRNG (see [RFC4086] for recommendations on choosing
secrets) at an appropriate time, and stored in stable or volatile
storage (as necessary) for future use.
The result of F() is stored in the variable 'offset', which may take The result of F() is stored in the variable "offset", which may take
any value within the storage type range, since we are restricting the any value within the storage type range, since we are restricting the
resulting identifier to be in the range [min_id, max_id] in a similar resulting identifier to be in the range [min_id, max_id] in a similar
way as in the algorithm described in Section 7.1.1. way as in the algorithm described in Section 7.1.1.
suitable_id() checks whether the candidate identifier has suitable As noted above, suitable_id() checks whether a candidate numeric
uniqueness properties. Collisions (i.e., an identifier that is not identifier has suitable uniqueness properties. Collisions (i.e., an
unique) are recovered by incrementing the 'retry' variable and identifier that is not unique) are recovered by incrementing the
recomputing F(), up to a maximum of MAX_RETRIES times. However, "retry" variable and recomputing F(), up to a maximum of MAX_RETRIES
recovering from collisions will usually result in identifiers that times. However, recovering from collisions will usually result in
fail to remain constant for the specified context. This is normally identifiers that fail to remain constant for the specified context.
acceptable when the probability of collisions is small, as in the This is normally acceptable when the probability of collisions is
case of e.g. IPv6 IIDs resulting from SLAAC [RFC7217] [RFC8981]. small, as in the case of, e.g., IPv6 IIDs resulting from SLAAC
[RFC7217] [RFC8981].
For obvious reasons, the transient numeric identifiers generated with For obvious reasons, the transient numeric identifiers generated with
this algorithm allow for network activity correlation and this algorithm allow for network activity correlation and
fingerprinting within "CONTEXT". However, this is essentially a fingerprinting within "CONTEXT". However, this is essentially a
design goal of this category of transient numeric identifiers. design goal of this category of transient numeric identifiers.
7.4. Category #4: Uniqueness, monotonically increasing within context 7.4. Category #4: Uniqueness, Monotonically Increasing within Context
(hard failure) (Hard Failure)
7.4.1. Per-context Counter Algorithm 7.4.1. Per-Context Counter Algorithm
One possible way of selecting unique monotonically-increasing One possible way of selecting unique monotonically increasing
identifiers (per context) is to employ a per-context counter. Such identifiers (per context) is to employ a per-context counter. Such
an algorithm could be described as follows: an algorithm could be described as follows:
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = id_range; retry = id_range;
id_inc = increment() % id_range; id_inc = increment() % id_range;
if( (next_id = lookup_counter(CONTEXT)) == ERROR){ if( (next_id = lookup_counter(CONTEXT)) == ERROR){
skipping to change at page 17, line 43 skipping to change at line 793
store_counter(CONTEXT, next_id); store_counter(CONTEXT, next_id);
return next_id; return next_id;
} }
retry = retry - id_inc; retry = retry - id_inc;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
NOTES: NOTE:
CONTEXT is the concatenation of all the elements that define a
given context.
increment() returns a small integer that is employed to increment increment() returns a small integer that is employed to increment
the current counter value to obtain the next transient numeric the current counter value to obtain the next transient numeric
identifier. This value must be much smaller than the number of identifier. This value must be larger than or equal to 1, and
possible values for the numeric IDs (i.e., "id_range"). Most much smaller than the number of possible values for the numeric
implementations of this algorithm employ a constant increment of identifiers (i.e., "id_range"). Most implementations of this
1. Using a value other than 1 can help mitigate some information algorithm employ a constant increment of 1. Using a value other
leakages (please see below), at the expense of a possible increase than 1 can help mitigate some information leakages (please see
in the numeric ID reuse frequency. below) at the expense of a possible increase in the numeric
identifier reuse frequency. The code above makes sure that the
The code above makes sure that the increment employed in the increment employed in the algorithm (id_inc) is always smaller
algorithm (id_inc) is always smaller than the number of possible than the number of possible values for the numeric identifiers
values for the numeric IDs (i.e., "max_id - min_d + 1"). However, (i.e., "max_id - min_d + 1"). However, as noted above, this value
as noted above, this value must also be much smaller than the must also be much smaller than the number of possible values for
number of possible values for the numeric IDs. the numeric identifiers.
lookup_counter() is a function that returns the current counter lookup_counter() is a function that returns the current counter
for a given context, or an error condition if that counter does for a given context or an error condition if that counter does not
not exist. exist.
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
store_counter() is a function that saves a counter value for a store_counter() is a function that saves a counter value for a
given context. given context.
suitable_id() is a function that checks whether the resulting suitable_id() checks whether a candidate numeric identifier has
identifier is acceptable (e.g., whether it is not already in use, suitable uniqueness properties.
etc.).
Essentially, whenever a new identifier is to be selected, the Essentially, whenever a new identifier is to be selected, the
algorithm checks whether a counter for the corresponding context algorithm checks whether a counter for the corresponding context
exists. If does, the value of such counter is incremented to obtain exists. If it does, the value of such counter is incremented to
the new transient numeric identifier, and the counter is updated. If obtain the new transient numeric identifier, and the counter is
no counter exists for such context, a new counter is created and updated. If no counter exists for such context, a new counter is
initialized to a random value, and used as the selected transient created and initialized to a random value and used as the selected
numeric identifier. This algorithm produces a per-context counter, transient numeric identifier. This algorithm produces a per-context
which results in one monotonically-increasing function for each counter, which results in one monotonically increasing function for
context. Since each counter is initialized to a random value, the each context. Since each counter is initialized to a random value,
resulting values are unpredictable by an off-path attacker. the resulting values are unpredictable by an off-path attacker.
The choice of id_inc has implications on both the security and The choice of id_inc has implications on both the security and
privacy properties of the resulting identifiers, but also on the privacy properties of the resulting identifiers and also on the
corresponding interoperability properties. On one hand, minimizing corresponding interoperability properties. On one hand, minimizing
the increments generally minimizes the identifier reuse frequency, the increments generally minimizes the identifier reuse frequency,
albeit at increased predictability. On the other hand, if the albeit at increased predictability. On the other hand, if the
increments are randomized, predictability of the resulting increments are randomized, predictability of the resulting
identifiers is reduced, and the information leakage produced by identifiers is reduced, and the information leakage produced by
global constant increments is mitigated. However, using larger global constant increments is mitigated. However, using larger
increments than necessary can result in higher numeric ID reuse increments than necessary can result in higher numeric identifier
frequency. reuse frequency.
This algorithm has the following drawbacks: This algorithm has the following drawbacks:
* It requires an implementation to store each per-CONTEXT counter in * It requires an implementation to store each per-context counter in
memory. If, as a result of resource management, the counter for a memory. If, as a result of resource management, the counter for a
given context must be removed, the last transient numeric given context must be removed, the last transient numeric
identifier value used for that context will be lost. Thus, if identifier value used for that context will be lost. Thus, if an
subsequently an identifier needs to be generated for the same identifier subsequently needs to be generated for the same
context, the corresponding counter will need to be recreated and context, the corresponding counter will need to be recreated and
reinitialized to a random value, thus possibly leading to reuse/ reinitialized to a random value, thus possibly leading to reuse/
collision of numeric identifiers. collision of numeric identifiers.
* Keeping one counter for each possible "context" may in some cases * Keeping one counter for each possible "context" may in some cases
be considered too onerous in terms of memory requirements. be considered too onerous in terms of memory requirements.
Otherwise, the identifiers produced by this algorithm do not suffer Otherwise, the identifiers produced by this algorithm do not suffer
from the other issues discussed in Section 8. from the other issues discussed in Section 8.
7.4.2. Simple PRF-Based Algorithm 7.4.2. Simple PRF-Based Algorithm
The goal of this algorithm is to produce monotonically-increasing The goal of this algorithm is to produce monotonically increasing
transient numeric identifiers (for each given context), with a transient numeric identifiers (for each given context) with a
randomized initial value. For example, if the identifiers being randomized initial value. For example, if the identifiers being
generated must be monotonically-increasing for each {IP Source generated must be monotonically increasing for each {Source Address,
Address, IP Destination Address} set, then each possible combination Destination Address} set, then each possible combination of {Source
of {IP Source Address, IP Destination Address} should have a separate Address, Destination Address} should have a separate monotonically
monotonically-increasing sequence, that starts at a different random increasing sequence that starts at a different random value.
value.
Instead of maintaining a per-context counter (as in the algorithm Instead of maintaining a per-context counter (as in the algorithm
from Section 7.4.1), the following algorithm employs a calculated from Section 7.4.1), the following algorithm employs a calculated
technique to maintain a random offset for each possible context. technique to maintain a random offset for each possible context.
/* Initialization code */ /* Initialization code */
counter = 0; counter = 0;
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
skipping to change at page 20, line 29 skipping to change at line 907
if (suitable_id(next_id)) { if (suitable_id(next_id)) {
return next_id; return next_id;
} }
retry = retry - id_inc; retry = retry - id_inc;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
In the algorithm above, the function F() provides a (stateless) NOTE:
unpredictable offset for each given context (as identified by
'CONTEXT').
F() is a PRF, with the same properties as those specified for F() in CONTEXT is the concatenation of all the elements that define a
Section 7.3. given context. For example, if this algorithm is expected to
produce identifiers that are monotonically increasing for each set
{Source Address, Destination Address}, CONTEXT should be the
concatenation of Source Address and Destination Address.
CONTEXT is the concatenation of all the elements that define a given increment() has the same properties and requirements as those
context. For example, if this algorithm is expected to produce specified for increment() in Section 7.4.1.
identifiers that are monotonically-increasing for each set (Source IP
Address, Destination IP Address), CONTEXT should be the concatenation
of these two IP addresses.
The function F() provides a "per-CONTEXT" fixed offset within the F() is a PRF, with the same properties as those specified for F()
numeric identifier "space". Both the 'offset' and 'counter' in Section 7.3.
variables may take any value within the storage type range since we
are restricting the resulting identifier to be in the range [min_id, suitable_id() checks whether a candidate numeric identifier has
suitable uniqueness properties.
In the algorithm above, the function F() provides a stateless,
stable, and unpredictable offset for each given context (as
identified by "CONTEXT"). Both the "offset" and "counter" variables
may take any value within the storage type range since we are
restricting the resulting identifier to be in the range [min_id,
max_id] in a similar way as in the algorithm described in max_id] in a similar way as in the algorithm described in
Section 7.1.1. This allows us to simply increment the 'counter' Section 7.1.1. This allows us to simply increment the "counter"
variable and rely on the unsigned integer to wrap around. variable and rely on the unsigned integer to wrap around.
The result of F() is no more secure than the secret key, and The result of F() is no more secure than the secret key, and
therefore 'secret_key' must be unknown to the attacker, and must be therefore, "secret_key" must be unknown to the attacker and must be
of a reasonable length. 'secret_key' must remain stable for a given of a reasonable length. "secret_key" must remain stable for a given
CONTEXT, since otherwise the numeric identifiers generated by this CONTEXT, since otherwise, the numeric identifiers generated by this
algorithm would not have the desired stability properties (i.e., algorithm would not have the desired properties (i.e., monotonically
monotonically-increasing for a given CONTEXT). In most cases, increasing for a given CONTEXT). In most cases, "secret_key" should
'secret_key' should be selected with a PRNG (see [RFC4086] for be selected with a PRNG (see [RFC4086] for recommendations on
recommendations on choosing secrets) at an appropriate time, and choosing secrets) at an appropriate time and stored in stable or
stored in stable or volatile storage (as necessary) for future use. volatile storage (as necessary) for future use.
It should be noted that, since this algorithm uses a global counter It should be noted that, since this algorithm uses a global counter
("counter") for selecting identifiers (i.e., all counters share the ("counter") for selecting identifiers (i.e., all counters share the
same increments space), this algorithm results in an information same increment space), this algorithm results in an information
leakage (as described in Section 8.2). For example, if this leakage (as described in Section 8.2). For example, if this
algorithm were used for selecting TCP ephemeral ports, and an algorithm was used for selecting TCP ephemeral ports and an attacker
attacker could force a client to periodically establish a new TCP could force a client to periodically establish a new TCP connection
connection to an attacker-controlled system (or through an attacker- to an attacker-controlled system (or through an attacker-observable
observable routing path), the attacker could subtract consecutive routing path), the attacker could subtract consecutive Source Port
source port values to obtain the number of outgoing TCP connections values to obtain the number of outgoing TCP connections established
established globally by the victim host within that time period (up globally by the victim host within that time period (up to wrap-
to wrap-around issues and five-tuple collisions, of course). This around issues and five-tuple collisions, of course). This
information leakage could be partially mitigated by employing small information leakage could be partially mitigated by employing small
random values for the increments (i.e., increment() function), random values for the increments (i.e., increment() function),
instead of having increment() return the constant "1". instead of having increment() return the constant "1".
We nevertheless note that an improved mitigation of this information We nevertheless note that an improved mitigation of this information
leakage could be more successfully achieved by employing the leakage could be more successfully achieved by employing the
algorithm from Section 7.4.3, instead. algorithm from Section 7.4.3, instead.
7.4.3. Double-PRF Algorithm 7.4.3. Double-PRF Algorithm
A trade-off between maintaining a single global 'counter' variable A trade-off between maintaining a single global "counter" variable
and maintaining 2**N 'counter' variables (where N is the width of the and maintaining 2**N "counter" variables (where N is the width of the
result of F()), could be achieved as follows. The system would keep result of F()) could be achieved as follows. The system would keep
an array of TABLE_LENGTH values, which would provide a separation of an array of TABLE_LENGTH values, which would provide a separation of
the increment space into multiple buckets. This improvement could be the increment space into multiple buckets. This improvement could be
incorporated into the algorithm from Section 7.4.2 as follows: incorporated into the algorithm from Section 7.4.2 as follows:
/* Initialization code */ /* Initialization code */
for(i = 0; i < TABLE_LENGTH; i++) { for(i = 0; i < TABLE_LENGTH; i++) {
table[i] = random(); table[i] = random();
} }
skipping to change at page 22, line 33 skipping to change at line 999
if (suitable_id(next_id)) { if (suitable_id(next_id)) {
return next_id; return next_id;
} }
retry = retry - id_inc; retry = retry - id_inc;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
'table[]' could be initialized with random values, as indicated by NOTE:
the initialization code in the pseudo-code above.
Both F() and G() are PRFs, with the same properties as those required increment() has the same properties and requirements as those
for F() in Section 7.3. specified for increment() in Section 7.4.1.
The results of F() and G() are no more secure than their respective Both F() and G() are PRFs, with the same properties as those
secret keys ('secret_key1' and 'secret_key2', respectively), and required for F() in Section 7.3. The results of F() and G() are
therefore both secret keys must be unknown to the attacker, and must no more secure than their respective secret keys ("secret_key1"
be of a reasonable length. Both secret keys must remain stable for and "secret_key2", respectively), and therefore, both secret keys
the given CONTEXT, since otherwise the transient numeric identifiers must be unknown to the attacker and must be of a reasonable
generated by this algorithm would not have the desired stability length. Both secret keys must remain stable for the given
properties (i.e., monotonically-increasing for a given CONTEXT). In CONTEXT, since otherwise, the transient numeric identifiers
most cases, both secret keys should be selected with a PRNG (see generated by this algorithm would not have the desired properties
[RFC4086] for recommendations on choosing secrets) at an appropriate (i.e., monotonically increasing for a given CONTEXT). In most
time, and stored in stable or volatile storage (as necessary) for cases, both secret keys should be selected with a PRNG (see
future use. [RFC4086] for recommendations on choosing secrets) at an
appropriate time and stored in stable or volatile storage (as
necessary) for future use.
The 'table[]' array assures that successive transient numeric "table[]" could be initialized with random values, as indicated by
identifiers for a given context will be monotonically-increasing. the initialization code in the pseudocode above.
Since the increments space is separated into TABLE_LENGTH different
The "table[]" array assures that successive transient numeric
identifiers for a given context will be monotonically increasing.
Since the increment space is separated into TABLE_LENGTH different
spaces, the identifier reuse frequency will be (probabilistically) spaces, the identifier reuse frequency will be (probabilistically)
lower than that of the algorithm in Section 7.4.2. That is, the lower than that of the algorithm in Section 7.4.2. That is, the
generation of an identifier for one given context will not generation of an identifier for one given context will not
necessarily result in increments in the identifier sequence of other necessarily result in increments in the identifier sequence of other
contexts. It is interesting to note that the size of 'table[]' does contexts. It is interesting to note that the size of "table[]" does
not limit the number of different identifier sequences, but rather not limit the number of different identifier sequences but rather
separates the *increment space* into TABLE_LENGTH different spaces. separates the *increment space* into TABLE_LENGTH different spaces.
The selected transient numeric identifier sequence will be obtained The selected transient numeric identifier sequence will be obtained
by adding the corresponding entry from 'table[]' to the value in the by adding the corresponding entry from "table[]" to the value in the
'offset' variable, which selects the actual identifier sequence space "offset" variable, which selects the actual identifier sequence space
(as in the algorithm from Section 7.4.2). (as in the algorithm from Section 7.4.2).
An attacker can perform traffic analysis for any "increment space" An attacker can perform traffic analysis for any "increment space"
(i.e., context) into which the attacker has "visibility" -- namely, (i.e., context) into which the attacker has "visibility" -- namely,
the attacker can force a system to generate identifiers for the attacker can force a system to generate identifiers for
G(CONTEXT, secret_key2), where the result of G() identifies the G(CONTEXT, secret_key2), where the result of G() identifies the
target "increment space". However, the attacker's ability to perform target "increment space". However, the attacker's ability to perform
traffic analysis is very reduced when compared to the simple PRF- traffic analysis is very reduced when compared to the simple PRF-
based identifiers (described in Section 7.4.2) and the predictable based identifiers (described in Section 7.4.2) and the predictable
linear identifiers (described in Appendix A.1). Additionally, an linear identifiers (described in Appendix A.1). Additionally, an
skipping to change at page 23, line 46 skipping to change at line 1062
8. Common Vulnerabilities Associated with Transient Numeric Identifiers 8. Common Vulnerabilities Associated with Transient Numeric Identifiers
8.1. Network Activity Correlation 8.1. Network Activity Correlation
An identifier that is predictable within a given context allows for An identifier that is predictable within a given context allows for
network activity correlation within that context. network activity correlation within that context.
For example, a stable IPv6 Interface Identifier allows for network For example, a stable IPv6 Interface Identifier allows for network
activity to be correlated within the context in which the Interface activity to be correlated within the context in which the Interface
Identifier is stable [RFC7721]. A stable-per-network IPv6 Interface Identifier is stable [RFC7721]. A stable per-network IPv6 Interface
Identifier (as in [RFC7217]) allows for network activity correlation Identifier (as in [RFC7217]) allows for network activity correlation
within a network, whereas a constant IPv6 Interface Identifier (that within a network, whereas a constant IPv6 Interface Identifier (which
remains constant across networks) allows not only network activity remains constant across networks) allows not only network activity
correlation within the same network, but also across networks ("host correlation within the same network but also across networks ("host-
tracking"). tracking").
Similarly, an implementation that generates TCP ISNs with a global Similarly, an implementation that generates TCP ISNs with a global
counter could allow for fingerprinting and network activity counter could allow for fingerprinting and network activity
correlation across networks, since an attacker could passively infer correlation across networks, since an attacker could passively infer
the identity of the victim based on the TCP ISNs employed for the identity of the victim based on the TCP ISNs employed for
subsequent communication instances. Similarly, an implementation subsequent communication instances. Similarly, an implementation
that generates predictable IPv6 Fragment Identification values could that generates predictable IPv6 Identification values could be
be subject to fingerprinting attacks (see e.g. [Bellovin2002]). subject to fingerprinting attacks (see, e.g., [Bellovin2002]).
8.2. Information Leakage 8.2. Information Leakage
Transient numeric identifiers that result in specific patterns can Transient numeric identifiers that result in specific patterns can
produce an information leakage to other communicating entities. For produce an information leakage to other communicating entities. For
example, it is common to generate transient numeric identifiers with example, it is common to generate transient numeric identifiers with
an algorithm such as: an algorithm such as:
ID = offset(CONTEXT) + mono(CONTEXT); ID = offset(CONTEXT) + mono(CONTEXT);
This generic expression generates identifiers by adding a This generic expression generates identifiers by adding a
monotonically-increasing function (e.g. linear) to a randomized monotonically increasing function (e.g., linear) to a randomized
offset. offset() is constant within a given context, whereas mono() offset. offset() is constant within a given context, whereas mono()
produces a monotonically-increasing sequence for the given context. produces a monotonically increasing sequence for the given context.
Identifiers generated with this expression will generally be Identifiers generated with this expression will generally be
predictable within CONTEXT. predictable within CONTEXT.
The predictability of mono(), irrespective of the predictability of The predictability of mono(), irrespective of the predictability of
offset(), can leak information that may be of use to attackers. For offset(), can leak information that may be of use to attackers. For
example, a node that selects ephemeral port numbers as in: example, a node that selects transport-protocol ephemeral port
numbers, as in:
ephemeral_port = offset(Dest_IP) + mono() ephemeral_port = offset(IP_Dst_Addr) + mono()
that is, with a per-destination offset, but a global mono() function that is, with a per-destination offset but a global mono() function
(e.g., a global counter), will leak information about total number of (e.g., a global counter), will leak information about the total
outgoing connections that have been issued by the vulnerable number of outgoing connections that have been issued by the
implementation. vulnerable implementation.
Similarly, a node that generates Fragment Identification values as Similarly, a node that generates IPv6 Identification values as in:
in:
Frag_ID = offset(IP_src_addr, IP_dst_addr) + mono() ID = offset(IP_Src_Addr, IP_Dst_Addr) + mono()
will leak out information about the total number of fragmented will leak out information about the total number of fragmented
packets that have been transmitted by the vulnerable implementation. packets that have been transmitted by the vulnerable implementation.
The vulnerabilities described in The vulnerabilities described in [Sanfilippo1998a],
[Sanfilippo1998b], and [Sanfilippo1999] are all associated with the
[Sanfilippo1998a], [Sanfilippo1998b], and [Sanfilippo1999] are all use of a global mono() function (i.e., with a global and constant
associated with the use of a global mono() function (i.e., with a "CONTEXT") -- particularly when it is a linear function (constant
global and constant "context") -- particularly when it is a linear increments of 1).
function (constant increments of 1).
Predicting transient numeric identifiers can be of help for other Predicting transient numeric identifiers can be of help for other
types of attacks. For example, predictable TCP ISNs can open the types of attacks. For example, predictable TCP ISNs can open the
door to trivial connection-reset and data injection attacks (see door to trivial connection-reset and data injection attacks (see
Section 8.6). Section 8.6).
8.3. Fingerprinting 8.3. Fingerprinting
Fingerprinting is the capability of an attacker to identify or re- Fingerprinting is the capability of an attacker to identify or
identify a visiting user, user agent or device via configuration reidentify a visiting user, user agent, or device via configuration
settings or other observable characteristics. Observable protocol settings or other observable characteristics. Observable protocol
objects and characteristics can be employed to identify/re-identify a objects and characteristics can be employed to identify/reidentify
variety of entities, ranging from the underlying hardware or various entities. These entities can range from the underlying
Operating System (vendor, type and version), to the user itself (i.e. hardware or operating system (OS) (vendor, type, and version) to the
his/her identity). [EFF] illustrates web browser-based user. [EFF] illustrates web-browser-based fingerprinting, but
fingerprinting, but similar techniques can be applied at other layers similar techniques can be applied at other layers and protocols,
and protocols, whether alternatively or in conjunction with it. whether alternatively or in conjunction with it.
Transient numeric identifiers are one of the observable protocol Transient numeric identifiers are one of the observable protocol
components that could be leveraged for fingerprinting purposes. That components that could be leveraged for fingerprinting purposes. That
is, an attacker could sample transient numeric identifiers to infer is, an attacker could sample transient numeric identifiers to infer
the algorithm (and its associated parameters, if any) for generating the algorithm (and its associated parameters, if any) for generating
such identifiers, possibly revealing the underlying Operating System such identifiers, possibly revealing the underlying OS vendor, type,
(OS) vendor, type, and version. This information could possibly be and version. This information could possibly be further leveraged in
further leveraged in conjunction with other fingerprinting techniques conjunction with other fingerprinting techniques and sources.
and sources.
Evasion of protocol-stack fingerprinting can prove to be a very Evasion of protocol-stack fingerprinting can prove to be a very
difficult task: most systems make use of a wide variety of protocols, difficult task, i.e., most systems make use of a wide variety of
each of which have a large number of parameters that can be set to protocols, each of which have a large number of parameters that can
arbitrary values or generated with a variety of algorithms with be set to arbitrary values or generated with a variety of algorithms
multiple parameters. with multiple parameters.
NOTE: | NOTE: General protocol-based fingerprinting is discussed in
General protocol-based fingerprinting is discussed in [RFC6973], | [RFC6973], along with guidelines to mitigate the associated
along with guidelines to mitigate the associated vulnerability. | vulnerability. [Fyodor1998] and [Fyodor2006] are classic
[Fyodor1998] and [Fyodor2006] are classic references on Operating | references on OS detection via TCP/IP stack fingerprinting.
System detection via TCP/IP stack fingerprinting. Nmap [nmap] is | Network Mapper [nmap] is probably the most popular tool for
probably the most popular tool for remote OS identification via | remote OS identification via active TCP/IP stack
active TCP/IP stack fingerprinting. p0f [Zalewski2012], on the | fingerprinting. p0f [Zalewski2012], on the other hand, is a
other hand, is a tool for performing remote OS detection via | tool for performing remote OS detection via passive TCP/IP
passive TCP/IP stack fingerprinting. Finally, [TBIT] is a TCP | stack fingerprinting. Finally, [TBIT] is a TCP fingerprinting
fingerprinting tool that aims at characterizing the behaviour of a | tool that aims at characterizing the behavior of a remote TCP
remote TCP peer based on active probes, and which has been widely | peer based on active probes, which has been widely used in the
used in the research community. | research community.
Algorithms that, from the perspective of an observer (e.g., the Algorithms that, from the perspective of an observer (e.g., the
legitimate communicating peer), result in specific values or legitimate communicating peer), result in specific values or patterns
patterns, will allow for at least some level of fingerprinting. For will allow for at least some level of fingerprinting. For example,
example, the algorithm from Section 7.3 will typically allow the algorithm from Section 7.3 will typically allow fingerprinting
fingerprinting within the context where the resulting identifiers are within the context where the resulting identifiers are stable.
stable. Similarly, the algorithms from Section 7.4 will result in a Similarly, the algorithms from Section 7.4 will result in
monotonically-increasing sequences within a given context, thus monotonically increasing sequences within a given context, thus
allowing for at least some level of fingerprinting (when the other allowing for at least some level of fingerprinting (when the other
communicating entity can correlate different sampled identifiers as communicating entity can correlate different sampled identifiers as
belonging to the same monotonically-increasing sequence). belonging to the same monotonically increasing sequence).
Thus, where possible, algorithms from Section 7.1 should be preferred Thus, where possible, algorithms from Section 7.1 should be preferred
over algorithms that result in specific values or patterns. over algorithms that result in specific values or patterns.
8.4. Exploitation of the Semantics of Transient Numeric Identifiers 8.4. Exploitation of the Semantics of Transient Numeric Identifiers
Identifiers that are not semantically opaque tend to be more Identifiers that are not semantically opaque tend to be more
predictable than semantically-opaque identifiers. For example, a MAC predictable than semantically opaque identifiers. For example, a
address contains an OUI (Organizationally-Unique Identifier) which Media Access Control (MAC) address contains an Organizationally
may identify the vendor that manufactured the corresponding network Unique Identifier (OUI), which may identify the vendor that
interface card. This can be leveraged by an attacker trying to manufactured the corresponding network interface card. This can be
"guess" MAC addresses, who has some knowledge about the possible leveraged by an attacker trying to "guess" MAC addresses, who has
Network Interface Card (NIC) vendor. some knowledge about the possible Network Interface Card (NIC)
vendor.
[RFC7707] discusses a number of techniques to reduce the search space [RFC7707] discusses a number of techniques to reduce the search space
when performing IPv6 address-scanning attacks by leveraging the when performing IPv6 address-scanning attacks by leveraging the
semantics of the IIDs produced by traditional SLAAC algorithms semantics of IPv6 IIDs.
(eventually replaced by [RFC7217]) that embed MAC addresses in the
IID of IPv6 addresses.
8.5. Exploitation of Collisions of Transient Numeric Identifiers 8.5. Exploitation of Collisions of Transient Numeric Identifiers
In many cases, the collision of transient network identifiers can In many cases, the collision of transient network identifiers can
have a hard failure severity (or result in a hard failure severity if have a hard failure severity (or result in a hard failure severity if
an attacker can cause multiple collisions deterministically, one an attacker can cause multiple collisions deterministically, one
after another). For example, predictable Fragment Identification after another). For example, predictable IP Identification values
values open the door to Denial of Service (DoS) attacks (see e.g. open the door to Denial of Service (DoS) attacks (see, e.g.,
[RFC5722].). [RFC5722].).
8.6. Exploitation of Predictable Transient Numeric Identifiers for 8.6. Exploitation of Predictable Transient Numeric Identifiers for
Injection Attacks Injection Attacks
Some protocols rely on "sequence numbers" for the validation of Some protocols rely on "sequence numbers" for the validation of
incoming packets. For example, TCP employs sequence numbers for incoming packets. For example, TCP employs sequence numbers for
reassembling TCP segments, while IPv4 and IPv6 employ Fragment reassembling TCP segments, while IPv4 and IPv6 employ Identification
Identification values for reassembling IPv4 and IPv6 fragments values for reassembling IPv4 and IPv6 fragments (respectively).
(respectively). Lacking built-in cryptographic mechanisms for Lacking built-in cryptographic mechanisms for validating packets,
validating packets, these protocols are therefore vulnerable to on- these protocols are therefore vulnerable to on-path data (see, e.g.,
path data (see e.g. [Joncheray1995]) and/or control-information (see [Joncheray1995]) and/or control-information (see, e.g., [RFC4953] and
e.g. [RFC4953] and [RFC5927]) injection attacks. The extent to [RFC5927]) injection attacks. The extent to which these protocols
which these protocols may resist off-path (i.e. "blind") injection may resist off-path (i.e., "blind") injection attacks depends on
attacks depends on whether the associated "sequence numbers" are whether the associated "sequence numbers" are predictable and the
predictable, and effort required to successfully predict a valid effort required to successfully predict a valid "sequence number"
"sequence number" (see e.g. [RFC4953] and [RFC5927]). (see, e.g., [RFC4953] and [RFC5927]).
We note that the use of unpredictable "sequence numbers" is a We note that the use of unpredictable "sequence numbers" is a
completely-ineffective mitigation for on-path injection attacks, and completely ineffective mitigation for on-path injection attacks and
also a mostly-ineffective mitigation for off-path (i.e. "blind") also a mostly ineffective mitigation for off-path (i.e., "blind")
injection attacks. However, many legacy protocols (such as TCP) do injection attacks. However, many legacy protocols (such as TCP) do
not natively incorporate cryptographic mitigations, but rather only not incorporate cryptographic mitigations as part of the core
as optional features (see e.g. [RFC5925]), if at all available. protocol but rather as optional features (see, e.g., [RFC5925]), if
Additionally, ad-hoc use of cryptographic mitigations might not be available at all. Additionally, ad hoc use of cryptographic
sufficient to relieve a protocol implementation of generating mitigations might not be sufficient to relieve a protocol
appropriate transient numeric identifiers. For example, use of the implementation of generating appropriate transient numeric
Transport Layer Security (TLS) protocol [RFC8446] with TCP will identifiers. For example, use of the Transport Layer Security (TLS)
protect the application protocol, but will not help to mitigate e.g. protocol [RFC8446] with TCP will protect the application protocol but
TCP-based connection-reset attacks (see e.g. [RFC4953]). Similarly, will not help to mitigate, e.g., TCP-based connection-reset attacks
use of SEcure Neighbor Discovery (SEND) [RFC3971] will still imply (see, e.g., [RFC4953]). Similarly, use of SEcure Neighbor Discovery
reliance on the successful reassembly of IPv6 fragments in those (SEND) [RFC3971] will still imply reliance on the successful
cases where SEND packets do not fit into the link Maximum reassembly of IPv6 fragments in those cases where SEND packets do not
Transmission Unit (MTU) (see [RFC6980]). fit into the link Maximum Transmission Unit (MTU) (see [RFC6980]).
8.7. Cryptanalysis 8.7. Cryptanalysis
A number of algorithms discussed in this document (such as those A number of algorithms discussed in this document (such as those
described in Section 7.4.2 and Section 7.4.3) rely on PRFs. described in Sections 7.4.2 and 7.4.3) rely on PRFs. Implementations
Implementations that employ weak PRFs or keys of inappropriate size that employ weak PRFs or keys of inappropriate size can be subject to
can be subject to cryptanalysis, where an attacker can obtain the cryptanalysis, where an attacker can obtain the secret key employed
secret key employed for the PRF, predict numeric identifiers, etc. for the PRF, predict numeric identifiers, etc.
Furthermore, an implementation that overloads the semantics of the Furthermore, an implementation that overloads the semantics of the
secret key can result in more trivial cryptanalysis, possibly secret key can result in more trivial cryptanalysis, possibly
resulting in the leakage of the value employed for the secret key. resulting in the leakage of the value employed for the secret key.
NOTE: | NOTE: [IPID-DEV] describes two vulnerable transient numeric
[IPID-DEV] describes two vulnerable transient numeric ID | identifier generators that employ cryptographically weak hash
generators that employ cryptographically-weak hash functions. | functions. Additionally, one of such implementations employs
Additionally, one of such implementations employs 32-bits of a | 32 bits of a kernel address as the secret key for a hash
kernel address as the secret key for a hash function, and | function, and therefore, successful cryptanalysis leaks the
therefore successful cryptanalysis leaks the aforementioned kernel | aforementioned kernel address, allowing for Kernel Address
address, allowing for Kernel Address Space Layout Randomization | Space Layout Randomization (KASLR) [KASLR] bypass.
(KASLR) [KASLR] bypass.
9. Vulnerability Assessment of Transient Numeric Identifiers 9. Vulnerability Assessment of Transient Numeric Identifiers
The following subsections analyze possible vulnerabilities associated The following subsections analyze possible vulnerabilities associated
with the algorithms described in Section 7. with the algorithms described in Section 7.
9.1. Category #1: Uniqueness (soft failure) 9.1. Category #1: Uniqueness (Soft Failure)
Possible vulnerabilities associated with the algorithms from Possible vulnerabilities associated with the algorithms from
Section 7.1 include: Section 7.1 include the following:
* Use of flawed PRNGs (please see e.g. [Zalewski2001], * use of flawed PRNGs (please see, e.g., [Zalewski2001],
[Zalewski2002], [Klein2007] and [CVEs]). [Zalewski2002], [Klein2007], and [CVEs])
* Inadvertently affecting the distribution of an otherwise suitable * inadvertently affecting the distribution of an otherwise suitable
PRNG (please see e.g. [Romailler2020]). PRNG (please see, e.g., [Romailler2020])
Where available, CSPRNGs should be preferred over regular PRNGs such Where available, CSPRNGs should be preferred over regular PRNGs, such
as e.g. POSIX random(3) implementations. In scenarios where a as, e.g., POSIX random(3) implementations. In scenarios where a
CSPRNG is not readily available, a security and privacy assessment of CSPRNG is not readily available, a security and privacy assessment of
employing a regular PRNG should be performed, supporting the employing a regular PRNG should be performed, supporting the
implementation decision. implementation decision.
NOTE: | NOTE: Please see [RFC4086] regarding randomness requirements
Please see [RFC4086] regarding randomness requirements for | for security. [Aumasson2018], [Press1992], and [Knuth1983]
security. [Aumasson2018], [Press1992], and [Knuth1983], discuss | discuss theoretical and practical aspects of random number
theoretical and practical aspects of random numbers generation, | generation and provide guidance on how to evaluate PRNGs.
and provide guidance on how to evaluate PRNGs.
When employing a PRNG, many implementations "adapt" the length of its When employing a PRNG, many implementations "adapt" the length of its
output with a modulo operator (e.g., C language's "%"), possibly output with a modulo operator (e.g., C language's "%"), possibly
changing the distribution of the output of the PRNG. changing the distribution of the output of the PRNG.
For example, consider an implementation that employs the following For example, consider an implementation that employs the following
code: code:
id = random() % 50000; id = random() % 50000;
This example implementation means to obtain a transient numeric This example implementation means to obtain a transient numeric
identifier in the range 0-49999. If random() produces e.g. a identifier in the range 0-49999. If random() produces, e.g., a
pseudorandom number of 16 bits (with uniform distribution), the pseudorandom number of 16 bits (with uniform distribution), the
selected transient numeric identifier will have a non-uniform selected transient numeric identifier will have a nonuniform
distribution with the numbers in the range 0-15535 having double- distribution with the numbers in the range 0-15535 having double
frequency than the numbers in the range 15536-49999. frequency than the numbers in the range 15536-49999.
NOTE: | NOTE: For example, in our sample code, both an output of 10 and
For example, in our sample code both an output of 10 and output of | output of 50010 from the random() function will result in an
50010 from the random() function will result in an 'id' value of | "id" value of 10.
10.
This effect is reduced if the PRNG produces an output that is much This effect is reduced if the PRNG produces an output that is much
longer than the length implied by the modulo operation. We note that longer than the length implied by the modulo operation. We note that
to preserve a uniform distribution, the rejection sampling technique to preserve a uniform distribution, the rejection sampling technique
[Romailler2020] can be used. [Romailler2020] can be used.
Use of algorithms other than PRNGs for generating identifiers of this Use of algorithms other than PRNGs for generating identifiers of this
category is discouraged. category is discouraged.
9.2. Category #2: Uniqueness (hard failure) 9.2. Category #2: Uniqueness (Hard Failure)
As noted in Section 7.2, this category can employ the same algorithms As noted in Section 7.2, this category can employ the same algorithms
as Category #4, since a monotonically-increasing sequence tends to as Category #4, since a monotonically increasing sequence tends to
minimize the transient numeric identifier reuse frequency. minimize the transient numeric identifier reuse frequency.
Therefore, the vulnerability analysis in Section 9.4 also applies to Therefore, the vulnerability analysis in Section 9.4 also applies to
this category. this category.
Additionally, as noted in Section 7.2, some transient numeric Additionally, as noted in Section 7.2, some transient numeric
identifiers of this category might be able to use the algorithms from identifiers of this category might be able to use the algorithms from
Section 7.1, in which case the same considerations as in Section 9.1 Section 7.1, in which case the same considerations as in Section 9.1
would apply. would apply.
9.3. Category #3: Uniqueness, stable within context (soft failure) 9.3. Category #3: Uniqueness, Stable within Context (Soft Failure)
Possible vulnerabilities associated with the algorithms from Possible vulnerabilities associated with the algorithms from
Section 7.3 are: Section 7.3 are the following:
1. Use of weak PRFs, or inappropriate secret keys (whether * Use of weak PRFs or inappropriate secret keys (whether
inappropriate selection or inappropriate size) could allow for inappropriate selection or inappropriate size) could allow for
cryptanalysis, which could eventually be exploited by an attacker cryptanalysis, which could eventually be exploited by an attacker
to predict future transient numeric identifiers. to predict future transient numeric identifiers.
2. Since the algorithm generates a unique and stable identifier * Since the algorithm generates a unique and stable identifier
within a specified context, it may allow for network activity within a specified context, it may allow for network activity
correlation and fingerprinting within the specified context. correlation and fingerprinting within the specified context.
9.4. Category #4: Uniqueness, monotonically increasing within context 9.4. Category #4: Uniqueness, Monotonically Increasing within Context
(hard failure) (Hard Failure)
The algorithm described in Section 7.4.1 for generating identifiers The algorithm described in Section 7.4.1 for generating identifiers
of Category #4 will result in an identifiable pattern (i.e. a of Category #4 will result in an identifiable pattern (i.e., a
monotonically-increasing sequence) for the transient numeric monotonically increasing sequence) for the transient numeric
identifiers generated for each CONTEXT, and thus will allow for identifiers generated for each CONTEXT, and thus will allow for
fingerprinting and network activity correlation within each CONTEXT. fingerprinting and network activity correlation within each CONTEXT.
On the other hand, a simple way to generalize and analyze the On the other hand, a simple way to generalize and analyze the
algorithms described in Section 7.4.2 and Section 7.4.3 for algorithms described in Sections 7.4.2 and 7.4.3 for generating
generating identifiers of Category #4, is as follows: identifiers of Category #4 is as follows:
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = id_range; retry = id_range;
id_inc = increment() % id_range; id_inc = increment() % id_range;
do { do {
update_mono(CONTEXT, id_inc); update_mono(CONTEXT, id_inc);
next_id = min_id + (offset(CONTEXT) + \ next_id = min_id + (offset(CONTEXT) + \
skipping to change at page 31, line 6 skipping to change at line 1369
return next_id; return next_id;
} }
retry = retry - id_inc; retry = retry - id_inc;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
NOTE: NOTE:
increment() returns a small integer that is employed to generate a increment() returns a small integer that is employed to generate a
monotonically-increasing function. Most implementations employ a monotonically increasing function. Most implementations employ a
constant value for "increment()" (usually 1). The value returned constant value for "increment()" (usually 1). The value returned
by increment() must be much smaller than the value computed for by increment() must be much smaller than the value computed for
"id_range". "id_range".
update_mono(CONTEXT, id_inc) increments the counter corresponding update_mono(CONTEXT, id_inc) increments the counter corresponding
to CONTEXT by "id_inc". to CONTEXT by "id_inc".
mono(CONTEXT) reads the counter corresponding to CONTEXT. mono(CONTEXT) reads the counter corresponding to CONTEXT.
Essentially, an identifier (next_id) is generated by adding a Essentially, an identifier (next_id) is generated by adding a
monotonically-increasing function (mono()) to an offset value, monotonically increasing function (mono()) to an offset value, which
unknown to the attacker and stable for given context (CONTEXT). is unknown to the attacker and stable for given context (CONTEXT).
The following aspects of the algorithm should be considered: The following aspects of the algorithm should be considered:
* For the most part, it is the offset() function that results in * For the most part, it is the offset() function that results in
identifiers that are unpredictable by an off-patch attacker. identifiers that are unpredictable by an off-patch attacker.
While the resulting sequence is known to be monotonically- While the resulting sequence is known to be monotonically
increasing, the use of a randomized offset value makes the increasing, the use of a randomized offset value makes the
resulting values unknown to the attacker. resulting values unknown to the attacker.
* The most straightforward "stateless" implementation of offset() is * The most straightforward "stateless" implementation of offset() is
with a PRF that takes the values that identify the context and a with a PRF that takes the values that identify the context and a
"secret_key" (not shown in the figure above) as arguments. secret key (not shown in the figure above) as arguments.
* One possible implementation of mono() would be to have mono() * One possible implementation of mono() would be to have mono()
internally employ a single counter (as in the algorithm from internally employ a single counter (as in the algorithm from
Section 7.4.2), or map the increments for different contexts into Section 7.4.2) or map the increments for different contexts into a
a number of counters/buckets, such that the number of counters number of counters/buckets, such that the number of counters that
that need to be maintained in memory is reduced (as in the need to be maintained in memory is reduced (as in the "Double-PRF
algorithm from algorithm in Section 7.4.3). Algorithm" from Section 7.4.3).
* In all cases, a monotonically increasing function is implemented * In all cases, a monotonically increasing function is implemented
by incrementing the previous value of a counter by increment() by incrementing the previous value of a counter by increment()
units. In the most trivial case, increment() could return the units. In the most trivial case, increment() could return the
constant "1". But increment() could also be implemented to return constant "1". But increment() could also be implemented to return
small random integers such that the increments are unpredictable small random integers such that the increments are unpredictable
(see Appendix A of [RFC7739]). This represents a trade-off (see Appendix A.2 of this document). This represents a trade-off
between the unpredictability of the resulting transient numeric between the unpredictability of the resulting transient numeric
IDs and the transient numeric ID reuse frequency. identifiers and the transient numeric identifier reuse frequency.
Considering the generic algorithm illustrated above, we can identify Considering the generic algorithm illustrated above, we can identify
the following possible vulnerabilities: the following possible vulnerabilities:
* Since the algorithms for this category are similar to those of * Since the algorithms for this category are similar to those of
Section 9.3, with the addition of a monotonically-increasing Section 9.3, with the addition of a monotonically increasing
function, all the issues discussed in Section 9.3 ("Category #3: function, all the issues discussed in Section 9.3 ("Category #3:
Uniqueness, stable within context (soft failure)") also apply to Uniqueness, Stable within Context (Soft Failure)") also apply to
this case. this case.
* mono() can be correlated to the number of identifiers generated * mono() can be correlated to the number of identifiers generated
for a given context (CONTEXT). Thus, if mono() spans more than for a given context (CONTEXT). Thus, if mono() spans more than
the necessary context, the "increments" could be leaked to other the necessary context, the "increments" could be leaked to other
parties, thus disclosing information about the number of parties, thus disclosing information about the number of
identifiers that have been generated by the algorithm for all identifiers that have been generated by the algorithm for all
contexts. This is information disclosure becomes more evident contexts. This information disclosure becomes more evident when
when an implementation employs a constant increment of 1. For an implementation employs a constant increment of 1. For example,
example, an implementation where mono() is actually a single an implementation where mono() is actually a single global counter
global counter, will unnecessarily leak information the number of will unnecessarily leak information about the number of
identifiers that have been generated by the algorithm (globally, identifiers that have been generated by the algorithm (globally,
for all contexts). [Fyodor2003] is one example of how such for all contexts). [Fyodor2003] describes one example of how such
information leakages can be exploited. We note that limiting the information leakages can be exploited. We note that limiting the
span of the increments space will require a larger number of span of the increment space will require a larger number of
counters to be stored in memory (i.e., a larger value for the counters to be stored in memory (i.e., a larger value for the
TABLE_LENGTH parameter of the algorithm in Section 7.4.3). TABLE_LENGTH parameter of the algorithm in Section 7.4.3).
* Transient numeric identifiers generated with the algorithms * Transient numeric identifiers generated with the algorithms
described in Section 7.4.2 and Section 7.4.3 will normally allow described in Sections 7.4.2 and 7.4.3 will normally allow for
for fingerprinting within CONTEXT since, for such context, the fingerprinting within CONTEXT since, for such context, the
resulting identifiers will have an identifiable pattern (i.e. a resulting identifiers will have an identifiable pattern (i.e., a
monotonically-increasing sequence). monotonically increasing sequence).
10. IANA Considerations 10. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
11. Security Considerations 11. Security Considerations
The entire document is about the security and privacy implications of This entire document is about the security and privacy implications
transient numeric identifiers. of transient numeric identifiers. [RFC9416] recommends that protocol
[I-D.gont-numeric-ids-sec-considerations] recommends that protocol
specifications specify the interoperability requirements of their specifications specify the interoperability requirements of their
transient numeric identifiers, perform a vulnerability assessment of transient numeric identifiers, perform a vulnerability assessment of
their transient numeric identifiers, and suggest an algorithm for their transient numeric identifiers, and recommend an algorithm for
generating each of their transient numeric identifiers. This generating each of their transient numeric identifiers. This
document analyzes possible algorithms (and their implications) that document analyzes possible algorithms (and their implications) that
could be employed to comply with the interoperability properties of could be employed to comply with the interoperability requirements of
most common categories of transient numeric identifiers, while the most common categories of transient numeric identifiers while
minimizing the associated negative security and privacy implications. minimizing the associated negative security and privacy implications.
12. Acknowledgements 12. References
The authors would like to thank (in alphabetical order) Bernard
Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis Leon Cardenas
Graide, Spencer Dawkins, Theo de Raadt, Guillermo Gont, Joseph
Lorenzo Hall, Gre Norcie, Colin Perkins, Vincent Roca, Shivan Sahib,
Rich Salz, Martin Thomson, and Michael Tuexen, for providing valuable
comments on earlier versions of this document.
The authors would like to thank Shivan Sahib and Christopher Wood,
for their guidance during the publication process of this document.
The authors would like to thank Jean-Philippe Aumasson and Mathew D.
Green (John Hopkins University) for kindly answering a number of
questions.
The authors would like to thank Diego Armando Maradona for his magic
and inspiration.
13. References 12.1. Normative References
13.1. Normative References [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
DOI 10.17487/RFC0791, September 1981,
<https://www.rfc-editor.org/info/rfc791>.
[RFC0793] Postel, J., "Transmission Control Protocol", RFC 793, [RFC0793] Postel, J., "Transmission Control Protocol", RFC 793,
DOI 10.17487/RFC0793, September 1981, DOI 10.17487/RFC0793, September 1981,
<https://www.rfc-editor.org/info/rfc793>. <https://www.rfc-editor.org/info/rfc793>.
[RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence [RFC1035] Mockapetris, P., "Domain names - implementation and
Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
2012, <https://www.rfc-editor.org/info/rfc6528>. November 1987, <https://www.rfc-editor.org/info/rfc1035>.
[RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport-
Protocol Port Randomization", BCP 156, RFC 6056,
DOI 10.17487/RFC6056, January 2011,
<https://www.rfc-editor.org/info/rfc6056>.
[RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
June 2010, <https://www.rfc-editor.org/info/rfc5925>.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,
DOI 10.17487/RFC0791, September 1981, DOI 10.17487/RFC1321, April 1992,
<https://www.rfc-editor.org/info/rfc791>. <https://www.rfc-editor.org/info/rfc1321>.
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460,
December 1998, <https://www.rfc-editor.org/info/rfc2460>. December 1998, <https://www.rfc-editor.org/info/rfc2460>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
[RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker,
"Randomness Requirements for Security", BCP 106, RFC 4086, "Randomness Requirements for Security", BCP 106, RFC 4086,
DOI 10.17487/RFC4086, June 2005, DOI 10.17487/RFC4086, June 2005,
<https://www.rfc-editor.org/info/rfc4086>. <https://www.rfc-editor.org/info/rfc4086>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, DOI 10.17487/RFC4291, February Architecture", RFC 4291, DOI 10.17487/RFC4291, February
2006, <https://www.rfc-editor.org/info/rfc4291>. 2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC8981] Gont, F., Krishnan, S., Narten, T., and R. Draves,
"Temporary Address Extensions for Stateless Address
Autoconfiguration in IPv6", RFC 8981,
DOI 10.17487/RFC8981, February 2021,
<https://www.rfc-editor.org/info/rfc8981>.
[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
Address Autoconfiguration", RFC 4862, Address Autoconfiguration", RFC 4862,
DOI 10.17487/RFC4862, September 2007, DOI 10.17487/RFC4862, September 2007,
<https://www.rfc-editor.org/info/rfc4862>. <https://www.rfc-editor.org/info/rfc4862>.
[RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", [RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments",
RFC 5722, DOI 10.17487/RFC5722, December 2009, RFC 5722, DOI 10.17487/RFC5722, December 2009,
<https://www.rfc-editor.org/info/rfc5722>. <https://www.rfc-editor.org/info/rfc5722>.
[RFC7217] Gont, F., "A Method for Generating Semantically Opaque [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
Interface Identifiers with IPv6 Stateless Address "Network Time Protocol Version 4: Protocol and Algorithms
Autoconfiguration (SLAAC)", RFC 7217, Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
DOI 10.17487/RFC7217, April 2014, <https://www.rfc-editor.org/info/rfc5905>.
<https://www.rfc-editor.org/info/rfc7217>.
[RFC8064] Gont, F., Cooper, A., Thaler, D., and W. Liu, [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
"Recommendation on Stable IPv6 Interface Identifiers", Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
RFC 8064, DOI 10.17487/RFC8064, February 2017, June 2010, <https://www.rfc-editor.org/info/rfc5925>.
<https://www.rfc-editor.org/info/rfc8064>.
[RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport-
Protocol Port Randomization", BCP 156, RFC 6056,
DOI 10.17487/RFC6056, January 2011,
<https://www.rfc-editor.org/info/rfc6056>.
[RFC6151] Turner, S. and L. Chen, "Updated Security Considerations
for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
RFC 6151, DOI 10.17487/RFC6151, March 2011,
<https://www.rfc-editor.org/info/rfc6151>.
[RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP
Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191,
April 2011, <https://www.rfc-editor.org/info/rfc6191>.
[RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme,
"IPv6 Flow Label Specification", RFC 6437, "IPv6 Flow Label Specification", RFC 6437,
DOI 10.17487/RFC6437, November 2011, DOI 10.17487/RFC6437, November 2011,
<https://www.rfc-editor.org/info/rfc6437>. <https://www.rfc-editor.org/info/rfc6437>.
[RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence
Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191, Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February
April 2011, <https://www.rfc-editor.org/info/rfc6191>. 2012, <https://www.rfc-editor.org/info/rfc6528>.
[RFC7217] Gont, F., "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address
Autoconfiguration (SLAAC)", RFC 7217,
DOI 10.17487/RFC7217, April 2014,
<https://www.rfc-editor.org/info/rfc7217>.
[RFC7323] Borman, D., Braden, B., Jacobson, V., and R. [RFC7323] Borman, D., Braden, B., Jacobson, V., and R.
Scheffenegger, Ed., "TCP Extensions for High Performance", Scheffenegger, Ed., "TCP Extensions for High Performance",
RFC 7323, DOI 10.17487/RFC7323, September 2014, RFC 7323, DOI 10.17487/RFC7323, September 2014,
<https://www.rfc-editor.org/info/rfc7323>. <https://www.rfc-editor.org/info/rfc7323>.
[RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, [RFC8064] Gont, F., Cooper, A., Thaler, D., and W. Liu,
DOI 10.17487/RFC1321, April 1992, "Recommendation on Stable IPv6 Interface Identifiers",
<https://www.rfc-editor.org/info/rfc1321>. RFC 8064, DOI 10.17487/RFC8064, February 2017,
<https://www.rfc-editor.org/info/rfc8064>.
[RFC6151] Turner, S. and L. Chen, "Updated Security Considerations
for the MD5 Message-Digest and the HMAC-MD5 Algorithms",
RFC 6151, DOI 10.17487/RFC6151, March 2011,
<https://www.rfc-editor.org/info/rfc6151>.
[RFC1035] Mockapetris, P., "Domain names - implementation and [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, (IPv6) Specification", STD 86, RFC 8200,
November 1987, <https://www.rfc-editor.org/info/rfc1035>. DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
13.2. Informative References [RFC8981] Gont, F., Krishnan, S., Narten, T., and R. Draves,
"Temporary Address Extensions for Stateless Address
Autoconfiguration in IPv6", RFC 8981,
DOI 10.17487/RFC8981, February 2021,
<https://www.rfc-editor.org/info/rfc8981>.
[KASLR] PaX Team, "Address Space Layout Randomization", [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
<https://pax.grsecurity.net/docs/aslr.txt>. STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
<https://www.rfc-editor.org/info/rfc9293>.
[IANA-PROT] 12.2. Informative References
IANA, "Protocol Registries",
<https://www.iana.org/protocols>.
[RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., [ARC4RANDOM]
Morris, J., Hansen, M., and R. Smith, "Privacy OpenBSD, "arc4random(3)", Library Functions Manual,
Considerations for Internet Protocols", RFC 6973, September 2019, <https://man.openbsd.org/arc4random>.
DOI 10.17487/RFC6973, July 2013,
<https://www.rfc-editor.org/info/rfc6973>.
[Fyodor1998] [Aumasson2018]
Fyodor, "Remote OS Detection via TCP/IP Stack Aumasson, J-P., "Serious Cryptography: A Practical
Fingerprinting", Phrack Magazine, Volume 9, Issue 54, Introduction to Modern Encryption", No Starch Press, Inc.,
1998, <http://www.phrack.org/archives/issues/54/9.txt>. ISBN-10 1-59327-826-8, November 2017.
[Fyodor2006] [Bellovin1989]
Lyon, G., "Chapter 8. Remote OS Detection", 2006, Bellovin, S., "Security Problems in the TCP/IP Protocol
<https://nmap.org/book/osdetect.html>. Suite", Computer Communications Review, Vol. 19, No. 2,
pp. 32-48, April 1989,
<https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.
[nmap] nmap, "Nmap: Free Security Scanner For Network Exploration [Bellovin2002]
and Audit", 2020, <https://www.insecure.org/nmap>. Bellovin, S., "A Technique for Counting NATted Hosts",
IMW'02, Marseille, France, ISBN 1-58113-603-X/02/0011,
November 2002,
<https://www.cs.columbia.edu/~smb/papers/fnat.pdf>.
[EFF] EFF, "Cover your tracks: See how trackers view your [BLAKE3] "BLAKE3: one function, fast everywhere", September 2022,
browser", 2020, <https://coveryourtracks.eff.org/>. <https://blake3.io/>.
[Schuba1993] [C11] ISO/IEC, "Information technology - Programming languages -
Schuba, C., "ADDRESSING WEAKNESSES IN THE DOMAIN NAME C", ISO/IEC 9899:2018, June 2018.
SYSTEM PROTOCOL", 1993,
<http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
schuba-DNS-msthesis.pdf>.
[TBIT] TBIT, "TBIT, the TCP Behavior Inference Tool", 2001, [CPNI-TCP] Centre for the Protection of National Infrastructure
<https://www.icir.org/tbit/>. (CPNI), "Security Assessment of the Transmission Control
Protocol (TCP)", CPNI Technical Note 3/2009, February
2009, <https://www.si6networks.com/files/publications/tn-
03-09-security-assessment-TCP.pdf>.
[C11] ISO/IEC, "Information technology - Programming languages - [CVEs] NVD, "Vulnerability Advisories for PRNGs",
C", ISO/IEC 9899:2011, 2011. <https://www.gont.com.ar/miscellanea/prng-cves/>.
[POSIX] IEEE, "IEEE Standard for Information Technology -- [EFF] EFF, "Cover your tracks: See how trackers view your
Portable Operating System Interface (POSIX)", IEEE Std browser", <https://coveryourtracks.eff.org/>.
1003.1-2017, 2017.
[ARC4RANDOM] [FIPS-SHS] NIST, "Secure Hash Standard (SHS)", FIPS PUB 180-4,
OpenBSD, "arc4random(3)", Library Functions Manual, 2022, DOI 10.6028/NIST.FIPS.180-4, August 2015,
<https://man.openbsd.org/arc4random>. <https://nvlpubs.nist.gov/nistpubs/FIPS/
NIST.FIPS.180-4.pdf>.
[GETENTROPY] [Fyodor1998]
Linux, "getentropy(3)", Linux Programmer's Manual, 2022, Fyodor, "Remote OS detection via TCP/IP Stack
<https://man7.org/linux/man-pages/man3/getentropy.3.html>. FingerPrinting", Phrack Magazine, Volume 8, Issue 54,
December 1998,
<http://www.phrack.org/archives/issues/54/9.txt>.
[CVEs] NVD, "Vulnerability Advisories for Pseudo Random Number [Fyodor2003]
Generators", 2022, Fyodor, "Idle Scanning and related IPID games", 2003,
<https://www.gont.com.ar/miscellanea/prng-cves/>. <https://nmap.org/presentations/CanSecWest03/CD_Content/
idlescan_paper/idlescan.html>.
[Zalewski2012] [Fyodor2006]
Zalewski, M., "p0f v3 (version 3.09b)", 2012, Lyon, G., "Chapter 8. Remote OS Detection", January 2009,
<https://lcamtuf.coredump.cx/p0f.shtml>. <https://nmap.org/book/osdetect.html>.
[RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- [GETENTROPY]
Hashing for Message Authentication", RFC 2104, Linux, "getentropy(3)", Linux Programmer's Manual, March
DOI 10.17487/RFC2104, February 1997, 2021,
<https://www.rfc-editor.org/info/rfc2104>. <https://man7.org/linux/man-pages/man3/getentropy.3.html>.
[RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6 [IANA-PROT]
Flow Label for Load Balancing in Server Farms", RFC 7098, IANA, "Protocol Registries",
DOI 10.17487/RFC7098, January 2014, <https://www.iana.org/protocols>.
<https://www.rfc-editor.org/info/rfc7098>.
[RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an [IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May KASLR Bypass (Extended Version)",
2014, <https://www.rfc-editor.org/info/rfc7258>. DOI 10.48550/arXiv.1906.10478, October 2019,
<https://arxiv.org/pdf/1906.10478.pdf>.
[CPNI-TCP] Gont, F., "Security Assessment of the Transmission Control [Joncheray1995]
Protocol (TCP)", United Kingdom's Centre for the Joncheray, L., "Simple Active Attack Against TCP",
Protection of National Infrastructure (CPNI) Technical Proceedings of the Fifth USENIX UNIX Security Symposium,
Report, 2009, <https://www.gont.com.ar/papers/tn-03-09- June 1995, <https://www.usenix.org/legacy/publications/lib
security-assessment-TCP.pdf>. rary/proceedings/security95/full_papers/joncheray.pdf>.
[Zalewski2001] [KASLR] PaX Team, "Address Space Layout Randomization",
Zalewski, M., "Strange Attractors and TCP/IP Sequence <https://pax.grsecurity.net/docs/aslr.txt>.
Number Analysis", 2001,
<https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.
[Zalewski2002] [Klein2007]
Zalewski, M., "Strange Attractors and TCP/IP Sequence Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
Number Analysis - One Year Later", 2001, Predictable IP ID Vulnerability", November 2007,
<https://lcamtuf.coredump.cx/newtcp/>. <https://dl.packetstormsecurity.net/papers/attack/OpenBSD_
DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln
erability.pdf>.
[Joncheray1995] [Knuth1983]
Joncheray, L., "A Simple Active Attack Against TCP", Proc. Knuth, D., "The Art of Computer Programming", Volume 2
Fifth Usenix UNIX Security Symposium, 1995, <https://www.u (Seminumerical Algorithms), 2nd Ed., Reading,
senix.org/legacy/publications/library/proceedings/ Massachusetts, Addison-Wesley Publishing Company, January
security95/full_papers/joncheray.pdf>. 1981.
[Morris1985] [Morris1985]
Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP
Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, Software", CSTR 117, AT&T Bell Laboratories, Murray Hill,
NJ, 1985, NJ, February 1985,
<https://pdos.csail.mit.edu/~rtm/papers/117.pdf>. <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>.
[Shimomura1995] [nmap] nmap, "Nmap: Free Security Scanner For Network Exploration
Shimomura, T., "Technical details of the attack described and Audit", 2020, <https://nmap.org/>.
by Markoff in NYT", Message posted in USENET's
comp.security.misc newsgroup Message-ID:
<3g5gkl$5j1@ariel.sdsc.edu>, 1995,
<https://www.gont.com.ar/docs/post-shimomura-usenet.txt>.
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, [POSIX] IEEE, "IEEE Standard for Information Technology --
DOI 10.17487/RFC5927, July 2010, Portable Operating System Interface (POSIX(TM)) Base
<https://www.rfc-editor.org/info/rfc5927>. Specifications, Issue 7", IEEE Std 1003.1-2017,
DOI 10.1109/IEEESTD.2018.8277153, January 2018,
<https://doi.org/10.1109/IEEESTD.2018.8277153>.
[RFC4953] Touch, J., "Defending TCP Against Spoofing Attacks", [Press1992]
RFC 4953, DOI 10.17487/RFC4953, July 2007, Press, W., Teukolsky, S., Vetterling, W., and B. Flannery,
<https://www.rfc-editor.org/info/rfc4953>. "Numerical Recipes in C: The Art of Scientific Computing",
2nd Ed., Cambridge University Press, ISBN 0-521-43108-5,
December 1992.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, Hashing for Message Authentication", RFC 2104,
<https://www.rfc-editor.org/info/rfc8446>. DOI 10.17487/RFC2104, February 1997,
<https://www.rfc-editor.org/info/rfc2104>.
[RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, [RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander,
"SEcure Neighbor Discovery (SEND)", RFC 3971, "SEcure Neighbor Discovery (SEND)", RFC 3971,
DOI 10.17487/RFC3971, March 2005, DOI 10.17487/RFC3971, March 2005,
<https://www.rfc-editor.org/info/rfc3971>. <https://www.rfc-editor.org/info/rfc3971>.
[RFC6980] Gont, F., "Security Implications of IPv6 Fragmentation [RFC4953] Touch, J., "Defending TCP Against Spoofing Attacks",
with IPv6 Neighbor Discovery", RFC 6980, RFC 4953, DOI 10.17487/RFC4953, July 2007,
DOI 10.17487/RFC6980, August 2013, <https://www.rfc-editor.org/info/rfc4953>.
<https://www.rfc-editor.org/info/rfc6980>.
[RFC7739] Gont, F., "Security Implications of Predictable Fragment
Identification Values", RFC 7739, DOI 10.17487/RFC7739,
February 2016, <https://www.rfc-editor.org/info/rfc7739>.
[RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly
Errors at High Data Rates", RFC 4963, Errors at High Data Rates", RFC 4963,
DOI 10.17487/RFC4963, July 2007, DOI 10.17487/RFC4963, July 2007,
<https://www.rfc-editor.org/info/rfc4963>. <https://www.rfc-editor.org/info/rfc4963>.
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927,
DOI 10.17487/RFC5927, July 2010,
<https://www.rfc-editor.org/info/rfc5927>.
[RFC6194] Polk, T., Chen, L., Turner, S., and P. Hoffman, "Security
Considerations for the SHA-0 and SHA-1 Message-Digest
Algorithms", RFC 6194, DOI 10.17487/RFC6194, March 2011,
<https://www.rfc-editor.org/info/rfc6194>.
[RFC6274] Gont, F., "Security Assessment of the Internet Protocol [RFC6274] Gont, F., "Security Assessment of the Internet Protocol
Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011, Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011,
<https://www.rfc-editor.org/info/rfc6274>. <https://www.rfc-editor.org/info/rfc6274>.
[Press1992] [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
Press, W. H., Teukolsky, S. A., Vetterling, W. T., and B. Morris, J., Hansen, M., and R. Smith, "Privacy
P. Flannery, "Numerical Recipes in C: The Art of Considerations for Internet Protocols", RFC 6973,
Scientific Computing", 2nd ed. ISBN 0-521-43108-5. DOI 10.17487/RFC6973, July 2013,
Cambridge University Press, 1992. <https://www.rfc-editor.org/info/rfc6973>.
[Romailler2020] [RFC6980] Gont, F., "Security Implications of IPv6 Fragmentation
Romailler, Y., "THE DEFINITIVE GUIDE TO "MODULO BIAS AND with IPv6 Neighbor Discovery", RFC 6980,
HOW TO AVOID IT"!", Kudelski Security Research, 2020, DOI 10.17487/RFC6980, August 2013,
<https://research.kudelskisecurity.com/2020/07/28/the- <https://www.rfc-editor.org/info/rfc6980>.
definitive-guide-to-modulo-bias-and-how-to-avoid-it/>.
[Aumasson2018] [RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
Aumasson, J.P., "Serious Cryptography: A Practical Flow Label for Load Balancing in Server Farms", RFC 7098,
Introduction to Modern Encryption", ISBN-10: DOI 10.17487/RFC7098, January 2014,
1-59327-826-8, No Starch Press, Inc., 2018. <https://www.rfc-editor.org/info/rfc7098>.
[Knuth1983] [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
Knuth, D., "The Art of Computer Programming", volume 2 Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
(Seminumerical Algorithms), 2nd ed. Reading, 2014, <https://www.rfc-editor.org/info/rfc7258>.
Massachusetts: Addison-Wesley Publishing Company, 1981.
[Bellovin1989] [RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6
Bellovin, S., "Security Problems in the TCP/IP Protocol Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016,
Suite", Computer Communications Review, vol. 19, no. 2, <https://www.rfc-editor.org/info/rfc7707>.
pp. 32-48, 1989,
<https://www.cs.columbia.edu/~smb/papers/ipext.pdf>.
[Bellovin2002] [RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy
Bellovin, S. M., "A Technique for Counting NATted Hosts", Considerations for IPv6 Address Generation Mechanisms",
IMW'02 Nov. 6-8, 2002, Marseille, France, 2002, RFC 7721, DOI 10.17487/RFC7721, March 2016,
<https://www.cs.columbia.edu/~smb/papers/fnat.pdf>. <https://www.rfc-editor.org/info/rfc7721>.
[Fyodor2003] [RFC7739] Gont, F., "Security Implications of Predictable Fragment
Fyodor, "Idle scanning and related IP ID games", 2003, Identification Values", RFC 7739, DOI 10.17487/RFC7739,
<https://nmap.org/presentations/CanSecWest03/CD_Content/ February 2016, <https://www.rfc-editor.org/info/rfc7739>.
idlescan_paper/idlescan.html>.
[RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
<https://www.rfc-editor.org/info/rfc8446>.
[RFC8937] Cremers, C., Garratt, L., Smyshlyaev, S., Sullivan, N.,
and C. Wood, "Randomness Improvements for Security
Protocols", RFC 8937, DOI 10.17487/RFC8937, October 2020,
<https://www.rfc-editor.org/info/rfc8937>.
[RFC9414] Gont, F. and I. Arce, "Unfortunate History of Transient
Numeric Identifiers", RFC 9414, DOI 10.17487/RFC9414, July
2023, <https://www.rfc-editor.org/info/rfc9414>.
[RFC9416] Gont, F. and I. Arce, "Security Considerations for
Transient Numeric Identifiers Employed in Network
Protocols", BCP 72, RFC 9416, DOI 10.17487/RFC9416, July
2023, <https://www.rfc-editor.org/info/rfc9416>.
[Romailler2020]
Romailler, Y., "The Definitive Guide to "Modulo Bias and
How to Avoid It"!", Kudelski Security Research, July 2020,
<https://research.kudelskisecurity.com/2020/07/28/the-
definitive-guide-to-modulo-bias-and-how-to-avoid-it/>.
[Sanfilippo1998a] [Sanfilippo1998a]
Sanfilippo, S., "about the ip header id", Post to Bugtraq Sanfilippo, S., "about the ip header id", message to the
mailing-list, Mon Dec 14 1998, Bugtraq mailing list, December 1998,
<http://seclists.org/bugtraq/1998/Dec/48>. <http://seclists.org/bugtraq/1998/Dec/48>.
[Sanfilippo1998b] [Sanfilippo1998b]
Sanfilippo, S., "Idle scan", Post to Bugtraq mailing-list, Sanfilippo, S., "new tcp scan method", message to the
1998, <https://github.com/antirez/hping/raw/master/docs/ Bugtraq mailing list, 18 December 1998,
SPOOFED_SCAN.txt>. <https://seclists.org/bugtraq/1998/Dec/79>.
[Sanfilippo1999] [Sanfilippo1999]
Sanfilippo, S., "more ip id", Post to Bugtraq mailing- Sanfilippo, S., "more ip id", message to the Bugtraq
list, 1999, mailing list, November 1999,
<https://github.com/antirez/hping/raw/master/docs/MORE- <https://github.com/antirez/hping/raw/master/docs/MORE-
FUN-WITH-IPID>. FUN-WITH-IPID>.
[Silbersack2005] [Schuba1993]
Silbersack, M.J., "Improving TCP/IP security through Schuba, C., "Addressing Weakness in the Domain Name System
randomization without sacrificing interoperability", Protocol", August 1993,
EuroBSDCon 2005 Conference, 2005, <http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/
<https://citeseerx.ist.psu.edu/viewdoc/ schuba-DNS-msthesis.pdf>.
download?doi=10.1.1.91.4542&rep=rep1&type=pdf>.
[Klein2007]
Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S
Predictable IP ID Vulnerability", 2007,
<https://dl.packetstormsecurity.net/papers/attack/OpenBSD_
DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln
erability.pdf>.
[IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and
KASLR Bypass (Extended Version)", June 2019,
<https://arxiv.org/pdf/1906.10478.pdf>.
[I-D.irtf-pearg-numeric-ids-history]
Gont, F. and I. Arce, "Unfortunate History of Transient
Numeric Identifiers", Work in Progress, Internet-Draft,
draft-irtf-pearg-numeric-ids-history-10, 11 July 2022,
<https://www.ietf.org/archive/id/draft-irtf-pearg-numeric-
ids-history-10.txt>.
[I-D.gont-numeric-ids-sec-considerations] [Shimomura1995]
Gont, F. and I. Arce, "Security Considerations for Shimomura, T., "Technical details of the attack described
Transient Numeric Identifiers Employed in Network by Markoff in NYT", message to the USENET
Protocols", Work in Progress, Internet-Draft, draft-gont- comp.security.misc newsgroup, 25 January 1995,
numeric-ids-sec-considerations-08, 10 December 2022, <https://www.gont.com.ar/files/post-shimomura-usenet.txt>.
<https://datatracker.ietf.org/api/v1/doc/document/draft-
gont-numeric-ids-sec-considerations/>.
[RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy [Silbersack2005]
Considerations for IPv6 Address Generation Mechanisms", Silbersack, M., "Improving TCP/IP security through
RFC 7721, DOI 10.17487/RFC7721, March 2016, randomization without sacrificing interoperability",
<https://www.rfc-editor.org/info/rfc7721>. EuroBSDCon 2005 Conference,
<https://www.silby.com/eurobsdcon05/
eurobsdcon_silbersack.pdf>.
[RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6 [SipHash] "SipHash: a fast short-input PRF", February 2023,
Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016, <https://github.com/veorq/SipHash>.
<https://www.rfc-editor.org/info/rfc7707>.
[RFC8937] Cremers, C., Garratt, L., Smyshlyaev, S., Sullivan, N., [TBIT] TBIT, "TBIT, the TCP Behavior Inference Tool", 2001,
and C. Wood, "Randomness Improvements for Security <https://www.icir.org/tbit/>.
Protocols", RFC 8937, DOI 10.17487/RFC8937, October 2020,
<https://www.rfc-editor.org/info/rfc8937>.
[TCPT-uptime] [TCPT-uptime]
McDanel, B., "TCP Timestamping - Obtaining System Uptime McDanel, B., "TCP Timestamping - Obtaining System Uptime
Remotely", 14 March 2001, Remotely", message to the Bugtraq mailing list, March
<https://securiteam.com/securitynews/5np0c153pi/>. 2001, <https://seclists.org/bugtraq/2001/Mar/182>.
[SipHash] Aumasson, J. P. and D. J. Bernstein, "SipHash: a fast
short-input PRF", 2012,
<https://github.com/veorq/SipHash>.
[BLAKE3] O'Connor, J., Aumasson, J. P., Neves, S., and Z. Wilcox- [Zalewski2001]
O'Hearn, "BLAKE3: one function, fast everywhere", 2020, Zalewski, M., "Strange Attractors and TCP/IP Sequence
<https://blake3.io/>. Number Analysis", April 2001,
<https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>.
[FIPS-SHS] NIST, "Secure Hash Standard (SHS)", Federal Information [Zalewski2002]
Processing Standards Publication 180-4, August 2015, Zalewski, M., "Strange Attractors and TCP/IP Sequence
<https://nvlpubs.nist.gov/nistpubs/FIPS/ Number Analysis - One Year Later (2002)",
NIST.FIPS.180-4.pdf>. <https://lcamtuf.coredump.cx/newtcp/>.
[RFC6194] Polk, T., Chen, L., Turner, S., and P. Hoffman, "Security [Zalewski2012]
Considerations for the SHA-0 and SHA-1 Message-Digest Zalewski, M., "p0f v3 (3.09b)",
Algorithms", RFC 6194, DOI 10.17487/RFC6194, March 2011, <https://lcamtuf.coredump.cx/p0f.shtml>.
<https://www.rfc-editor.org/info/rfc6194>.
Appendix A. Algorithms and Techniques with Known Issues Appendix A. Algorithms and Techniques with Known Issues
The following subsections discuss algorithms and techniques with The following subsections discuss algorithms and techniques with
known negative security and privacy implications. known negative security and privacy implications.
NOTE: | NOTE: As discussed in Section 1, the use of cryptographic
As discussed in Section 1, the use of cryptographic techniques | techniques might allow for the safe use of some of these
might allow for the safe use of some of these algorithms and | algorithms and techniques. However, this should be evaluated
techniques. However, this should be evaluated on a case by case | on a case-by-case basis.
basis.
A.1. Predictable Linear Identifiers Algorithm A.1. Predictable Linear Identifiers Algorithm
One of the most trivial ways to achieve uniqueness with a low One of the most trivial ways to achieve uniqueness with a low
identifier reuse frequency is to produce a linear sequence. This identifier reuse frequency is to produce a linear sequence. This
type of algorithm has been employed in the past to generate type of algorithm has been employed in the past to generate
identifiers of Categories #1, #2, and #4 (please see Section 6 for an identifiers of Categories #1, #2, and #4 (please see Section 6 for an
analysis of these categories). analysis of these categories).
For example, the following algorithm has been employed (see e.g. For example, the following algorithm has been employed (see, e.g.,
[Morris1985], [Shimomura1995], [Silbersack2005] and [CPNI-TCP]) in a [Morris1985], [Shimomura1995], [Silbersack2005], and [CPNI-TCP]) in a
number of operating systems for selecting IP fragment IDs, TCP number of operating systems for selecting IP IDs, TCP ephemeral port
ephemeral ports, etc.: numbers, etc.:
/* Initialization code */ /* Initialization code */
next_id = min_id; next_id = min_id;
id_inc= 1; id_inc= 1;
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = id_range; retry = id_range;
skipping to change at page 42, line 34 skipping to change at line 1885
return next_id; return next_id;
} }
retry--; retry--;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
NOTE: NOTE:
suitable_id() is a function that checks whether the resulting
identifier is acceptable (e.g., whether it's in use, etc.). suitable_id() checks whether a candidate numeric identifier is
suitable (e.g., whether it is unique or not).
For obvious reasons, this algorithm results in predictable sequences. For obvious reasons, this algorithm results in predictable sequences.
Since a global counter is used to generate the transient numeric Since a global counter is used to generate the transient numeric
identifiers ("next_id" in the example above), an entity that learns identifiers ("next_id" in the example above), an entity that learns
one numeric identifier can infer past numeric identifiers and predict one numeric identifier can infer past numeric identifiers and predict
future values to be generated by the same algorithm. Since the value future values to be generated by the same algorithm. Since the value
employed for the increments is known (such as "1" in this case), an employed for the increments is known (such as "1" in this case), an
attacker can sample two values, and learn the number of identifiers attacker can sample two values and learn the number of identifiers
that have been were generated in-between the two sampled values. that were generated in between the two sampled values. Furthermore,
Furthermore, if the counter is initialized e.g. when the system its if the counter is initialized, to some known value (e.g., when the
bootstrapped to some known value, the algorithm will leak additional system is bootstrapped), the algorithm will leak additional
information, such as the number of transmitted fragmented datagrams information, such as the number of transmitted fragmented datagrams
in the case of an IP ID generator [Sanfilippo1998a], or the system in the case of an IP ID generator [Sanfilippo1998a] or the system
uptime in the case of TCP timestamps [TCPT-uptime]. uptime in the case of TCP timestamps [TCPT-uptime].
A.2. Random-Increments Algorithm A.2. Random-Increments Algorithm
This algorithm offers a middle ground between the algorithms that This algorithm offers a middle ground between the algorithms that
generate randomized transient numeric identifiers (such as those generate randomized transient numeric identifiers (such as those
described in Section 7.1.1 and Section 7.1.2), and those that described in Sections 7.1.1 and 7.1.2) and those that generate
generate identifiers with a predictable monotonically-increasing identifiers with a predictable monotonically increasing function (see
function (see Appendix A.1). Appendix A.1).
/* Initialization code */ /* Initialization code */
next_id = random(); /* Initialization value */ next_id = random(); /* Initialization value */
id_rinc = 500; /* Determines the trade-off */ id_rinc = 500; /* Determines the trade-off */
/* Transient Numeric ID selection function */ /* Transient Numeric ID selection function */
id_range = max_id - min_id + 1; id_range = max_id - min_id + 1;
retry = id_range; retry = id_range;
skipping to change at page 43, line 44 skipping to change at line 1942
if (suitable_id(next_id)) { if (suitable_id(next_id)) {
return next_id; return next_id;
} }
retry = retry - id_inc; retry = retry - id_inc;
} while (retry > 0); } while (retry > 0);
return ERROR; return ERROR;
This algorithm aims at producing a global monotonically-increasing NOTE:
sequence of transient numeric identifiers, while avoiding the use of
random() is a PRNG that returns a pseudorandom unsigned integer
number of appropriate size. Beware that "adapting" the length of
the output of random() with a modulo operator (e.g., C language's
"%") may change the distribution of the PRNG. To preserve a
uniform distribution, the rejection sampling technique
[Romailler2020] can be used.
suitable_id() is a function that checks whether a candidate
identifier is suitable (e.g., whether it is unique or not).
This algorithm aims at producing a global monotonically increasing
sequence of transient numeric identifiers while avoiding the use of
fixed increments, which would lead to trivially predictable fixed increments, which would lead to trivially predictable
sequences. The value "id_inc" allows for direct control of the sequences. The value "id_rinc" allows for direct control of the
trade-off between unpredictability and identifier reuse frequency. trade-off between unpredictability and identifier reuse frequency.
The smaller the value of "id_inc", the more similar this algorithm is The smaller the value of "id_rinc", the more similar this algorithm
to a predicable, global linear ID generation algorithm (as the one in is to a predicable, global linear identifier generation algorithm (as
Appendix A.1). The larger the value of "id_inc", the more similar the one in Appendix A.1). The larger the value of "id_rinc", the
this algorithm is to the algorithm described in Section 7.1.1 of this more similar this algorithm is to the algorithm described in
document. Section 7.1.1 of this document.
When the identifiers wrap, there is a risk of collisions of transient When the identifiers wrap, there is a risk of collisions of transient
numeric identifiers (i.e., identifier reuse). Therefore, "id_inc" numeric identifiers (i.e., identifier reuse). Therefore, "id_rinc"
should be selected according to the following criteria: should be selected according to the following criteria:
* It should maximize the wrapping time of the identifier space. * It should maximize the wrapping time of the identifier space.
* It should minimize identifier reuse frequency. * It should minimize identifier reuse frequency.
* It should maximize unpredictability. * It should maximize unpredictability.
Clearly, these are competing goals, and the decision of which value Clearly, these are competing goals, and the decision of which value
of "id_inc" to use is a trade-off. Therefore, the value of "id_inc" of "id_rinc" to use is a trade-off. Therefore, the value of
is at times a configurable parameter so that system administrators "id_rinc" is at times a configurable parameter so that system
can make the trade-off for themselves. We note that the alternative administrators can make the trade-off for themselves. We note that
algorithms discussed throughout this document offer better the alternative algorithms discussed throughout this document offer
interoperability, security and privacy properties than this better interoperability, security, and privacy properties than this
algorithm, and hence implementation of this algorithm is discouraged. algorithm, and hence, implementation of this algorithm is
discouraged.
A.3. Re-using Identifiers Across Different Contexts A.3. Reusing Identifiers Across Different Contexts
Employing the same identifier across contexts in which stability is Employing the same identifier across contexts in which stability is
not required (i.e. overloading the semantics of transient numeric not required (i.e., overloading the semantics of transient numeric
identifier) usually has negative security and privacy implications. identifiers) usually has negative security and privacy implications.
For example, in order to generate transient numeric identifiers of For example, in order to generate transient numeric identifiers of
Category #2 or Category #3, an implementation or specification might Category #2 or #3, an implementation or specification might be
be tempted to employ a source for the numeric identifiers which is tempted to employ a source for the numeric identifiers that is known
known to provide unique values, but that may also be predictable or to provide unique values but that may also be predictable or leak
leak information related to the entity generating the identifier. information related to the entity generating the identifier. This
This technique has been employed in the past for e.g. generating IPv6 technique has been employed in the past for, e.g., generating IPv6
IIDs by re-using the MAC address of the underlying network interface. IIDs by reusing the MAC address of the underlying network interface
However, as noted in [RFC7721] and [RFC7707], embedding link-layer card. However, as noted in [RFC7721] and [RFC7707], embedding link-
addresses in IPv6 IIDs not only results in predictable values, but layer addresses in IPv6 IIDs not only results in predictable values
also leaks information about the manufacturer of the underlying but also leaks information about the manufacturer of the underlying
network interface card, allows for network activity correlation, and network interface card, allows for network activity correlation, and
makes address-based scanning attacks feasible. makes address-based scanning attacks feasible.
Acknowledgements
The authors would like to thank (in alphabetical order) Bernard
Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis León Cárdenas
Graide, Spencer Dawkins, Theo de Raadt, Guillermo Gont, Joseph
Lorenzo Hall, Gre Norcie, Colin Perkins, Vincent Roca, Shivan Sahib,
Rich Salz, Martin Thomson, and Michael Tüxen for providing valuable
comments on earlier draft versions of this document.
The authors would like to thank Shivan Sahib and Christopher Wood for
their guidance during the publication process of this document.
The authors would like to thank Jean-Philippe Aumasson and Mathew
D. Green (John Hopkins University) for kindly answering a number of
questions.
The authors would like to thank Diego Armando Maradona for his magic
and inspiration.
Authors' Addresses Authors' Addresses
Fernando Gont Fernando Gont
SI6 Networks SI6 Networks
Segurola y Habana 4310 7mo piso Segurola y Habana 4310 7mo piso
Ciudad Autonoma de Buenos Aires Ciudad Autonoma de Buenos Aires
Buenos Aires
Argentina Argentina
Email: fgont@si6networks.com Email: fgont@si6networks.com
URI: https://www.si6networks.com URI: https://www.si6networks.com
Ivan Arce Ivan Arce
Quarkslab Quarkslab
Segurola y Habana 4310 7mo piso Segurola y Habana 4310 7mo piso
Ciudad Autonoma de Buenos Aires Ciudad Autonoma de Buenos Aires
Buenos Aires
Argentina Argentina
Email: iarce@quarkslab.com Email: iarce@quarkslab.com
URI: https://www.quarkslab.com URI: https://www.quarkslab.com
 End of changes. 265 change blocks. 
1004 lines changed or deleted 1050 lines changed or added

This html diff was produced by rfcdiff 1.48.