| rfc9268.original | rfc9268.txt | |||
|---|---|---|---|---|
| Network Working Group R. Hinden | Internet Engineering Task Force (IETF) R. Hinden | |||
| Internet-Draft Check Point Software | Request for Comments: 9268 Check Point Software | |||
| Intended status: Experimental G. Fairhurst | Category: Experimental G. Fairhurst | |||
| Expires: 11 November 2022 University of Aberdeen | ISSN: 2070-1721 University of Aberdeen | |||
| 10 May 2022 | August 2022 | |||
| IPv6 Minimum Path MTU Hop-by-Hop Option | IPv6 Minimum Path MTU Hop-by-Hop Option | |||
| draft-ietf-6man-mtu-option-15 | ||||
| Abstract | Abstract | |||
| This document specifies a new IPv6 Hop-by-Hop option that is used to | This document specifies a new IPv6 Hop-by-Hop Option that is used to | |||
| record the minimum Path MTU along the forward path between a source | record the Minimum Path MTU (PMTU) along the forward path between a | |||
| host to a destination host. The recorded value can then be | source host to a destination host. The recorded value can then be | |||
| communicated back to the source using the return Path MTU field in | communicated back to the source using the return Path MTU field in | |||
| the option. | the Option. | |||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This document is not an Internet Standards Track specification; it is | |||
| provisions of BCP 78 and BCP 79. | published for examination, experimental implementation, and | |||
| evaluation. | ||||
| Internet-Drafts are working documents of the Internet Engineering | ||||
| Task Force (IETF). Note that other groups may also distribute | ||||
| working documents as Internet-Drafts. The list of current Internet- | ||||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
| Internet-Drafts are draft documents valid for a maximum of six months | This document defines an Experimental Protocol for the Internet | |||
| and may be updated, replaced, or obsoleted by other documents at any | community. This document is a product of the Internet Engineering | |||
| time. It is inappropriate to use Internet-Drafts as reference | Task Force (IETF). It represents the consensus of the IETF | |||
| material or to cite them other than as "work in progress." | community. It has received public review and has been approved for | |||
| publication by the Internet Engineering Steering Group (IESG). Not | ||||
| all documents approved by the IESG are candidates for any level of | ||||
| Internet Standard; see Section 2 of RFC 7841. | ||||
| This Internet-Draft will expire on 11 November 2022. | Information about the current status of this document, any errata, | |||
| and how to provide feedback on it may be obtained at | ||||
| https://www.rfc-editor.org/info/rfc9268. | ||||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
| license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
| and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
| extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
| described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
| provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
| in the Revised BSD License. | ||||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction | |||
| 1.1. Example Operation . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Example Operation | |||
| 1.2. Use of the IPv6 Hop-by-Hop Options Header . . . . . . . . 4 | 1.2. Use of the IPv6 Hop-by-Hop Options Header | |||
| 2. Motivation and Problem Solved . . . . . . . . . . . . . . . . 5 | 2. Motivation and Problem Solved | |||
| 3. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 | 3. Requirements Language | |||
| 4. Applicability Statements . . . . . . . . . . . . . . . . . . 6 | 4. Applicability Statements | |||
| 5. IPv6 Minimum Path MTU Hop-by-Hop Option . . . . . . . . . . . 6 | 5. IPv6 Minimum Path MTU Hop-by-Hop Option | |||
| 6. Router, Host, and Transport Layer Behaviors . . . . . . . . . 8 | 6. Router, Host, and Transport Layer Behaviors | |||
| 6.1. Router Behavior . . . . . . . . . . . . . . . . . . . . . 8 | 6.1. Router Behavior | |||
| 6.2. Host Operating System Behavior . . . . . . . . . . . . . 8 | 6.2. Host Operating System Behavior | |||
| 6.3. Transport Layer Behavior . . . . . . . . . . . . . . . . 9 | 6.3. Transport Layer Behavior | |||
| 6.3.1. Including the Option in an Outgoing Packet . . . . . 10 | 6.3.1. Including the Option in an Outgoing Packet | |||
| 6.3.2. Validation of the Packet that includes the Option . . 12 | 6.3.2. Validation of the Packet that Includes the Option | |||
| 6.3.3. Receiving the Option . . . . . . . . . . . . . . . . 12 | 6.3.3. Receiving the Option | |||
| 6.3.4. Using the Rtn-PMTU Field . . . . . . . . . . . . . . 13 | 6.3.4. Using the Rtn-PMTU Field | |||
| 6.3.5. Detecting Path Changes . . . . . . . . . . . . . . . 14 | 6.3.5. Detecting Path Changes | |||
| 6.3.6. Detection of Dropping Packets that include the | 6.3.6. Detection of Dropping Packets that Include the Option | |||
| Option . . . . . . . . . . . . . . . . . . . . . . . 14 | 7. IANA Considerations | |||
| 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | 8. Security Considerations | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | 8.1. Router Option Processing | |||
| 8.1. Router Option Processing . . . . . . . . . . . . . . . . 15 | 8.2. Network-Layer Host Processing | |||
| 8.2. Network Layer Host Processing . . . . . . . . . . . . . . 15 | 8.3. Validating Use of the Option Data | |||
| 8.3. Validating use of the Option Data . . . . . . . . . . . . 16 | 8.4. Direct Use of the Rtn-PMTU Value | |||
| 8.4. Direct use of the Rtn-PMTU Value . . . . . . . . . . . . 16 | 8.5. Using the Rtn-PMTU Value as a Hint for Probing | |||
| 8.5. Using the Rtn-PMTU Value as a Hint for Probing . . . . . 17 | 8.6. Impact of Middleboxes | |||
| 8.6. Impact of Middleboxes . . . . . . . . . . . . . . . . . . 17 | 9. Experiment Goals | |||
| 9. Experiment Goals . . . . . . . . . . . . . . . . . . . . . . 17 | 10. Implementation Status | |||
| 10. Implementation Status . . . . . . . . . . . . . . . . . . . . 18 | 11. References | |||
| 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 | 11.1. Normative References | |||
| 12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 18 | 11.2. Informative References | |||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 | Appendix A. Examples of Usage | |||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . 21 | Acknowledgments | |||
| 13.2. Informative References . . . . . . . . . . . . . . . . . 22 | Authors' Addresses | |||
| Appendix A. Examples of Usage . . . . . . . . . . . . . . . . . 24 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 | ||||
| 1. Introduction | 1. Introduction | |||
| This document specifies a new IPv6 Hop-by-Hop (HBH) Option to record | This document specifies a new IPv6 Hop-by-Hop (HBH) Option to record | |||
| the minimum Maximum Transmission Unit (MTU) along the forward path | the minimum Maximum Transmission Unit (MTU) along the forward path | |||
| between a source and a destination host. The source host creates a | between a source and a destination host. The source host creates a | |||
| packet with this option and initializes the Min-PMTU field with the | packet with this Option and initializes the Min-PMTU field with the | |||
| value of the MTU for the outbound link that will be used to forward | value of the MTU for the outbound link that will be used to forward | |||
| the packet towards the destination host. | the packet towards the destination host. | |||
| At each subsequent hop where the option is processed, the router | At each subsequent hop where the Option is processed, the router | |||
| compares the value of the Min-PMTU Field in the option and the MTU of | compares the value of the Min-PMTU field in the Option and the MTU of | |||
| its outgoing link. If the MTU of the link is less than the Min-PMTU, | its outgoing link. If the MTU of the link is less than the Min-PMTU, | |||
| it rewrites the value in the option data with the smaller value. | it rewrites the value in the Option Data with the smaller value. | |||
| When the packet arrives at the destination host, the host can send | When the packet arrives at the destination host, the host can send | |||
| the value of the minimum reported MTU for the path back to the source | the value of the minimum Reported MTU for the path back to the source | |||
| host using the Rtn-PMTU field in the option. The source host can | host using the Rtn-PMTU field in the Option. The source host can | |||
| then use this value as input to the method that sets the Path MTU | then use this value as input to the method that sets the Path MTU | |||
| (PMTU) used by upper layer protocols. | (PMTU) used by upper-layer protocols. | |||
| The IPv6 Minimum Path MTU Hop-by-Hop (MinPMTU HBH) Option is designed | The IPv6 Minimum Path MTU Hop-by-Hop (MinPMTU HBH) Option is designed | |||
| to work with packet sizes that can be specified in the IPv6 header. | to work with packet sizes that can be specified in the IPv6 header. | |||
| The maximum packet size that can be specified in an IPv6 header is | The maximum packet size that can be specified in an IPv6 header is | |||
| 65,535 octets (2^^16). | 65,535 octets (2^16). | |||
| This method has the potential to complete Path MTU discovery in a | This method has the potential to complete Path MTU Discovery (PMTUD) | |||
| single round trip time, even over paths that have successive links | in a single round-trip time, even over paths that have successive | |||
| each with a lower MTU. | links, each with a lower MTU. | |||
| The mechanism defined in this document is focused on Unicast, it does | The mechanism defined in this document is focused on unicast; it does | |||
| not describe Multicast. That is left for future work. | not describe multicast. That is left for future work. | |||
| 1.1. Example Operation | 1.1. Example Operation | |||
| The figure below illustrates the operation of the method. In this | The figure below illustrates the operation of the method. In this | |||
| case, the path between the source host and the destination host | case, the path between the source host and the destination host | |||
| comprises three links, the source has a link MTU of size MTU-S, the | comprises three links: the source has a link MTU of size MTU-S, the | |||
| link between routers R1 and R2 has an MTU of size 9000 bytes, and the | link between routers R1 and R2 has an MTU of size 9000 bytes, and the | |||
| final link to the destination has an MTU of size MTU-D. | final link to the destination has an MTU of size MTU-D. | |||
| +--------+ +----+ +----+ +-------+ | +--------+ +----+ +----+ +-------+ | |||
| | | | | | | | | | | | | | | | | | | |||
| | Sender +---------+ R1 +--------+ R2 +-------- + Dest. | | | Sender +---------+ R1 +--------+ R2 +-------- + Dest. | | |||
| | | | | | | | | | | | | | | | | | | |||
| +--------+ MTU-S +----+ 9000B +----+ MTU-D +-------+ | +--------+ MTU-S +----+ 9000B +----+ MTU-D +-------+ | |||
| Figure 1 | Figure 1: An Example Path between the Source Host and the | |||
| Destination Host | ||||
| Three scenarios are described: | Three scenarios are described: | |||
| * Scenario 1, considers all links to have an 9000 byte MTU and the | * Scenario 1 considers all links to have a 9000 byte MTU, and the | |||
| method is supported by both routers. The initial Min-PMTU is not | method is supported by both routers. The initial Min-PMTU is not | |||
| modified along the path, and therefore the PMTU is 9000 bytes. | modified along the path. Therefore, the PMTU is 9000 bytes. | |||
| * Scenario 2, considers the link between R2 and destination host | * Scenario 2 considers the link between R2 and the destination host | |||
| (MTU-D) to have an MTU of 1500 bytes. This is the smallest MTU, | (MTU-D) to have an MTU of 1500 bytes. This is the smallest MTU. | |||
| router R2 updates the Min-PMTU to 1500 bytes and the method | Router R2 updates the Min-PMTU to 1500 bytes, and the method | |||
| correctly updates the PMTU to 1500 bytes. Had there been another | correctly updates the PMTU to 1500 bytes. Had there been another | |||
| smaller MTU at a link further along the path that also supports | smaller MTU at a link further along the path that also supports | |||
| the method, the lower MTU would also have been detected. | the method, the lower MTU would also have been detected. | |||
| * Scenario 3, considers the case where the router preceding the | * Scenario 3 considers the case where the router preceding the | |||
| smallest link (R2) does not support the method, and the link to | smallest link (R2) does not support the method, and the link to | |||
| the destination host (MTU-D) has an MTU of 1500 bytes. Therefore, | the destination host (MTU-D) has an MTU of 1500 bytes. Therefore, | |||
| router R2 does not update the Min-PMTU to 1500 bytes. The method | router R2 does not update the Min-PMTU to 1500 bytes. The method | |||
| then fails to detect the actual PMTU. | then fails to detect the actual PMTU. | |||
| In Scenarios 2 and 3, a lower PMTU would also fail to be detected in | In Scenarios 2 and 3, a lower PMTU would also fail to be detected in | |||
| the case where PMTUD had been used and an ICMPv6 Packet Too Big (PTB) | the case where PMTUD had been used and an ICMPv6 Packet Too Big (PTB) | |||
| message had not been delivered to the sender [RFC8201]. | message had not been delivered to the sender [RFC8201]. | |||
| These scenarios are summarized in the table below. "H" in R1 and/or | These scenarios are summarized in the table below. "H" in R1 and/or | |||
| R2 columns means the router understands the MinPMTU HBH option. | R2 columns means the router understands the MinPMTU HBH Option. | |||
| +-+-----+-----+----+----+----------+-----------------------+ | +===+========+========+====+====+==========+=======================+ | |||
| | |MTU-S|MTU-D| R1 | R2 | Rec PMTU | Note | | | | MTU-S | MTU-D | R1 | R2 | Rec PMTU | Note | | |||
| +-+-----+-----+----+----+----------+-----------------------+ | +===+========+========+====+====+==========+=======================+ | |||
| |1|9000B|9000B| H | H | 9000 B | Endpoints attempt to | | | 1 | 9000 B | 9000 B | H | H | 9000 B | Endpoints attempt to | | |||
| | | | | | | use a 9000 B PMTU. | | | | | | | | | use a 9000 B PMTU. | | |||
| +-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
| |2|9000B|1500B| H | H | 1500 B | Endpoints attempt to | | | 2 | 9000 B | 1500 B | H | H | 1500 B | Endpoints attempt to | | |||
| | | | | | | | use a 1500 B PMTU. | | | | | | | | | use a 1500 B PMTU. | | |||
| +-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
| |3|9000B|1500B| H | - | 9000 B | Endpoints attempt to | | | 3 | 9000 B | 1500 B | H | - | 9000 B | Endpoints attempt to | | |||
| | | | | | | | use a 9000 B PMTU, | | | | | | | | | use a 9000 B PMTU but | | |||
| | | | | | | | but need to implement | | | | | | | | | need to implement a | | |||
| | | | | | | | a method to fall back | | | | | | | | | method to fall back | | |||
| | | | | | | | to discover and use a | | | | | | | | | to discover and use a | | |||
| | | | | | | | 1500 B PMTU. | | | | | | | | | 1500 B PMTU. | | |||
| +-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
| Figure 2 | Table 1: Three Scenarios That Arise from Using the Path Shown in | |||
| Figure 1 | ||||
| 1.2. Use of the IPv6 Hop-by-Hop Options Header | 1.2. Use of the IPv6 Hop-by-Hop Options Header | |||
| IPv6 as specified in [RFC8200] allows nodes to optionally process the | As specified in [RFC8200], IPv6 allows nodes to optionally process | |||
| Hop-by-Hop header. Specifically, from Section 4: | the Hop-by-Hop header. Specifically, from Section 4 of [RFC8200]: | |||
| * The Hop-by-Hop Options header is not inserted or deleted, but may | ||||
| be examined or processed by any node along a packet's delivery | ||||
| path, until the packet reaches the node (or each of the set of | ||||
| nodes, in the case of multicast) identified in the Destination | ||||
| Address field of the IPv6 header. The Hop-by-Hop Options header, | ||||
| when present, must immediately follow the IPv6 header. Its | ||||
| presence is indicated by the value zero in the Next Header field | ||||
| of the IPv6 header. | ||||
| * NOTE: While [RFC2460] required that all nodes must examine and | | The Hop-by-Hop Options header is not inserted or deleted, but may | |||
| process the Hop-by-Hop Options header, it is now expected that | | be examined or processed by any node along a packet's delivery | |||
| nodes along a packet's delivery path only examine and process the | | path, until the packet reaches the node (or each of the set of | |||
| Hop-by-Hop Options header if explicitly configured to do so. | | nodes, in the case of multicast) identified in the Destination | |||
| | Address field of the IPv6 header. The Hop-by-Hop Options header, | ||||
| | when present, must immediately follow the IPv6 header. Its | ||||
| | presence is indicated by the value zero in the Next Header field | ||||
| | of the IPv6 header. | ||||
| | | ||||
| | NOTE: While [RFC2460] required that all nodes must examine and | ||||
| | process the Hop-by-Hop Options header, it is now expected that | ||||
| | nodes along a packet's delivery path only examine and process the | ||||
| | Hop-by-Hop Options header if explicitly configured to do so. | ||||
| The Hop-by-Hop Option defined in this document is designed to take | The Hop-by-Hop Option defined in this document is designed to take | |||
| advantage of this property of how Hop-by-Hop options are processed. | advantage of this property of how Hop-by-Hop Options are processed. | |||
| Nodes that do not support this Option SHOULD ignore them. This can | Nodes that do not support this Option SHOULD ignore them. This can | |||
| mean that the Min-PMTU value does not account for all links along a | mean that the Min-PMTU value does not account for all links along a | |||
| path. | path. | |||
| 2. Motivation and Problem Solved | 2. Motivation and Problem Solved | |||
| The current state of Path MTU Discovery on the Internet is | The current state of Path MTU Discovery on the Internet is | |||
| problematic. The mechanisms defined in [RFC8201] are known to not | problematic. The mechanisms defined in [RFC8201] are known to not | |||
| work well in all environments. It fails to work in various cases, | work well in all environments. It fails to work in various cases, | |||
| including when nodes in the middle of the network do not send ICMPv6 | including when nodes in the middle of the network do not send ICMPv6 | |||
| PTB messages, or rate-limited ICMPv6 messages, or do not have a | PTB messages or rate-limited ICMPv6 messages or do not have a return | |||
| return path to the source host. | path to the source host. This results in many transport-layer | |||
| connections being configured to use smaller packets (e.g., 1280 | ||||
| This results in many transport layer connections being configured to | bytes) by default and makes it difficult to take advantage of paths | |||
| use smaller packets (e.g., 1280 bytes) by default and makes it | with a larger PMTU where they do exist. Applications that send large | |||
| difficult to take advantage of paths with a larger PMTU where they do | packets are forced to use IPv6 fragmentation [RFC8200], which can | |||
| exist. Applications that send large packets are forced to use IPv6 | reduce the reliability of Internet communication [RFC8900]. | |||
| Fragmentation [RFC8200], which can reduce the reliability of Internet | ||||
| communication [RFC8900]. | ||||
| Encapsulations and network-layer tunnels further reduce the payload | Encapsulations and network-layer tunnels further reduce the payload | |||
| size available for a transport protocol to use. Also, some use-cases | size available for a transport protocol to use. Also, some use cases | |||
| increase packet overhead, for example, Network Virtualization Using | increase packet overhead, for example, Network Virtualization Using | |||
| Generic Routing Encapsulation (NVGRE) [RFC7637] encapsulates L2 | Generic Routing Encapsulation (NVGRE) [RFC7637] encapsulates Layer 2 | |||
| packets in an outer IP header and does not allow IP Fragmentation. | (L2) packets in an outer IP header and does not allow IP | |||
| fragmentation. | ||||
| Sending larger packets can improve host performance, e.g., avoiding | Sending larger packets can improve host performance, e.g., avoiding | |||
| limits to packet processing by the packet rate. For example, the | limits to packet processing by the packet rate. An example of this | |||
| packet per second rate required to reach wire speed on a 10G link | is how the packet-per-second rate required to reach wire speed on a | |||
| with 1280 byte packets is about 977K packets per second (pps), vs. | 10G link with 1280 byte packets is about 977K packets per second | |||
| 139K pps for 9000 byte packets. | (pps) vs. 139K pps for 9000 byte packets. | |||
| The purpose of this document is to improve the situation by defining | The purpose of this document is to improve the situation by defining | |||
| a mechanism that does not rely on reception of ICMPv6 Packet Too Big | a mechanism that does not rely on reception of ICMPv6 PTB messages | |||
| messages from nodes in the middle of the network. Instead, this | from nodes in the middle of the network. Instead, this provides | |||
| provides information to the destination host about the minimum Path | information to the destination host about the Minimum Path MTU and | |||
| MTU, and sends this information back to the source host. This is | sends this information back to the source host. This is expected to | |||
| expected to work better than the current RFC8201-based mechanisms. | work better than the current mechanisms based on [RFC8201]. | |||
| A similar mechanism was proposed in 1988 for IPv4 in [RFC1063] by | A similar mechanism was proposed in 1988 for IPv4 in [RFC1063] by | |||
| Jeff Mogul, C. Kent, Craig Partridge, and Keith McCloghire. It was | Jeff Mogul, C. Kent, Craig Partridge, and Keith McCloghire. It was | |||
| later obsoleted in 1990 by [RFC1191], the current deployed approach | later obsoleted in 1990 by [RFC1191], which is the current deployed | |||
| to Path MTU Discovery. In contrast, the method described in this | approach to Path MTU Discovery. In contrast, the method described in | |||
| document uses the Hop-by-Hop option of IPv6. It does not replace | this document uses the Hop-by-Hop Option of IPv6. It does not | |||
| PMTUD [RFC8201], PLPPMTUD [RFC4821] or Datagram PLPMTUD [RFC8899], | replace PMTUD [RFC8201], Packetization Layer Path MTU Discovery | |||
| but rather is designed to compliment these methods. | (PLPMTUD) [RFC4821], or Datagram Packetization Layer PMTU Discovery | |||
| (DPLPMTUD) [RFC8899] but rather is designed to compliment these | ||||
| methods. | ||||
| 3. Requirements Language | 3. Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| "OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
| capitals, as shown here. | capitals, as shown here. | |||
| 4. Applicability Statements | 4. Applicability Statements | |||
| The Path MTU option is designed for environments where there is | The Path MTU Option is designed for environments where there is | |||
| control over the hosts and nodes that connect them, and where there | control over the hosts and nodes that connect them and where there is | |||
| is more than one MTU size in use. For example, in Data Centers and | more than one MTU size in use, for example, in data centers and on | |||
| on paths between Data Centers, to allow hosts to better take | paths between data centers to allow hosts to better take advantage of | |||
| advantage of a path that is able to support a large PMTU. | a path that is able to support a large PMTU. | |||
| The design of the option is sufficiently simple that it can be | The design of the Option is so sufficiently simple that it can be | |||
| executed on a router's fast path. A successful experiment depends on | executed on a router's fast path. A successful experiment depends on | |||
| both implementation by host and router vendors and deployment by | both implementation by host and router vendors and deployment by | |||
| operators. The contained use-case of connections within and between | operators. The contained use case of connections within and between | |||
| Data Centers could be a driver for deployment. | data centers could be a driver for deployment. | |||
| The method could also be useful in other environments, including the | The method could also be useful in other environments, including the | |||
| general Internet, and offers advantage when this Hop-by-Hop Option is | general Internet, and offers an advantage when this Hop-by-Hop Option | |||
| supported on all paths. The method is more robust when used to probe | is supported on all paths. The method is more robust when used to | |||
| the path using packets that do not carry application data and when | probe the path using packets that do not carry application data and | |||
| also paired with a method such as Packetization Layer PMTUD [RFC4821] | when also paired with a method like Packetization Layer PMTUD | |||
| or Datagram PLPMTUD [RFC8899]. | [RFC4821] or Datagram Packetization Layer PMTU Discovery (DPLPMTUD) | |||
| [RFC8899]. | ||||
| 5. IPv6 Minimum Path MTU Hop-by-Hop Option | 5. IPv6 Minimum Path MTU Hop-by-Hop Option | |||
| The Minimum Path MTU Hop-by-Hop Option has the following format: | The Minimum Path MTU Hop-by-Hop Option has the following format: | |||
| Option Option Option | Option Option Option | |||
| Type Data Len Data | Type Data Len Data | |||
| +--------+--------+--------+--------+---------+-------+-+ | +--------+--------+--------+--------+---------+-------+-+ | |||
| |BBCTTTTT|00000100| Min-PMTU | Rtn-PMTU |R| | |BBCTTTTT|00000100| Min-PMTU | Rtn-PMTU |R| | |||
| +--------+--------+--------+--------+---------+-------+-+ | +--------+--------+--------+--------+---------+-------+-+ | |||
| Option Type (see Section 4.2 of [RFC8200]): | Figure 2: Format of the Minimum Path MTU Hop-by-Hop Option | |||
| BB 00 Skip over this option and continue processing. | Option Type (see Section 4.2 of [RFC8200]): | |||
| C 1 Option data can change en route to the packet's final | BB 00 Skip over this Option and continue processing. | |||
| C 1 Option Data can change en route to the packet's final | ||||
| destination. | destination. | |||
| TTTTT 10000 Option Type assigned from IANA [IANA-HBH]. | TTTTT 10000 Option Type assigned from IANA [IANA-HBH]. | |||
| Length: 4 The size of the value field in Option Data | Length: 4 The size of the value field in Option Data | |||
| field supports PMTU values from 0 to 65,534 octets, the | field supports PMTU values from 0 to 65,534 | |||
| maximum size represented by the Path MTU option. | octets, the maximum size represented by the | |||
| Path MTU Option. | ||||
| Min-PMTU: n 16-bits. The minimum MTU recorded along the path | Min-PMTU: n 16-bits. The minimum MTU recorded along the path | |||
| in octets, reflecting the smallest link MTU that | in octets, reflecting the smallest link MTU that | |||
| the packet experienced along the path. | the packet experienced along the path. | |||
| A value less than the IPv6 minimum link | A value less than the IPv6 minimum link | |||
| MTU [RFC8200] MUST be ignored. | MTU [RFC8200] MUST be ignored. | |||
| Rtn-PMTU: n 15-bits. The returned Path MTU field, carrying the 15 | Rtn-PMTU: n 15-bits. The returned Path MTU field, carrying the 15 | |||
| most significant bits of the latest received Min-PMTU | most significant bits of the latest received Min-PMTU | |||
| field for the forward path. The value zero means that | field for the forward path. The value zero means that | |||
| no Reported MTU is being returned. | no Reported MTU is being returned. | |||
| R n 1-bit. R-Flag. Set by the source to signal that | R n 1-bit. R-Flag. Set by the source to signal that | |||
| the destination host should include the received | the destination host should include the received | |||
| Rtn-PMTU field updated by the reported Min-PMTU value | Rtn-PMTU field updated by the reported Min-PMTU value | |||
| when the destination host is to send a PMTU Option back | when the destination host is to send a PMTU Option back | |||
| to the source host. | to the source host. | |||
| Figure 3 | ||||
| NOTE: The encoding of the final two octets (Rtn-PMTU and R-Flag) | NOTE: The encoding of the final two octets (Rtn-PMTU and R-Flag) | |||
| could be implemented by a mask of the latest received Min-PMTU value | could be implemented by a mask of the latest received Min-PMTU value | |||
| with 0xFFFE, discarding the right-most bit and then performing a | with 0xFFFE, discarding the right-most bit and then performing a | |||
| logical 'OR' with the R-Flag value of the sender. This encoding fits | logical 'OR' with the R-Flag value of the sender. This encoding fits | |||
| in the minimum-sized Hop-by-Hop Option header. | in the minimum-sized Hop-by-Hop Option header. | |||
| 6. Router, Host, and Transport Layer Behaviors | 6. Router, Host, and Transport Layer Behaviors | |||
| 6.1. Router Behavior | 6.1. Router Behavior | |||
| Routers that are not configured to support Hop-by-Hop Options are not | Routers that are not configured to support Hop-by-Hop Options are not | |||
| expected to examine or process the contents of this option [RFC8200]. | expected to examine or process the contents of this Option [RFC8200]. | |||
| Routers that support Hop-by-Hop Options, but are not configured to | Routers that support Hop-by-Hop Options but are not configured to | |||
| support this option SHOULD skip over this option and continue to | support this Option SHOULD skip over this Option and continue to | |||
| processing the header [RFC8200]. | process the header [RFC8200]. | |||
| Routers that support this option MUST compare the value of the Min- | Routers that support this Option MUST compare the value of the Min- | |||
| PMTU field with the MTU configured for the outgoing link. If the MTU | PMTU field with the MTU configured for the outgoing link. If the MTU | |||
| of the outgoing link is less than the Min-PMTU, the router rewrites | of the outgoing link is less than the Min-PMTU, the router rewrites | |||
| the Min-PMTU in the Option to use the smaller value. (The router | the Min-PMTU in the Option to use the smaller value. (The router | |||
| processing is performed without checking the valid range of the Min- | processing is performed without checking the valid range of the Min- | |||
| PMTU or the Rtn-PMTU fields.) | PMTU or the Rtn-PMTU fields.) | |||
| A router MUST ignore and MUST NOT change the Rtn-PMTU field or the | A router MUST ignore and MUST NOT change the Rtn-PMTU field or the | |||
| R-Flag in the option. | R-Flag in the Option. | |||
| 6.2. Host Operating System Behavior | 6.2. Host Operating System Behavior | |||
| The PMTU entry associated with the destination in the host's | The PMTU entry associated with the destination in the host's | |||
| destination cache [RFC4861] SHOULD be updated after detecting a | destination cache [RFC4861] SHOULD be updated after detecting a | |||
| change using the IPv6 Minimum Path MTU Hop-by-Hop Option. This | change using the IPv6 Minimum Path MTU Hop-by-Hop Option. This | |||
| cached value can be used by other flows that share the host's | cached value can be used by other flows that share the host's | |||
| destination cache. | destination cache. | |||
| The value in the host destination cache SHOULD be used by PLPMTUD to | The value in the host destination cache SHOULD be used by PLPMTUD to | |||
| select an initial PMTU for a flow. The cached PMTU is only increased | select an initial PMTU for a flow. The cached PMTU is only increased | |||
| by PLPMTUD when the Packetization Layer determines the path actually | by PLPMTUD when the Packetization Layer determines the path actually | |||
| supports a larger PMTU [RFC4821] [RFC8899]. | supports a larger PMTU [RFC4821] [RFC8899]. | |||
| When requested to send an IPv6 packet with the MinPMTU HBH option, | When requested to send an IPv6 packet with the MinPMTU HBH Option, | |||
| the source host includes the option in an outgoing packet. The | the source host includes the Option in an outgoing packet. The | |||
| source host MUST fill the Min-PMTU field with the MTU configured for | source host MUST fill the Min-PMTU field with the MTU configured for | |||
| the link over which it will send the packet on the next hop towards | the link over which it will send the packet on the next hop towards | |||
| the destination host. | the destination host. | |||
| When a host includes the option in a packet it sends, the host SHOULD | When a host includes the Option in a packet it sends, the host SHOULD | |||
| set the Rtn-PMTU field to the previously cached value of the received | set the Rtn-PMTU field to the previously cached value of the received | |||
| Minimum Path MTU for the flow in the Rtn-PMTU field (see | Minimum Path MTU for the flow in the Rtn-PMTU field (see | |||
| Section 6.3.3). If this value is not set (for example, because there | Section 6.3.3). If this value is not set (for example, because there | |||
| is no cached reported Min-PMTU value), the Rtn-PMTU field value MUST | is no cached reported Min-PMTU value), the Rtn-PMTU field value MUST | |||
| be set to zero. | be set to zero. | |||
| The source host MAY request the destination host to return the | The source host MAY request the destination host to return the | |||
| reported Min-PMTU value by setting the R-Flag in the option of an | reported Min-PMTU value by setting the R-Flag in the Option of an | |||
| outgoing packet. The R-Flag SHOULD NOT be set when the MinPMTU HBH | outgoing packet. The R-Flag SHOULD NOT be set when the MinPMTU HBH | |||
| Option was sent solely to provide requested feedback on the return | Option was sent solely to provide requested feedback on the return | |||
| Path MTU to avoid each response generating another response. | Path MTU to avoid each response generating another response. | |||
| The destination host controls when to send a packet with this option | The destination host controls when to send a packet with this Option | |||
| in response to an R-flag, as well as which packets to include it in. | in response to an R-Flag, as well as which packets to include it in. | |||
| The destination host MAY limit the rate at which it sends these | The destination host MAY limit the rate at which it sends these | |||
| packets. | packets. | |||
| A destination host only sets the R Flag if it wishes the source host | A destination host only sets the R-Flag if it wishes the source host | |||
| to also return the discovered PMTU value for the path from the | to also return the discovered PMTU value for the path from the | |||
| destination to the source. | destination to the source. | |||
| The normal sequence of operation of the R-Flag using the terminology | The normal sequence of operation of the R-Flag using the terminology | |||
| from the diagram in Figure 1 is: | from the diagram in Figure 1 is: | |||
| 1. The source sends a probe to the destination. The sender sets the | 1. The source sends a probe to the destination. The sender sets the | |||
| R-Flag. | R-Flag. | |||
| 2. The destination responds by sending a probe including the | 2. The destination responds by sending a probe including the | |||
| received Min-PMTU as the Rtn-PMTU. A destination that does not | received Min-PMTU as the Rtn-PMTU. A destination that does not | |||
| wish to probe the return path sets the R-Flag to 0. | wish to probe the return path sets the R-Flag to 0. | |||
| 6.3. Transport Layer Behavior | 6.3. Transport Layer Behavior | |||
| This Hop-by-Hop option is intended to be used with a path MTU | This Hop-by-Hop Option is intended to be used with a Path MTU | |||
| discovery method. | Discovery method. | |||
| PLPMTUD [RFC9000] uses probe packets for two distinct functions: | PLPMTUD [RFC8899] uses probe packets for two distinct functions: | |||
| * Probe packets are used to confirm connectivity. Such probes can | * Probe packets are used to confirm connectivity. Such probes can | |||
| be of any size up to the PLPMTU. These probe packets are sent to | be of any size up to the Packetization Layer Path MTU (PLPMTU). | |||
| solicit a response use the path to the remote node. These probe | These probe packets are sent to solicit a response using the path | |||
| packets can carry the Hop-by-Hop PMTU option, providing the final | to the remote node. These probe packets can carry the Hop-by-Hop | |||
| size of the packet does not exceed the current PLPMTU. After | PMTU Option, providing the final size of the packet does not | |||
| validating that the packet originates from the path (section | exceed the current PLPMTU. After validating that the packet | |||
| 4.6.1), the PLPMTUD method can use the reported size from the Hop- | originates from the path (Section 4.6.1 of [RFC8899]), the PLPMTUD | |||
| by-Hop option as the next search point when it resumes the search | method can use the reported size from the Hop-by-Hop Option as the | |||
| algorithm. (This use resembles the use of the PTB_SIZE | next search point when it resumes the search algorithm. (This use | |||
| information in section 4.6.2 of [RFC8899] | resembles the use of the PTB_SIZE information in Section 4.6.2 of | |||
| [RFC8899].) | ||||
| * A second use of probe packets is to explore if a path supports a | * A second use of probe packets is to explore if a path supports a | |||
| packet size greater than the current PLPMTU. If this probe packet | packet size greater than the current PLPMTU. If this probe packet | |||
| is successfully delivered (as determined by the source host), then | is successfully delivered (as determined by the source host), then | |||
| the PLPMTU is raised to the size of the successful probe. These | the PLPMTU is raised to the size of the successful probe. These | |||
| probe packets do not usually set the Path MTU Hop-by-Hop option. | probe packets do not usually set the Path MTU Hop-by-Hop Option. | |||
| See Section 1.2 of [RFC8899]. Section 4.1 of [RFC8899] also | ||||
| See section 1.2 of [RFC8899]. Section 4.1 of [RFC8899] also | describes ways that a probe packet can be constructed, depending | |||
| describes ways that a Probe Packet can be constructed, depending | ||||
| on whether the probe packets carry application data. | on whether the probe packets carry application data. | |||
| * The PMTU Hop-by-Hop Option Probe can be sent on packets that | The PMTU Hop-by-Hop Option probe can be sent on packets that include | |||
| include application data, but needs to be robust to potential loss | application data but needs to be robust to potential loss of the | |||
| of the packet (i.e., with the possibility that retransmission | packet (i.e., with the possibility that retransmission might be | |||
| might be needed if the packet is lost). | needed if the packet is lost). | |||
| * Using a PMTU Probe on packets that do not carry application data | Using a PMTU probe on packets that do not carry application data will | |||
| will avoid the need for loss recovery if a router on the path | avoid the need for loss recovery if a router on the path drops | |||
| drops packets that set this option. (This avoids the transport | packets that set this Option. (This avoids the transport needing to | |||
| needing to retransmit a lost packet that includes this option.) | retransmit a lost packet that includes this Option.) This is the | |||
| This is the normal default format for both uses of probes. | normal default format for both uses of probes. | |||
| 6.3.1. Including the Option in an Outgoing Packet | 6.3.1. Including the Option in an Outgoing Packet | |||
| The upper layer protocol can request the MinPMTU HBH option to be | The upper-layer protocol can request the MinPMTU HBH Option to be | |||
| included in an outgoing IPv6 packet. A transport protocol (or upper | included in an outgoing IPv6 packet. A transport protocol (or upper- | |||
| layer protocol) can include this option only on specific packets used | layer protocol) can include this Option only on specific packets used | |||
| to test the path. This option does not need to be included in all | to test the path. This Option does not need to be included in all | |||
| packets belonging to a flow. | packets belonging to a flow. | |||
| NOTE: Including this option in a large packet (e.g., one larger than | NOTE: Including this Option in a large packet (e.g., one larger than | |||
| the present PMTU) is not likely to be useful, since the large packet | the present PMTU) is not likely to be useful, since the large packet | |||
| would itself be dropped by any link along the path with a smaller | would itself be dropped by any link along the path with a smaller | |||
| MTU, preventing the Min-PMTU information from reaching the | MTU, preventing the Min-PMTU information from reaching the | |||
| destination host. | destination host. | |||
| Discussion: | Discussion: | |||
| * In the case of TCP, the option could be included in a packet that | * In the case of TCP, the Option could be included in a packet that | |||
| carries a TCP segment sent after the connection is established. A | carries a TCP segment sent after the connection is established. A | |||
| segment without data could be used, to avoid the need to | segment without data could be used to avoid the need to retransmit | |||
| retransmit this data if the probe packet is lost. The discovered | this data if the probe packet is lost. The discovered value can | |||
| value can be used to inform PLPMTUD [RFC4821]. | be used to inform PLPMTUD [RFC4821]. | |||
| NOTE: A TCP SYN can also negotiate the Maximum Segment Size (MSS), | NOTE: A TCP SYN can also negotiate the Maximum Segment Size (MSS), | |||
| which acts as an upper limit to the packet size that can be sent | which acts as an upper limit to the packet size that can be sent | |||
| by a TCP sender. If this option were to be included in a TCP SYN, | by a TCP sender. If this Option were to be included in a TCP SYN, | |||
| it could increase the probability that the SYN segment is lost | it could increase the probability that the SYN segment is lost | |||
| when routers on the path drop packets with this option (see | when routers on the path drop packets with this Option (see | |||
| Section 6.3.6), which could have an unwanted impact on the result | Section 6.3.6), which could have an unwanted impact on the result | |||
| of racing options [I-D.ietf-taps-arch] or feature negotiation. | of racing Options [TAPS-ARCH] or feature negotiation. | |||
| * The use with datagram transport protocols (e.g., UDP) is harder to | * The use with datagram transport protocols (e.g., UDP) is harder to | |||
| characterize because applications using datagram transports range | characterize because applications using datagram transports range | |||
| from very short-lived (low data-volume applications) exchanges, to | from very short-lived (low data-volume applications) exchanges to | |||
| longer (bulk) exchanges of packets between the source and | longer (bulk) exchanges of packets between the source and | |||
| destination hosts [RFC8085]. | destination hosts [RFC8085]. | |||
| * Simple-exchange protocols (i.e., low data-volume applications | * Simple-exchange protocols (i.e., low data-volume applications | |||
| [RFC8085] that only send one or a few packets per transaction), | [RFC8085] that only send one or a few packets per transaction) | |||
| might assume that the PMTU is symmetrical. That is, the PMTU is | might assume that the PMTU is symmetrical. That is, the PMTU is | |||
| the same in both directions, or at least not smaller for the | the same in both directions or at least not smaller for the return | |||
| return path. This optimization does not hold when the paths are | path. This optimization does not hold when the paths are not | |||
| not symmetric. | symmetric. | |||
| * The MinPMTU HBH option can be used with ICMPv6 [RFC4443]. This | * The MinPMTU HBH Option can be used with ICMPv6 [RFC4443]. This | |||
| requires a response from the remote node and therefore is | requires a response from the remote node and therefore is | |||
| restricted to use with ICMPv6 echo messages. The MinPMTU HBH | restricted to use with ICMPv6 echo messages. The MinPMTU HBH | |||
| option could provide additional information about the PMTU that | Option could provide additional information about the PMTU that | |||
| might be supported by a path. This could be use as a diagnostic | might be supported by a path. This could be used as a diagnostic | |||
| tool to measure the PMTU of a path. As with other uses, the | tool to measure the PMTU of a path. As with other uses, the | |||
| actual supported PMTU is only confirmed after receiving a response | actual supported PMTU is only confirmed after receiving a response | |||
| to a subsequent probe of the PMTU size. | to a subsequent probe of the PMTU size. | |||
| * A datagram transport can utilise DPLPMTUD [RFC8899]. For example, | * A datagram transport can utilize DPLPMTUD [RFC8899]. For example, | |||
| QUIC (see section 14.3 of [RFC9000]), can use DPLPMTUD to | QUIC (see Section 14.3 of [RFC9000]) can use DPLPMTUD to determine | |||
| determine whether the path to a destination will support a desired | whether the path to a destination will support a desired maximum | |||
| maximum datagram size. When using the IPv6 MinPMTU HBH option, | datagram size. When using the IPv6 MinPMTU HBH Option, the Option | |||
| the option could be added to an additional QUIC PMTU Probe that is | could be added to an additional QUIC PMTU probe that is of minimal | |||
| of minimal size (or one no larger than the currently supported | size (or one no larger than the currently supported PMTU size). | |||
| PMTU size). Once the return Path MTU value in the MinPMTU HBH | Once the return Path MTU value in the MinPMTU HBH Option has been | |||
| option has been learned, DPLPMTUD can be triggered to test for a | learned, DPLPMTUD can be triggered to test for a larger PLPMTU | |||
| larger PLPMTU using an appropriately sized PLPMTU Probe Packet | using an appropriately sized PLPMTU probe packet (see | |||
| (see section 5.3.1 of [RFC8899]). | Section 5.3.1 of [RFC8899]). | |||
| * The use of this option with DNS and DNSSEC over UDP is expected to | * The use of this Option with DNS and DNSSEC over UDP is expected to | |||
| work for paths where the PMTU is symmetric. The DNS server will | work for paths where the PMTU is symmetric. The DNS server will | |||
| learn the PMTU from the DNS query messages. If the Rtn-PMTU value | learn the PMTU from the DNS query messages. If the Rtn-PMTU value | |||
| is smaller, then a large DNSSEC response might be dropped and the | is smaller, then a large DNSSEC response might be dropped and the | |||
| known problems with PMTUD will then occur. DNS and DNSSEC over | known problems with PMTUD will then occur. DNS and DNSSEC over | |||
| transport protocols that can carry the PMTU ought to work. | transport protocols that can carry the PMTU ought to work. | |||
| * This method also can be used with Anycast to discover the PMTU of | * This method also can be used with anycast to discover the PMTU of | |||
| the path, but the use needs to be aware that the Anycast binding | the path, but the use needs to be aware that the anycast binding | |||
| might change. | might change. | |||
| 6.3.2. Validation of the Packet that includes the Option | 6.3.2. Validation of the Packet that Includes the Option | |||
| An upper layer protocol (e.g., transport endpoint) using this option | An upper-layer protocol (e.g., transport endpoint) using this Option | |||
| needs to provide protection from data injection attacks by off-path | needs to provide protection from data injection attacks by off-path | |||
| devices [RFC8085]. This requires a method to assure that the | devices [RFC8085]. This requires a method to assure that the | |||
| information in the Option Data is provided by a node on the path. | information in the Option Data is provided by a node on the path. | |||
| This validates that the packet forms a part of an existing flow, | This validates that the packet forms a part of an existing flow, | |||
| using context available at the upper layer. For example, a TCP | using context available at the upper layer. For example, a TCP | |||
| connection or UDP application that maintains the related state and | connection or UDP application that maintains the related state and | |||
| uses a randomized ephemeral port would provide this basic validation | uses a randomized ephemeral port would provide this basic validation | |||
| to protect from off-path data injection, see Section 5.1 of | to protect from off-path data injection; see Section 5.1 of | |||
| [RFC8085]. IPsec [RFC4301] and TLS [RFC8446] provide greater | [RFC8085]. IPsec [RFC4301] and TLS [RFC8446] provide greater | |||
| assurance. | assurance. | |||
| The upper layer discards any received packet when the packet | The upper layer discards any received packet when the packet | |||
| validation fails. When packet validation fails, the upper layer MUST | validation fails. When packet validation fails, the upper layer MUST | |||
| also discard the associated Option Data from the MinPMTU HBH option | also discard the associated Option Data from the MinPMTU HBH Option | |||
| without further processing. | without further processing. | |||
| 6.3.3. Receiving the Option | 6.3.3. Receiving the Option | |||
| For a connection-oriented upper layer protocol, caching of the | For a connection-oriented upper-layer protocol, caching of the | |||
| received Min-PMTU could be implemented by saving the value in the | received Min-PMTU could be implemented by saving the value in the | |||
| connection context at the transport layer. A connection-less upper | connection context at the transport layer. A connectionless upper | |||
| layer (e.g., one using UDP), requires the upper layer protocol to | layer (e.g., one using UDP) requires the upper-layer protocol to | |||
| cache the value for each flow it uses. | cache the value for each flow it uses. | |||
| A destination host that receives a MinPMTU HBH Option with the R-Flag | A destination host that receives a MinPMTU HBH Option with the R-Flag | |||
| SHOULD include the MinPMTU HBH option in the next outgoing IPv6 | SHOULD include the MinPMTU HBH Option in the next outgoing IPv6 | |||
| packet for the corresponding flow. | packet for the corresponding flow. | |||
| A simple mechanism could only include this option (with the Rtn-PMTU | A simple mechanism could only include this Option (with the Rtn-PMTU | |||
| field set) the first time this option is received or when it notifies | field set) the first time this Option is received or when it notifies | |||
| a change in the Minimum Path MTU. This limits the number of packets | a change in the Minimum Path MTU. This limits the number of packets, | |||
| including the option packets that are sent. However, this does not | including the Option packets, that are sent. However, this does not | |||
| provide robustness to packet loss or recovery after a sender loses | provide robustness to packet loss or recovery after a sender loses | |||
| state. | state. | |||
| Discussion: | Discussion: | |||
| * Some upper layer protocols send packets less frequently than the | * Some upper-layer protocols send packets less frequently than the | |||
| rate at which the host receives packets. This provides less | rate at which the host receives packets. This provides less | |||
| frequent feedback of the received Rtn-PMTU value. However, a host | frequent feedback of the received Rtn-PMTU value. However, a host | |||
| always sends the most recent Rtn-PMTU value. | always sends the most recent Rtn-PMTU value. | |||
| 6.3.4. Using the Rtn-PMTU Field | 6.3.4. Using the Rtn-PMTU Field | |||
| The Rtn-PMTU field provides an indication of the PMTU from on-path | The Rtn-PMTU field provides an indication of the PMTU from on-path | |||
| routers. It does not necessarily reflect the actual PMTU between the | routers. It does not necessarily reflect the actual PMTU between the | |||
| source and destination hosts. Care therefore needs to be exercised | source and destination hosts. Care therefore needs to be exercised | |||
| in using the Rtn-PMTU value. Specifically: | in using the Rtn-PMTU value. Specifically: | |||
| * The actual PMTU can be lower than the Rtn-PMTU value because the | * The actual PMTU can be lower than the Rtn-PMTU value because the | |||
| Min-PMTU field was not updated by a router on the path that did | Min-PMTU field was not updated by a router on the path that did | |||
| not process the option. | not process the Option. | |||
| * The actual PMTU may be lower than the Rtn-PMTU value because there | * The actual PMTU may be lower than the Rtn-PMTU value because there | |||
| is a layer-2 device with a lower MTU. | is a Layer 2 device with a lower MTU. | |||
| * The actual PMTU may be larger than the Rtn-PMTU value because of a | * The actual PMTU may be larger than the Rtn-PMTU value because of a | |||
| corrupted, delayed or mis-ordered response. A source host MUST | corrupted, delayed, or misordered response. A source host MUST | |||
| ignore a Rtn-PMTU value larger than the MTU configured for the | ignore a Rtn-PMTU value larger than the MTU configured for the | |||
| outgoing link. | outgoing link. | |||
| * The path might have changed between the time when the probe was | * The path might have changed between the time when the probe was | |||
| sent and when the Rtn-PMTU value received. | sent and when the Rtn-PMTU value received. | |||
| IPv6 requires that every link in the Internet have an MTU of 1280 | IPv6 requires that every link in the Internet have an MTU of 1280 | |||
| octets or greater. A node MUST ignore a Rtn-PMTU value less than | octets or greater. A node MUST ignore a Rtn-PMTU value less than | |||
| 1280 octets [RFC8200]. | 1280 octets [RFC8200]. | |||
| To avoid unintentional dropping of packets that exceed the actual | To avoid unintentional dropping of packets that exceed the actual | |||
| PMTU (e.g., Scenario 3 in Section 1.1), the source host can delay | PMTU (e.g., Scenario 3 in Section 1.1), the source host can delay | |||
| increasing the PMTU until a probe packet with the size of the Rtn- | increasing the PMTU until a probe packet with the size of the Rtn- | |||
| PMTU value has been successfully acknowledged by the upper layer, | PMTU value has been successfully acknowledged by the upper layer, | |||
| confirming that the path supports the larger PMTU. This probing | confirming that the path supports the larger PMTU. This probing | |||
| increases robustness, but adds one additional path round trip time | increases robustness but adds one additional path round-trip time | |||
| before the PMTU is updated. This use resembles that of PTB messages | before the PMTU is updated. This use resembles that of PTB messages | |||
| in section 4.6 of DPLPMTUD [RFC8899] (with the important difference | in Section 4.6 of DPLPMTUD [RFC8899] (with the important difference | |||
| that a PTB message can only seek to lower the PMTU, whereas this | being that a PTB message can only seek to lower the PMTU, whereas | |||
| option could trigger a probe packet to seek to increase the PMTU.) | this Option could trigger a probe packet to seek to increase the | |||
| PMTU). | ||||
| Section 5.2 of [RFC8201] provides guidance on the caching of PMTU | Section 5.2 of [RFC8201] provides guidance on the caching of PMTU | |||
| information and also the relation to IPv6 flow labels. | information and also the relation to IPv6 flow labels. | |||
| Implementations should consider the impact of Equal Cost Multipath | Implementations should consider the impact of Equal-Cost Multipath | |||
| (ECMP) [RFC6438]. Specifically, whether a PMTU ought to be | (ECMP) [RFC6438], specifically, whether a PMTU ought to be maintained | |||
| maintained for each transport endpoint, or for each network address. | for each transport endpoint or for each network address. | |||
| 6.3.5. Detecting Path Changes | 6.3.5. Detecting Path Changes | |||
| Path characteristics can change and the actual PMTU could increase or | Path characteristics can change, and the actual PMTU could increase | |||
| decrease over time. For instance, following a path change when | or decrease over time, for instance, following a path change when | |||
| packets are forwarded over a link with a different MTU than that | packets are forwarded over a link with a different MTU than that | |||
| previously used. To bound the delay in discovering an increase in | previously used. To bound the delay in discovering an increase in | |||
| the actual PMTU, a host with a link MTU larger than the current PMTU | the actual PMTU, a host with a link MTU larger than the current PMTU | |||
| SHOULD periodically send the MinPMTU HBH Option with the R-bit set. | SHOULD periodically send the MinPMTU HBH Option with the R-bit set. | |||
| DPLPMTUD provides recommendations concerning how this could be | DPLPMTUD provides recommendations concerning how this could be | |||
| implemented (see Section 5.3 of [RFC8899]). Since the option | implemented (see Section 5.3 of [RFC8899]). Since the Option | |||
| consumes less capacity than a full-sized probe packet, there can be | consumes less capacity than a full-sized probe packet, there can be | |||
| advantage in using this to detect a change in the path | an advantage in using this to detect a change in the path | |||
| characteristics. | characteristics. | |||
| 6.3.6. Detection of Dropping Packets that include the Option | 6.3.6. Detection of Dropping Packets that Include the Option | |||
| There is evidence that some middleboxes drop packets that include | There is evidence that some middleboxes drop packets that include | |||
| Hop-by-Hop options. For example, a firewall might drop a packet that | Hop-by-Hop Options. For example, a firewall might drop a packet that | |||
| carries an unknown extension header or option. This practice is | carries an unknown extension header or Option. This practice is | |||
| expected to decrease as an option becomes more widely used. It could | expected to decrease as an Option becomes more widely used. It could | |||
| result in generation of an ICMPv6 message indicating the problem. | result in the generation of an ICMPv6 message that indicates the | |||
| This could be used to (temporarily) suspend use of this option. | problem. This could be used to (temporarily) suspend use of this | |||
| Option. | ||||
| A middlebox that silently discards a packet with this option results | A middlebox that silently discards a packet with this Option results | |||
| in dropping of any packet using the option. This dropping can be | in the dropping of any packet using the Option. This dropping can be | |||
| avoided by appropriate configuration in a controlled environment, | avoided by appropriate configuration in a controlled environment, | |||
| such as within a data centre, but needs to be considered for Internet | such as within a data center, but it needs to be considered for | |||
| usage. Section 6.2 recommends that this option is not used on | Internet usage. Section 6.2 recommends that this Option is not used | |||
| packets where loss might adversely impact performance. | on packets where loss might adversely impact performance. | |||
| 7. IANA Considerations | 7. IANA Considerations | |||
| IANA has assigned and registered an IPv6 Hop-by-Hop Option type with | IANA has registered an IPv6 Hop-by-Hop Option type in the | |||
| Temporary status from the "Destination Options and Hop-by-Hop | "Destination Options and Hop-by-Hop Options" registry within the | |||
| Options" registry [IANA-HBH]. This assignment is shown in Section 5. | "Internet Protocol Version 6 (IPv6) Parameters" registry group | |||
| [IANA-HBH]. This assignment is shown in Section 5. | ||||
| IANA is requested to update this registry to point to this document | ||||
| and remove the Temporary status. | ||||
| 8. Security Considerations | 8. Security Considerations | |||
| This section discusses the security considerations. It first reviews | This section discusses the security considerations. It first reviews | |||
| router option processing. It then reviews host processing when | router Option processing. It then reviews host processing when | |||
| receiving this option at the network layer. It then considers two | receiving this Option at the network layer. It then considers two | |||
| ways in which the Option Data can be processed, followed by two | ways in which the Option Data can be processed, followed by two | |||
| approaches for using the Option Data. Finally, it discusses | approaches for using the Option Data. Finally, it discusses | |||
| middlebox implications related to use in the general Internet. | middlebox implications related to use in the general Internet. | |||
| 8.1. Router Option Processing | 8.1. Router Option Processing | |||
| This option shares the characteristics of all other IPv6 Hop-by-Hop | This Option shares the characteristics of all other IPv6 Hop-by-Hop | |||
| Options, in that if not supported at line rate it could be used to | Options, in that, if not supported at line rate, it could be used to | |||
| degrade the performance of a router. This option, while simple, is | degrade the performance of a router. This Option, while simple, is | |||
| no different to other uses of IPv6 Hop-by-Hop options. | no different than other uses of IPv6 Hop-by-Hop Options. | |||
| It is common for routers to ignore the Hop-by-Hop Option header or | It is common for routers to ignore the Hop-by-Hop Option header or to | |||
| drop packets containing a Hop-by-Hop Option header. Routers | drop packets containing a Hop-by-Hop Option header. Routers | |||
| implementing IPv6 according to [RFC8200] only examine and process the | implementing IPv6 according to [RFC8200] only examine and process the | |||
| Hop-by-Hop Options header if explicitly configured to do so. | Hop-by-Hop Options header if explicitly configured to do so. | |||
| 8.2. Network Layer Host Processing | 8.2. Network-Layer Host Processing | |||
| A malicious attacker can forge a packet directed at a host that | A malicious attacker can forge a packet directed at a host that | |||
| carries the MinPMTU HBH option. By design, the fields of this IP | carries the MinPMTU HBH Option. By design, the fields of this IP | |||
| option can be modified by the network. | Option can be modified by the network. | |||
| For comparison, the ICMPv6 Packet Too Big message used in [RFC8201] | For comparison, the ICMPv6 PTB message used in Path MTU Discovery | |||
| Path MTU Discovery, the source host has an inherent trust | [RFC8201] and the source host have an inherent trust relationship | |||
| relationship with the destination host including this option. This | with the destination host including this Option. This trust | |||
| trust relationship can be used to help verify the option. ICMPv6 | relationship can be used to help verify the Option. ICMPv6 PTB | |||
| Packet Too Big messages are sent from any router on the path to the | messages are sent from any router on the path to the destination | |||
| destination host, the source host has no prior knowledge of these | host. The source host has no prior knowledge of these routers | |||
| routers (except for the first hop router). | (except for the first hop router). | |||
| Reception of this packet will require processing as the network stack | Reception of this packet will require processing as the network stack | |||
| parses the packet before the packet is delivered to the upper layer | parses the packet before the packet is delivered to the upper-layer | |||
| protocol. This network layer option processing is normally completed | protocol. This network-layer Option processing is normally completed | |||
| before any upper layer protocol delivery checks are performed. | before any upper-layer protocol delivery checks are performed. | |||
| The network layer does not normally have sufficient information to | The network layer does not normally have sufficient information to | |||
| validate that the packet carrying an option originated from the | validate that the packet carrying an Option originated from the | |||
| destination (or an on-path node). It also does not typically have | destination (or an on-path node). It also does not typically have | |||
| sufficient context to demultiplex the packet to identify the related | sufficient context to demultiplex the packet to identify the related | |||
| transport flow. This can mean that any changes resulting from | transport flow. This can mean that any changes resulting from | |||
| reception of the option applies to all flows between a pair of | reception of the Option applies to all flows between a pair of | |||
| endpoints. | endpoints. | |||
| These considerations are no different to other uses of Hop-by-Hop | These considerations are no different than other uses of Hop-by-Hop | |||
| options, and this is the use case for PMTUD. The following section | Options, and this is the use case for PMTUD. The following section | |||
| describes a mitigation for this attack. | describes a mitigation for this attack. | |||
| 8.3. Validating use of the Option Data | 8.3. Validating Use of the Option Data | |||
| Transport protocols should be designed to provide protection from | Transport protocols should be designed to provide protection from | |||
| data injection attacks by off-path devices and mechanisms should be | data injection attacks by off-path devices, and mechanisms should be | |||
| described in the Security Considerations for each transport | described in the Security Considerations section for each transport | |||
| specification (see Section 5.1 of the UDP Guidelines [RFC8085]). For | specification (see Section 5.1 of "UDP Usage Guidelines" [RFC8085]). | |||
| example, a TCP or UDP application that maintains the related state | For example, a TCP or UDP application that maintains the related | |||
| and uses a randomized ephemeral port would provide basic protection. | state and uses a randomized ephemeral port would provide basic | |||
| TLS [RFC8446] or IPsec [RFC4301] provide cryptographic | protection. TLS [RFC8446] or IPsec [RFC4301] provide cryptographic | |||
| authentication. An upper layer protocol that validates each received | authentication. An upper-layer protocol that validates each received | |||
| packet discards any packet when this validation fails. In this case, | packet discards any packet when this validation fails. In this case, | |||
| the host MUST also discard the associated Option Data from the | the host MUST also discard the associated Option Data from the | |||
| MinPMTU HBH option without further processing (Section 6.3). | MinPMTU HBH Option without further processing (Section 6.3). | |||
| A network node on the path has visibility of all packets it forwards. | A network node on the path has visibility of all packets it forwards. | |||
| By observing the network packet payload, the node might be able to | By observing the network packet payload, the node might be able to | |||
| construct a packet that might be validated by the destination host. | construct a packet that might be validated by the destination host. | |||
| Such a node would also be able to drop or limit the flow in other | Such a node would also be able to drop or limit the flow in other | |||
| ways that could be potentially more disruptive. Authenticating the | ways that could be potentially more disruptive. Authenticating the | |||
| packet, for example, using IPsec [RFC4301] or TLS [RFC8446] mitigates | packet, for example, using IPsec [RFC4301] or TLS [RFC8446] mitigates | |||
| this attack. Note that AH style authentication [RFC4302] while | this attack. Note that the authentication style of the | |||
| authenticating the payload and outer IPv6 header, does not check Hop- | Authentication Header (AH) [RFC4302], while authenticating the | |||
| by-Hop options that change on route. | payload and outer IPv6 header, does not check Hop-by-Hop Options that | |||
| change on route. | ||||
| 8.4. Direct use of the Rtn-PMTU Value | 8.4. Direct Use of the Rtn-PMTU Value | |||
| The simplest way to utilize the Rtn-PMTU value is to directly use | The simplest way to utilize the Rtn-PMTU value is to directly use | |||
| this to update the PMTU. This approach results in a set of security | this to update the PMTU. This approach results in a set of security | |||
| issues when the option carries malicious data: | issues when the Option carries malicious data: | |||
| * A direct update of the PMTU using the Rtn-PMTU value could result | * A direct update of the PMTU using the Rtn-PMTU value could result | |||
| in an attacker inflating or reducing the size of the host PMTU for | in an attacker inflating or reducing the size of the host PMTU for | |||
| the destination. Forcing a reduction in the PMTU can decrease the | the destination. Forcing a reduction in the PMTU can decrease the | |||
| efficiency of network use, might increase the number of packets/ | efficiency of network use, might increase the number of packets/ | |||
| fragments required to send the same volume of payload data, and | fragments required to send the same volume of payload data, and | |||
| prevents sending an unfragmented datagram larger than the PMTU. | can prevent sending an unfragmented datagram larger than the PMTU. | |||
| Increasing the PMTU can result in black-holing (see Section 1.1 of | Increasing the PMTU can result in a path silently dropping packets | |||
| [RFC8899]) when the source host sends packets larger than the | (described as a black hole in [RFC8899]) when the source host | |||
| actual PMTU. This persists until the PMTU is next updated. | sends packets larger than the actual PMTU. This persists until | |||
| the PMTU is next updated. | ||||
| * The method can be used to solicit a response from the destination | * The method can be used to solicit a response from the destination | |||
| host. A malicious attacker could forge a packet that causes the | host. A malicious attacker could forge a packet that causes the | |||
| destination to add the option to a packet sent to the source host. | destination to add the Option to a packet sent to the source host. | |||
| A forged value of Rtn-PMTU in the Option Data might also impact | A forged value of Rtn-PMTU in the Option Data might also impact | |||
| the remote endpoint, as described in the previous bullet. This | the remote endpoint, as described in the previous bullet. This | |||
| persists until a valid MinPMTU HBH option is received. This | persists until a valid MinPMTU HBH Option is received. This | |||
| attack could be mitigated by limiting the sending of the MinPMTU | attack could be mitigated by limiting the sending of the MinPMTU | |||
| HBH option in reply to incoming packets that carry the option. | HBH Option in reply to incoming packets that carry the Option. | |||
| 8.5. Using the Rtn-PMTU Value as a Hint for Probing | 8.5. Using the Rtn-PMTU Value as a Hint for Probing | |||
| Another way to utilize the Rtn-PMTU value is to indirectly trigger a | Another way to utilize the Rtn-PMTU value is to indirectly trigger a | |||
| probe to determine if the path supports a PMTU of size Rtn-PMTU. | probe to determine if the path supports a PMTU of size Rtn-PMTU. | |||
| This approach needs context for the flow, and hence assumes an upper | This approach needs context for the flow and hence assumes an upper- | |||
| layer protocol that validates the packet that carries the option (see | layer protocol that validates the packet that carries the Option (see | |||
| Section 8.3). This is the case when used in combination with | Section 8.3). This is the case when used in combination with | |||
| DPLPMTUD [RFC8899]. A set of security considerations result when an | DPLPMTUD [RFC8899]. A set of security considerations result when an | |||
| option carries malicious data: | Option carries malicious data: | |||
| * If the forged packet carries a validated option with a non-zero | * If the forged packet carries a validated Option with a non-zero | |||
| Rtn-PMTU field, the upper layer protocol could utilize the | Rtn-PMTU field, the upper-layer protocol could utilize the | |||
| information in the Rtn-PMTU field. A Rtn-PMTU larger than the | information in the Rtn-PMTU field. A Rtn-PMTU larger than the | |||
| current PMTU can trigger a probe for a new size. | current PMTU can trigger a probe for a new size. | |||
| * If the forged packet carries a non-zero Min-PMTU field, the upper | * If the forged packet carries a non-zero Min-PMTU field, the upper- | |||
| layer protocol would change the cached information about the path | layer protocol would change the cached information about the path | |||
| from the source. The cached information at the destination host | from the source. The cached information at the destination host | |||
| will be overwritten when the host receives another packet that | will be overwritten when the host receives another packet that | |||
| includes a MinPMTU HBH option corresponding to the flow. | includes a MinPMTU HBH Option corresponding to the flow. | |||
| * Processing of the option could cause a destination host to add the | * Processing of the Option could cause a destination host to add the | |||
| MinPMTU HBH option to a packet sent to the source host. This | MinPMTU HBH Option to a packet sent to the source host. This | |||
| option will carry a Rtn-PMTU value that could have been updated by | Option will carry a Rtn-PMTU value that could have been updated by | |||
| the forged packet. The impact of the source host receiving this | the forged packet. The impact of the source host receiving this | |||
| resembles that discussed previously. | resembles that discussed previously. | |||
| 8.6. Impact of Middleboxes | 8.6. Impact of Middleboxes | |||
| There is evidence that some middleboxes drop packets that include | There is evidence that some middleboxes drop packets that include | |||
| Hop-by-Hop options. For example, a firewall might drop a packet that | Hop-by-Hop Options. For example, a firewall might drop a packet that | |||
| carries an unknown extension header or option. This practice is | carries an unknown extension header or Option. This practice is | |||
| expected to decrease as the option becomes more widely used. Methods | expected to decrease as the Option becomes more widely used. Methods | |||
| to address this are discussed in Section 6.3.6. | to address this are discussed in Section 6.3.6. | |||
| When a forged packet causes a packet to be sent including the MinPMTU | When a forged packet causes a packet that includes the MinPMTU HBH | |||
| HBH option, and the return path does not forward packets with this | Option to be sent and the return path does not forward packets with | |||
| option, the packet will be dropped Section 6.3.6. This attack is | this Option, the packet will be dropped (see Section 6.3.6). This | |||
| mitigated by validating the option data before use and by limiting | attack is mitigated by validating the Option Data before use and by | |||
| the rate of responses generated. An upper layer could further | limiting the rate of responses generated. An upper layer could | |||
| mitigate the impact by responding to an R-Flag by including the | further mitigate the impact by responding to an R-Flag by including | |||
| option in a packet that does not carry application data. | the Option in a packet that does not carry application data. | |||
| 9. Experiment Goals | 9. Experiment Goals | |||
| This section describes the experimental goals of this specification. | This section describes the experimental goals of this specification. | |||
| A successful deployment of the method depends upon several components | A successful deployment of the method depends upon several components | |||
| being implemented and deployed: | being implemented and deployed: | |||
| * Support in the sending node (see Section 6.2). This also requires | * Support in the sending node (see Section 6.2). This also requires | |||
| corresponding support in upper layer protocols (see Section 6.3). | corresponding support in upper-layer protocols (see Section 6.3). | |||
| * Router support in nodes (see Section 6.1). The IETF continues to | * Router support in nodes (see Section 6.1). The IETF continues to | |||
| provide recommendations on the use of IPv6 Hop-by-Hop options, for | provide recommendations on the use of IPv6 Hop-by-Hop Options, for | |||
| example Section 2.2.2 of [RFC9099]. This document does not update | example, see Section 2.2.2 of [RFC9099]. This document does not | |||
| the way router implementations configure support for Hop-by-Hop | update the way router implementations configure support for Hop- | |||
| options. | by-Hop Options. | |||
| * Support in the receiving node (see Section 6.3.3). | * Support in the receiving node (see Section 6.3.3). | |||
| Experience from deployment is an expected input to any decision to | Experience from deployment is an expected input to any decision to | |||
| progress this specification from Experimental to IETF Standards | progress this specification from Experimental to IETF Standards | |||
| Track. Appropriate inputs might include: | Track. Appropriate inputs might include: | |||
| * Reports of implementation experience; | * reports of implementation experience, | |||
| * Measurements of the number paths where the method can be used; | * measurements of the number paths where the method can be used, or | |||
| * Measurements showing the benefit realized or the implications of | * measurements showing the benefit realized or the implications of | |||
| using specific methods over specific paths. | using specific methods over specific paths. | |||
| 10. Implementation Status | 10. Implementation Status | |||
| At the time this document was published there are two known | At the time this document was published, there are two known | |||
| implementations of the Path MTU Hop-by-Hop option. These are: | implementations of the Path MTU Hop-by-Hop Option. These are: | |||
| * Wireshark dissector. This is shipping in production in Wireshark | * Wireshark dissector. This is shipping in production in Wireshark | |||
| version 3.2 [WIRESHARK]. | version 3.2 [WIRESHARK]. | |||
| * A prototype in the open source version of the FD.io Vector Packet | * A prototype in the open source version of the FD.io Vector Packet | |||
| Processing (VPP) technology [VPP]. At the time this document was | Processing (VPP) technology [VPP]. At the time this document was | |||
| published, the source code can be found [VPP_SRC]. | published, the source code can be found [VPP_SRC]. | |||
| 11. Acknowledgments | 11. References | |||
| Helpful comments were received from Tom Herbert, Tom Jones, Fred | ||||
| Templin, Ole Troan, Tianran Zhou, Jen Linkova, Brian Carpenter, Peng | ||||
| Shuping, Mark Smith, Fernando Gont, Michael Dougherty, Erik Kline, | ||||
| and other members of the 6MAN working group. | ||||
| 12. Change log [RFC Editor: Please remove] | ||||
| draft-ietf-6man-mtu-option-15, 2022-May-10 | ||||
| * Correcting an editing mistake in Appendix A. | ||||
| * Editorial Change. | ||||
| draft-ietf-6man-mtu-option-14, 2022-April-15 | ||||
| * Area Director Reviews: | ||||
| - Lars Eggert's Review: Fixed "nits". | ||||
| - Eric Vyncke's Review: Added that this work is focused on | ||||
| Unicast, removed Discussion from Section 6.1, revised text on | ||||
| PLPMTUD probing, changed SHOULD to MUST in Section 6.3.4, and | ||||
| fixed several NITs. | ||||
| - Alvaro Retana's Review: Changed SHOULD language to more general | ||||
| text in Section 6.1 | ||||
| - ARTART Review: Added new Appendix "Examples of Usage" with | ||||
| diagrams showing examples of use. | ||||
| - Zaheduzzaman Sarker's Review: Fixed some editorial issues, and | ||||
| updated SHOULD language. | ||||
| * Editorial Changes. | ||||
| draft-ietf-6man-mtu-option-13, 2022-February-28 | ||||
| * Area Directorate Reviews: | ||||
| - SECDIR Review: Fixed "nit". | ||||
| - TSVART Review: Restructured Section 6 including making | ||||
| Transport Behavior more prominent, added text about ICMPv6 to | ||||
| Section 6.3.1, moved the text about prior work in RFC1063 to | ||||
| Section 2. | ||||
| - GENART Review: Added text to Section 1 that this option was | ||||
| designed to work with packet sizes that can be specified in the | ||||
| IPv6 Header. | ||||
| * Editorial Changes. | ||||
| draft-ietf-6man-mtu-option-12, 2022-January-26 | ||||
| * Clarified a few issues raised by AD review by Erik Kline AD | ||||
| review. | ||||
| draft-ietf-6man-mtu-option-11, 2021-September-30 | ||||
| * Clarifications and editorial changes to the Security | ||||
| Considerations section based on early AD review by Erik Kline. | ||||
| draft-ietf-6man-mtu-option-10, 2021-September-27 | ||||
| * Clarifications and editorial changes based on second chair review | ||||
| by Ole Troan. | ||||
| * Editorial changes. | ||||
| draft-ietf-6man-mtu-option-09, 2021-September-23 | ||||
| * Clarifications and editorial changes based on review by Michael | ||||
| Dougherty. | ||||
| draft-ietf-6man-mtu-option-08, 2021-September-7 | ||||
| * Clarifications and editorial changes based on chair review by Ole | ||||
| Troan. | ||||
| * Correction and clarifications based on review by Fernando Gont. | ||||
| draft-ietf-6man-mtu-option-07, 2021-August-31 | ||||
| * Added Experiment Goals section. | ||||
| * Added Implementation Status section. | ||||
| * Updated the IANA Considerations section to point to this document | ||||
| and remove Temporary status. | ||||
| * Clarifications and editorial changes based on review by Mark | ||||
| Smith. | ||||
| draft-ietf-6man-mtu-option-06, 2021-August-7 | ||||
| * Transport usage of the mechanism clarified in response to feedback | ||||
| and suggestions from Jen Linkova. | ||||
| * Restructured Section 6 to improve readability. | ||||
| * Editorial changes. | ||||
| draft-ietf-6man-mtu-option-05, 2021-April-28 | ||||
| * Editorial changes. | ||||
| draft-ietf-6man-mtu-option-04, 2020-Oct-23 | ||||
| * Fixes for typos. | ||||
| draft-ietf-6man-mtu-option-03, 2020-Sept-14 | ||||
| * Rewrite to make text and terminology more consistent. | ||||
| * Added the notion of validating the packet before use of the HBH | ||||
| option data. | ||||
| * Method aligned with the way common APIs send/receive HBH option | ||||
| data. | ||||
| * Added reference to DPLPMTUD and clarified upper layer usage. | ||||
| * Completed security considerations section. | ||||
| draft-ietf-6man-mtu-option-02, 2020-March-9 | ||||
| * Editorial changes to make text and terminology more consistent. | ||||
| * Added reference to DPLPMTUD. | ||||
| draft-ietf-6man-mtu-option-01, 2019-September-13 | ||||
| * Changes to show IANA assigned code point. | ||||
| * Editorial changes to make text and terminology more consistent. | ||||
| * Added a reference to RFC8200 in Section 2 and a reference to | ||||
| RFC6438 in Section 6.3. | ||||
| draft-ietf-6man-mtu-option-00, 2019-August-9 | ||||
| * First 6man w.g. draft version. | ||||
| * Changes to request IANA allocation of code point. | ||||
| * Editorial changes. | ||||
| draft-hinden-6man-mtu-option-02, 2019-July-5 | ||||
| * Changed option format to also include the Returned PMTU value and | ||||
| Return flag and made related text changes in Section 6.2 to | ||||
| describe this behavior. | ||||
| * ICMPv6 Packet Too Big messages are no longer used for feedback to | ||||
| the source host. | ||||
| * Added to Acknowledgements Section that a similar mechanism was | ||||
| proposed for IPv4 in 1988 in [RFC1063]. | ||||
| * Editorial changes. | ||||
| draft-hinden-6man-mtu-option-01, 2019-March-05 | ||||
| * Changed requested status from Standards Track to Experimental to | ||||
| allow use of experimental option type (11110) to allow for | ||||
| experimentation. Removed request for IANA Option assignment. | ||||
| * Added Section 2 "Motivation and Problem Solved" section to better | ||||
| describe what the purpose of this document is. | ||||
| * Added appendix describing planned experiments and how the results | ||||
| will be measured. | ||||
| * Editorial changes. | ||||
| draft-hinden-6man-mtu-option-00, 2018-Oct-16 | ||||
| * Initial draft. | ||||
| 13. References | ||||
| 13.1. Normative References | 11.1. Normative References | |||
| [IANA-HBH] "Destination Options and Hop-by-Hop Options", | [IANA-HBH] IANA, "Destination Options and Hop-by-Hop Options", | |||
| <https://www.iana.org/assignments/ipv6-parameters/ | <https://www.iana.org/assignments/ipv6-parameters/>. | |||
| ipv6-parameters.xhtml#ipv6-parameters-2>. | ||||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
| DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
| <https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
| (IPv6) Specification", STD 86, RFC 8200, | (IPv6) Specification", STD 86, RFC 8200, | |||
| DOI 10.17487/RFC8200, July 2017, | DOI 10.17487/RFC8200, July 2017, | |||
| <https://www.rfc-editor.org/info/rfc8200>. | <https://www.rfc-editor.org/info/rfc8200>. | |||
| [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
| "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
| DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
| <https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
| 13.2. Informative References | 11.2. Informative References | |||
| [I-D.ietf-taps-arch] | ||||
| Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., and | ||||
| C. Perkins, "An Architecture for Transport Services", Work | ||||
| in Progress, Internet-Draft, draft-ietf-taps-arch-12, 3 | ||||
| January 2022, <https://datatracker.ietf.org/doc/html/ | ||||
| draft-ietf-taps-arch-12>. | ||||
| [RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP | [RFC1063] Mogul, J C., Kent, C A., Partridge, C., and K. McCloghrie, | |||
| MTU discovery options", RFC 1063, DOI 10.17487/RFC1063, | "IP MTU discovery options", RFC 1063, | |||
| July 1988, <https://www.rfc-editor.org/info/rfc1063>. | DOI 10.17487/RFC1063, July 1988, | |||
| <https://www.rfc-editor.org/info/rfc1063>. | ||||
| [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | [RFC1191] Mogul, J C. and S E. Deering, "Path MTU discovery", | |||
| DOI 10.17487/RFC1191, November 1990, | RFC 1191, DOI 10.17487/RFC1191, November 1990, | |||
| <https://www.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
| [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
| (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | |||
| December 1998, <https://www.rfc-editor.org/info/rfc2460>. | December 1998, <https://www.rfc-editor.org/info/rfc2460>. | |||
| [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
| Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | |||
| December 2005, <https://www.rfc-editor.org/info/rfc4301>. | December 2005, <https://www.rfc-editor.org/info/rfc4301>. | |||
| skipping to change at page 24, line 10 ¶ | skipping to change at line 930 ¶ | |||
| [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
| Multiplexed and Secure Transport", RFC 9000, | Multiplexed and Secure Transport", RFC 9000, | |||
| DOI 10.17487/RFC9000, May 2021, | DOI 10.17487/RFC9000, May 2021, | |||
| <https://www.rfc-editor.org/info/rfc9000>. | <https://www.rfc-editor.org/info/rfc9000>. | |||
| [RFC9099] Vyncke, É., Chittimaneni, K., Kaeo, M., and E. Rey, | [RFC9099] Vyncke, É., Chittimaneni, K., Kaeo, M., and E. Rey, | |||
| "Operational Security Considerations for IPv6 Networks", | "Operational Security Considerations for IPv6 Networks", | |||
| RFC 9099, DOI 10.17487/RFC9099, August 2021, | RFC 9099, DOI 10.17487/RFC9099, August 2021, | |||
| <https://www.rfc-editor.org/info/rfc9099>. | <https://www.rfc-editor.org/info/rfc9099>. | |||
| [VPP] "VPP/What is VPP?", | [TAPS-ARCH] | |||
| Pauly, T., Ed., Trammell, B., Ed., Brunstrom, A., | ||||
| Fairhurst, G., and C. Perkins, "An Architecture for | ||||
| Transport Services", Work in Progress, Internet-Draft, | ||||
| draft-ietf-taps-arch-12, June 2022, | ||||
| <https://datatracker.ietf.org/doc/bibxml3/draft-ietf-taps- | ||||
| arch.xml>. | ||||
| [VPP] FD.io, "VPP/What is VPP?", | ||||
| <https://wiki.fd.io/view/VPP/What_is_VPP%3F>. | <https://wiki.fd.io/view/VPP/What_is_VPP%3F>. | |||
| [VPP_SRC] "VPP Source", <https://gerrit.fd.io/r/c/vpp/+/21948>. | [VPP_SRC] "vpp", commit 21948, ip: HBH MTU recording for IPv6, | |||
| <https://gerrit.fd.io/r/c/vpp/+/21948>. | ||||
| [WIRESHARK] | [WIRESHARK] | |||
| "Wireshark Network Protocol Analyzer", | "Wireshark Network Protocol Analyzer", | |||
| <https://www.wireshark.org>. | <https://www.wireshark.org>. | |||
| Appendix A. Examples of Usage | Appendix A. Examples of Usage | |||
| This section provides examples that illustrate a use of the MinPMTU | This section provides examples that illustrate a use of the MinPMTU | |||
| HBH option by a source using DPLPMTUD to discover the PLPMTU | HBH Option by a source using DPLPMTUD to discover the PLPMTU | |||
| supported by a path. They consider a path where the on-path router | supported by a path. They consider a path where the on-path router | |||
| has been configured with an outgoing MTU of d'. The source starts by | has been configured with an outgoing MTU of d'. The source starts by | |||
| transmission of packets of size a, and then uses DPLPMTUD to seek to | transmission of packets of size a and then uses DPLPMTUD to seek to | |||
| increase the size in steps resulting in sizes of b,c,d,e, etc., | increase the size in steps resulting in sizes of b, c, d, e, etc. | |||
| (chosen by the search algorithm used by DPLPMTUD). The search | (chosen by the search algorithm used by DPLPMTUD). The search | |||
| algorithm terminates with a PLPMTU that is at least d and is less | algorithm terminates with a PLPMTU that is at least d and is less | |||
| than or equal to d'. | than or equal to d'. | |||
| The first example considers DPLPMTUD without using the MinPMTU HBH | The first example considers DPLPMTUD without using the MinPMTU HBH | |||
| option. In this case, DPLPMTUD searches using an increasing size of | Option. In this case, DPLPMTUD searches using a probe packet that | |||
| probe packet. Probe packets of size (e) are sent, which are larger | increases in size. Probe packets of size e are sent, which are | |||
| than the actual PMTU. In this example, PTB messages are not received | larger than the actual PMTU. In this example, PTB messages are not | |||
| from the routers and repeated unsuccessful probes result in the | received from the routers, and repeated unsuccessful probes result in | |||
| search phase completing. Packets of data are never sent with a size | the search phase completing. Packets of data are never sent with a | |||
| larger than the size of the last confirmed probe packet. ACKs of | size larger than the size of the last confirmed probe packet. | |||
| data packets are not shown. | Acknowledgments (ACKs) of data packets are not shown. | |||
| ----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
| ----Probe size (b) --------------------------------------> | ----Probe size b ----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (b) ----------------------------> | ----Packets of data size b ------------------------------> | |||
| ----Probe size (c) --------------------------------------> | ----Probe size c ----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (c) ----------------------------> | ----Packets of data size c ------------------------------> | |||
| ----Probe size (d) --------------------------------------> | ----Probe size d ----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ... | ... | |||
| ----Probe size (e) ------------X | ----Probe size e --------------X | |||
| X----ICMPv6 PTB (d') --| | X----ICMPv6 PTB d' ----| | |||
| ----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
| ----Probe size (e) ------------X (again) | ----Probe size e --------------X (again) | |||
| X----ICMPv6 PTB (d') --| | X----ICMPv6 PTB d' ----| | |||
| ----Packets of data size (d) ----------------------------- | ----Packets of data size d ------------------------------> | |||
| ... | ... | |||
| etc, until MaxProbes are unsuccessful and search phase completes. | etc. until MaxProbes are unsuccessful and search phase completes. | |||
| ----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
| Figure 4 | Figure 3 | |||
| The second example considers DPLPMTUD with the MinPMTU HBH option set | The second example considers DPLPMTUD with the MinPMTU HBH Option set | |||
| on a connectivity probe packet. | on a connectivity probe packet. | |||
| The IPv6 option is sent end-to-end, and the Min-PMTU is updated by a | The IPv6 Option is sent end to end, and the Min-PMTU is updated by a | |||
| router on the path to d', which is returned in a response that also | router on the path to d', which is returned in a response that also | |||
| sets the MinPMTU HBH option. Upon receiving Rtn-PMTU value is | sets the MinPMTU HBH Option. Upon receiving the Rtn-PMTU value, | |||
| received, DPLPMTUD immediately sends a probe packet of the target | DPLPMTUD immediately sends a probe packet of the target size d'. If | |||
| size (d'). If the probe packet is confirmed for the path, the PLPMTU | the probe packet is confirmed for the path, the PLPMTU is updated, | |||
| is updated, allowing the source to use data packets up to size d'. | allowing the source to use data packets up to size d'. (The search | |||
| (The search algorithm is allowed to continue to probe to see if the | algorithm is allowed to continue to probe to see if the path supports | |||
| path supports a larger size.) Packets of data are never sent with a | a larger size.) Packets of data are never sent with a size larger | |||
| size larger than the last confirmed probe size, d'. | than the last confirmed probe size d'. | |||
| ----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
| ----Connectivity probe with MinPMTU- | ----Connectivity probe with MinPMTU- | |||
| +--updated to minPMTU=d'-----> | +--updated to minPMTU=d'-----> | |||
| <-----------------ACK with Rtn-PMTU=d'-------------------- | <-----------------ACK with Rtn-PMTU=d'-------------------- | |||
| ----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
| ----Probe size (d') -------------------------------------> | ----Probe size d' ---------------------------------------> | |||
| <---------------------------------- ACK of probe --------- | <---------------------------------- ACK of probe --------- | |||
| -----Packets of data size (d') --------------------------> | -----Packets of data size d' ----------------------------> | |||
| Search phase completes. | Search phase completes. | |||
| -----Packets of data size (d') --------------------------> | -----Packets of data size d' ----------------------------> | |||
| Figure 5 | ||||
| The final example considers DPLPMTUD with the MinPMTU HBH option set | Figure 4 | |||
| on a connectivity probe packet, but shows the effect when this | ||||
| The final example considers DPLPMTUD with the MinPMTU HBH Option set | ||||
| on a connectivity probe packet but shows the effect when this | ||||
| connectivity probe packet is dropped. | connectivity probe packet is dropped. | |||
| In this case, the packet with the MinPMTU HBH option is not received. | In this case, the packet with the MinPMTU HBH Option is not received. | |||
| DPLPMTUD searches using probe packets of increasing size, increasing | DPLPMTUD searches using probe packets of increasing size, increasing | |||
| the PLPMTU when the probes are confirmed. An ICMPv6 PTB message is | the PLPMTU when the probes are confirmed. An ICMPv6 PTB message is | |||
| received when the probed size exceeds the actual PMTU, indicating a | received when the probed size exceeds the actual PMTU, indicating a | |||
| PTB_SIZE of d'. DPLPMTUD immediately sends a probe packet of the | PTB_SIZE of d'. DPLPMTUD immediately sends a probe packet of the | |||
| target size (d'). If the probe packet is confirmed for the path, the | target size d'. If the probe packet is confirmed for the path, the | |||
| PLPMTU is updated, allowing the source to use data packets up to size | PLPMTU is updated, allowing the source to use data packets up to size | |||
| d'. If the ICMPv6 PTB message is not received, the DPLPMTU will be | d'. If the ICMPv6 PTB message is not received, the DPLPMTU will be | |||
| the last confirmed probe size, d. | the last confirmed probe size, which is d. | |||
| ----Packets of data size (a) -----------------------------> | ----Packets of data size a -------------------------------> | |||
| ----Connectivity probe with MinPMTU --------X | ----Connectivity probe with MinPMTU --------X | |||
| ----Packets of data size (a) -----------------------------> | ----Packets of data size a -------------------------------> | |||
| ----Probe size (b) ---------------------------------------> | ----Probe size b -----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (b) -----------------------------> | ----Packets of data size b -------------------------------> | |||
| ----Probe size (c) ---------------------------------------> | ----Probe size c -----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (c) -----------------------------> | ----Packets of data size c -------------------------------> | |||
| ----Probe size (d) ---------------------------------------> | ----Probe size d -----------------------------------------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| ----Packets of data size (d) -----------------------------> | ----Packets of data size d -------------------------------> | |||
| ----Probe size (e) ----------X | ----Probe size e ------------X | |||
| <--ICMPv6 PTB PTB_SIZE(d') -| | <--ICMPv6 PTB PTB_SIZE d' --| | |||
| ----Packets of data size (d) -----------------------------> | ----Packets of data size d -------------------------------> | |||
| ----Probe size (d') using target set by PTB_SIZE ---------> | ----Probe size d' using target set by PTB_SIZE -----------> | |||
| <---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
| Search phase completes. | Search phase completes. | |||
| ----Packets of data size (d') ----------------------------> | ----Packets of data size d' ------------------------------> | |||
| Figure 6 | Figure 5 | |||
| The number of probe rounds depends on the number of steps needed by | The number of probe rounds depends on the number of steps needed by | |||
| the search algorithm, and is typically larger for a larger PMTU. | the search algorithm and is typically larger for a larger PMTU. | |||
| Acknowledgments | ||||
| Helpful comments were received from Tom Herbert, Tom Jones, Fred | ||||
| Templin, Ole Troan, Tianran Zhou, Jen Linkova, Brian Carpenter, Peng | ||||
| Shuping, Mark Smith, Fernando Gont, Michael Dougherty, Erik Kline, | ||||
| and other members of the 6MAN Working Group. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Robert M. Hinden | Robert M. Hinden | |||
| Check Point Software | Check Point Software | |||
| 959 Skyway Road | 959 Skyway Road | |||
| San Carlos, CA 94070 | San Carlos, CA 94070 | |||
| United States of America | United States of America | |||
| Email: bob.hinden@gmail.com | Email: bob.hinden@gmail.com | |||
| End of changes. 171 change blocks. | ||||
| 596 lines changed or deleted | 464 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||