| rfc9627v1.txt | rfc9627.txt | |||
|---|---|---|---|---|
| Internet Engineering Task Force (IETF) J. Lennox | Internet Engineering Task Force (IETF) J. Lennox | |||
| Request for Comments: 9627 D. Hong | Request for Comments: 9627 8x8 / Jitsi | |||
| Category: Standards Track Vidyo | Category: Standards Track D. Hong | |||
| ISSN: 2070-1721 J. Uberti | ISSN: 2070-1721 Google | |||
| J. Uberti | ||||
| OpenAI | ||||
| S. Holmer | S. Holmer | |||
| M. Flodman | M. Flodman | |||
| August 2024 | March 2025 | |||
| The Layer Refresh Request (LRR) RTCP Feedback Message | The Layer Refresh Request (LRR) RTCP Feedback Message | |||
| Abstract | Abstract | |||
| This memo describes the RTCP Payload-Specific Feedback Message Layer | This memo describes the RTCP Payload-Specific Feedback Message Layer | |||
| Refresh Request (LRR), which can be used to request a state refresh | Refresh Request (LRR), which can be used to request a state refresh | |||
| of one or more substreams of a layered media stream. It also defines | of one or more substreams of a layered media stream. This document | |||
| its use with several RTP payloads for scalable media formats. | also defines its use with several RTP payloads for scalable media | |||
| formats. | ||||
| Status of This Memo | Status of This Memo | |||
| This is an Internet Standards Track document. | This is an Internet Standards Track document. | |||
| This document is a product of the Internet Engineering Task Force | This document is a product of the Internet Engineering Task Force | |||
| (IETF). It represents the consensus of the IETF community. It has | (IETF). It represents the consensus of the IETF community. It has | |||
| received public review and has been approved for publication by the | received public review and has been approved for publication by the | |||
| Internet Engineering Steering Group (IESG). Further information on | Internet Engineering Steering Group (IESG). Further information on | |||
| Internet Standards is available in Section 2 of RFC 7841. | Internet Standards is available in Section 2 of RFC 7841. | |||
| Information about the current status of this document, any errata, | Information about the current status of this document, any errata, | |||
| and how to provide feedback on it may be obtained at | and how to provide feedback on it may be obtained at | |||
| https://www.rfc-editor.org/info/rfc9627. | https://www.rfc-editor.org/info/rfc9627. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2025 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Revised BSD License text as described in Section 4.e of the | include Revised BSD License text as described in Section 4.e of the | |||
| Trust Legal Provisions and are provided without warranty as described | Trust Legal Provisions and are provided without warranty as described | |||
| skipping to change at line 58 ¶ | skipping to change at line 61 ¶ | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction | 1. Introduction | |||
| 2. Conventions and Terminology | 2. Conventions and Terminology | |||
| 2.1. Terminology | 2.1. Terminology | |||
| 3. Layer Refresh Request | 3. Layer Refresh Request | |||
| 3.1. Message Format | 3.1. Message Format | |||
| 3.2. Semantics | 3.2. Semantics | |||
| 4. Usage with Specific Codecs | 4. Usage with Specific Codecs | |||
| 4.1. H264 SVC | 4.1. H.264 SVC | |||
| 4.2. VP8 | 4.2. VP8 | |||
| 4.3. H265 | 4.3. H.265 | |||
| 5. Usage with Different Scalability Transmission Mechanisms | 5. Usage with Different Scalability Transmission Mechanisms | |||
| 6. SDP Definitions | 6. SDP Definitions | |||
| 7. Security Considerations | 7. Security Considerations | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| 9.2. Informative References | 9.2. Informative References | |||
| Authors' Addresses | Authors' Addresses | |||
| 1. Introduction | 1. Introduction | |||
| This memo describes an RTCP [RFC3550] Payload-Specific Feedback | This memo describes an RTCP [RFC3550] Payload-Specific Feedback | |||
| Message [RFC4585] Layer Refresh Request (LRR). It is designed to | Message [RFC4585] "Layer Refresh Request" (LRR), which is designed to | |||
| allow a receiver of a layered media stream to request that one or | allow a receiver of a layered media stream to request that one or | |||
| more of its substreams be refreshed such that it can then be decoded | more of its substreams be refreshed. The stream can then be decoded | |||
| by an endpoint that previously was not receiving those layers, | by an endpoint that previously was not receiving those layers, | |||
| without requiring that the entire stream be refreshed (as it would be | without requiring that the entire stream be refreshed (as it would be | |||
| if the receiver sent a Full Intra Request (FIR) [RFC5104]; see also | if the receiver sent a Full Intra Request (FIR) [RFC5104]; see also | |||
| [RFC8082]). | [RFC8082]). | |||
| The feedback message is applicable to both temporally and spatially | The feedback message is applicable to both temporally and spatially | |||
| scaled streams and to both single-stream and multi-stream scalability | scaled streams and to both single-stream and multi-stream scalability | |||
| modes. | modes. | |||
| 2. Conventions and Terminology | 2. Conventions and Terminology | |||
| skipping to change at line 103 ¶ | skipping to change at line 106 ¶ | |||
| 2.1. Terminology | 2.1. Terminology | |||
| A "layer refresh point" is a point in a scalable stream after which a | A "layer refresh point" is a point in a scalable stream after which a | |||
| decoder, which previously had been able to decode only some (possibly | decoder, which previously had been able to decode only some (possibly | |||
| none) of the available layers of stream, is able to decode a greater | none) of the available layers of stream, is able to decode a greater | |||
| number of the layers. | number of the layers. | |||
| For spatial (or quality) layers, in normal encoding, a subpicture can | For spatial (or quality) layers, in normal encoding, a subpicture can | |||
| depend both on earlier pictures of that spatial layer and also on | depend both on earlier pictures of that spatial layer and also on | |||
| lower-layer pictures of the current picture. However, a layer | lower-layer pictures of the current picture. However, a layer | |||
| refresh typically requires that a spatial layer picture be encoded in | refresh typically requires that a spatial-layer picture be encoded in | |||
| a way that references only the lower-layer subpictures of the current | a way that references only the lower-layer subpictures of the current | |||
| picture, not any earlier pictures of that spatial layer. | picture, not any earlier pictures of that spatial layer. | |||
| Additionally, the encoder must promise that no earlier pictures of | Additionally, the encoder must promise that no earlier pictures of | |||
| that spatial layer will be used as reference in the future. | that spatial layer will be used as reference in the future. | |||
| However, even in a layer refresh, layers other than the ones being | However, even in a layer refresh, layers other than the ones being | |||
| refreshed may still maintain dependency on earlier content of the | refreshed may still maintain dependency on earlier content of the | |||
| stream. This is the difference between a layer refresh and a FIR | stream. This is the difference between a layer refresh and a FIR | |||
| [RFC5104]. This minimizes the coding overhead of refresh to only | [RFC5104]. This minimizes the coding overhead of refresh to only | |||
| those parts of the stream that actually need to be refreshed at any | those parts of the stream that actually need to be refreshed at any | |||
| given time. | given time. | |||
| The spatial layer refresh of an enhancement layer is shown below. | The spatial-layer refresh of an enhancement layer is shown below. | |||
| The "<--" indicates a coding dependency. | The "<--" indicates a coding dependency. | |||
| ... <-- S1 <-- S1 S1 <-- S1 <-- ... | ... <-- S1 <-- S1 S1 <-- S1 <-- ... | |||
| | | | | | | | | | | |||
| \/ \/ \/ \/ | \/ \/ \/ \/ | |||
| ... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... | ... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... | |||
| 1 2 3 4 | 1 2 3 4 | |||
| Figure 1 | Figure 1: Refresh of a Spatial Enhancement Layer | |||
| In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a | In Figure 1, frame 3 is a layer refresh point for spatial layer S1; a | |||
| decoder that had previously only been decoding spatial layer S0 would | decoder that had previously only been decoding spatial layer S0 would | |||
| be able to decode layer S1 starting at frame 3. | be able to decode layer S1 starting at frame 3. | |||
| The spatial layer refresh of a base layer is shown below. The "<--" | The spatial-layer refresh of a base layer is shown below. The "<--" | |||
| indicates a coding dependency. | indicates a coding dependency. | |||
| ... <-- S1 <-- S1 <-- S1 <-- S1 <-- ... | ... <-- S1 <-- S1 <-- S1 <-- S1 <-- ... | |||
| | | | | | | | | | | |||
| \/ \/ \/ \/ | \/ \/ \/ \/ | |||
| ... <-- S0 <-- S0 S0 <-- S0 <-- ... | ... <-- S0 <-- S0 S0 <-- S0 <-- ... | |||
| 1 2 3 4 | 1 2 3 4 | |||
| Figure 2 | Figure 2: Refresh of a Spatial Base Layer | |||
| In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a | In Figure 2, frame 3 is a layer refresh point for spatial layer S0; a | |||
| decoder that had previously not been decoding the stream at all could | decoder that had previously not been decoding the stream at all could | |||
| decode layer S0 starting at frame 3. | decode layer S0 starting at frame 3. | |||
| For temporal layers, while normal encoding allows frames to depend on | For temporal layers, while normal encoding allows frames to depend on | |||
| earlier frames of the same temporal layer, layer refresh requires | earlier frames of the same temporal layer, layer refresh requires | |||
| that the layer be "temporally nested", i.e., use as reference only | that the layer be "temporally nested", i.e., use as reference only | |||
| earlier frames of a lower temporal layer, not any earlier frames of | earlier frames of a lower temporal layer, not any earlier frames of | |||
| this temporal layer and promise that no future frames of this | this temporal layer and promise that no future frames of this | |||
| skipping to change at line 168 ¶ | skipping to change at line 171 ¶ | |||
| The temporal layer refresh is shown below. The "<--" indicates a | The temporal layer refresh is shown below. The "<--" indicates a | |||
| coding dependency. | coding dependency. | |||
| ... <----- T1 <------ T1 T1 <------ ... | ... <----- T1 <------ T1 T1 <------ ... | |||
| / / / | / / / | |||
| |_ |_ |_ | |_ |_ |_ | |||
| ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | |||
| 1 2 3 4 5 6 7 | 1 2 3 4 5 6 7 | |||
| Figure 3 | Figure 3: Refresh of a Temporal Layer | |||
| In Figure 3, frame 6 is a layer refresh point for temporal layer T1; | In Figure 3, frame 6 is a layer refresh point for temporal layer T1; | |||
| a decoder that had previously only been decoding temporal layer T0 | a decoder that had previously only been decoding temporal layer T0 | |||
| would be able to decode layer T1 starting at frame 6. | would be able to decode layer T1 starting at frame 6. | |||
| An inherently temporally nested stream is shown below. The "<--" | An inherently temporally nested stream is shown below. The "<--" | |||
| indicates a coding dependency. | indicates a coding dependency. | |||
| T1 T1 T1 | T1 T1 T1 | |||
| / / / | / / / | |||
| |_ |_ |_ | |_ |_ |_ | |||
| ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... | |||
| 1 2 3 4 5 6 7 | 1 2 3 4 5 6 7 | |||
| Figure 4 | Figure 4: An Inherently Temporally Nested Stream | |||
| In Figure 4, the stream is temporally nested in its ordinary | In Figure 4, the stream is temporally nested in its ordinary | |||
| structure; a decoder receiving layer T0 can begin decoding layer T1 | structure; a decoder receiving layer T0 can begin decoding layer T1 | |||
| at any point. | at any point. | |||
| A "layer index" is a numeric label for a specific spatial and | A "layer index" is a numeric label for a specific spatial and | |||
| temporal layer of a scalable stream. It consists of both a "temporal | temporal layer of a scalable stream. It consists of both a | |||
| ID" identifying the temporal layer and a "layer ID" identifying the | "temporal-layer ID" identifying the temporal layer and a "layer ID" | |||
| spatial or quality layer. The details of how layers of a scalable | identifying the spatial or quality layer. The details of how layers | |||
| stream are labeled are codec specific. Details for several codecs | of a scalable stream are labeled are codec specific. Details for | |||
| are defined in Section 4. | several codecs are defined in Section 4. | |||
| 3. Layer Refresh Request | 3. Layer Refresh Request | |||
| A layer refresh frame can be requested by sending a Layer Refresh | A layer refresh frame can be requested by sending a Layer Refresh | |||
| Request (LRR), which is an RTCP [RFC3550] payload-specific feedback | Request (LRR), which is an RTCP [RFC3550] payload-specific feedback | |||
| message [RFC4585] asking the encoder to encode a frame that makes it | message [RFC4585] asking the encoder to encode a frame that makes it | |||
| possible to upgrade to a higher layer. The LRR contains one or two | possible to upgrade to a higher layer. The LRR contains one or two | |||
| tuples, indicating the temporal and spatial layer the decoder wants | tuples, indicating the temporal and spatial layer the decoder wants | |||
| to upgrade to and (optionally) the currently highest temporal and | to upgrade to and (optionally) the currently highest temporal and | |||
| spatial layer the decoder can decode. | spatial layer the decoder can decode. | |||
| The specific format of the tuples, and the mechanism by which a | The specific format of the tuples, and the mechanism by which a | |||
| receiver recognizes a refresh frame, is codec dependent. Usage for | receiver recognizes a refresh frame, is codec dependent. Usage for | |||
| several codecs is discussed in Section 4. | several codecs is discussed in Section 4. | |||
| An LRR follows the FIR model (Section 3.5.1 of [RFC5104]) for its | The design of LRR follows the FIR model (Section 3.5.1 of [RFC5104]) | |||
| retransmission, reliability, and use in multipoint conferences. | for its retransmission, reliability, and use in multipoint | |||
| conferences. | ||||
| The LRR message is identified by RTCP packet type value PT=PSFB and | The LRR message is identified by RTCP packet type value PT=PSFB and | |||
| FMT=10. The Feedback Control Information (FCI) field MUST contain | FMT=10. The Feedback Control Information (FCI) field MUST contain | |||
| one or more LRR entries. Each entry applies to a different media | one or more LRR entries. Each entry applies to a different media | |||
| sender, identified by its Synchronization Source (SSRC). | sender, identified by its Synchronization Source (SSRC). | |||
| 3.1. Message Format | 3.1. Message Format | |||
| The FCI for the Layer Refresh Request consists of one or more FCI | The FCI for the Layer Refresh Request consists of one or more FCI | |||
| entries, the content of which is depicted in Figure 5. The length of | entries, the content of which is depicted in Figure 5. The length of | |||
| skipping to change at line 236 ¶ | skipping to change at line 240 ¶ | |||
| 0 1 2 3 | 0 1 2 3 | |||
| 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | SSRC | | | SSRC | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | Seq nr. |C| Payload Type| Reserved | | | Seq nr. |C| Payload Type| Reserved | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | RES | TTID| TLID | RES | CTID| CLID | | | RES | TTID| TLID | RES | CTID| CLID | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Figure 5 | Figure 5: Layer Refresh Request FCI Format | |||
| Synchronization Source (SSRC) (32 bits): | Synchronization Source (SSRC) (32 bits): | |||
| The SSRC value of the media sender that is requested to send a | The SSRC value of the media sender that is requested to send a | |||
| layer refresh point. | layer refresh point. | |||
| Seq nr. (8 bits): | Seq nr. (8 bits): | |||
| The command sequence number. The sequence number space is unique | The command sequence number. The sequence number space is unique | |||
| for each pairing of the SSRC of command source and the SSRC of the | for each pairing of the SSRC of command source and the SSRC of the | |||
| command target. The sequence number SHALL be increased by 1 for | command target. The sequence number SHALL be increased by 1 for | |||
| each new command (modulo 256, so the value after 255 is 0). A | each new command (modulo 256, so the value after 255 is 0). A | |||
| repetition SHALL NOT increase the sequence number. The initial | repetition SHALL NOT increase the sequence number. The initial | |||
| value is arbitrary. | value is arbitrary. | |||
| C (1 bit): | C (1 bit): | |||
| A flag bit indicating whether the Current Temporal Layer ID (CTID) | A flag bit indicating whether the Current Temporal-layer ID (CTID) | |||
| and Current Layer ID (CLID) fields are present in the FCI. If | and Current Layer ID (CLID) fields are present in the FCI. If | |||
| this bit is 0, the sender of the LRR message is requesting refresh | this bit is 0, the sender of the LRR message is requesting refresh | |||
| of all layers up to and including the target layer. | of all layers up to and including the target layer. | |||
| Payload Type (7 bits): | Payload Type (7 bits): | |||
| The RTP payload type for which the LRR is being requested. This | The RTP payload type for which the LRR is being requested. This | |||
| gives the context in which the target layer index is to be | gives the context in which the target layer index is to be | |||
| interpreted. | interpreted. | |||
| Reserved (RES) (three separate fields of 16 bits / 5 bits / 5 | Reserved (RES) (three separate fields of 16 bits / 5 bits / 5 | |||
| bits): | bits): | |||
| All bits SHALL be set to 0 by the sender and SHALL be ignored on | All bits SHALL be set to zero by the sender and SHALL be ignored | |||
| reception. | on reception. | |||
| Target Temporal Layer ID (TTID) (3 bits): | Target Temporal-layer ID (TTID) (3 bits): | |||
| The temporal ID of the target layer for which the receiver wishes | The temporal-layer ID of the target layer for which the receiver | |||
| a refresh point. | wishes a refresh point. | |||
| Target Layer ID (TLID) (8 bits): | Target Layer ID (TLID) (8 bits): | |||
| The layer ID of the target spatial or quality layer for which the | The layer ID of the target spatial or quality layer for which the | |||
| receiver wishes a refresh point. Its format is dependent on the | receiver wishes a refresh point. Its format is dependent on the | |||
| payload type field. | payload type field. | |||
| Current Temporal Layer ID (CTID) (3 bits): | Current Temporal-layer ID (CTID) (3 bits): | |||
| If C is 1, the ID of the current temporal layer being decoded by | If C is 1, the ID of the current temporal layer being decoded by | |||
| the receiver. This message is not requesting refresh of layers at | the receiver. This message is not requesting refresh of layers at | |||
| or below this layer. If C is 0, this field SHALL be set to 0 by | or below this layer. If C is 0, this field SHALL be set to zero | |||
| the sender and SHALL be ignored on reception. | by the sender and SHALL be ignored on reception. | |||
| Current Layer ID (CLID) (8 bits): | Current Layer ID (CLID) (8 bits): | |||
| If C is 1, the layer ID of the current spatial or quality layer | If C is 1, the layer ID of the current spatial or quality layer | |||
| being decoded by the receiver. This message is not requesting | being decoded by the receiver. This message is not requesting | |||
| refresh of layers at or below this layer. If C is 0, this field | refresh of layers at or below this layer. If C is 0, this field | |||
| SHALL be set to 0 by the sender and SHALL be ignored on reception. | SHALL be set to zero by the sender and SHALL be ignored on | |||
| reception. | ||||
| When C is 1, TTID MUST NOT be less than CTID, and TLID MUST NOT be | When C is 1, TTID MUST NOT be less than CTID, and TLID MUST NOT be | |||
| less than CLID; at least one of either TTID or TLID MUST be greater | less than CLID; at least one of either TTID or TLID MUST be greater | |||
| than CTID or CLID, respectively. That is to say, the target layer | than CTID or CLID, respectively. That is to say, the target layer | |||
| index <TTID, TLID> MUST be a layer upgrade from the current layer | index <TTID, TLID> MUST be a layer upgrade from the current layer | |||
| index <CTID, CLID>. A sender MAY request an upgrade in both temporal | index <CTID, CLID>. A sender MAY request an upgrade in both temporal | |||
| and spatial/quality layers simultaneously. | and spatial or quality layers simultaneously. | |||
| A receiver receiving an LRR feedback packet that does not satisfy the | A receiver receiving an LRR feedback packet that does not satisfy the | |||
| requirements of the previous paragraph, i.e., one where the C bit is | requirements of the previous paragraph, i.e., one where the C bit is | |||
| present but the TTID is less than the CTID or the TLID is less than | present but the TTID is less than the CTID or the TLID is less than | |||
| the CLID, MUST discard the request. | the CLID, MUST discard the request. | |||
| Note: the syntax of the TTID, TLID, CTID, and CLID fields match, by | | Note: The syntax of the TTID, TLID, CTID, and CLID fields | |||
| design, the TID and LID fields in [RFC9626]. | | match, by design, the TID and LID fields in [RFC9626]. | |||
| 3.2. Semantics | 3.2. Semantics | |||
| Within the common packet header for feedback messages (as defined in | Within the common packet header for feedback messages (as defined in | |||
| Section 6.1 of [RFC4585]), the "SSRC of packet sender" field | Section 6.1 of [RFC4585]), the "SSRC of packet sender" field | |||
| indicates the source of the request, and the "SSRC of media source" | indicates the source of the request, and the "SSRC of media source" | |||
| is not used and SHALL be set to 0. The SSRCs of the media senders to | is not used and SHALL be set to zero. The SSRCs of the media senders | |||
| which the LRR command applies are in the corresponding FCI entries. | to which the LRR command applies are in the corresponding FCI | |||
| An LRR message MAY contain requests to multiple media senders, using | entries. An LRR message MAY contain requests to multiple media | |||
| one FCI entry per target media sender. | senders, using one FCI entry per target media sender. | |||
| Upon reception of an LRR, the encoder MUST send a decoder refresh | Upon reception of an LRR, the encoder MUST send a decoder refresh | |||
| point (see Section 2.1) as soon as possible. | point (see Section 2.1) as soon as possible. | |||
| The sender MUST respect bandwidth limits provided by the application | The sender MUST respect bandwidth limits provided by the application | |||
| of congestion control, as described in Section 5 of [RFC5104]. As | of congestion control, as described in Section 5 of [RFC5104]. As | |||
| layer refresh points will often be larger than non-refreshing frames, | layer refresh points will often be larger than non-refreshing frames, | |||
| this may restrict a sender's ability to send a layer refresh point | this may restrict a sender's ability to send a layer refresh point | |||
| quickly. | quickly. | |||
| An LRR MUST NOT be sent as a reaction to picture losses due to packet | An LRR MUST NOT be sent as a reaction to picture losses due to packet | |||
| loss or corruption; it is RECOMMENDED to use a PLI (Picture Loss | loss or corruption; it is RECOMMENDED to use a PLI (Picture Loss | |||
| Indication) [RFC4585] instead. An LRR SHOULD be used only in | Indication) [RFC4585] instead. An LRR SHOULD be used only in | |||
| situations where there is an explicit change in a decoders' behavior: | situations where there is an explicit change in a decoder's behavior: | |||
| for example, when a receiver will start decoding a layer that it | for example, when a receiver will start decoding a layer that it | |||
| previously had been discarding. | previously had been discarding. | |||
| 4. Usage with Specific Codecs | 4. Usage with Specific Codecs | |||
| In order for an LRR to be used with a scalable codec, the format of | In order for an LRR to be used with a scalable codec, the format of | |||
| the temporal and layer ID fields (for both the target and current | the temporal and layer ID fields (for both the target and current | |||
| layer indices) needs to be specified for that codec's RTP | layer indices) needs to be specified for that codec's RTP | |||
| packetization. New RTP packetization specifications for scalable | packetization. New RTP packetization specifications for scalable | |||
| codecs SHOULD define how this is done. (The VP9 payload [RFC9628], | codecs SHOULD define how this is done. (The VP9 payload [RFC9628], | |||
| for instance, has done so.) If the payload also specifies how it is | for instance, has done so.) If the payload also specifies how it is | |||
| used with the Frame Marking RTP Header Extension [RFC9626], the | used with the Video Frame Marking RTP Header Extension described in | |||
| syntax MUST be defined in the same manner as the TID and LID fields | [RFC9626], the syntax MUST be defined in the same manner as the TID | |||
| in that header. | and LID fields in that header. | |||
| 4.1. H264 SVC | 4.1. H.264 SVC | |||
| H.264 SVC [RFC6190] defines temporal, dependency (spatial), and | H.264 SVC [RFC6190] defines temporal, dependency (spatial), and | |||
| quality scalability modes. | quality scalability modes. | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | RES | TID |R| DID | QID | | | RES | TID |R| DID | QID | | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| Figure 6 | Figure 6: H.264 SVC Layer Index Fields Format | |||
| Figure 6 shows the format of the layer index fields for H.264 SVC | Figure 6 shows the format of the layer index fields for H.264 SVC | |||
| streams. The "R" and "RES" fields MUST be set to 0 on transmission | streams. The "R" and "RES" fields MUST be set to zero on | |||
| and ignored on reception. See Section 1.1.3 of [RFC6190] for details | transmission and ignored on reception. See Section 1.1.3 of | |||
| on the dependency_id (DID), quality_id (QID), and temporal_id (TID) | [RFC6190] for details on the dependency_id (DID), quality_id (QID), | |||
| fields. | and temporal_id (TID) fields. | |||
| A dependency or quality layer refresh of a given layer in H.264 SVC | A dependency or quality layer refresh of a given layer in H.264 SVC | |||
| can be identified by the "I" bit (idr_flag) in the extended Network | can be identified by the I bit (idr_flag) in the extended Network | |||
| Abstraction Layer (NAL) unit header, present in NAL unit types 14 | Abstraction Layer (NAL) unit header, present in NAL unit types 14 | |||
| (prefix NAL unit) and 20 (coded scalable slice). Layer refresh of | (prefix NAL unit) and 20 (coded scalable slice). Layer refresh of | |||
| the base layer can also be identified by its NAL unit type of its | the base layer can also be identified by its NAL unit type of its | |||
| coded slices, which is "5" rather than "1". A dependency or quality | coded slices, which is "5" rather than "1". A dependency or quality | |||
| layer refresh is complete once this bit has been seen on all the | layer refresh is complete once this bit has been seen on all the | |||
| appropriate layers (in decoding order) above the current layer index | appropriate layers (in decoding order) above the current layer index | |||
| (if any, or beginning from the base layer if not) through the target | (if any, or beginning from the base layer if not) through the target | |||
| layer index. | layer index. | |||
| Note that as the "I" bit in a Payload Content Scalability Information | Note that as the I bit in a Payload Content Scalability Information | |||
| (PACSI) header is set if the corresponding bit is set in any of the | (PACSI) header is set if the corresponding bit is set in any of the | |||
| aggregated NAL units it describes; thus, it is not sufficient to | aggregated NAL units it describes; thus, it is not sufficient to | |||
| identify layer refresh when NAL units of multiple dependency or | identify layer refresh when NAL units of multiple dependency or | |||
| quality layers are aggregated. | quality layers are aggregated. | |||
| In H.264 SVC, temporal layer refresh information can be determined | In H.264 SVC, temporal layer refresh information can be determined | |||
| from various Supplemental Encoding Information (SEI) messages in the | from various Supplemental Encoding Information (SEI) messages in the | |||
| bitstream. | bitstream. | |||
| Whether an H.264 SVC stream is scalably nested can be determined from | Whether an H.264 SVC stream is scalably nested can be determined from | |||
| skipping to change at line 411 ¶ | skipping to change at line 416 ¶ | |||
| The VP8 RTP payload format [RFC7741] defines temporal scalability | The VP8 RTP payload format [RFC7741] defines temporal scalability | |||
| modes. It does not support spatial scalability. | modes. It does not support spatial scalability. | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | RES | TID | RES | | | RES | TID | RES | | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| Figure 7 | Figure 7: VP8 Layer Index Field Format | |||
| Figure 7 shows the format of the layer index field for VP8 streams. | Figure 7 shows the format of the layer index field for VP8 streams. | |||
| The "RES" fields MUST be set to 0 on transmission and be ignored on | The "RES" fields MUST be set to zero on transmission and be ignored | |||
| reception. See Section 4.2 of [RFC7741] for details on the TID | on reception. See Section 4.2 of [RFC7741] for details on the TID | |||
| field. | field. | |||
| A VP8 layer refresh point can be identified by the presence of the | A VP8 layer refresh point can be identified by the presence of the Y | |||
| "Y" bit in the VP8 payload header. When this bit is set, this and | bit (see [RFC7741]) in the VP8 payload header. When this bit is set, | |||
| all subsequent frames depend only on the current base temporal layer. | this and all subsequent frames depend only on the current base | |||
| On receipt of an LRR for a VP8 stream, a sender that supports LRRs | temporal layer. On receipt of an LRR for a VP8 stream, a sender that | |||
| MUST encode the stream so it can set the Y bit in a packet whose | supports LRRs MUST encode the stream so it can set the Y bit in a | |||
| temporal layer is at or below the target layer index. | packet whose temporal layer is at or below the target layer index. | |||
| Note that in VP8, not every layer switch point can be identified by | Note that in VP8, not every layer switch point can be identified by | |||
| the Y bit since the Y bit implies layer switch of all layers, not | the Y bit since the Y bit implies layer switch of all layers, not | |||
| just the layer in which it is sent. Thus, the use of an LRR with VP8 | just the layer in which it is sent. Thus, the use of an LRR with VP8 | |||
| can result in some inefficiency in transmission. However, this is | can result in some inefficiency in transmission. However, this is | |||
| not expected to be a major issue for temporal structures in normal | not expected to be a major issue for temporal structures in normal | |||
| use. | use. | |||
| 4.3. H265 | 4.3. H.265 | |||
| The initial version of the H.265 payload format [RFC7798] defines | The initial version of the H.265 payload format [RFC7798] defines | |||
| temporal scalability, with protocol elements reserved for spatial or | temporal scalability, with protocol elements reserved for spatial or | |||
| other scalability modes (which are expected to be defined in a future | other scalability modes (which are expected to be defined in a future | |||
| version of the specification). | version of the specification). | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | RES | TID |RES| LayerId | | | RES | TID |RES| layer ID | | |||
| +---------------+---------------+ | +---------------+---------------+ | |||
| Figure 8 | Figure 8: H.265 Layer Index Fields Format | |||
| Figure 8 shows the format of the layer index field for H.265 streams. | Figure 8 shows the format of the layer index field for H.265 streams. | |||
| The "RES" fields MUST be set to 0 on transmission and ignored on | The "RES" fields MUST be set to zero on transmission and ignored on | |||
| reception. See Section 1.1.4 of [RFC7798] for details on the LayerId | reception. See Section 1.1.4 of [RFC7798] for details on the layer | |||
| and TID fields. | ID and TID fields. | |||
| H.265 streams signal whether they are temporally nested by using the | H.265 streams signal whether they are temporally nested by using the | |||
| vps_temporal_id_nesting_flag in the Video Parameter Set (VPS) and the | vps_temporal_id_nesting_flag in the Video Parameter Set (VPS) and the | |||
| sps_temporal_id_nesting_flag in the Sequence Parameter Set (SPS). If | sps_temporal_id_nesting_flag in the Sequence Parameter Set (SPS). If | |||
| this flag is set in a stream's currently applicable VPS or SPS, | this flag is set in a stream's currently applicable VPS or SPS, | |||
| receivers SHOULD NOT send temporal LRR messages for that stream, as | receivers SHOULD NOT send temporal LRR messages for that stream, as | |||
| every frame is implicitly a temporal layer refresh point. | every frame is implicitly a temporal layer refresh point. | |||
| If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit | If a stream's sps_temporal_id_nesting_flag is not set, the NAL unit | |||
| types 2 to 5 inclusively identify temporal layer switching points. A | types 2 to 5 inclusively identify temporal layer switching points. A | |||
| layer refresh to any higher target temporal layer is satisfied when a | layer refresh to any higher target temporal layer is satisfied when a | |||
| NAL unit type of 4 or 5 with TID equal to 1 more than current TID is | NAL unit type of 4 or 5 with TID equal to 1 more than current TID is | |||
| seen. Alternatively, layer refresh to a target temporal layer can be | seen. Alternatively, layer refresh to a target temporal layer can be | |||
| incrementally satisfied with a NAL unit type of 2 or 3. In this | incrementally satisfied with a NAL unit type of 2 or 3. In this | |||
| case, given current TID = TO and target TID = TN, layer refresh to TN | case, given current TID = TO and target TID = TN, layer refresh to TN | |||
| is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1, | is satisfied when a NAL unit type of 2 or 3 is seen for TID = T1, | |||
| then TID = T2, all the way up to TID = TN. During this incremental | then TID = T2, all the way up to TID = TN (note that TN and TO refer | |||
| to nonce variables in this instance). During this incremental | ||||
| process, layer refresh to TN can be completely satisfied as soon as a | process, layer refresh to TN can be completely satisfied as soon as a | |||
| NAL unit type of 2 or 3 is seen. | NAL unit type of 2 or 3 is seen. | |||
| Of course, temporal layer refresh can also be satisfied whenever any | Of course, temporal layer refresh can also be satisfied whenever any | |||
| Intra-Random Access Point (IRAP) NAL unit type (with values 16-23, | Intra-Random Access Point (IRAP) NAL unit type (with values 16-23, | |||
| inclusively) is seen. An IRAP picture is similar to an IDR picture | inclusively) is seen. An IRAP picture is similar to an IDR picture | |||
| in H.264 (NAL unit type of 5 in H.264) where decoding of the picture | in H.264 (NAL unit type of 5 in H.264) where decoding of the picture | |||
| can start without any older pictures. | can start without any older pictures. | |||
| In the (future) H.265 payloads that support spatial scalability, a | In the (future) H.265 payloads that support spatial scalability, a | |||
| spatial layer refresh of a specific layer can be identified by NAL | spatial-layer refresh of a specific layer can be identified by NAL | |||
| units with the requested layer ID and NAL unit types between 16 and | units with the requested layer ID and NAL unit types between 16 and | |||
| 21, inclusive. A dependency or quality layer refresh is complete | 21, inclusive. A dependency or quality layer refresh is complete | |||
| once NAL units of this type have been seen on all the appropriate | once NAL units of this type have been seen on all the appropriate | |||
| layers (in decoding order) above the current layer index (if any, or | layers (in decoding order) above the current layer index (if any, or | |||
| beginning from the base layer if not) through the target layer index. | beginning from the base layer if not) through the target layer index. | |||
| 5. Usage with Different Scalability Transmission Mechanisms | 5. Usage with Different Scalability Transmission Mechanisms | |||
| Several different mechanisms are defined for how scalable streams can | Several different mechanisms are defined for how scalable streams can | |||
| be transmitted in RTP. The RTP Taxonomy (Section 3.7 of [RFC7656]) | be transmitted in RTP. Section 3.7 of "A Taxonomy of Semantics and | |||
| Mechanisms for Real-Time Transport Protocol (RTP) Sources" [RFC7656] | ||||
| defines three mechanisms: Single RTP stream on a Single media | defines three mechanisms: Single RTP stream on a Single media | |||
| Transport (SRST), Multiple RTP streams on a Single media Transport | Transport (SRST), Multiple RTP streams on a Single media Transport | |||
| (MRST), and Multiple RTP streams on Multiple media Transports (MRMT). | (MRST), and Multiple RTP streams on Multiple media Transports (MRMT). | |||
| The LRR message is applicable to all these mechanisms. For MRST and | The LRR message is applicable to all these mechanisms. For MRST and | |||
| MRMT mechanisms, the "media source" field of the LRR FCI is set to | MRMT mechanisms, the "media source" field of the LRR FCI is set to | |||
| the SSRC of the RTP stream containing the layer indicated by the | the SSRC of the RTP stream containing the layer indicated by the | |||
| Current Layer Index (if "C" is 1) or the stream containing the base | Current Layer Index (if "C" is 1) or the stream containing the base | |||
| encoded stream (if "C" is 0). For MRMT, it is sent on the RTP | encoded stream (if "C" is 0). For MRMT, the LRR message is sent on | |||
| session on which this stream is sent. On receipt, the sender MUST | the RTP session on which this stream is sent. On receipt, the sender | |||
| refresh all the layers requested in the stream, simultaneously in | MUST refresh all the layers requested in the stream, simultaneously | |||
| decode order. | in decode order. | |||
| 6. SDP Definitions | 6. SDP Definitions | |||
| Section 7 of [RFC5104] defines Session Description Protocol (SDP) | Section 7 of [RFC5104] defines Session Description Protocol (SDP) | |||
| procedures for indicating and negotiating support for Codec Control | procedures for indicating and negotiating support for Codec Control | |||
| Messages (CCM) in SDP. This document extends this with a new codec | Messages (CCM) in SDP. This document extends this with a new codec | |||
| control command, "lrr", which indicates support of the LRR. | control command, "lrr", which indicates support of the LRR. | |||
| Figure 9 gives a formal Augmented Backus-Naur Form (ABNF) [RFC5234] | Figure 9 gives a formal Augmented Backus-Naur Form (ABNF) [RFC5234] | |||
| showing this grammar extension, extending the grammar defined in | showing this grammar extension, extending the grammar defined in | |||
| skipping to change at line 531 ¶ | skipping to change at line 538 ¶ | |||
| All the security considerations of FIR feedback packets [RFC5104] | All the security considerations of FIR feedback packets [RFC5104] | |||
| apply to LRR feedback packets as well. Additionally, media senders | apply to LRR feedback packets as well. Additionally, media senders | |||
| receiving LRR feedback packets MUST validate that the payload types | receiving LRR feedback packets MUST validate that the payload types | |||
| and layer indices they are receiving are valid for the stream they | and layer indices they are receiving are valid for the stream they | |||
| are currently sending, and discard the requests if not. | are currently sending, and discard the requests if not. | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| This document defines a new entry to the "Codec Control Messages" | This document defines a new entry to the "Codec Control Messages" | |||
| subregistry of the "Session Description Protocol (SDP) Parameters" | registry of the "Session Description Protocol (SDP) Parameters" | |||
| registry, according to the following data: | registry group, according to the following data: | |||
| Value Name: lrr | Value Name: lrr | |||
| Long Name: Layer Refresh Request Command | Long Name: Layer Refresh Request Command | |||
| Usable with: ccm | Usable with: ccm | |||
| Mux: IDENTICAL-PER-PT | Mux: IDENTICAL-PER-PT | |||
| Reference: RFC 9627 | Reference: RFC 9627 | |||
| This document also defines a new entry to the "FMT Values for PSFB | This document also defines a new entry to the "FMT Values for PSFB | |||
| Payload Types" subregistry of the "Real-Time Transport Protocol (RTP) | Payload Types" registry of the "Real-Time Transport Protocol (RTP) | |||
| Parameters" registry, according to the following data: | Parameters" registry group, according to the following data: | |||
| Name: LRR | Name: LRR | |||
| Long Name: Layer Refresh Request Command | Long Name: Layer Refresh Request Command | |||
| Value: 10 | Value: 10 | |||
| Reference: RFC 9627 | Reference: RFC 9627 | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| skipping to change at line 599 ¶ | skipping to change at line 606 ¶ | |||
| [RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | [RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | |||
| M. Hannuksela, "RTP Payload Format for High Efficiency | M. Hannuksela, "RTP Payload Format for High Efficiency | |||
| Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | |||
| March 2016, <https://www.rfc-editor.org/info/rfc7798>. | March 2016, <https://www.rfc-editor.org/info/rfc7798>. | |||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
| 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
| May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
| [RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame | [RFC9626] Zanaty, M., Berger, E., and S. Nandakumar, "Video Frame | |||
| Marking RTP Header Extension", RFC 9621, | Marking RTP Header Extension", RFC 9626, | |||
| DOI 10.17487/RFC9621, August 2024, | DOI 10.17487/RFC9626, March 2025, | |||
| <https://www.rfc-editor.org/info/rfc9626>. | <https://www.rfc-editor.org/info/rfc9626>. | |||
| 9.2. Informative References | 9.2. Informative References | |||
| [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | |||
| B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | |||
| for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | |||
| DOI 10.17487/RFC7656, November 2015, | DOI 10.17487/RFC7656, November 2015, | |||
| <https://www.rfc-editor.org/info/rfc7656>. | <https://www.rfc-editor.org/info/rfc7656>. | |||
| [RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | [RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | |||
| "Using Codec Control Messages in the RTP Audio-Visual | "Using Codec Control Messages in the RTP Audio-Visual | |||
| Profile with Feedback with Layered Codecs", RFC 8082, | Profile with Feedback with Layered Codecs", RFC 8082, | |||
| DOI 10.17487/RFC8082, March 2017, | DOI 10.17487/RFC8082, March 2017, | |||
| <https://www.rfc-editor.org/info/rfc8082>. | <https://www.rfc-editor.org/info/rfc8082>. | |||
| [RFC9628] Lennox, J., Hong, D., Uberti, J., Holmer, S., and M. | [RFC9628] Uberti, J., Holmer, S., Flodman, M., Hong, D., and J. | |||
| Flodman, "The Layer Refresh Request (LRR) RTCP Feedback | Lennox, "RTP Payload Format for VP9 Video", RFC 9628, | |||
| Message", RFC 9628, DOI 10.17487/RFC9628, August 2024, | DOI 10.17487/RFC9628, March 2025, | |||
| <https://www.rfc-editor.org/info/rfc9628>. | <https://www.rfc-editor.org/info/rfc9628>. | |||
| Authors' Addresses | Authors' Addresses | |||
| Jonathan Lennox | Jonathan Lennox | |||
| Vidyo, Inc. | 8x8, Inc. / Jitsi | |||
| 433 Hackensack Avenue | Jersey City, NJ 07302 | |||
| Seventh Floor | ||||
| Hackensack, NJ 07601 | ||||
| United States of America | United States of America | |||
| Email: jonathan@vidyo.com | Email: jonathan.lennox@8x8.com | |||
| Danny Hong | Danny Hong | |||
| Vidyo, Inc. | Google, Inc. | |||
| 433 Hackensack Avenue | 315 Hudson St. | |||
| Seventh Floor | New York, NY 10013 | |||
| Hackensack, NJ 07601 | ||||
| United States of America | United States of America | |||
| Email: danny@vidyo.com | Email: dannyhong@google.com | |||
| Justin Uberti | Justin Uberti | |||
| Google, Inc. | OpenAI | |||
| 747 6th Street South | 1455 3rd St | |||
| Kirkland, WA 98033 | San Francisco, CA 94158 | |||
| United States of America | United States of America | |||
| Email: justin@uberti.name | Email: justin@uberti.name | |||
| Stefan Holmer | Stefan Holmer | |||
| Google, Inc. | Google, Inc. | |||
| Kungsbron 2 | Kungsbron 2 | |||
| SE-111 22 Stockholm | SE-111 22 Stockholm | |||
| Sweden | Sweden | |||
| Email: holmer@google.com | Email: holmer@google.com | |||
| End of changes. 54 change blocks. | ||||
| 99 lines changed or deleted | 103 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||