DTLS-SRTP Handling
in Session Initiation Protocol (SIP) Back-to-Back User Agents
(B2BUAs)CiscoCessna Business ParkSarjapur-Marathahalli Outer Ring RoadBangaloreKarnataka560103Indiarmohanr@cisco.comCiscoCessna Business Park, Varthur HobliSarjapur Marathalli Outer Ring RoadBangaloreKarnataka560103Indiatireddy@cisco.comCisco Systems, Inc.7200-12 Kit Creek RoadResearch Triangle ParkNC27709USgsalguei@cisco.comQuobisSpainvictor.pascual@quobis.com
Real-time Applications and Infrastructre (RAI)
STRAWSession Initiation Protocol (SIP) Back-to-Back User Agents
(B2BUAs) often function on the media plane, rather than just on
the signaling path. This document describes the behavior B2BUAs
should follow when acting on the media plane that use Secure
Real-time Transport Protocol (SRTP) security context setup with
Datagram Transport Layer Security (DTLS) protocol. describes how Session Initiation
Protocol (SIP) can be used to establish
a Secure Real-time Transport Protocol (SRTP) security context with Datagram Transport
Layer Security (DTLS) protocol. It
describes a mechanism of transporting a certificate fingerprint
in the Session Description Protocol (SDP) , which identifies the certificate that will
be presented during the DTLS handshake. DTLS-SRTP is defined for
point-to-point media sessions, in which there are exactly two
participants. Each DTLS-SRTP session contains a single DTLS
association, and either two SRTP contexts (if media traffic is
flowing in both directions on the same host/port quartet) or one
SRTP context (if media traffic is only flowing in one
direction).In many SIP deployments, SIP entities exist in the SIP
signaling path between the originating and final terminating
endpoints. These SIP entities, as described in , modify SIP and SDP bodies and also are
likely to be on the media path. Such entities, when present in
the signaling/media path, are likely to do several things. For
example, some B2BUAs modify parts of the SDP body (like IP
address, port) and subsequently modify the RTP headers as well.
There are other types of B2BUAs that completely modify the RTP
packet, including the payload (e.g., a transcoder). In all these
cases a DTLS association would break unless the B2BUA
participates in the DTLS setup and ensures the contexts are
setup properly. B2BUA that are in media path MUST support DTLS
stack and SRTP extensions needed for DTLS as described in so that it can function as DTLS proxy. describes three different categories
of such B2BUAs, according to the level of activities performed
on the media plane:A B2BUA that act as a simple media relay effectively
unaware of anything that is transported and only modifies
the UDP/IP header of the packets.A B2BUA that performs a media-aware role. It inspects and
potentially modifies RTP or RTP Control Protocol (RTCP)
headers; but it does not modify the payload of RTP/RTCP.A B2BUA that performs a media-termination role and
operates at the media payload layer, such as RTP/RTCP
payload (e.g., a transcoder).The following sections will describe the behaviour B2BUAs
should follow in order to avoid any impact on end-to-end
DTLS-SRTP streams.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in .The following generalized terms are defined in , Section 6.B2BUA: a SIP Back-to-Back User Agent, which is the
logical combination of a User Agent Server (UAS) and User
Agent Client (UAC).UAS: a SIP User Agent Server.UAC: a SIP User Agent Client.All of the pertinent B2BUA terminology and taxonomy used in
this document is based on .It is assumed the reader is already familiar with the
fundamental concepts of the RTP protocol and its taxonomy , as well as
those of SRTP , and DTLS .A media relay as identified in section 3.2.1 of basically just forwards, from an
application layer point-of-view, all packets it receives on a
negotiated UDP connection, without either inspecting or
modifying them. They just forward the UDP payload as-is by
changing only the UDP/IP header.A media relay B2BUA MUST forward the certificate
fingerprint and setup attribute it receives in the SDP from
the originating endpoint as-is to the remote side and
vice-versa. The below example shows a "INVITE with SDP" SIP
call flow with both SIP user agents doing DTLS-SRTP with a
media relay B2BUA that changes the UDP/IP address/port.NOTE: For the sake of brevity the entire fingerprint
attribute is not shown.For each RTP or RTCP flow the peers do a DTLS handshake on
the same source and destination port pair to establish a DTLS
association. In this case, Bob, after he receives an INVITE
triggers a DTLS connection. Note the DTLS handshake and the
response to the INVITE may happen in parallel, thus, the B2BUA
SHOULD be prepared to receive media on the ports it advertised
to Bob in the OFFER. Since a media relay B2BUA does not
differentiate between a DTLS, RTP or any packet sent it just
changes the UDP/IP addresses and forwards the packet on either
leg.[[TODO: ICE handling w.r.t media relay B2BUA will be
discussed in STUN passthrough STRAW WG item and the reference
will be added in this section]]A media-aware relay, unlike the the media relay discussed
in the previous section, is actually aware of the media
traffic it is handling. A media-aware relay inspects SRTP and
SRTCP packets flowing through it, and may even be able to
modify the headers in any of them before forwarding them. A
B2BUA performing such a media- aware role de-crypts the
payload and re-encrypt it, but it does not modify the contents
of the payload itself. Note that when such a media-aware B2BUA
modifies SRTP headers it MUST act as a DTLS intermediary and
terminate the DTLS connection so it can decrypt/re-encrypt in
order to properly update the compound SRTCP packet to make
them consistent. This DTLS proxy functionality of media-aware
B2BUAs is discussed in greater detail in Section X of .[[TODO: Update reference to STRAW RTCP document once this
new section appears in the next version (in progress).]]In addition to modifying the headers, a B2BUA performing a
media termination role can modify parts of the payload as
well. For example, a transcoder is a type of media terminator
that modifies the payload before it forwards the packet. These
B2BUA's SHOULD have the capability to distinguish between
DTLS, SRTP, SRTCP or other packets (e.g., STUN) received on
the same UDP port by using the algorithm mentioned in section
5.1.2 of and takes care of handling
them separately.Below example shows how a DTLS-SRTP session is setup for
these B2BUANOTE: For the sake of brevity the entire fingerprint
attribute is not shown.NOTE: The same call flow would be applicable to "INVITE
without SDP" Offer calls.NOTE: Steps 5,6 may be parallel and so the B2BUA MAY
receive ClientHello before it sees a 200OK. Steps 7,8 can
happen in any order. Also steps 9,10, 11 may be parallel.
B2BUA should be prepared to handle these responses on each leg
independently.A media termination B2BUA MUST change the certificate
fingerprint from both the endpoints so that it can signal its
own certificate fingerprint in the SDP. This allows the B2BUA
to act as a DTLS-SRTP proxy and modify the payload.It is possible that DTLS exchange and offer/answer exchange
happens in parallel. In case of NAT exists between B2BUA and
UA, ClientHello message in DTLS will be lost in case the
answer is not received in UA. To overcome this issue,
retransmission of ClientHello of DTLS as mentioned in Sec
4.2.4.1 of SHALL be followed or
ClientHello MAY be started only after offer/answer exchange is
complete.B2BUA's may receive multiple answers for an outbound INVITE
due to a downstream proxy forking the INVITE to multiple
targets. It is possible that each of these responses have
different certificate fingerprints. The B2BUA SHOULD take care
of setting separate DTLS-SRTP associations with each of the
forked targets.This document simply describes the behavior B2BUAs should
follow when acting on the media plane that use SRTP security
context setup with the DTLS protocol. It does not introduce any
specific security considerations beyond those detailed in . The B2BUA behaviors outlined here also do
not impact the security and integrity of the DTLS-SRTP session
nor the data exchanged over it.This document makes no request of IANA.Special thanks to Lorenzo Miniero, Ranjit Avarsala, Hadriel
Kaplan, Muthu Arul Mozhi, Paul Kyzivat, Peter Dawes and Brett
Tate for their constructive comments, suggestions, and early
reviews that were critical to the formulation and refinement of
this document.Rajeev Seth provided substantial contributions to this
document.