RTP
The Real-Time Protocol (RTP) application is a block process with a variable block size. The process block size mast mach the block size of the codec. It is possible to change the block size after initialization.
It provides end-to-end transport functions suitable for applications transmitting interactive real-time data, such as audio. RTP is commonly used in Internet telephony applications.
RTP does not in itself guarantee real-time delivery of multimedia data. It is often encapsulated within the User Datagram Protocol (UDP). In general, UDP is an unreliable transport mechanism where packet delivery is not guaranteed, and message duplication is possible. Furthermore, UDP does not guarantee sequencing. The RTP header includes timestamps, as well as sequencing numbers. This allows the receiver to detect if there is any lost, duplicated or out of sequence packets. In addition, this information can be used to compensate for any packet delay jitter.
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
0
|
1
|
V
|
P
|
X
|
CCount
|
M
|
Payload type
|
Sequence number
|
Timestamp
|
Synchronization source (SSRC) identifier
|
Contributing source (CSRC) identifier
|
Figure 1: RTP Header Format
Figure 1 shows the RTP header fields. The first twelve bytes are present in every RTP packet. Below is a brief description of each field and its use. For more details please refer to [4]
Version (V): This field identifies the version of the RTP packet. This application supports version 2.
Padding (P): Used to indicate hether there are additional padding octets at the end of the packet, which are not part of the payload.
Extension (X): To indicate whether the header is extended by exactly one more header.
CSRC Count (Ccount): Contains the number of CSRC identifiers that follow the fixed header.
Marker (M): The marker bit is defined by the profile. For audio applications that send either no packet or comfort-noise, the marker bit is used to identify the first packet of a talk spurt.
Payload type (PT): This field identifies the format of the RTP payload and determines its interpretation by the application. It can be changed to adapt to a variation in bandwidth.
Sequence Number: The receiver can use this to restore the packet sequence or to detect lost packets. The initial value of the sequence number is a random value. It increments by one for each RTP packet transmitted.
Timestamp: The timestamp is incremented monotonically and linearly in time. The initial value of the timestamp is a random value. The resolution of the clock depends on the format of the payload. It could be used to detect different delay jitter within a single stream and compensate for it.
SSRC: This identifier is chosen randomly, with the intent that no two sources within the same RTP session will have the same SSRC identifier. This identifies the originator of the frame.
CSRC: The CSRC list identifies the contributing sources for the payload contained in the RTP packet. The number of identifiers is given by Ccount.
The RTP process interacts with other processes such as a codec, DTMF receiver and Tone Generator. The currently supported codecs are G.711 (PCMA and PCMU), G.726 (all rates). The RTP process interacts with the DTMF receiver and Tone Generator to control out of band DTMF signaling. The out of band signaling can be accomplished using the RTP protocol described in RFC 2833 [2] or via a TCP/IP based protocol managed by the host via the RTP process.
The RTP application has two modes of operation: transmitting and receiving. The transmitter creates RTP packets and the receiver terminates RTP packets. These packets may be either audio codec packets or tone packets as described in RFC 2833 [2]. An RTP process can work either as a transmitter or receiver, not both. In addition, the RTP process fulfills the requirements of the Real-Time Control Protocol (RTCP) to monitor data delivery. The transmit process provides information required to send RTCP sender reports, and the receive process provides information required to generate receiver reports.
The RTP receiver supports two modes of jitter buffer (see note 1) management:
- Fixed delay:
The delay is specified by the host and can change during the call. This can be used in conjunction with the host implementing its own adaptive jitter buffering algorithm.
- Adaptive:
The latency introduced by the jitter buffer changes dynamically with the network conditions. This is accomplished by inserting or skipping a number of frames as needed. Overall, this mechanism helps in keeping the packet losses to the desired level and ensures the delay due to the jitter buffer to be minimum.
In cases where packets are lost over the network or arrive late, the RTP receiver designates the corresponding frames in the jitter buffer as invalid and leaves it up to the voice decoder to recover them.
The RTP application is designed in accordance to the specifications described in [2], [3], [4], and [5] with the following exceptions:
- This application does not transmit or receive RTCP packets.
- Out of band signaling can only be used to send DTMF tones. Most other tones such as call progress tones are not designed for machine detection, and can therefore be effectively sent in band. Unrecognized Named Tone Events can be passed to the host if desired.
Platform Support
- PIKA MonteCarlo 6.2
- VPOS, PIKA Technologies’ proprietary voice processing operating system, which supports the PIKA AllOnBoard architecture
- Written primarily in C with some Motorola assembly for the DSP563xx
- This DSP application uses 16-bit arithmetic model
Features
- Supports G.711 (PCMA and PCMU), G.729ab, G.726 (40, 32, 24 and 16 Kbps) and G.723.1
- Supports 1ms frame resolution for G.711 and G.726 reception
- Supports dynamic payload types
- Supports out-of-band DTMF signaling as specified in RFC2833
- Supports fixed delay and adaptive jitter buffer management
- Creates data for RTCP reports
- Disables DTMF detection while generating a digit on the receiver side in order to prevent false detection of echoed DTMF signals
- Flexible architecture. A number of parameters are visible to the user and if necessary, they can be altered to adjust the performance of the application:
– Jitter buffer sizePacketization rate: The number of encoder frames per RTP packet
– Initial Latency: This parameter indicates the number of frames to be placed in the jitter buffer before starting the decoder
– Enable/disable out-of-band DTMF signaling
– Play tone delay: This is used to delay the generation of a DTMF tone when the RFC 2833 protocol is being used. Delaying the start of a tone ensures that tone generation does not end prematurely when an RFC 2833 RTP packet is late or lost.
Specifications
Codec
Supported
|
Min. RTP packet size
|
Max. RTP packet size
|
Resolution (ms)
|
Transmit (ms)
|
Receive (ms)
|
Transmit (ms)
|
Receive (ms)
|
Transmit (ms)
|
Receive (ms)
|
G.711 |
10
|
0
|
200
|
200
|
10
|
1
|
G.726
|
10
|
0
|
200
|
200
|
10
|
1
|
Jitter buffer size = 4sec max (limited by the 16-bit DSP word)
Resource Requirements
Memory requirements
Application Memory
(in DSP words)
|
Process Memory
(per process in DSP words)
|
Host Allocated Memory
(per process in DSP words)
|
4419
|
228
|
Transmitter = 41+(12+M*80)/2
Receiver = 806 + (5*N)
|
Where:
M = CodecFramesPerRTPPacket
N = The size of the Jitter Buffer in milliseconds.
Note: If G711 is not going to be supported, then refer to the Interface Specification documents of the codecs supported to minimize the host allocated memory.
Sample MIPS Requirements
Operating Mode
|
MIPS
(10ms frame size, 2 frames per RTP packet; 30 adn 1 for G723 respectively)
|
G711
|
G726
|
Transmit |
0.24
|
0.22
|
Receive |
0.63
|
0.72
|
Idle |
|
|
Notes
Note 1: The jitter buffer removes the jitter in the arrival of the packets caused by the IP networks. But it does so at the cost of increase in the overall delay. There is a trade-off between the delay caused by the jitter buffer and the packet loss. A large jitter buffer causes increase in the delay and decreases the packet loss. A small jitter buffer decreases the delay but increases the packet loss. The size of the jitter buffer depends on the condition of the network.
Reference Documents
[1] H.225.0, “Call Signaling Protocols and Media Stream Packetization for Packet-Based Multimedia Communication Systems”, ITU-T, Nov. 2000.
[2] RFC 2833 draft, “RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals”, IETF May 2000.
[3] RFC 1890 draft, “RTP Profile for Audio and Video Conferences with Minimal Control”, IETF: draft-ietf-profile-nex-10.txt, Mar 2001. See http://www.ietf.org/rfc.html for most recent versions of this document. A more recent version may also exist in draft form, see http://www.ietf.org/ids.by.wg/avt.html (draft-ietf-avt-profile-new-xx).
[4] RFC 1889 draft, “RTP A Transport Protocol for Real-Time Applications”, IETF: draft-ietf-avt-rtp-new-09.txt, Mar 2001. See http://www.ietf.org/rfc.html for most recent versions of this document. A more recent version may also exist in draft form, see http://www.ietf.org/ids.by.wg/avt.html (draft-ietf-avt-rtp-new-xx).
[5] RFC 2833 draft, “RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals”, IETF May 2000.
The PIKA Plus Advantage