RealMedia

From MultimediaWiki
Jump to: navigation, search

Multimedia container format developed by Real and used almost exclusively for codecs developed by Real.

The old .ra files are just for audio. The newer RealMedia (.rm) files are for audio and video.

RA Format

This is the old audio-only RealAudio file format. A very similar structure is also used to describe audio streams in RM files.

The audio data part is just a stream of bytes with no structure. There is no index in .ra files, but seeking is possible because the codecs are CBR.

RealAudio 1.0 file (.ra version 3)

This is from the very first version of RealAudio (1995). These files can only contain 8kbps VSELP audio data. A FourCC (lpcJ) may be present, but it is ignored. Byte order is big-endian.

byte[4]  Header signature ('.', 'r', 'a', 0xfd)
word     Version (always 3)
word     Header size, not including first 8 bytes
byte[10] Unknown
dword    Data size
byte     Title string length
byte[]   Title string
byte     Author string length
byte[]   Author string
byte     Copyright string length
byte[]   Copyright string
byte     Comment string length
byte[]   Comment string
byte     Unknown *
byte     Fourcc string length (always 4) *
byte[]   Fourcc string (always "lpcJ") *
 Audio data

Notes:

  • Fields marked with * may be missing. Based on the only known sample with no FourCC, it's assumed that all these fields are either present or missing. To determine if they are missing, check the header size (bytes 6-7).
  • The informative fields (title, author, copyright and comment) can have zero length.

RealAudio 2.0 file (.ra version 4)

This is second version of the RealAudio file format. It is distinguished from the above by the value in byte 5 (=0x04). This type of file must contain a valid FourCC to identify the audio codec.

Possible FourCC values are 28_8, dnet and sipr.

byte[4]  Header signature ('.', 'r', 'a', 0xfd)
word     Version (always 4)
word     Unused (always 0)
byte[4]  ra4 signature (always ".ra4")
dword    Data size - 0x27
word     Version2 (always equal to version)
dword    Header size - 16
word     Codec flavor
dword    Coded frame size
byte[12] Unknown
word     Sub packet h
word     Frame size
word     Subpacket size
word     Unknown
word     Samplerate
word     Unknown
word     Sample size
word     Channels
byte     Interleaver ID string length (always 4)
byte[]   Interleaver ID string
byte     FourCC string length (always 4)
byte[]   FourCC string
byte[3]  Unknown
byte     Title string length
byte[]   Title string
byte     Author string length
byte[]   Author string
byte     Copyright string length
byte[]   Copyright string
byte     Comment string length
byte[]   Comment string
 Audio Data

Notes:

  • The 0x27 in data size is the size of the fixed-length part of the header (up to channels).
  • The informative fields (title, author, copyright and comment) can have zero length.

.ra version 5

While the .ra header can contain version 5, there are no known RealAudio files with this format, and it's not known if they really exist.

RealMedia Format

This is the newer format which stores both audio and video. All multi-byte numbers are stored in big-endian format.

A RealMedia file consists of a series of chunks. Each chunk has the following format:

dword  chunk type (FOURCC)
dword  chunk size, including 8-byte preamble
word   chunk version
byte[] chunk payload

Real chunk types:

  • .RMF: RealMedia file header (only one per file, must be the first chunk)
  • PROP: File properties (only one per file)
  • MDPR: Stream properties (one for each stream)
  • CONT: Content description/metadata (typically one per file)
  • DATA: File data
  • INDX: File index (typically one per stream)


RealMedia file header (.RMF)

This must be the first chunk in a RealMedia file. Only one .RMF can be present in a file. The only useful information carried by .RMF is the number of headers.

A .RMF chunk has the following format

dword chunk type ('.RMF')
dword chunk size (typically 0x12)
word  chunk version (always 0, for every known file)
dword file version
dword number of headers

Notes:

  • All known sample files have version equal to 0.
  • There is a sample with chunk size = 0x10, in that case file version is a word. Note that the sample has chunk version = 0 like all other files.


File properties header (PROP)

This chunk contains some information about the general properties of a RealMedia file. Only one PROP chunk can be present in a file.

A PROP chunk has the following format

dword  Chunk type ('PROP')
dword  Chunk size (typically 0x32)
word   Chunk version (always 0, for every known file)
dword  Maximum bit rate
dword  Average bit rate
dword  Size of largest data packet
dword  Average size of data packet
dword  Number of data packets in the file
dword  File duration in ms
dword  Suggested number of ms to buffer before starting playback
dword  Offset of the first INDX chunk form the start of the file
dword  Offset of the first DATA chunk form the start of the file
word   Number of streams in the file
word   Flags (bitfield, see below)

Flags:

  • bit 0: file can be saved on disk
  • bit 1: PerfectPlay can be used (extra buffering)
  • bit 2: the file is a live broadcast


Media properties header (MDPR)

This chunk contains information about the properties of a RealMedia stream. This header defines the type of a stream and the codec used. All codec-related data is in the type specific part of this header.

Many fields share the same meanings as the ones in PROP chunk, but in this case they are specific for one stream.

There is one MDPR chunk for every stream in the file.

A MDPR chunk has the following format

dword   Chunk type ('MDPR')
dword   Chunk size
word    Chunk version (always 0, for every known file)
word    Stream number
dword   Maximum bit rate
dword   Average bit rate
dword   Size of largest data packet
dword   Average size of data packet
dword   Stream start offset in ms
dword   Preroll in ms (to be subtracted from timestamps?)
dword   Stream duration in ms
byte    Size of stream description string
byte[]  Stream description string
byte    Size of stream mime type string
byte[]  Mime type string
dword   Size of type specific part of the header
byte[]  Type specific data, meaning and format depends on mime type


Audio (audio/)

audio/x-pn-realaudio and audio/x-pn-multirate-realaudio

These mimetypes are used to specify streams with RealAudio codecs. There are 3 known versions of this datablock: ra3, ra4, ra5. ra3 is used only with the old 14_4 codec, ra4 and ra5 can be used with all the other codecs.

The audio block has this format

 byte[4]  Header signature ('.', 'r', 'a', 0xfd)
 word     Version (3, 4 or 5)
#if version == 3
 word     Header size, not including first 8 bytes
 byte[10] Unknown
 dword    Data size
 byte     Title string length
 byte[]   Title string
 byte     Author string length
 byte[]   Author string
 byte     Copyright string length
 byte[]   Copyright string
 byte     Comment string length
 byte[]   Comment string
 byte     Unknown *
 byte     Fourcc string length (always 4) *
 byte[]   Fourcc string (always "lpcJ") *
#elseif version == 4 or version == 5
 word     Unused (always 0)
 byte[4]  ra signature (".ra4" or ".ra5", depending on version)
 dword    Unknown (maybe data size)
 word     Version2 (always equal to version)
 dword    Header size
 word     Codec flavor
 dword    Coded frame size
 byte[12] Unknown
 word     Sub packet h
 word     Frame size
 word     Subpacket size
 word     Unknown
#if version == 5
 byte[6]  Unknown
#endif
 word     Samplerate
 word     Unknown
 word     Sample size
 word     Channels
#if version == 4
 byte     Interleaver ID string length (always 4)
 byte[]   Interleaver ID string
 byte     FourCC string length (always 4)
 byte[]   FourCC string
#endif
#if version == 5
 dword    Interleaver ID
 dword    FourCC
#endif
 byte[3]  Unknown
#if version == 5
 byte     Unknown
#endif
 dword    Codec extradata length
 byte[]   Codec extradata
#endif


audio/X-MP3-draft-00

This is used to store MP3 audio in rm container. When this mimetype is used the type-specific part of the MDPR header is not used, and its length is set to 0.

The MP3 frames are stored in ADU format (see RFC 3119 for details) with no interleaving (at least this is true in the only known sample).

audio/x-ralf-mpeg4

This is used to store ralf lossless audio. This is the only known RealAudio codec that does not use the x-pn-realaudio mimetype.

The format of this type-specific data is not known.

Content description header (CONT)

This chunk contains some text information (like title, author, ...) about the content of the file. This header has an informative purpose only and it's not needed to demux the file.

A CONT chunk has the following format

dword   Chunk type ('CONT')
dword   Chunk size
word    Chunk version (always 0, for every known file)
word    Title string length
byte[]  Title string
word    Author string length
byte[]  Author string
word    Copyright string length
byte[]  Copyright string
word    Comment string length
byte[]  Comment string

Data header (DATA)

This chunk contains a group of data packets. Packets from each stream are interleaved, except for multirate files.

A DATA chunk has the following format

dword   Chunk type ('DATA')
dword   Chunk size
word    Chunk version (always 0, for every known file)
dword   Number of data packets in this chunk
dword   Offset of the next DATA chunk (form the start of the file)
byte[]  Data packets

Each data packet has this format

 word   Packet version (0 or 1 in available samples)
 word   Packet size
 word   Stream number
 dword  Timestamp (in ms)
 byte   Unknown
 byte   Flags (bitfield, see below)
#if version == 1
 byte   Unknown
#endif
 byte[]  Stream-specific data

Flags:

  • bit 0: reliable packet (refers to network transmission method)
  • bit 1: keyframe

Note: The previous description of the data packet comes from working demuxer code, the description in official Real docs (somewhere on Helix site) is a bit different:

 word   Packet version
 word   Packet size
 word   Stream number
 dword  Timestamp
#if version == 0
 byte   Packet group
 byte   Flags
#endif
#if version == 1
 word   ASM rule
 byte   ASM flags
#endif
 byte[]  Stream-specific data

where packet group is "The packet group to which the packet belongs. If packet grouping is not used, set this field to 0 (zero)", asm rule is "The ASM rule assigned to this packet" and asm flags "Contains HX_ flags that dictate stream switching points".

Index header (INDX)

This chunk contains index entries. It comes after all the DATA chunks. An index chunk contains data for a single stream, A file can have more than one INDX chunk.

A INDX chunk has the following format

dword   Chunk type ('INDX')
dword   Chunk size
word    Chunk version (always 0, for every known file)
dword   Number of entries in this chunk
word    Stream number
dword   Offset of the next INDX chunk (form the start of the file)
byte[]  Index entries

Each index entry has this format

 word   Entry version (always 0, for every known file)
 dword  Timestamp (in ms)
 dword  Packet offset in file (form the start of the file)
 dword  Packet number


Codecs

Codecs in RealMedia are identified by the following four character codes:

Audio
Video
  • CLV1 - ClearVideo (from helix spec)
  • RV10 - H.263
  • RV13 - H.263
  • RV20 - H.263+
  • RV30 - H.264 precursor
  • RV40 - H.264 precursor
  • RVTR - H.263+ (RV20)
  • RVT2 - RV30 ? (from helix spec hxmtypes.h)