RealMedia
- Extensions: rm, ra, rmvb, rmhd
- Company: Real
- Specifications: https://common.helixcommunity.org/2003/HCS_SDK_r5/htmfiles/rmff.htm
Multimedia container format developed by Real and used almost exclusively for codecs developed by Real.
The old .ra files are just for audio. The newer RealMedia (.rm) and RealMedia HD (.rmhd) files are for audio and video.
RA Format
This is the old audio-only RealAudio file format. A very similar structure is also used to describe audio streams in RM files.
The audio data part is just a stream of bytes with no structure. There is no index in .ra files, but seeking is possible because the codecs are CBR.
RealAudio 1.0 file (.ra version 3)
This is from the very first version of RealAudio (1995). These files can only contain 8kbps VSELP audio data. A FourCC (lpcJ) may be present, but it is ignored. Byte order is big-endian.
byte[4] Header signature ('.', 'r', 'a', 0xfd) word Version (always 3) word Header size, not including first 8 bytes byte[10] Unknown dword Data size byte Title string length byte[] Title string byte Author string length byte[] Author string byte Copyright string length byte[] Copyright string byte Comment string length byte[] Comment string byte Unknown * byte Fourcc string length (always 4) * byte[] Fourcc string (always "lpcJ") * Audio data
Notes:
- Fields marked with * may be missing. Based on the only known sample with no FourCC, it's assumed that all these fields are either present or missing. To determine if they are missing, check the header size (bytes 6-7).
- The informative fields (title, author, copyright and comment) can have zero length.
RealAudio 2.0 file (.ra version 4)
This is second version of the RealAudio file format. It is distinguished from the above by the value in byte 5 (=0x04). This type of file must contain a valid FourCC to identify the audio codec.
Possible FourCC values are 28_8, dnet and sipr.
byte[4] Header signature ('.', 'r', 'a', 0xfd) word Version (always 4) word Unused (always 0) byte[4] ra4 signature (always ".ra4") dword Data size - 0x27 word Version2 (always equal to version) dword Header size - 16 word Codec flavor dword Coded frame size byte[12] Unknown word Sub packet h word Frame size word Subpacket size word Unknown word Samplerate word Unknown word Sample size word Channels byte Interleaver ID string length (always 4) byte[] Interleaver ID string byte FourCC string length (always 4) byte[] FourCC string byte[3] Unknown byte Title string length byte[] Title string byte Author string length byte[] Author string byte Copyright string length byte[] Copyright string byte Comment string length byte[] Comment string Audio Data
Notes:
- The 0x27 in data size is the size of the fixed-length part of the header (up to channels).
- The informative fields (title, author, copyright and comment) can have zero length.
.ra version 5
While the .ra header can contain version 5, there are no known RealAudio files with this format, and it's not known if they really exist.
RealMedia Format
This is the newer format which stores both audio and video. All multi-byte numbers are stored in big-endian format.
A RealMedia file consists of a series of chunks. Each chunk has the following format:
dword chunk type (FOURCC) dword chunk size, including 8-byte preamble word chunk version byte[] chunk payload
Real chunk types:
- .RMF: RealMedia file header (only one per file, must be the first chunk)
- .RMP: RealMedia HD file header (only one per file, must be the first chunk)
- PROP: File properties (only one per file)
- MDPR: Stream properties (one for each stream)
- CONT: Content description/metadata (typically one per file)
- DATA: File data
- INDX: File index (typically one per stream)
RealMedia file header (.RMF or .RMP)
This must be the first chunk in a RealMedia file. Only one .RMF can be present in a file. The only useful information carried by .RMF is the number of headers.
A .RMF chunk has the following format
dword chunk type ('.RMF') dword chunk size (typically 0x12) word chunk version (always 0, for every known file) dword file version dword number of headers
Notes:
- All known sample files have version equal to 0, or equal to 2 in case of rmhd.
- There is a sample with chunk size = 0x10, in that case file version is a word. Note that the sample has chunk version = 0 like all other files.
File properties header (PROP)
This chunk contains some information about the general properties of a RealMedia file. Only one PROP chunk can be present in a file.
A PROP chunk has the following format
dword Chunk type ('PROP') dword Chunk size (typically 0x32) word Chunk version (always 0, for every known file) dword Maximum bit rate dword Average bit rate dword Size of largest data packet dword Average size of data packet dword Number of data packets in the file dword File duration in ms dword Suggested number of ms to buffer before starting playback dword Offset of the first INDX chunk form the start of the file dword Offset of the first DATA chunk form the start of the file word Number of streams in the file word Flags (bitfield, see below)
Flags:
- bit 0: file can be saved on disk
- bit 1: PerfectPlay can be used (extra buffering)
- bit 2: the file is a live broadcast
Media properties header (MDPR)
This chunk contains information about the properties of a RealMedia stream. This header defines the type of a stream and the codec used. All codec-related data is in the type specific part of this header.
Many fields share the same meanings as the ones in PROP chunk, but in this case they are specific for one stream.
There is one MDPR chunk for every stream in the file.
A MDPR chunk has the following format
dword Chunk type ('MDPR') dword Chunk size word Chunk version (always 0, for every known file) word Stream number dword Maximum bit rate dword Average bit rate dword Size of largest data packet dword Average size of data packet dword Stream start offset in ms dword Preroll in ms (to be subtracted from timestamps?) dword Stream duration in ms byte Size of stream description string byte[] Stream description string byte Size of stream mime type string byte[] Mime type string dword Size of type specific part of the header byte[] Type specific data, meaning and format depends on mime type
Audio (audio/)
audio/x-pn-realaudio and audio/x-pn-multirate-realaudio
These mimetypes are used to specify streams with RealAudio codecs. There are 3 known versions of this datablock: ra3, ra4, ra5. ra3 is used only with the old 14_4 codec, ra4 and ra5 can be used with all the other codecs.
The audio block has this format
byte[4] Header signature ('.', 'r', 'a', 0xfd) word Version (3, 4 or 5) #if version == 3 word Header size, not including first 8 bytes byte[10] Unknown dword Data size byte Title string length byte[] Title string byte Author string length byte[] Author string byte Copyright string length byte[] Copyright string byte Comment string length byte[] Comment string byte Unknown * byte Fourcc string length (always 4) * byte[] Fourcc string (always "lpcJ") * #elseif version == 4 or version == 5 word Unused (always 0) byte[4] ra signature (".ra4" or ".ra5", depending on version) dword Unknown (maybe data size) word Version2 (always equal to version) dword Header size word Codec flavor dword Coded frame size byte[12] Unknown word Sub packet h word Frame size word Subpacket size word Unknown #if version == 5 byte[6] Unknown #endif word Samplerate word Unknown word Sample size word Channels #if version == 4 byte Interleaver ID string length (always 4) byte[] Interleaver ID string byte FourCC string length (always 4) byte[] FourCC string #endif #if version == 5 dword Interleaver ID dword FourCC #endif byte[3] Unknown #if version == 5 byte Unknown #endif dword Codec extradata length byte[] Codec extradata #endif
audio/X-MP3-draft-00
This is used to store MP3 audio in rm container. When this mimetype is used the type-specific part of the MDPR header is not used, and its length is set to 0.
The MP3 frames are stored in ADU format (see RFC 3119 for details) with no interleaving (at least this is true in the only known sample).
audio/x-ralf-mpeg4
This is used to store ralf lossless audio. This is the only known RealAudio codec that does not use the x-pn-realaudio mimetype.
The format of this type-specific data is not known.
Content description header (CONT)
This chunk contains some text information (like title, author, ...) about the content of the file. This header has an informative purpose only and it's not needed to demux the file.
A CONT chunk has the following format
dword Chunk type ('CONT') dword Chunk size word Chunk version (always 0, for every known file) word Title string length byte[] Title string word Author string length byte[] Author string word Copyright string length byte[] Copyright string word Comment string length byte[] Comment string
Data header (DATA)
This chunk contains a group of data packets. Packets from each stream are interleaved, except for multirate files.
A DATA chunk has the following format
dword Chunk type ('DATA') dword Chunk size word Chunk version (always 0, for every known file) dword Number of data packets in this chunk dword Offset of the next DATA chunk (form the start of the file) byte[] Data packets
Each data packet has this format
word Packet version (0 or 1 in available samples) word Packet size word Stream number dword Timestamp (in ms) byte Unknown byte Flags (bitfield, see below) #if version == 1 byte Unknown #endif byte[] Stream-specific data
Flags:
- bit 0: reliable packet (refers to network transmission method)
- bit 1: keyframe
Note: The previous description of the data packet comes from working demuxer code, the description in official Real docs (somewhere on Helix site) is a bit different:
word Packet version word Packet size word Stream number dword Timestamp #if version == 0 byte Packet group byte Flags #endif #if version == 1 word ASM rule byte ASM flags #endif byte[] Stream-specific data
where packet group is "The packet group to which the packet belongs. If packet grouping is not used, set this field to 0 (zero)", asm rule is "The ASM rule assigned to this packet" and asm flags "Contains HX_ flags that dictate stream switching points".
Index header (INDX)
This chunk contains index entries. It comes after all the DATA chunks. An index chunk contains data for a single stream, A file can have more than one INDX chunk.
A INDX chunk has the following format
dword Chunk type ('INDX') dword Chunk size word Chunk version (always 0, for every known file) dword Number of entries in this chunk word Stream number dword Offset of the next INDX chunk (form the start of the file) byte[] Index entries
Each index entry has this format
word Entry version (always 0, for every known file) dword Timestamp (in ms) dword Packet offset in file (form the start of the file) dword Packet number
Codecs
Codecs in RealMedia are identified by the following four character codes:
Audio
- lpcJ - RealAudio 1.0 (VSELP)
- 28_8 - RealAudio 2.0 (LD-CELP)
- dnet - AC3
- sipr - Sipro
- cook - Cook
- atrc - ATRAC3
- ralf - RealAudio Lossless Format
- raac - LC-AAC
- racp - HE-AAC