RealAudio cook

From MultimediaWiki
Jump to: navigation, search

RealAudio cook (a.k.a. Cooker a.k.a. gecko) is an audio codec created by Real. It features different coding modes.

Cook Flavors

00  8 Kbps Music - RealAudio
01  11 Kbps Music - RealAudio
02  16 Kbps Music - RealAudio
03  20 Kbps Music - RealAudio
04  32 Kbps Music - RealAudio
05  44 Kbps Music - RealAudio
06  64 Kbps Music - RealAudio
07  32 Kbps - RealAudio
08  6 Kbps Music - RealAudio
09  20 Kbps Stereo Music
10  32 Kbps Stereo Music
11  44 Kbps Stereo Music
12  64 Kbps Stereo Music
13  96 Kbps Stereo Music
14  64 Kbps - RealAudio
15  20 Kbps Music High Response - RealAudio
16  32 Kbps Music High Response - RealAudio
17  16 Kbps Stereo Music - RealAudio
18  20 Kbps Stereo Music - RealAudio
19  20 Kbps Stereo Music High Response - RealAudio
20  32 Kbps Stereo Music - RealAudio
21  32 Kbps Stereo Music High Response - RealAudio
22  44 Kbps Stereo Music - RealAudio
23  44 Kbps Stereo Music High Response - RealAudio
24  64 Kbps Stereo Music - RealAudio
25  96 Kbps Stereo Music - RealAudio
26  12 Kbps Stereo Music - RealAudio
27  64 kbps Stereo Surround - RealAudio
28  96 kbps Stereo Surround - RealAudio
29  44 kbps Stereo Surround - RealAudio
30  96 Kbps 5.1 Multichannel - RealAudio 10
31  132 Kbps 5.1 Multichannel - RealAudio 10
32  184 Kbps 5.1 Multichannel - RealAudio 10
33  268 Kbps 5.1 Multichannel - RealAudio 10

Type Specific Data

Cook data is encapsulated in RealMedia files which transport type specific data needed by different codecs. Cook requires 8 bytes of type specific data for monophonic audio and 16 bytes for stereo data. Multi-byte numbers are big-endian:

mono and stereo data:

 bytes 0-3    Cook version
 bytes 4-5    samples per frame per channel
 bytes 6-7    number of subbands used in the frequency domain

stereo data requires 8 more bytes:

 bytes 8-11   unused
 bytes 12-13  joint stereo subband start
 bytes 14-15  joint stereo VLC bits

for multichannel data there may be additional four bytes:

 bytes 16-19  channel mask

Frame organisation

In case of multichannel audio frame data may consist of several subpackets (for 5.1 it's usually 4 subpackets). In this case sizes in 16-bit words for subpackets 1-N are stored as bytes at the end of frame data.

Subpacket data is XORed with 0x37 0xC5 0x11 0xF2. If it codes dual mono data then each half of it is XORed with that key.

Subpacket structure

 gain
 if (joint stereo coding mode)
   decoupling information
 single channel data
 if (channels == 2 && !joint) {
   gain
   single channel data
 }

Gain information

 num_sections = get_unary()
 j = 0;
 for (i = 0; i < num_sections; i++) {
     idx = get_bits(3);
     if (get_bit())
         val = get_bits(4) - 7;
     else
         val = -1;
     for (; j < idx; j++)
         gains[j] = val;
 }
 for (; j < 8; j++)
     gains[j] = 0;

Decoupling information

Decoupling information is an array of properties for all coupled bands (from joint stereo subband start to the last possible band).

First bit of the data is the flag signalling that array is packed with VLCs (depending on joint stereo VLC bits) or raw joint stereo VLC bits.

Channel data

Channel data format is close to G.722.1 but bitstream format still differs (different field sizes, probably different VLCs).

 envelope
 num_vectors
 vector data

Envelope:

 quant[0] = get_bits(6) - 6;
 for (i = 1; i < subbands; i++) {
     if (i >= js_subband_start * 2)
         vlc_index = i - js_subband_start;
     else
         vlc_index = i / 2;
     if (vlc_index < 1)
         vlc_index = 1;
     if (vlc_index > 13)
         vlc_index = 13;
     quant[i] = quant[i - 1] + get_vlc(quant_vlc[i - 1]) - 12;
 }