Apple QuickTime IMA ADPCM

From MultimediaWiki
Jump to navigation Jump to search

QuickTime files can store either mono or stereo IMA ADPCM data. Files with IMA data contain the codec fourcc "ima4" in the audio stsd atom. The files store the data in blocks of nibbles. The individual IMA samples are never interleaved; one block of IMA nibbles represents either all left-channel or all right-channel PCM samples.

In any given IMA-encoded QuickTime file, the size of an individual block of IMA nibbles is stored in the bytes/packet field present in the extended audio information portion in an audio stsd atom. However, this size always seems to be 34 bytes/block. Sometimes, IMA-encoded Quicktime files are missing the extended wave information header. In this case, assume that each IMA block is 34 bytes.

The first 2 bytes of a block specify a preamble with the initial predictor and step index. The 2 bytes are read from the stream as a big-endian 16-bit number which has the following bit structure:

pppppppp piiiiiii 

Bits 15-7 of the preamble are the top 9 bits of the initial signed predictor; bits 6-0 of the initial predictor are always 0. Bits 6-0 of the preamble specify the initial step index. Note that this gives a range of 0..127 which should be clamped to 0..88 for good measure.

The remaining bytes in the IMA block (of which there are usually 32) are the ADPCM nibbles. In Quicktime IMA data, the bottom nibble of a byte is decoded first, then the top nibble:

 byte0 byte1 byte2 byte3 ...
  n1n0  n3n2  n5n4  n7n6 ... 

If a file is encoded as mono IMA, all of the blocks encode that one channel. However, if the file is encoded as stereo IMA, the first block is left audio data, the second block is right audio data, and the stereo interleaving continues on the block level for the duration of the file.