Microsoft ADPCM

From MultimediaWiki
Jump to navigation Jump to search
  • Audio ID: 0x0002
  • FOURCC: 'm','s',0x00,0x02
  • Company: Microsoft
  • wFormatTag ID: WAVE_FORMAT_ADPCM
  • 'short' ID: MS_ADPCM
  • ACM Codec: 'msadp32.acm' (included in Windows 95+)

This format is Microsoft's own custom variation of the ADPCM concept.

MS ADPCM is organized in blocks. Each block has a preamble and a series of coded ADPCM nibbles. The total number of bytes in an individual ADPCM block is obtained through the nBlockAlign field of a media file's WAVEFORMATEX data structure.

A monaural block begins with the following preamble:

byte 0       block predictor (builtin predictors are in the range [0..6] but others can be manually defined)
bytes 1-2    initial delta
bytes 3-4    sample 1
bytes 5-6    sample 2 

The initial delta and both samples are signed numbers (so take sign extension into account). The block predictor value is used as an index into two adaptation coefficient tables in order to initialize two coefficients, coeff1 and coeff2.

The initial 2 samples from the block preamble are sent directly to the output. Sample 2 is first, then sample 1. The remaining samples are decoded from the ADPCM nibbles, which comprise the rest of the bytes in the block. The bytes are decoded from the upper nibble (bits 7-4) first, then the lower nibble. For each nibble:

  • predictor = ((sample1 * coeff1) + (sample2 * coeff2)) / 256
  • predictor += (signed)nibble * delta (note that nibble is 2's complement)
  • clamp predictor within signed 16-bit range
  • PCM sample = predictor
  • send PCM sample to the output
  • shuffle samples: sample 2 = sample 1, sample 1 = calculated PCM sample
  • compute next adaptive scale factor: delta = (AdaptationTable[nibble] * delta) / 256
  • saturate delta to lower bound of 16

For stereo data, the block preamble stores interleaved initialization values for the left and right channels:

byte 0        left channel block predictor (should be [0..6])
byte 1        right channel block predictor (should be [0..6])
bytes 2-3     left channel initial idelta
bytes 4-5     right channel initial idelta
bytes 6-7     left channel sample 1
bytes 8-9     right channel sample 1
bytes 10-11   left channel sample 2
bytes 12-13   right channel sample 2 

Following the preamble, the left and right ADPCM samples are interleaved within each byte. The upper nibble (bits 7-4) contains the left channel ADPCM code and the lower nibble contains the right channel ADPCM code.

The following tables define the values used to decode MS ADPCM data:

int AdaptationTable [] = { 
  230, 230, 230, 230, 307, 409, 512, 614, 
  768, 614, 512, 409, 307, 230, 230, 230 
} ;
// These are the 'built in' set of 7 predictor value pairs; additional values can be added to this table by including them as metadata chunks in the WAVE header
int AdaptCoeff1 [] = { 256, 512, 0, 192, 240, 460, 392 } ;
int AdaptCoeff2 [] = { 0, -256, 0, 64, 0, -208, -232 } ;

The adaptation table comes from Jayant's 1973 paper [1] (Table VIII, 'DPCM' column, B=4). This paper, along with its companion paper [2] originally coined the term "ADPCM".

The relevant values from the paper are:

{ 0.9, 0.9, 0.9, 0.9, 1.2, 1.6, 2.0, 2.4 }

and Microsoft added an implicit '3.0' value as a 9th entry, for a final table of:

{ 0.9, 0.9, 0.9, 0.9, 1.2, 1.6, 2.0, 2.4, 3.0 }

To obtain the table used by Microsoft ADPCM, multiply the values from Jayant's table by 256 and round down. The adaptation table holds the absolute step size values for signed input nybbles of

{ 0, 1, 2, 3, 4, 5, 6, 7, -8, -7, -6, -5, -4, -3, -2, -1 }

in that order, so take the unsigned value of the (signed) nybble and use that as an offset into the table.