Bink Audio

From MultimediaWiki
Jump to navigation Jump to search

This is a custom perceptual audio codec used in Bink files (and later Smacker files). A Bink container file can contain multiple Bink audio streams. Each stream can be monophonic or stereo. The coding algorithm can use one of 2 transforms, either a discrete cosine transform (DCT) or a real discrete Fourier transform (RDFT). Both the transform and the number of channels is defined in the stream's header in the main Bink container.

Notes

  • compresses audio in chunks of varying sizes depending on sample rate:
    • if sample rate < 22050, frame size is 2048 samples
    • if sample rate < 44100, frame size is 4096 samples
    • else, frame size is 8192 samples
  • a frame is windowed with the previous frame; the size of the window is frame size / 16
  • compute half the sample rate as (sample rate + 1) / 2; initialize an array of band frequencies corresponding to an array of 25 critical frequencies (same as WMA, apparently), any for which the critical frequencies are less than half the sample rate
    • bands calculation: bands[0] = 1; foreach (i in 1..# of bands-1): bands[i] = crit_freq[i-1] * (frame length / 2) / (sample rate / 2); bands[# of bands] = frame length / 2
  • Bink audio packs 29-bit floating point numbers in the bitstream like this-- get_float():
    • exponent = next 5 bits
    • mantissa = next 23 bits
    • sign = 1 bit
  • decode process-- to decode an individual frame:
    • for each channel:
      • fetch 2 floats from the bitstream as the first 2 coefficients
      • unpack quantizers; for each band:
        • value = next 8 bits in the bitstream
        • exponent = min(value, 95) * 0.0664
        • quantizer corresponding to band = 10exponent
      • locate the initial band
      • unpack and dequantize the transform coefficients while updating the current band
      • apply reverse transform, either discrete cosine transform or real discrete Fourier transform
      • convert samples to appropriate output format and interleave as necessary
      • window the output with the previous frame
        • after decoding a frame, copy the last window_size samples (for each channel) into a window buffer; use that buffer to window the first window_size samples in the current buffer
        • window function: foreach (i in window_len): final_output_sample[i] = (window_buffer[i] * (window_len - i) + output_sample[i] * i) / window_len

Tables

Critical frequencies, 25 entries:

  100,   200,   300,   400,
  510,   630,   770,   920,
 1080,  1270,  1480,  1720,
 2000,  2320,  2700,  3150,
 3700,  4400,  5300,  6400,
 7700,  9500, 12000, 15500,
24500

RLE length table, 16 entries:

 2,  3,  4,  5,
 6,  8,  9, 10,
11, 12, 13, 14,
15, 16, 32, 64