Bink Audio: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
(windowing notes)
m (→‎Notes: about the bands)
Line 26: Line 26:
**** exponent = min(value, 95) * 0.0664
**** exponent = min(value, 95) * 0.0664
**** quantizer corresponding to band = 10<sup>exponent</sup>
**** quantizer corresponding to band = 10<sup>exponent</sup>
*** find the correct band
*** locate the initial band
*** unpack and dequantize the transform coefficients
*** unpack and dequantize the transform coefficients while updating the current band
*** apply reverse transform
*** apply reverse transform
*** convert samples to appropriate output format and interleave as necessary
*** convert samples to appropriate output format and interleave as necessary

Revision as of 18:12, 22 September 2008

This is a custom perceptual audio codec used in Bink and later Smacker files.

Notes

  • mono or stereo
  • a given audio stream (of which there can be multiple) in a Bink file might use a DCT or DFT depending on a byte in the stream header at the start of the file
  • compresses audio in chunks of varying sizes depending on sample rate:
    • if sample rate < 22050, frame size is 2048 samples
    • if sample rate < 44100, frame size is 4096 samples
    • else, frame size is 8192 samples
  • a frame is windowed with the previous frame; the size of the window is frame size / 16
  • compute half the sample rate as (sample rate + 1) / 2; initialize an array of band frequencies corresponding to an array of 25 critical frequencies (same as WMA, apparently), any for which the critical frequencies are less than half the sample rate
    • critical frequencies: 100, 200, 300, 400, 510, 630, 770, 920, 1080, 1270, 1480, 1720, 2000, 2320, 2700, 3150, 3700, 4400, 5300, 6400, 7700, 9500, 12000, 15500, 24500
    • bands calculation: bands[0] = 1; foreach (i in 1..# of bands-1): bands[i] = crit_freq[i-1] * (frame length / 2) / (sample rate / 2); bands[# of bands] = frame length / 2
  • Bink audio packs 29-bit floating point numbers in the bitstream like this-- get_float():
    • exponent = next 5 bits
    • mantissa = next 23 bits
    • sign = 1 bit
  • decode process-- to decode an individual frame:
    • for each channel:
      • fetch 2 floats from the bitstream as the first 2 coefficients
      • unpack quantizers; for each band:
        • value = next 8 bits in the bitstream
        • exponent = min(value, 95) * 0.0664
        • quantizer corresponding to band = 10exponent
      • locate the initial band
      • unpack and dequantize the transform coefficients while updating the current band
      • apply reverse transform
      • convert samples to appropriate output format and interleave as necessary
      • window the output with the previous frame
        • after decoding a frame, copy the last window_size samples (for each channel) into a window buffer; use that buffer to window the first window_size samples in the current buffer
        • window function: foreach (i in window_len): final_output_sample[i] = (window_buffer[i] * (window_len - i) + output_sample[i] * i) / window_len