Bink Video
- FourCC: BIKf, BIKg, BIKh, BIKi (note that some FourCC lists assert that BINK is a FourCC for general-purpose multimedia containers; however, Bink data is always known to be encapsulated in a custom container format)
- Company: RAD Game Tools
- Samples: http://samples.mplayerhq.hu/game-formats/bink/, countless video games
Bink is a video codec that uses Vector Quantization along with DCT and Motion Compensation). This codec is used in a large number of computer and console games. The video is packaged in custom Bink files.
Each plane is coded separately. First plane is luma or alpha plane followed by luma plane, since version 'i' there is 32-bit word containing plane data size in bytes before alpha and luma plane, so they may be skipped or decoded in parallel with chroma.
Format
Bitstream is read LSB from 32-bit little-endian words.
Frame is coded as three planes.
Each plane starts with Huffman codebooks: two 4-bit ones, one 8-bit (which really consists of seventeen (16 plus one special) 4-bit codebooks) and four 4-bit ones.
Huffman trees
There are 16 predefined Huffman trees, Bink frame only stores tree number and sophisticated coding of leaves values.
- 4 bits - huffman tree type.
First tree is actually raw nibbles, so no additional data for it. For other trees leaves symbols should be coded.
- 1 bits - symbols are coded explicitly (by selection)
If that flag is one, read 3 additional bits which determine number of symbols out of order, then read those symbols (4 bits per each symbol), the rest of symbols should go in normal order.
Example: read {5} [1] (3) <9> <1> <7> tree type: 5 symbols order: 9 1 7 0 2 3 4 5 6 8 A B C D E F
In other case symbols are coded by shuffling:
- 2 bits - shuffle depth
First divide all symbols into pairs and read 1 bit for each pair to decide whether swap this pair or not.
Example: original data: (0 1) (2 3) (4 5) (6 7) ... read bits: 0 1 1 0 ... output: (0 1) (3 2) (5 4) (6 7) ...
if shuffle depth is zero, end operation; otherwise perform merging.
void merge(uint8_t *dst, uint8_t *src1, uint8_t *src2, int size) { int size1 = size, size2 = size; do{ if(getbit()){ *dst++ = *src2++; size2--; }else{ *dst++ = *src1++; size1--; } }while(size1 && size2); while(size1--) *dst++ = *src1++; while(size2--) *dst++ = *src2++; }
If depth is one, divide symbols into pairs and merge neighbouring pairs (i.e. [0 1] and [2 3], [4 5] and [6 7], etc.) If depth is two, divide result into quartets and merge quartet 0 with quartet 1 and quartet 2 with quartet 3. For depth = 3 additionally divide result into two halves and merge them too.
Or, more clearly:
depth = getbits(2); for (i = 0; i <= depth; i++) { int size = 1 << i; int skip = size * 2; for (j = 0; j < 16; j += skip) merge(tmp + j, symbols + j, symbols + j + size, size); memcpy(symbols, tmp, 16); }
Bundle reading
Bink uses basic Huffman trees for decoding 4-bit values to create values of different types from them.
4 bit bundle
The simplest of them all:
t = getbits(bundle->runsize); if(!t) return 0; //no data to decode if(getbit()) memset(dst, getbits(4), t); else{ for(i = 0; i < t; i++) dst[t] = gethuff(bundle->tree[0]); }
4 bit signed bundle
Virtually the same as 4 bit bundle but after each Huffman-coded value is decoded an additional sign bit is read; if that bit is set then decoded value is negative.
4 bit RLE bundle
Another RLE-oriented variation - if decoded value is less than 12, output it, otherwise find corresponding run length and repeat last decoded value that number of times:
int runvalues[4] = { 4, 8, 12, 32}; t = getbits(bundle->runsize); if(!t) return 0; for(i = 0; i < t; i++){ v = gethuff(bundle->tree[0]); if(v < 0xC) *dst++ = v; else{ memset(dst, dst[-1], runvalues[v - 12]); dst += runvalues[v - 12]; i += runvalues[v - 12] - 1; } }
4 bit pair bundle
This decodes several 8-bit values composed from two nibbles:
t = getbits(bundle->runsize); if(!t) return 0; for(i = 0; i < t; i++){ n0 = gethuff(bundle->tree[0]); n1 = gethuff(bundle->tree[0]); dst[i] = n0 | (n1 << 4); }
8 bit bundle
There are two slightly different ways to decode this bundle depending on Bink version. This decoding method is context-dependent, i.e. high nibble is decoded with the tree selected on last decoded high nibble. The only difference between decoding methods is that value was signed in old decode.
Old version:
t = getbits(bundle->runsize); if(!t) return 0; if(getbit()){ for(i = 0; i < t; i++){ n0 = gethuff(bundle->tree[1 + bundle->lastval]); n1 = gethuff(bundle->tree[0]); bundle->lastval = n0; v = (n0 << 4) | n1; if (v & 0x80) v = -(v & 0x7F); dst[i] = v; } }else{ v = ??? //decoded here or from previous run memset(dst, v, t); }
New version:
t = getbits(bundle->runsize); if(!t) return 0; if(getbit()){ for(i = 0; i < t; i++){ n0 = gethuff(bundle->tree[1 + bundle->lastval]); n1 = gethuff(bundle->tree[0]); bundle->lastval = n0; v = (n0 << 4) | n1; dst[i] = v; } }else{ //unsure whether dst[-1] is decoded here or not memset(dst, dst[-1], t); }
16 bit delta
This one does not employ Huffman coding and uses delta coding instead.
t = getbits(bundle->runsize); if(!t) return 0; start = getbits(bundle->startsize); if(bundle->is_signed) start = (start & 1) ? -(start >> 1) : (start >> 1); *dst++ = start; for(i = 0; i < t; i++){ t2 = getbits(4); if(t2){ for(j = 0; j < min(t, 8); j++){ v = getbits(t2); if(v && getbit()) v = -v; start += v; *dst++ = start; } }else{ for(;i < t; i++) *dst++ = t; } }
Coefficients reading
'lossy' block (residue)
TODO
'lossless' block (DCT coeffs)
TODO
Decoding
Frame
for each block a lot of values should be decoded (value may be taken from cache if previous decode returned run of values):
- block type - 4 bit RLE bundle
- sub block type - 4 bit RLE bundle
- colors data (for filling blocks) - 8 bit bundle
- pattern - 4 bit pair bundle
- x motion offset - 4 bit signed bundle
- y motion offset - 4 bit signed bundle
- DC for intrablock - 16 bit delta
- DC for interblock - 16 bit delta
- pattern colors run length - 4 bit bundle
Block types
0 - skip block
Copy 8x8 block from previous frame to current frame
1 - scaled 16x16 block
This is 16x16 block which has its own subtype, which corresponds to other block types (range 3-9). Blocks should be decoded in the same way and then scaled twice.
2 - motion block
Copy 8x8 block from previous frame with some offset (x and y values decoded at the beginning of the block)
3 - run fill
Fill 8x8 block with pattern 4 bits - pattern index
first 63 or 64 coefficients in block are decoded in this way:
run = next value from Pattern Colors Run Length; if(getbit()){ memset(block, Colors[0], run); Colors++; }else{ memcpy(block, Colors, run); Colors += run; } block += run;
Last block element (if decoded 63 coefficients) is taken from Colors
values
4 - motion+residue
motion compensation + so-called "lossy" coding (see [1]), 7 bits before coded block contain number of masks to decode.
5 - intra DCT
intrablock with DCT coefficients and 4 bits quantizer
6 - fill
Fill 8x8 block with one value from Colors
7 - inter DCT
interblock with motion vector (x and y values decoded at the beginning), DCT coefficients and 4 bit quantizer
8 - pattern fill
Fill block with two colours taken from Colors
values using 8 values from Patterns
9 - raw data
Fill 8x8 block with 64 values from Colors