https://wiki.multimedia.cx/api.php?action=feedcontributions&user=Kostya&feedformat=atomMultimediaWiki - User contributions [en]2024-03-29T10:31:34ZUser contributionsMediaWiki 1.39.5https://wiki.multimedia.cx/index.php?title=Talisman_ANI&diff=15765Talisman ANI2024-03-22T14:50:32Z<p>Kostya: fill format description</p>
<hr />
<div>* Company: Software 2000<br />
* Extension: ani<br />
* Game: [https://www.mobygames.com/game/32166/talisman/ Talisman]<br />
<br />
This is a rather peculiar video game format that uses frame data compression in addition to the frame coding.<br />
<br />
The file consists of chunks of various types, each starting with 32-bit little-endian chunk type and 32-bit payload size (compressed payload will start with an additional 32-bit unpacked size).<br />
<br />
== Known chunks ==<br />
* <code>0x1234</code> -- ANI header, should be 20 bytes long (always unpacked);<br />
* <code>0x4321</code> -- sync chunk, should be 16 bytes long (always unpacked);<br />
* <code>0x1111</code> -- Huffman codebooks data, always unpacked;<br />
* <code>0x2001</code> -- intra frame;<br />
* <code>0x2110</code> -- inter frame;<br />
* <code>0x2332</code> -- seems to always contain 32-value equal to one (obviously unpacked), probably skip frame signal;<br />
* <code>0x2553</code> -- probably the same, not encountered in game files;<br />
* <code>0x3456</code> -- unknown, might be related to audio data;<br />
* <code>0x5544</code> -- palette (unpacked), always 768 bytes;<br />
* <code>0xABCD</code> -- acknowledged by the decoder as unpacked but not encountered in game files.<br />
<br />
=== Header chunk ===<br />
The header chunk should be the first in the file, occur only once. It contains the following information:<br />
* 32-bit image width<br />
* 32-bit image height<br />
* 32-bit value that is always zero<br />
* 32-bit value that is always 300000<br />
* 32-bit number of frames<br />
<br />
=== Sync chunk ===<br />
Sync chunks mark a sequence aka group of frames (an intra frame plus some inter frames).<br />
<br />
* 32-bit offset to the next sync chunk<br />
* 32-bit value that is always zero<br />
* 32-bit value that looks like some suggested buffer size<br />
* 32-bit number of video frames until next sync chunk<br />
<br />
=== Codebooks data chunk ===<br />
This chunk contains Huffman codebook definition.<br />
<br />
First there may be a sequence of <code>0xFF</code> bytes that should be ignored.<br />
<br />
The first byte with another value tells which Huffman tree should be used for the sequence.<br />
<br />
The following data contains symbol values for the Huffman trees. It starts with an opcode telling what to do with the entries: <code>0xFF</code> means skip definition set, <code>0x00..0x7F</code> mean the next byte is the number of new symbols and the following N bytes are new symbol values, other values mean previously decoded definition with number in the low seven bits should be re-used (skipped ones included but should not be re-used).<br />
<br />
See the next section on how that data is used.<br />
<br />
=== Chunks unpacking ===<br />
Chunks are packed using order-1 static Huffman codes with some predefined tables and per-sequence symbol values (defined in chunk <code>0x1111</code>). That means when the data is decoded, it reads the same code set but the symbol value depends on the definitions transmitted in the chunk so e.g. if symbol <code>011</code> meant 0 in one sequence, it may mean 4 in another.<br />
<br />
First symbol is transmitted as is, then actual Huffman bitstream follows (packed in 16-bit words, MSB first). The first byte is transmitted as is since the symbol table is defined for the previously decoded symbol and newly decoded value from the stream. Chunk <code>0x1111</code> stores those context-dependent symbol definitions for each previously decoded value and all possible input values (i.e. first set is used when previous symbol value is 0, next set is used when previous symbol value is 1 and so on).<br />
<br />
== Video decoding ==<br />
Video frames are split into 2x2 blocks and use motion value in byte form (low nibble - X offset, high nibble - Y offset) with nibble values being signed (e.g. nibble value <code>0x3</code> means offset 3 and nibble value <code>0xD</code> means offset -3).<br />
<br />
Frame data is split into two parts, namely the motion vectors data (and run values) and pixel values. 32-bit value at the beginning tells where the pixel data starts (motion data always starts at offset 4).<br />
<br />
Codes <code>0x00..0xF6</code> copy 2x2 block using the corresponding motion vector (<code>0x00</code> - do nothing, <code>0x3D</code> - copy with 3,-3 offset).<br />
<br />
Codes <code>0xF7..0xFA</code> read a motion vector byte and a pixel byte, copy block from the specified location and replace pixel 3..0 with the new pixel value.<br />
<br />
Codes <code>0xFB..0xFD</code> read a motion vector byte and copy the source block rotated by 90/180/270 degrees counter-clockwise correspondingly.<br />
<br />
Code <code>0xFE</code> reads a run value (that may be prefixed with several <code>0xFF</code> to add 255 to its value) from motion data section and skips the signalled number of blocks.<br />
<br />
Code <code>0xFF</code> reads four pixel values for the block.<br />
<br />
Inter frame seems to be the same but uses the intra frame as the reference.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=FVF&diff=15764FVF2024-03-19T15:31:02Z<p>Kostya: /* Video frame part */ clarify video coding a bit</p>
<hr />
<div>FVF is a format used for cutscenes in [https://www.mobygames.com/game/957/star-trek-the-next-generation-a-final-unity/ Star Trek: The Next Generation - "A Final Unity"] game.<br />
<br />
== File structure ==<br />
All data is little-endian. Data is grouped into frames that are grouped into blocks aligned to at least 2048 bytes. Audio is stored as unpacked PCM, video is compressed with its own method.<br />
<br />
Header:<br />
4 bytes - "FVF "<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
4 bytes - first block offset<br />
4 bytes - last block offset<br />
4 bytes - image header (should be 0x60)<br />
4 bytes - audio header (usually 0xB5)<br />
64 bytes - unknown<br />
<br />
Image header:<br />
2 bytes - header size? (usually 40)<br />
2 bytes - always 1?<br />
2 bytes - always 16?<br />
2 bytes - width<br />
2 bytes - height<br />
4 bytes - delay in milliseconds<br />
4 bytes - unknown<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
2 bytes - unknown<br />
1 byte - palette something<br />
1 byte - number of palette entries (usually 15)<br />
4 bytes - palette offset (usually 0x88)<br />
6 bytes - unknown<br />
<br />
Audio header:<br />
2 bytes - compression? (usually it's 1 and raw PCM)<br />
2 bytes - number of channels?<br />
2 bytes - bits per sample (usually 8)<br />
2 bytes - sampling rate<br />
8 bytes - unknown<br />
<br />
Block header:<br />
2 bytes - header size (should be 16)<br />
2 bytes - flags<br />
4 bytes - previous block size<br />
4 bytes - current block size<br />
4 bytes - next block size<br />
<br />
Frame header:<br />
2 bytes - header size (should be 24)<br />
4 bytes - full size<br />
4 bytes - always 24?<br />
4 bytes - video part size<br />
4 bytes - audio part size<br />
6 bytes - unknown<br />
<br />
== Video frame part ==<br />
Video frame part starts with 32-bit size and two 16-bit fields (one of those is used to signal palette change, another one is for motion vector table size). Compression method seems to work by painting tiles using RGB555 colour combinations generated from the palette colours. Bitstream is little-endian, low three bits signal tile opcode and it's aligned to the byte boundary before each next opcode:<br />
* code 0 -- get 7-bit value for palette LUT offset, get 14-bit value for source data offset, copy 4x4 block from the source applying palette LUT;<br />
* code 1 -- the same as above but tile is flipped horizontally;<br />
* code 2 -- same as code 0 but tile is flipped vertically;<br />
* code 3 -- same as above but tile is flipped horizontally as well;<br />
* code 4 -- get 5-bit run value, copy the provided amount of 4x4 tiles from the previous frame (presumably);<br />
* code 5 -- get 1-bit run flag and then 4-bit MV table index. Run flags means reading 8-bit run value and performing the operations that many additional times, zero flag means doing motion compensation on 4x4 tile just once (motion vector table is stored at the beginning of the frame);<br />
* code 6 -- skip 5 bits (to byte align) and copy 16 bytes from the input to the tile data;<br />
* code 7 -- this is large tile mode, the next 4 bits signal the operation:<br />
** case 0 -- skip 1 bit, get 7-bit palette LUT index, get 1-bit flag, get 14-bit source offset index, paint 8x8 tile;<br />
** cases 1-7 -- same but with various flipping modes;<br />
** cases 8-11, 13-14 -- should not be present;<br />
** case 12 -- special 4x4 tile copy run with long values (either 8-bit run value + 32 or 16-bit run value + 288 depending on flag);<br />
** case 14 -- raw 2x2 tile.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]<br />
[[Category:Incomplete Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Intelligent_Games_MOV&diff=15763Intelligent Games MOV2024-03-14T16:21:24Z<p>Kostya: document yet another game codec</p>
<hr />
<div>* Company: Intelligent Games Ltd.<br />
* Extension: MOV<br />
* Game used: [https://www.mobygames.com/game/1852/azraels-tear/ Azrael's Tear]<br />
<br />
This is a rather simple container for paletted video cutscenes.<br />
<br />
The file starts with the following header (all values are little-endian):<br />
4 bytes - always 90 EF 12 AB<br />
2 bytes - total number of frames in the file<br />
2 bytes - frames per second? (in reality it seems to be 10 instead of declared 12)<br />
2 bytes - video width<br />
2 bytes - video height<br />
18 bytes - seems to be random garbage<br />
<br />
Each frame starts with 16-byte header:<br />
4 bytes - always CF 01 99 EE<br />
2 bytes - frame type? (seems to be 3 for the first frame and 2 for the following ones)<br />
4 bytes - unpacked frame size<br />
4 bytes - packed frame size (or 0 for raw frames)<br />
2 bytes - padding?<br />
<br />
Frame consists of one or more chunks:<br />
4 bytes - always DC FE 23 01<br />
2 bytes - chunk type<br />
4 bytes - chunk size<br />
6 bytes - junk<br />
<br />
Chunk types:<br />
* 1 - raw frame data<br />
* 2 and 3 - empty/skip frame<br />
* 4 - VGA palette<br />
* 5 - inter frame<br />
* 6 - audio data (8-bit PCM, 22050Hz mono)<br />
<br />
=== Frame data unpacking ===<br />
Frame data may be compressed with Huffman codes. In that case it consists of four blocks: 32-bit unpacked data size (the same as in frame header), 32-bit Huffman tree root index, Huffman tree nodes (an array of 1024 16-bit numbers), packed data.<br />
<br />
Data decoding is trivial:<br />
* set <code>index</code> to the tree root index<br />
* while <code>index</code> &ge; 256<br />
** read bit (LSB first) from packed data part<br />
** <code>index = nodes[index * 2 + bit]</code><br />
* output <code>index</code><br />
<br />
=== Inter frame compression ===<br />
Inter frames are split in 4x4 blocks and may be coded using one of four modes and modes for four tiles are packed in one byte (LSB first) e.g. modes 0, 1, 2, 3 will be transmitted as <code>0xE4</code>.<br />
<br />
* mode 0 is long motion compensation. It requires two additional bytes, first one is X coordinate displacement plus 127 (i.e. value 0 mean -127 and value 255 means +128), second one is Y coordinate displacement plus 63 in low 7 bits and the top bit is used to signal data source (not set - copy data from the previous frame, set - copy data from the current frame).<br />
* mode 1 is a skip run. Next byte in the stream tells how many blocks need to be left unchanged from the previous frame (value 0 signals 256 blocks to copy).<br />
* mode 2 is short motion compensation. Next byte contains displacement for the previous frame. Top nibble - X coordinate plus 7, low nibble - Y coordinate plus 7.<br />
* mode 3 is for raw block. Following 16 bytes are the new block contents.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15738CNM2023-11-09T16:00:04Z<p>Kostya: fill information about CI2</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm, ci2<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen]. The CI2 is the next iteration of CNM with slightly different compression that is used in [https://www.mobygames.com/game/seven-games-of-the-soul Faust: The Seven Games of the Soul].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present; completely zero in CI2)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table (v1 only)<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - tile data<br />
* 0x55 - image (v2)<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression for version 1 ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (!getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 2,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
== Video compression for version 2 ==<br />
In this version frames are coded in small groups (usually by four) with the common tile data (chunk <code>0x54</code>) preceding keyframe (chunk <code>0x55</code>) and inter frames (chunk <code>0x53</code>).<br />
<br />
Also note that in this version bitstream format is little-endian LSB first.<br />
<br />
=== Tile format ===<br />
Chunk type <code>0x54</code> starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1:<br />
<br />
read raw data for tile 0<br />
for each tile {<br />
copy previous tile data<br />
for each component of tile { // i.e. all Rs, Gs, Bs and As<br />
bits = get_bits(3);<br />
if (bits < 7) {<br />
for (i = 0; i < tile_size; i++) {<br />
delta = get_bits(bits); // get_bits(0)=0<br />
if (delta && get_bit(1))<br />
delta = -delta;<br />
tile[component][i] += delta;<br />
}<br />
} else {<br />
for (i = 0; i < tile_size; i++) {<br />
tile[component][i] = get_bits(8);<br />
}<br />
}<br />
}<br />
}<br />
<br />
=== Frame format ===<br />
Frame is now packed using various methods of prediction operating on tile indices. In inter frame tile index 0 means unchanged area.<br />
<br />
Frame data is split into regions of eight tiles, for each a bit is transmitted. Bit 1 means the whole region should be copied from above, bit 0 means that each individual tile index needs to be treated separately.<br />
<br />
Individual tile indices have the following mode codewords:<br />
* <code>&nbsp;&nbsp;1</code> -- copy index from the top line<br />
* <code>000</code> -- get <code>ceil(log2(tile_size))</code> bits for a new tile index, add it to context list (see below)<br />
* <code>100</code> -- get 4-bit delta value, a sign bit, add/subtract <code>delta+1</code> to/from top index value, output and add it to the context list<br />
* <code>010</code> -- form a list of 1-4 unique neighbour values (see below), select one using 0-2 bits, output and add it to the context list<br />
* <code>110</code> -- get 4-bit index in the corresponding context list and output it (without updating the list)<br />
<br />
==== Context list ====<br />
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 16 values that had it as a top neighbour value. Initially it contains all zeroes.<br />
<br />
For all but one single-index operations the list should be updated:<br />
<br />
if (y > 0) { // not the first line<br />
top_idx = frame[cur_pos - stride];<br />
contexts[top_idx].list[contexts[top_idx].pos] = cur_idx;<br />
contexts[top_idx].pos = (contexts[top_idx].pos + 1) & 15;<br />
}<br />
<br />
==== Context-dependent list ====<br />
For one of the modes such list is formed and then used as the pixel source:<br />
<br />
// list forming<br />
list = (empty);<br />
top = y > 0 ? top tile index : NONE;<br />
for left, top-left, top-right and top-top positions {<br />
idx = tile index at the search position<br />
if (!contains(list, idx) && (top == NONE || top != idx)) {<br />
push(list, idx)<br />
}<br />
}<br />
//decoding<br />
if (length(list) < 2) {<br />
new_idx = list[0]; // it should not be empty<br />
} else if (length(list) == 2) {<br />
new_idx = list[get_bit()];<br />
} else {<br />
new_idx = list[get_bits(2)];<br />
}<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15737CNM2023-11-06T17:42:31Z<p>Kostya: /* Video compression for version 1 */ fix motion vector value</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm, ci2<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen]. The CI2 is the next iteration of CNM with slightly different compression that is used in [https://www.mobygames.com/game/seven-games-of-the-soul Faust: The Seven Games of the Soul].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table (v1 only)<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - tile data<br />
* 0x55 - image (v2)<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression for version 1 ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (!getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 2,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
== Video compression for version 2 ==<br />
In this version tile data is usually stored separately, in chunk type 0x54. Also bitstream format has changed to LSB first little-endian.<br />
<br />
=== Tile format ===<br />
Chunk type <code>0x54</code> starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1:<br />
<br />
read raw data for tile 0<br />
for each tile {<br />
copy previous tile data<br />
for each component of tile { // i.e. all Rs, Gs, Bs and As<br />
bits = get_bits(3);<br />
if (bits < 7) {<br />
for (i = 0; i < tile_size; i++) {<br />
delta = get_bits(bits); // get_bits(0)=0<br />
if (delta && get_bit(1))<br />
delta = -delta;<br />
tile[component][i] += delta;<br />
}<br />
} else {<br />
for (i = 0; i < tile_size; i++) {<br />
tile[component][i] = get_bits(8);<br />
}<br />
}<br />
}<br />
}<br />
<br />
=== Frame format ===<br />
Frame is now packed using a lot of various LRUs and first tile indices are restored and afterwards they are replaced with actual tile data. Frame data is coded in groups of 8 tiles using a bit prefix: 1 - copy 8 tile indices from the previous line, 0 - switch to individual tile index decoding. Individual tile indices are coded in several ways (depending on code):<br />
* <code>&nbsp;&nbsp;1</code> -- copy index from the top line<br />
* <code>000</code> -- get <code>ceil(log2(tile_size))</code> bits for a new tile index, add it to LRU list (see below)<br />
* <code>100</code> -- get 4-bit delta value, a sign bit, add that to top index value, output and add it to LRU list<br />
* <code>010</code> -- form a list of 0-4 context-dependent values (see below), select one using 0-2 bits, output and add it to LRU list<br />
* <code>110</code> -- get 4-bit index, output value retrieved from LRU list using that index<br />
<br />
==== LRU list ====<br />
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 15 values. The actual buffer is selected using the top tile index (so it is not in use for the first line). Initially it contains all zeroes.<br />
<br />
==== Context-dependent list ====<br />
For one of the modes such list is formed and then used as the pixel source:<br />
<br />
// list forming<br />
list = (empty);<br />
top = y > 0 ? top tile index : NONE;<br />
for left, top-left, top-right and top-top positions {<br />
idx = tile index at the search position<br />
if (!contains(list, idx) && (top == NONE || top != idx)) {<br />
push(list, idx)<br />
}<br />
}<br />
//decoding<br />
if (length(list) < 2) {<br />
new_idx = list[0]; // it should be empty<br />
} else if (length(list) == 2) {<br />
new_idx = list[get_bit()];<br />
} else {<br />
new_idx = list[get_bits(2)];<br />
}<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Winnow_Video&diff=15736Winnow Video2023-11-03T15:21:59Z<p>Kostya: /* Winnov Video 2 (FOURCC: WINX) */ Fill WINX information</p>
<hr />
<div>* FourCC: WINX, WNV1<br />
* Samples: <br />
** WINX: http://samples.mplayerhq.hu/V-codecs/WINX/<br />
** WNV1: http://samples.mplayerhq.hu/V-codecs/WNV1/<br />
<br />
== Winnov Video 1 (FOURCC: WNV1) ==<br />
<br />
Another hardware codec like [[ATI VCR1]], [[Indeo 2]] or [[Video XL]]. It uses YUYV format and stores deltas with static code.<br />
<br />
Codes used really are simple unary codes with following sign bit and 11111111 code used as escape.<br />
<br />
Each component may be decoded this way:<br />
<br />
code = get_code();<br />
if(code == ESCAPE)<br />
newval = code;<br />
else<br />
newval = oldval + (code << SHIFT);<br />
<br />
SHIFT = 6 (may be another, but no samples with another value are known), ESCAPE = 15 (code for ESCAPE is mentioned above)<br />
<br />
== Winnov Video 2 (FOURCC: WINX) ==<br />
<br />
This one is slightly more advanced codec that codes YUY2 data in 8x8 tiles.<br />
<br />
The bitstream is LSB first and has no special header. Each tile is prefixed by coded flag (0 means that the tile is skipped) and 4-bit mode (for coded tiles only).<br />
<br />
Supported tile modes<br />
* 1 - 5-bit raw deltas and codebook deltas shifted by 4 bits;<br />
* 3 - 6-bit raw deltas and codebook deltas shifted by 5 bits;<br />
* 15 - end of stream.<br />
<br />
Each tile is coded as a sequence of deltas to the previous component value. If the delta value is not "escape" then it should be shifted by the amount of bits defined by the tile mode and added to the previous value, otherwise raw delta value should be read and added instead.<br />
<br />
=== Codebook ===<br />
The codes are represented here in reversed form for clarity (in reality e.g. -3 is stored as <code>0x17</code>).<br />
<br />
111111100 - 7<br />
11111100 - 6<br />
1111100 - 5<br />
111100 - 4<br />
11100 - 3<br />
1100 - 2<br />
100 - 1<br />
0 - 0<br />
101 - -1<br />
1101 - -2<br />
11101 - -3<br />
111101 - -4<br />
1111101 - -5<br />
11111101 - -6<br />
111111101 - -7<br />
11111111 - escape<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Radius_Studio_Video&diff=15735Radius Studio Video2023-10-30T14:00:46Z<p>Kostya: fill codec information</p>
<hr />
<div>* FourCC: PGVV<br />
* Company: [[Radius]]<br />
* Samples: [http://samples.mplayerhq.hu/V-codecs/PGVV-RadiusStudio/ http://samples.mplayerhq.hu/V-codecs/PGVV-RadiusStudio/]<br />
<br />
PGVV is apparently a video codec generated by either the Video Vision hardware grabber card, or with the Radius Studio software. All the brochures and reviews say that it is using an "adaptive JPEG". Probably it means adaptive rate control as described in their patent 5,621,820. It is a slight variation of JPEG.<br />
<br />
[http://web.archive.org/web/20001008224911/http://radius.com/Products/VideoVisionPCI.html Original product page] on archive.org.<br />
<br />
There are two versions of the codec known which can be distinguished by 16-bit (big-endian) field at offset 8 of the frame data.<br />
<br />
Version 0 frame header (all values are big-endian):<br />
4 bytes - first field size<br />
4 bytes - second field size (0 for progressive images)<br />
2 bytes - version (0)<br />
1 byte - unknown<br />
1 byte - quality<br />
4 bytes - unknown (frame flags?)<br />
<br />
Version 1 frame header:<br />
4 bytes - zero<br />
4 bytes - second field size (0 for progressive images)<br />
2 bytes - version (0)<br />
1 byte - unknown<br />
1 byte - quality<br />
4 bytes - unknown (frame flags?)<br />
4 bytes - unknown<br />
4 bytes - first field size<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
<br />
After the header the first field data follows, then padding to multiple of 2048 and the second field data.<br />
<br />
Field data is almost standard JPEG data organised per macroblock with two luma and two chroma blocks per macroblock. The coefficients are coded almost the same, only chroma DC values are interpreted slightly differently (DC difference bits are stored as sign+value, not as two-complement value).<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=FVF&diff=15729FVF2023-09-14T13:44:11Z<p>Kostya: fill some information</p>
<hr />
<div>FVF is a format used for cutscenes in [https://www.mobygames.com/game/957/star-trek-the-next-generation-a-final-unity/ Star Trek: The Next Generation - "A Final Unity"] game.<br />
<br />
== File structure ==<br />
All data is little-endian. Data is grouped into frames that are grouped into blocks aligned to at least 2048 bytes. Audio is stored as unpacked PCM, video is compressed with its own method.<br />
<br />
Header:<br />
4 bytes - "FVF "<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
4 bytes - first block offset<br />
4 bytes - last block offset<br />
4 bytes - image header (should be 0x60)<br />
4 bytes - audio header (usually 0xB5)<br />
64 bytes - unknown<br />
<br />
Image header:<br />
2 bytes - header size? (usually 40)<br />
2 bytes - always 1?<br />
2 bytes - always 16?<br />
2 bytes - width<br />
2 bytes - height<br />
4 bytes - delay in milliseconds<br />
4 bytes - unknown<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
2 bytes - unknown<br />
1 byte - palette something<br />
1 byte - number of palette entries (usually 15)<br />
4 bytes - palette offset (usually 0x88)<br />
6 bytes - unknown<br />
<br />
Audio header:<br />
2 bytes - compression? (usually it's 1 and raw PCM)<br />
2 bytes - number of channels?<br />
2 bytes - bits per sample (usually 8)<br />
2 bytes - sampling rate<br />
8 bytes - unknown<br />
<br />
Block header:<br />
2 bytes - header size (should be 16)<br />
2 bytes - flags<br />
4 bytes - previous block size<br />
4 bytes - current block size<br />
4 bytes - next block size<br />
<br />
Frame header:<br />
2 bytes - header size (should be 24)<br />
4 bytes - full size<br />
4 bytes - always 24?<br />
4 bytes - video part size<br />
4 bytes - audio part size<br />
6 bytes - unknown<br />
<br />
== Video frame part ==<br />
Video frame part starts with 32-bit size and two 16-bit fields (one of those is used to signal palette change, another one is for motion vector table size). Compression method seems to work by painting tiles using colour combinations generated from the palette colours. Bitstream is little-endian, low three bits signal tile opcode and it's aligned to the byte boundary before each next opcode:<br />
* code 0 -- get 7-bit value, get 14-bit value, paint 4x4 tile using them;<br />
* code 1 -- the same as above but tile is painted flipped horizontally;<br />
* code 2 -- same as code 0 but tile is flipped vertically;<br />
* code 3 -- same as above but tile is flipped horizontally as well;<br />
* code 4 -- get 5-bit run value, copy the provided amount of 4x4 tiles from the previous frame (presumably);<br />
* code 5 -- get 1-bit run flag and then 4-bit MV table index. Run flags means reading 8-bit run value and performing the operations that many additional times, zero flag means doing motion compensation on 4x4 tile just once (motion vector table is stored at the beginning of the frame);<br />
* code 6 -- skip 5 bits (to byte align) and copy 16 bytes from the input to the tile data;<br />
* code 7 -- this is large tile mode, the next 4 bits signal the operation:<br />
** case 0 -- skip 1 bit, get 7-bit index, get 1-bit flag, get 14-bit index, paint 8x8 tile;<br />
** cases 1-7 -- same but with various flipping modes;<br />
** cases 8-11, 13-14 -- should not be present;<br />
** case 12 -- special 4x4 tile copy run with long values;<br />
** case 14 -- raw 2x2 tile.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]<br />
[[Category:Incomplete Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Interplay_ACMP&diff=15728Interplay ACMP2023-08-28T14:56:37Z<p>Kostya: fill at least some details</p>
<hr />
<div>* Company: [[Interplay Entertainment]]<br />
* Samples: http://samples.mplayerhq.hu/game-formats/interplay-acmp/<br />
<br />
Interplay ACMP is an audio compression file format used in some games from Interplay. The name is shortened "AUDCOMP" and it employs [[DPCM]] with variable-length coding for compressing 8-bit PCM.<br />
<br />
== Interplay ACMP Format ==<br />
Multi-byte numbers are stored in big endian format.<br />
<br />
bytes 0-19 Signature string: 'Interplay ACMP Data\x1A'<br />
bytes 20-21 sample rate<br />
bytes 22-23 unknown, always seems to be 0 and might be part of sample rate<br />
byte 24 unknown, always seems to be 0x28<br />
byte 25 unknown, always seems to be 0<br />
bytes 26-29 possibly the size of decompressed audio<br />
bytes 30.. suspected to be the encoded audio data<br />
<br />
== Audio data coding ==<br />
Audio seems to be coded in blocks of 256 samples (or less for the last block). Each block starts with 8-bit header:<br />
* 1 bit - audio is transformed and only half of the samples are transmitted;<br />
* 2 bits - quantisation step<br />
* 5 bits - coding mode (0-6)<br />
<br />
In general, block data is compressed by first taking differences (initial value is 0x80), quantising it by discarding low bits and coding in one of several ways. Some encoding modes use so-called tally code which is an unary code for <code>min(value, 10)</code> and with optional 6-bit full value written afterwards (for values 10 and above). Also deltas are often coded as absolute values with a sign bit following.<br />
<br />
Coding modes are:<br />
* 0 - constant block, first sample is coded as is;<br />
* 1 - deltas are coded as <code>tally(ilog2(|delta|)) |delta| sign</code> (or <code>tally(1)</code> for zero delta);<br />
* 2 - deltas are coded as 4-bit fields, those that do not fit coded as 0 nibble plus <code>8 - quant_bits</code> bits for the actual delta value;<br />
* 3 - deltas are coded as <code>tally((|diff| >> 2) + 1) |diff|&3 sign</code>;<br />
* 4 - the same as previous but with 3 low bits coded raw;<br />
* 5 - delta 0 is coded as <code>00</code>, delta 1 is coded as <code>01</code>, other deltas are coded as <code>10 tally(ilog2(|diff| - 1)) (|diff| - 1) sign</code>;<br />
* 6 - simply write raw quantised differences using <code>8 - quant_bits</code> bits for each.<br />
<br />
== Games Using Interplay ACMP Files ==<br />
* [http://www.mobygames.com/game/star-trek-judgment-rites Star Trek: Judgment Rites]<br />
<br />
[[Category:Audio Codecs]]<br />
[[Category:Game Formats]]<br />
[[Category:Incomplete Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Muzip&diff=15727Muzip2023-08-28T12:58:11Z<p>Kostya: promote from undiscovered codecs as some information is known</p>
<hr />
<div>* Extensions: mzp, dat, tlp<br />
* Company: [[XVD|Alaris (XVD Corporation)]]<br />
* Samples: http://www.ila-ila.com/xvd-hist/sites/lab1454/eng/products/NCitySng12.mzp and various VGM/VG2 files with Muzip audio stream inside.<br />
<br />
Muzip is a family of FFT-based audio codecs with confusing versioning. Internally codec versions can be distinguished by the text ID that starts at byte 18 of audio header. Known versions:<br />
* <code>CTP03</code> - probably Muzip1 and codec ID 3<br />
* <code>CTP04</code> - probably Muzip1 and codec ID 3<br />
* <code>CTP05</code> - probably Muzip1 and codec ID 5<br />
* <code>CTP06</code> - Muzip1 (codec ID 5)<br />
* <code>CTP07</code> - Muzip1 (codec ID 5), probably the same as <code>CTPJAVA</code> (codec ID 8)<br />
* <code>MZIP13R</code> - Muzip 4 (codec ID 9), the ID probably stands for 1.3<br />
<br />
Decoder for version CTP06 is here: http://www.ila-ila.com/xvd-hist/sites/lab1454/eng/products/muzip1.jar<br />
Decoder for version CTP07 can be found on the same site as part of Java VGM player.<br />
<br />
==Header for Muzip sample in WAV==<br />
0..3 "RIFF"<br />
4..7 unknown (file size? seems not)<br />
8..11 "WAVE"<br />
12..15 "fmt\0" (note: not space, it's null)<br />
16..19 unknown (fmt\0 size? seems not)<br />
20..21 unknown (format id? "0301". seems not)<br />
22 number of channels<br />
23 unknown (rest of number of channels?)<br />
24..27 sampling rate<br />
28..31 (byte per sec?)<br />
32..33 (block size?)<br />
34..35 (bit per sample?)<br />
36..(\0) codec name ("CTP03", "CTP04", "CTP05", "CTP06")<br />
42..43 unknown (rest of codec name or not?)<br />
44 coefficients(compression ratio) (8-13 -> [8,12,15,16,24,10])<br />
45 key frame<br />
46 variable bitrate<br />
47 reserved<br />
48..51 unknown<br />
52..55 "data"<br />
56..59 total size<br />
<br />
==Muzip CTP03==<br />
<br />
No information.<br />
<br />
==Muzip CTP04==<br />
<br />
No information.<br />
<br />
==Muzip CTP05==<br />
<br />
No information.<br />
<br />
==Muzip CTP06==<br />
<br />
This is a codec that employs [[enumerative coding]] for coding various values, FFT and also wavelet transform for high-frequency part manipulation. Frequencies are grouped into sub-bands depending on sampling rate.<br />
<br />
After the main part is decoded and FFT is applied, the audio is passed through wavelet-based filter to split it into high- and low-frequency band, an additional trellis-based high-frequency data is added to the high band and the bands are recombined again.<br />
<br />
==Muzip CTP07==<br />
<br />
This format improves on <code>CTP06</code> by adding two different kinds of correction data that is added after main data reconstruction.<br />
<br />
==Muzip MZIP13R==<br />
<br />
This is the version of codec used in VGM2 container. Unlike previous versions, it does not perform wavelet-based high band correction and uses arithmetic coding for coefficients. Additionally now it has several frame sizes depending on bitrate: 64, 128, 256 and 320 samples.<br />
<br />
Sample coding is done by coding bit allocation parameters in parametric way, symbol categories, arithmetic coded scalefactor indices (coded using static model selected depending on bit allocation), and arithmetic coded quantised samples coded using static model selected depending on bit allocation and category.<br />
<br />
[[Category:Audio Codecs]] <br />
[[Category:Incomplete Audio Codecs]]<br />
[[Category:Formats missing in FFmpeg]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=OptimFROG&diff=15726OptimFROG2023-08-28T12:57:17Z<p>Kostya: promote from undiscovered codecs as some information about it is known</p>
<hr />
<div>* Extension: ofr<br />
* Website: http://www.losslessaudio.org/<br />
* Samples: http://samples.mplayerhq.hu/A-codecs/lossless/ (luckynight.ofr)<br />
<br />
OptimFROG is a lossless audio coding algorithm employing multi-layer adaptive filter and range coding. There are two formats known: 4.2alpha and the current one.<br />
<br />
Adaptive filter differs from the conventional LMS filters by the fact that after certain amount of samples is decoded the filter is re-calculated.<br />
<br />
Coefficient coding may use adaptive models. The models are usually selected using exponent of weighed energy of decoded coefficients (e.g. for old format it is energy_new = energy_old * 0.91700404 + coef * coef * 0.08299596, new format can set custom weights).<br />
<br />
=== Old format ===<br />
<br />
Old format starts with a 44-byte RIFF WAV header with first four bytes replaced with <code>*RIF</code>, then 32-bit number of coded samples and actual coded data follow.<br />
<br />
This codec employs single adaptive filter with order up to 64. Residue is coded using a set of 32 adaptive models initialised with pre-defined frequencies.<br />
<br />
=== New format ===<br />
New format has a chunked format and supports a correction stream.<br />
Supported chunks:<br />
* <code>OFR </code> or <code>OFRX</code> - header<br />
* <code>HEAD</code> - WAV file header<br />
* <code>COMP</code> - compressed audio data<br />
* <code>CORR</code> - correction data for hybrid streams<br />
* <code>TAIL</code> - WAV file trailer<br />
<br />
==== Header ====<br />
All values are in little-endian format.<br />
<br />
* 4 bytes - header size (12 bytes for 4.5alpha, 15 bytes for older versions, 17 bytes for newer versions)<br />
* 6 bytes - number of samples (for all channels)<br />
* 1 byte - format ID (u8/s8, u16/s16, u24/s24, u32/s32, f32 in -1.0..1.0 range, f32 for 16-bit integers, f32 for 24-bit integers)<br />
* 1 byte - channel configuration (0 - mono, 1 - stereo)<br />
* 4 bytes - sample rate<br />
* 2 bytes - some packed information <code>((version - 4200) * 16 + something)</code><br />
* 1 byte - packed information <code>(8 * method + speed)</code>. Known methods are fast, normal, high, extra, best, ultra, insane, highnew, extranew, bestnew, ultranew, extrafast, turbonew, fastnew, normalnew. Knows speeds are 1x/2x/4x.<br />
* 2 bytes - <code>(version - 4500)</code><br />
<br />
==== Compressed audio chunk ====<br />
Compressed data begins with 4-byte unknown value (most probably CRC), 4-byte number of samples in the block, 1-byte format ID (same meaning as in the header), 1-byte channel ID (the same) and 2-byte block packing methods stored as <code>reader_id << 11 | filter_id << 6 | output_mode_id</code>.<br />
<br />
Known <code>reader_id</code> values:<br />
* 1 - use a set of adaptive models to determine how many bits of value to read<br />
* 2 - use single adaptive model to determine how many bits of value to read<br />
* 3 - use a adaptive models to decode which model to use next for decoding amount of bits of the value<br />
<br />
Known <code>filter_id</code> values are 1-4 that tell decoder which filters should be used for reconstructing. It seems to be several layers (minimum of two) of adaptive filters whose output if fed to the next layer. Remarkably filters use floating-point calculation instead of integer arithmetics.<br />
<br />
Known <code>output_mode_id</code> values:<br />
* 1 - multiply output by a constant, add another constant, clip to constant range (all those parameters are transmitted)<br />
* 2 - remap input values to the output floating-point values<br />
<br />
For all of these stages there are range-coded parameters that are transmitted before the actual data start. First it's the reader data, then filter data, and finally output mode data.<br />
<br />
[[Category:Container Formats]]<br />
[[Category:Audio Codecs]]<br />
[[Category:Lossless Audio Codecs]]<br />
[[Category:Incomplete Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=VocPack&diff=15725VocPack2023-08-28T12:56:12Z<p>Kostya: </p>
<hr />
<div>*Description: [http://www.rarewares.org/rrw/vocpack.php VocPack]<br />
<br />
From the description, this appears to be one of the earliest lossless codecs: it dates from 1993.<br />
Two versions of the encoder/decoder are available on the site listed above<br />
<br />
File structure:<br />
4 bytes - NFVP (probably Nicola Ferioli and VocPack)<br />
1 byte - compression method (0 - unsigned 8-bit audio, 1 - signed 8-bit audio, 0x20 - version 2.0)<br />
4 bytes - number of coded samples<br />
[v2 only] 1 byte - packed parameters (bit 1 - stereo, bit 2 - 16-bit input, bits 3-5 - number of initial raw bytes)<br />
[v2 only] 1-33 bytes - zero-terminated original filename<br />
packed audio data<br />
<br />
In both versions audio data is first predicted using adaptive order-2 predictor and then the difference is compressed using context adaptive models and binary arithmetic coder.<br />
<br />
=== VocPack 1.0 ===<br />
This version compresses only 8-bit mono audio.<br />
<br />
Prediction process:<br />
<br />
pred = clip((last * lastcoef + last2 * last2coef) >> 10, -128, 127);<br />
<br />
Predictor update:<br />
<br />
if last > 0 {<br />
lastcoef += 5 * diff;<br />
} else {<br />
lastcoef -= 5 * diff;<br />
}<br />
if last2 > 0 {<br />
last2coef += 2 * diff;<br />
} else {<br />
last2coef -= 2 * diff;<br />
}<br />
last2 = last;<br />
last = sample;<br />
<br />
Difference coding uses a set of 64 adaptive models with 256 elements each initially initialised to 1. The next model index is <code>((uint8_t)prev_decoded_val) >> 2</code>. Each decoded element frequency is updated by adding 256 and if the total sum of frequency reaches 64000 the frequencies are halved as <code>(freq >> 1) + 1</code>.<br />
<br />
=== VocPack 2.0 ===<br />
This version supports 16-bit and stereo audio. In case of stereo channels are coded independently. In case of 16-bit audio low byte of input sample is stored raw and only top byte is coded in a method very similar to version 1.0.<br />
<br />
Version 2 predictor uses shift by 9 and the following update method:<br />
<br />
if last > 0 {<br />
lastcoef += diff * 2;<br />
} else {<br />
lastcoef -= diff * 2;<br />
}<br />
lastcoef = clip(lastcoef, 500, 1500);<br />
if last2 > 0 {<br />
last2coef += diff;<br />
} else {<br />
last2coef -= diff;<br />
}<br />
last2coef = clip(last2coef, -750, -250);<br />
<br />
Difference is coded using a different model based on similar approaches.<br />
<br />
[[Category: Lossless Audio Codecs]]<br />
[[Category: Incomplete Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=VocPack&diff=15724VocPack2023-08-28T12:55:54Z<p>Kostya: promote from undiscovered codecs as it has algorithm description</p>
<hr />
<div>*Description: [http://www.rarewares.org/rrw/vocpack.php VocPack]<br />
<br />
From the description, this appears to be one of the earliest lossless codecs: it dates from 1993.<br />
Two versions of the encoder/decoder are available on the site listed above<br />
<br />
File structure:<br />
4 bytes - NFVP (probably Nicola Ferioli and VocPack)<br />
1 byte - compression method (0 - unsigned 8-bit audio, 1 - signed 8-bit audio, 0x20 - version 2.0)<br />
4 bytes - number of coded samples<br />
[v2 only] 1 byte - packed parameters (bit 1 - stereo, bit 2 - 16-bit input, bits 3-5 - number of initial raw bytes)<br />
[v2 only] 1-33 bytes - zero-terminated original filename<br />
packed audio data<br />
<br />
In both versions audio data is first predicted using adaptive order-2 predictor and then the difference is compressed using context adaptive models and binary arithmetic coder.<br />
<br />
=== VocPack 1.0 ===<br />
This version compresses only 8-bit mono audio.<br />
<br />
Prediction process:<br />
<br />
pred = clip((last * lastcoef + last2 * last2coef) >> 10, -128, 127);<br />
<br />
Predictor update:<br />
<br />
if last > 0 {<br />
lastcoef += 5 * diff;<br />
} else {<br />
lastcoef -= 5 * diff;<br />
}<br />
if last2 > 0 {<br />
last2coef += 2 * diff;<br />
} else {<br />
last2coef -= 2 * diff;<br />
}<br />
last2 = last;<br />
last = sample;<br />
<br />
Difference coding uses a set of 64 adaptive models with 256 elements each initially initialised to 1. The next model index is <code>((uint8_t)prev_decoded_val) >> 2</code>. Each decoded element frequency is updated by adding 256 and if the total sum of frequency reaches 64000 the frequencies are halved as <code>(freq >> 1) + 1</code>.<br />
<br />
=== VocPack 2.0 ===<br />
This version supports 16-bit and stereo audio. In case of stereo channels are coded independently. In case of 16-bit audio low byte of input sampe is stored raw and only top byte is coded in a method very similar to version 1.0.<br />
<br />
Version 2 predictor uses shift by 9 and the following update method:<br />
<br />
if last > 0 {<br />
lastcoef += diff * 2;<br />
} else {<br />
lastcoef -= diff * 2;<br />
}<br />
lastcoef = clip(lastcoef, 500, 1500);<br />
if last2 > 0 {<br />
last2coef += diff;<br />
} else {<br />
last2coef -= diff;<br />
}<br />
last2coef = clip(last2coef, -750, -250);<br />
<br />
Difference is coded using a different model based on similar approaches.<br />
<br />
[[Category: Lossless Audio Codecs]]<br />
[[Category: Incomplete Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=RK_Audio&diff=15723RK Audio2023-08-28T12:54:16Z<p>Kostya: promote from undiscovered as there's even an opensource decoder</p>
<hr />
<div>* Extension: rka<br />
* Website: [http://www.msoftware.co.nz/downloads_page.php http://www.msoftware.co.nz/downloads_page.php]<br />
* Description: http://www.rarewares.org/rrw/rkau.php<br />
* Company: [[M Software]]<br />
* Samples: http://samples.mplayerhq.hu/A-codecs/lossless/ (luckynight.rka)<br />
* Implementation: [https://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/rka.c;hb=HEAD libavcodec]<br />
<br />
RK Audio is a lossless audio coding algorithm.<br />
<br />
=== Container format ===<br />
All values are little-endian.<br />
<br />
3 bytes - "RKA"<br />
1 byte - version ('5'-'7')<br />
4 bytes - raw audio size<br />
4 bytes - sampling rate<br />
1 byte - number of channels<br />
1 byte - bits per sample<br />
1 byte - method * 16 + cmode (known methods: fast/normal/max, cmode 0 - lossless, 0-3 - simple lossy, 4-7 - VRQ lossy)<br />
1 byte - flags (bit 0 - true stereo, bit 1 - seek table present)<br />
<br />
Without seek table audio data coded as single block follows immediately, otherwise there is 4-byte work with an offset to frame sizes. Frame sizes are stored as 24-bit integers. Frame codes 131072 bytes of audio data<br />
<br />
=== Compression details ===<br />
Compression algorithm works by applying LPC filter with order up to 257 to the residues, all data is coded with adaptive models and arithmetic coder.<br />
<br />
Frame data consists of interleaved channel data blocks each coding 2560 samples. Depending on coded mode it can be coded in up to three segments. Each segment contains filter order, filter coefficients, and residues. Filter coefficients and residues are decoded using context-dependent adaptive models, the rest is coded using adaptive models.<br />
<br />
Stereo reconstruction:<br />
<br />
new_l = (l * 2 + r + 1) >> 1;<br />
new_r = (l * 2 - r + 1) >> 1;<br />
<br />
[[Category:Container Formats]]<br />
[[Category:Audio Codecs]]<br />
[[Category:Lossless Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CRH&diff=15722CRH2023-08-09T14:22:09Z<p>Kostya: document the format</p>
<hr />
<div>* extension: .crh<br />
* company: Kalisto Entertainment SA<br />
<br />
CRH is an FMV format used at least in two games by Kalisto Entertainment. It features 15-bit RGB video with LZ-like compression and block correlation and [[DPCM]]-coded audio.<br />
<br />
== Container format ==<br />
CRH files start with 16-bit little-endian version field (should be always set to one) followed by 48-byte audio header and 4-byte video header (16-bit number of frames and 16-bit frames per second which should always be 15).<br />
<br />
Audio header:<br />
4 bytes - sample rate<br />
4 bytes - probably full audio stream size<br />
2 bytes - unknown<br />
2 bytes - unknown<br />
2 bytes - unknown<br />
2 bytes - number of channels<br />
4 bytes - unknown<br />
4 bytes - probably raw audio bytes per second<br />
2 bytes - probably raw audio bits per sample<br />
22 bytes - unknown<br />
<br />
Frames consist of fixed-size audio part and variable-size video part (it starts with 32-bit size).<br />
Audio part of frame is calculated as <code>sample_rate * channels / fps + 2</code>.<br />
<br />
== Video coding ==<br />
Video frames start with four 32-bit numbers: full frame size, component 1 size, component 2 size, component 3 size. Then the data for the components follows. Finally there's tile data prefixed with 32-bit size.<br />
<br />
Video compression works by splitting frames into tiles, decorrelating tile components, interleaving them and coding each component separately using LZ-like coding.<br />
<br />
=== Component plane reconstruction ===<br />
Each component plane has fixed size (320x200 or 320x240 depending on the game) and uses the following bitstream (LSB first):<br />
<br />
if !get_bit() { // RLE<br />
value = get_bits(5);<br />
length = get_bits(8);<br />
output value x length to dst<br />
} else { // LZ<br />
source_id = get_bit();<br />
offset = get_bits(16 or 17); // depends on game<br />
length = get_bits(8);<br />
source = (!source_id ? dst : prev_plane) + offset;<br />
copy length bytes from source to dst<br />
}<br />
<br />
=== Deinterleaving ===<br />
For some reason (maybe in order to reconstruct downscaled video) data is stored in interleaved order, i.e. for each pair of lines component plane stored first line data at even positions and second line data at odd positions. Since the plane buffers may be used as the reference during the next plane decoding, this deinterleaving process is done during data read in tile reconstruction.<br />
<br />
=== Tile reconstruction ===<br />
After component plane reconstruction is done, it is split into tiles and reconstructed. Tile data is stored as bitstream LSB first and read bit-by-bit MSB first (e.g. eight-bit values in the tile header are stored bit-reversed).<br />
<br />
tile_width = get_bits(8); // should be 20<br />
tile_height = get_bits(8); // should also be 20<br />
for each tile {<br />
tile_mode = get_bits(2);<br />
step1 = get_bits(4) * 4;<br />
step2 = get_bits(4) * 4;<br />
offset1 = get_bits(4) * 2;<br />
offset2 = get_bits(4) * 2;<br />
}<br />
<br />
For each tile first two translation tables are generated using the corresponding <code>offset</code> and <code>step</code> values using the following formula:<br />
<br />
for (i = 0; i < 32; i++)<br />
tab[i] = i * step / 32 + offset - 16;<br />
<br />
And finally depending on the <code>tile_mode</code> the RGB components are restored from <code>c0</code>, <code>c1</code> and <code>c2</code> (components are listed in the order they're coded, <code>clip5()</code> is a function that clips the output value to <code>[0; 31]</code> range):<br />
<br />
switch tile_mode {<br />
case 0:<br />
r = clip5(c0 + tab2[c2]);<br />
g = clip5(c1 + tab1[c2]);<br />
b = c2;<br />
break;<br />
case 1:<br />
r = clip5(c1 + tab1[c2]);<br />
g = c2;<br />
b = clip5(c0 + tab2[c2]);<br />
break;<br />
case 2:<br />
r = c2;<br />
g = clip5(c0 + tab2[c2]);<br />
b = clip5(c1 + tab1[c2]);<br />
break;<br />
} // mode 3 seems to be not handled<br />
<br />
== Audio coding ==<br />
The format uses 16-bit audio with 8-bit deltas. In the beginning of the audio block there is full 16-bit sample (or two in case of stereo mode) followed by 8-bit signed deltas. If low bit of the delta is set, it should be multiplied by 512, otherwise just by 16 and added to the previous sample. In case of stereo mode deltas for different channels are interleaved.<br />
<br />
== Games using this format ==<br />
* [https://www.mobygames.com/game/2874/dark-earth/ Dark Earth] (320x200 and 16-bit offsets)<br />
* [https://www.mobygames.com/game/4186/nightmare-creatures/ Nightmare Creatures] (320x240 and 17-bit offsets)<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Audio Codecs]]<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Reaper&diff=15721Reaper2023-07-31T10:52:39Z<p>Kostya: fill details</p>
<hr />
<div>* Extension: fmv<br />
<br />
Psygnosis' 1999 title [http://www.mobygames.com/game/windows/rollcage Rollcage] included directory full of video with a .fmv extension. The first 8 bytes of the files contain a header of "!Reaper!".<br />
<br />
Looking at the strings in the either of the games binaries (Glide or Direct3D) the string "Reaper '95 Version 1.30, (c) Paul Hughes" can also be found.<br />
<br />
The game (its PSX version) take advantage of an [[MDEC]] chip, therefore the video encoding is an [[MPEG]] variant and audio is [[IMA ADPCM]] packaged in a custom container. <br />
<br />
== File Format ==<br />
All numbers are little-endian. File consist of a main header:<br />
<br />
s8 signature[8] -- "!Reaper!" (but only first 5 characters actually checked)<br />
s32 headersize -- size of the header and coefficients table (i.e. it's an offset of the first data chunk)<br />
s16 width -- video width<br />
s16 height -- video height<br />
s16 nframes -- number of frames<br />
s16 audiotype -- 0 - no sound, 1 - IMA sound<br />
s16 audiorate -- audio frequency<br />
s16 audiobits -- sound sample bits (always 16?)<br />
s16 audiochnl -- 1 - mono, 2 - stereo<br />
s16 unknown<br />
<br />
then followed by [[RLE]]-packed frame quantisers (size = headersize - sizeof(header)) and then followed by frame chunks.<br />
Chunk format:<br />
u32 signature -- chunk type<br />
u32 size -- chunk size, including this header<br />
u8 payload[] -- chunk payload<br />
<br />
Observed chunk types:<br />
* 0x0B85120F - video chunk<br />
* 0x90FC0302 - audio chunk<br />
* 0xD6BD0106 - raw audio chunk?<br />
<br />
Frame quantisers run coding is triggered by two consequent values being the same (or previous run having size 255). E.g. <code>64 64 06</code> decodes to value of 0x64 repeated eight times and <code>62 62 FF 62 02</code> decodes to 259 repeats of value 0x62 run of two values is coded as <code>5F 5F 00</code>.<br />
<br />
== Video Decompression ==<br />
Video is DCT-based sequence of macroblocks that may be either motion-compensated, filled or fully decoded. It uses 7-bit YUV colourspace internally with the following conversion formulae:<br />
* first expand all YUV component values as <code>val * 2 + 1</code>;<br />
* <code>R = Y + (Cr * 91881 >> 16) + 128;</code><br />
* <code>G = Y + (Cb * -22050 >> 16) + (Cr * -46799 >> 16) + 128;</code><br />
* <code>B = Y + (Cb * 116129 >> 16) + 128;</code><br />
<br />
=== Data decoding ===<br />
<br />
Each macroblock is prefixed by the value XORed with <code>0xB6</code> and may have the following values (not coincidentally it's the same as the number of bytes following the MB type):<br />
* 1 -- motion block, next byte high nibble is signed vertical motion component and low nibble is signed horizontal motion component;<br />
* 3 -- the next three bytes are YUV values to fill the whole macroblock with;<br />
* 6 -- the next six bytes are four Y plus UV values to fill the blocks of the whole macroblock<br />
* 12 -- the same as the previous but every other byte is ignored (i.e. Y0 0 Y1 0 Y2 0 Y3 0 U 0 V 0)<br />
* 2, 4, 5, 7-11 -- should not happen<br />
* other values denote a full coded macroblock<br />
<br />
For a full-coded macroblock data is decoded per block. First byte is the (signed) DC value, then the data for AC follows (LSB first, the loop starts from coefficient 63 and goes back to zero):<br />
<br />
* 3 bits - RL type:<br />
** 0 -- zero<br />
** 1 -- 6-bit zero run<br />
** 2 -- one<br />
** 3 and 7 -- 6-bit signed value (0x3F means escape and reading full 8-bit value)<br />
** 4 -- two zero coefficients<br />
** 5 -- 6-bit zero run plus another zero value<br />
** 6 -- minus one<br />
<br />
Scan order:<br />
0x00, 0x3f, 0x37, 0x3e, 0x3d, 0x36, 0x2f, 0x27,<br />
0x2e, 0x35, 0x3c, 0x3b, 0x34, 0x2d, 0x26, 0x1f,<br />
0x17, 0x1e, 0x25, 0x2c, 0x33, 0x3a, 0x39, 0x32,<br />
0x2b, 0x24, 0x1d, 0x16, 0x0f, 0x07, 0x0e, 0x15,<br />
0x1c, 0x23, 0x2a, 0x31, 0x38, 0x30, 0x29, 0x22,<br />
0x1b, 0x14, 0x0d, 0x06, 0x05, 0x0c, 0x13, 0x1a,<br />
0x21, 0x28, 0x20, 0x19, 0x12, 0x0b, 0x04, 0x03,<br />
0x0a, 0x11, 0x18, 0x10, 0x09, 0x02, 0x01, 0x08<br />
<br />
=== Quantiser ===<br />
Actual frame quantisation matrix is obtained from the frame quality stored in the header as<br />
quant = (100 - quality) / 2 - (100 - quality) * 22 / 100 + 2;<br />
multiplied by the default quantisation matrix.<br />
<br />
Default quantisation matrix:<br />
8192, 5906, 6270, 6967,<br />
8192, 10426, 15137, 29692,<br />
5906, 4258, 4520, 5023,<br />
5906, 7517, 10913, 21407,<br />
6270, 4520, 4799, 5332,<br />
6270, 7980, 11585, 22725,<br />
6967, 5023, 5332, 5925,<br />
6967, 8867, 12873, 25251,<br />
8192, 5906, 6270, 6967,<br />
8192, 10426, 15137, 29692,<br />
10426, 7517, 7980, 8867,<br />
10426, 13270, 19266, 37791,<br />
15137, 10913, 11585, 12873,<br />
15137, 19266, 27969, 54864,<br />
29692, 21407, 22725, 25251,<br />
29692, 37791, 54864, 107619<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Murder_FILM&diff=15720Murder FILM2023-07-29T16:18:44Z<p>Kostya: fill details</p>
<hr />
<div>* Extension: film<br />
* Samples: http://samples.mplayerhq.hu/game-formats/murder-film/<br />
<br />
FILM files are used in a shareware Amiga game called Murder. The format appears to encapsulate frames consisting of [[IFF]] data. The files are created using a program called [http://aminet.net/package/gfx/show/AGMSFilm2 AGMSMakeFilm].<br />
<br />
From the original description:<br />
<br />
The file format is taken from the IFF standard. Unfortunately, it is a<br />
part of the standard that few people use because of its complexity.<br />
Basically, the film is a LIST containing the frames (cells) of the<br />
animation. The LIST specifies shared properties for all the contained<br />
objects (namely, the screen size and sound speed). Each frame within<br />
the LIST is implemented as a CAT object (concatenation). The CAT holds<br />
a FORM ILBM for the image part of the frame and a FORM 8SVX for the<br />
sound track portion that goes with the image. So, programs like<br />
AGMSPlaySound that can handle fancy IFF structures work but most others<br />
will guru because they recognize it as IFF and then use buggy (probably<br />
untested) LIST and CAT reading code. One final note, I'm using a<br />
special type of LIST that requires that all the frames (CAT chunks) are<br />
all exactly the same size and have the audio and video parts in the same<br />
relative positions.<br />
<br />
Root chunk is <code>FILM</code> list, containing some <code>PROP</code> chunks and <code>CAT </code> chunks. <code>PROP</code> contain file metadata (like program version, OS and machine), ILBM and 8SVX headers for the following frames. <code>CAT </code> chunk contains chunks with ILBM and 8SVX <code>BODY</code> chunks in nested <code>FORM</code> chunks.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=On2_VP3&diff=15719On2 VP32023-07-22T14:02:10Z<p>Kostya: all variants have been discovered</p>
<hr />
<div>* FOURCCs: VP30, VP31, VP32<br />
* Company: [[On2]]<br />
* Technical Descriptions: [http://multimedia.cx/vp3-format.txt http://multimedia.cx/vp3-format.txt]<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/VP3/<br />
<br />
VP3 is the third video codec produced by On2. In 2002, On2 released the source code as open source and it eventually evolved into the [[Theora]] project.<br />
<br />
VP30 and VP31 are not bitstream-compatible codecs. While VP31 is open source VP30 is not decodable via the same process. Additionally VP32 was in the works but it was later released as [[On2 VP4]] instead.<br />
<br />
An opensource implementation for all those VP3 codec variations can be found in NihAV [https://git.nihav.org/?p=nihav.git;a=blob;f=nihav-duck/src/codecs/vp3.rs].<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=On2_VP6&diff=15718On2 VP62023-07-22T13:57:53Z<p>Kostya: now that the specification has been uncovered, the codec is no longer incomplete</p>
<hr />
<div>* FOURCCs: VP60, VP61, VP62<br />
* Company: [[On2]]<br />
* Whitepaper: http://www.on2.com/cms-data/pdf/1125607149174329.pdf ([[Mirrored Files|mirrored]])<br />
* Specification: http://multimedia.cx/mirror/vp6_format.pdf<br />
* Samples: [http://samples.mplayerhq.hu/V-codecs/VP6/ http://samples.mplayerhq.hu/V-codecs/VP6/]<br />
* Nice article form author of the codec (Paul Wilkins): http://www.dspdesignline.com/211100053?printableArticle=true<br />
<br />
== Implementations ==<br />
<br />
An early open source implementation could be found at http://libvp62.sourceforge.net/, but<br />
was driven underground by On2 on copyright infringement claims.<br />
<br />
This specification is said to be incomplete with regard to the On2 VP6 specification.<br />
<br />
A decoder implementation may be found in the FFMPEG source file [http://svn.mplayerhq.hu/ffmpeg/trunk/libavcodec/vp6.c?view=markup vp6.c]<br />
<br />
== Format ==<br />
<br />
The aim here is to open this standard with a full description of the bitstream format and decoding process. Contributors from On2 especially encouraged here, but it is anticipated that this section will be completed through reverse engineering and by people who saw libvp62 source code before it was censored. <br />
<br />
Please do not submit any copyrighted text or code here.<br />
<br />
=== Introduction ===<br />
<br />
VP6 uses unidirectional ("P-frame") and intra-frame (within the current frame) prediction. Entropy coding is performed using arithmetic (range?) coding and an 8x8 iDCT is used. The format supports dynamic adjustment of encoded video resolution. There are three variants of the VP6 codec, VP60 (Simple Profile), VP62 (Advanced Profile) and VP62 (Heightened Sharpness Profile).<br />
<br />
<br />
=== Macroblocks ===<br />
<br />
Each video frame is composed of an array of 16x16 macroblocks, just like [[MPEG-2]], [[MPEG-4]] parts 2 and 10. Each [[MB]] (macroblock) takes one of the following modes ("[[MV]]" means "motion vector"):<br />
<br />
* Intra MB<br />
* Inter MB, null MV, previous frame reference<br />
* Inter MB, differential MV, previous frame reference<br />
* Inter MB, four MVs, previous frame reference<br />
* Inter MB, MV 1, previous frame reference<br />
* Inter MB, MV 2, previous frame reference<br />
* Inter MB, null MV, bookmarked frame reference<br />
* Inter MB, differential MV, bookmarked frame reference<br />
* Inter MB, MV 1, bookmarked frame reference<br />
* Inter MB, MV 2, bookmarked frame reference<br />
<br />
=== Frame Header ===<br />
<br />
The frame header commences with a section that is encoded using conventional big-endian bit packing.<br />
<br />
{| border="1"<br />
! Syntax !! Number of bits !! Type !! Semantics<br />
|- <br />
| frame_mode || 1 || Enum || 0x0 signifies an intra frame<br />
|-<br />
| qp || 6 || Unsigned || Quantization parameter valid range 0..63<br />
|-<br />
| marker || 1 || Constant || 0=VP61/62, 1=VP60<br />
|-<br />
| if (frame_mode == 0) { || || ||0 equals to INTRA_FRAME<br />
|-<br />
| version || 5 || Constant || 6=VP60/61, 7=VP60(Electronic Arts), 8=VP62<br />
|-<br />
| version2 || 2 || Constant || 0=VP60, 3=VP61/62<br />
|-<br />
| interlace || 1 || Boolean || true (1) means interlace will be used<br />
|-<br />
| if (marker==1 or version2==0) {<br />
|-<br />
| offset || 16 || Unsigned || secondary buffer offset (bytes releative to start of buffer)<br />
|-<br />
| }<br />
|-<br />
| dim_y || 8 || Unsigned || Macroblock height of video <br />
|-<br />
| dim_x || 8 || Unsigned || Macroblock width of video <br />
|-<br />
| render_y || 8 || Unsigned || Display height of video <br />
|-<br />
| render_x || 8 || Unsigned || Display width of video<br />
|-<br />
| }else{<br />
|-<br />
| if (marker==1 or version2==0) {<br />
|-<br />
| offset || 16 || Unsigned || secondary buffer offset (bytes releative to start of buffer)<br />
|-<br />
| }<br />
|-<br />
| }<br />
|}<br />
<br />
If dim_x or dim_y have values different from the previous intra frame, then the resolution of the encoded image has changed.<br />
<br />
Arithmetic coding commences at the next bit (which should be on a byte boundary):<br />
<br />
{| border="1"<br />
! Syntax !! Type !! Semantics<br />
|-<br />
| if (frame_mode == 0) {<br />
|-<br />
| marker1 || Equiprobable 2-bit || Ignored<br />
|-<br />
| } else {<br />
|-<br />
| bookmark || Equiprobable 1-bit || bookmark == 0x1 means this frame will be the next bookmark frame<br />
|-<br />
| filter1 || Equiprobable 1-bit || <br />
|-<br />
| if (filter1 == 0x1) {<br />
|-<br />
| filter2 || Equiprobable 1-bit || <br />
|-<br />
| } <br />
|-<br />
| filter_info || Equiprobable 1-bit ||<br />
|-<br />
| }<br />
|-<br />
| if (frame_mode == 0 <nowiki>||</nowiki> filter_info == 0x1) {<br />
|-<br />
| filter_mode1 || Equiprobable 1-bit ||<br />
|-<br />
| if (filter_mode1 == 0x1) {<br />
|-<br />
| filter_threshold1 || Equiprobable 5-bit ||<br />
|-<br />
| filter_motion_param || Equiprobable 3-bit ||<br />
|-<br />
| } else {<br />
|-<br />
| filter_mode2 || Equiprobable 1-bit ||<br />
|-<br />
| }<br />
|-<br />
| filter_mode3 || Equiprobable 4-bit ||<br />
|-<br />
| } <br />
|-<br />
| marker2 || Equiprobable 1-bit || Secondary buffer encoding algorithm. 0=Range coding, 1=Huffman coding.<br />
|}<br />
<br />
If the secondary buffer is present, coeffient symbols are read from the secondary buffer using the algorithm indicated by marker2. VP60 has "the ability to switch to a faster entropy encoding strategy to ensure smooth playback" and "the ability to decode different parts of the bitstream on different sub-processors (for instance the vlx and the core), to ensure better overall system utilization" [http://www.on2.com/cms-data/pdf/1125607149174329.pdf].<br />
<br />
=== Entropy Coding ===<br />
<br />
Described here is the decoding process for the arithmetically-coded (AC) parts of the bitstream. VP6 uses a 16-bit [http://en.wikipedia.org/wiki/Range_encoding range coding] scheme to code binary symbols.<br />
<br />
The AC decoder maintains three state variables: ''code'', ''mask'' and ''high''. <br />
<br />
====Initialization====<br />
<br />
At initialization, the first two bytes of the AC bitstream are shifted into ''code''. The variable ''high'' is set to 0xff00. The variable ''mask'' is set to 0xffff.<br />
<br />
====Decoding a Binary Value====<br />
<br />
Each binary symbol has an associated probability ''p'' in the range 0 to 0xff. <br />
<br />
A threshold, ''t'', is computed thus:<br />
<br />
: ''t'' = 0x100 + ( 0xff00 & ( ( (''high''-0x100) * ''p'' ) >> 8 ) )<br />
<br />
Equiprobable binary symbols are treated somewhat differently:<br />
<br />
: ''t'' = 0xff00 & ( (''high''+0x100) >> 1 )<br />
<br />
The binary value may then be decoded by comparing ''code'' and ''t''. If ''code'' is less than ''t'', the binary value is decoded as 0. If ''code'' is equal to or greater than ''t'', the binary value is decoded as 1.<br />
<br />
If a 1 was decoded, then<br />
<br />
: ''high'' = ''high'' - ''t''<br />
: ''code'' = ''code'' - ''t''<br />
<br />
If a 0 was decoded, then<br />
<br />
: ''high'' = ''t''<br />
<br />
The following renormalization is now repeated while (''high'' & 0x8000) is non-zero.<br />
<br />
: ''high'' = 2 * ''high''<br />
: ''code'' = 2 * ''code''<br />
: ''mask'' = 2 * ''mask''<br />
: if ((''mask'' & 0xff) == 0x00) {<br />
:: ''code'' = ''code'' | <i>next byte from bitstream</i><br />
:: ''mask'' = ''mask'' | 0xff<br />
:}<br />
<br />
====Decoding an Equiprobable n-bit Integer Value====<br />
<br />
Integer values are coded as a big-endian sequence of equiprobable binary values. To decode an n-bit equiprobable integer value, n equiprobable binary values should be decoded using the sequence above and left-shifted into an integral result variable.<br />
<br />
=== Inverse DCT ===<br />
<br />
Inverse DCT is performed on 8x8 blocks of pixels. The algorithm used is the same (or a small variation) of the one used for the [[On2 VP3|VP3]] decoder in [[FFmpeg]] [http://www1.mplayerhq.hu/cgi-bin/cvsweb.cgi/ffmpeg/libavcodec/vp3dsp.c?rev=1.9&content-type=text/x-cvsweb-markup&cvsroot=FFMpeg], the original vp3 iDCT code is here [http://svn.xiph.org/trunk/vp32/CoreLibs/CDXV/Vp31/dx/generic/IDctPart.c].<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=YLC&diff=15717YLC2023-07-22T13:56:11Z<p>Kostya: fill missing details</p>
<hr />
<div>* Website: http://spring-fragrance.mints.ne.jp/aviutl/ (Japanese)<br />
* FourCC: YLC0<br />
* Sample: http://samples.libav.org/V-codecs/ylc0.avi<br />
<br />
The website names it as "YUY2 Lossless Codec".<br />
<br />
=== Format ===<br />
<br />
bytes 0- 3 'YLC0'<br />
bytes 4- 7 zero<br />
bytes 8-11 offset to the Huffman table descriptors (should be 16)<br />
bytes 12-15 offset to the compressed YUY2 data<br />
<br />
Bitstream is read as 32-bit little-endian words MSB.<br />
<br />
Table descriptors contain 256 gamma'-coded weights for each of the four Huffman tables (special, Y delta, U delta, V delta)<br />
<br />
Delta values are decoded in the following way:<br />
<br />
if (get_bit()) {<br />
val = decode_sym(tree1);<br />
if (val < 0xE1) {<br />
output predefined YUYV quad from the constant table;<br />
} else {<br />
for (i = 0; i < (val - 0xDF); i++)<br />
output {0, 0, 0, 0}<br />
}<br />
} else {<br />
u0 = decode_sym(tree2);<br />
v0 = decode_sym(tree3);<br />
u1 = decode_sym(tree2);<br />
v1 = decode_sym(tree4);<br />
<br />
output { u0, v0, u0 + u1, v1 }<br />
}<br />
<br />
Then the reconstruction stage follows: first left-predict top line (each component independently), then use <code>left + top - topleft</code> prediction on the rest of pixel components. Unlike many other codecs, prediction here wraps around (i.e. left predictor for the first pixel is the last pixel on a previous line) and the initial left and top-left predictors are set to 128 for all components.<br />
<br />
[[Category:Video Codecs]]<br />
[[Category:Lossless Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=DTS-HD&diff=15716DTS-HD2023-07-22T13:39:20Z<p>Kostya: the specification is available, so it's not really incomplete</p>
<hr />
<div>* Company: [[DTS Inc.]]<br />
* Whitepaper: http://www.dtsonline.com/media/DTS-HD_WhitePaper.pdf<br />
* [http://www.etsi.org/deliver/etsi_ts/102100_102199/102114/01.04.01_60/ts_102114v010401p.pdf Specification (PDF)] (v1.4.1, 2012-09-28); [[Mirrored Files|mirrored locally]].<br />
<br />
DTS-HD is an audio coding technology developed by DTS and targeted for the HD generation of optical discs (namely [[Blu-Ray]] and [[HD-DVD]]). The technology specification embodies various coding modes and extensions to the [[DTS]] core.<br />
<br />
DTS-HD frames may either extend DTS core (then they will follow each DTS core frame) or form an independent stream of DTS-HD frames only. In the former case DTS-HD frame may contain some extensions to the core frame (e.g. XBR for the additional resolution, XXCh for the additional channels or even XLL for reconstructing the original audio bit-perfectly). Independent DTS-HD frames may contain several streams internally, for example core audio (with possible extensions signalled explicitly), DTS-HD Master Audio (lossless audio) or DTS Express (a low-bitrate codec related to [[QDesign Music Codec]]).<br />
<br />
The DTS-HD packages begin with the text "dX %", which is in hex<br />
"64 58 20 25". There's a length field in the DTS-HD package which<br />
tells you how long the DTS-HD package is exactly. If you skip this<br />
length, you should end up on the next DTS core package. The<br />
length of the DTS-HD package is stored in the bytes 6-8. The lowest<br />
four bits of the sixth bytes are the most significant bits of the length<br />
field. All 8 bits of the seventh byte are used for the length field. And<br />
the 3 most significant bits of the eightth byte are the least significant<br />
bits of the length field. Finally you need to add 1 to the length field.<br />
So the length calculates like "(sixthByte & 0xf) << 11 + seventhByte<br />
<< 3 + (eightthByte >> 5) & 7 + 1".<br />
<br />
[[Category:Audio Codecs]]<br />
[[Category:Lossless Audio Codecs]]<br />
[[Category:Multichannel Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Siren&diff=15715Siren2023-07-22T13:30:39Z<p>Kostya: fill missing details</p>
<hr />
<div>* Format: 0x28E<br />
* Company: [[Microsoft]]<br />
* Decoder: https://github.com/kakaroto/libsiren<br />
<br />
This audio codec is used by MSN Messenger for sending/receiving voice clips. It is also one of the available codecs for the 'Computer Call' feature (audio conference). <br />
It is based on the G722.1 codec and only has a few differences to it. For example, it uses a 2 byte big-endian format for storing the encoded data instead of little-endian, and it also adds a few bits (mostly 2) as a header to each frame to specify the sampling rate, and a 4 bits footer which contains a checksum of the frame (but keeping the size of the block to 40 bytes per frame).<br />
<br />
The codec has been reverse engineered and is available for download at : https://amsn.svn.sourceforge.net/svnroot/amsn/trunk/amsn/utils/tcl_siren/src/ <br />
(excluding the tcl_siren* files).<br />
<br />
This codec is the same as [[Vivo Siren]] except for some slightly different bitstream format (e.g. no sample rate ID in the beginning or four-bit CRC at the end and frame data is padded) and some parameters related to sample scaling.<br />
<br />
[[Category:Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Indeo_Audio&diff=15714Indeo Audio2023-07-22T13:21:15Z<p>Kostya: fill missing details</p>
<hr />
<div>* Format: 0x402<br />
* Company: [[Intel]], now [[Ligos]]<br />
* Samples: http://samples.mplayerhq.hu/A-codecs/Indeo_audio/<br />
<br />
This is the next version of [[Intel Music Coder]] with slightly different coding parameters<br />
(all algorithms are the same) and joint stereo mode. Also unlike IMC frequency bands are calculated based on the actual sample rate.<br />
<br />
Joint stereo mode is implemented as simple <code>(L+R, L-R)</code> transform after both channels are reconstructed.<br />
<br />
=== Header format ===<br />
<br />
bit 0 - should be zero<br />
bit 1 - should be zero<br />
bit 2 - stereo flag<br />
bits 3-8 - block size code<br />
bit 9 - codebook set selector (0 - no escapes, 1 - with escapes)<br />
bit 10 - codebook selector<br />
bit 11 - use raw coding<br />
<br />
bit 12 - SNR offset (-30 or -10)<br />
<br />
<br />
Block size code:<br />
<br />
0, 34-50 - (block_size + 14) 16-bit words<br />
33 - 32 words<br />
1-32 - (block_size + 15) words<br />
51-63 - invalid value<br />
<br />
=== Levels ===<br />
<br />
There are 32 levels, coded with either one of codebooks or in raw form.<br />
<br />
Decoding Huffman-coded levels is the same as in [[Intel Music Coder]] but with different tables.<br />
<br />
Raw form:<br />
<br />
the largest coefficient position - 5 bits<br />
quantised largest coefficient - 7 bits<br />
other coefficients - 4 bits per coefficient<br />
<br />
Level reconstruction is also the same as in [[Intel Music Coder]] except for raw form.<br />
<br />
Maximum raw coefficient is reconstructed as <code>20000 / pow(10.0, coef * 0.057031251)</code>. Other coefficients are reconstructed as <code>pow(10.0, -idx * 0.4375)</code>.<br />
<br />
[[Category:Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Duck_TrueMotion_2&diff=15713Duck TrueMotion 22023-07-21T16:48:44Z<p>Kostya: fill mi</p>
<hr />
<div>* FOURCCs: TM20<br />
* Company: [[On2|On2 (formerly Duck)]]<br />
* Patents: U.S. # 6,327,304, "Apparatus and method to digitally compress video signals"<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/TM20/ or http://ghostarchive.org/samples/V-codecs/TM20/<br />
<br />
Duck TrueMotion 2 relies on differential coding of samples in a YUV colorspace and coding those deltas using Huffman codes.<br />
<br />
== Codec principles ==<br />
This codec operates on 4x4 blocks and employs data separation, so packed frame data is composed from these segments:<br />
# luma deltas for hi-res blocks<br />
# luma deltas for low-res blocks<br />
# chroma deltas for hi-res blocks<br />
# chroma deltas for med-res and low-res blocks<br />
# values for updating whole block<br />
# motion vectors<br />
# block types<br />
<br />
Each segment is compressed with own Huffman codes (Huffman tree is stored in segment header), thus gaining compression from grouping similar data.<br />
<br />
Blocks are unpacked in this way:<br />
<br />
LAST 4 ELEMENTS<br />
| | | |<br />
V V V V<br />
D0 -> +d00 +d10 +d20 +d30 -> D0<br />
D1 -> +d01 +d11 +d21 +d31 -> D1<br />
D2 -> +d02 +d12 +d22 +d32 -> D2<br />
D3 -> +d03 +d13 +d23 +d33 -> D2<br />
| | | |<br />
V V V V<br />
LAST 4 ELEMENTS<br />
<br />
When current block is low-res block, an average value of pair of two last elements is calculated and used instead of them. When another type of block occurs (update, motion, still) after doing operation we need to re-calculate deltas. The same applies to chroma blocks, they are just 2x2 size.<br />
<br />
Block types:<br />
# Hi-resolution block: 16 luma deltas, 8 chroma deltas<br />
# Medium-resolution block: 16 luma deltas, 2 chroma deltas<br />
# Low-resolution block: 4 luma deltas, 2 chroma deltas<br />
# Null block: no deltas, interpolate it from last 4 elements and deltas<br />
# Still block: copy data from previous frame (and update Dx)<br />
# Update block: 16 luma deltas, 8 chroma deltas, all applied independently , so recalculation is needed<br />
# Motion block: copy block from previous frame with offset provided by motion vector.<br />
<br />
Frame decoding is done this way:<br />
<br />
read header<br />
unpack all segments<br />
for each block in frame {<br />
get block type from 'block types' segment<br />
decode corresponding block type<br />
}<br />
<br />
=== Bitstream format ===<br />
Frame header consists of 32-bit codec version (0x100 or 0x101) and 36 bytes which value can be ignored. All values are little-endian, LSB first.<br />
<br />
Stream data has the following format:<br />
* 32-bit stream size in 32-bit words<br />
* 31-bit number of tokens coded<br />
* 1-bit delta table present flag<br />
** (optional) delta table:<br />
** 9-bit number of deltas<br />
** 5-bit delta bits<br />
** NxM bits - delta values<br />
** padding to 32-bit word<br />
* 32-bit tree length<br />
* 32-bit algorithm<br />
* Huffman tree data<br />
* padding to 32-bit word<br />
* 32-bit token data length (when zero, stream tokens all have the same value)<br />
* Huffman-coded token data<br />
<br />
Huffman tree descriptor:<br />
5 bits - symbol bits<br />
5 bits - maximum codeword length<br />
5 bits - minimum codeword length<br />
17 bits - number of leaves<br />
tree data (recursive, 1 bit to signal leaf/node; leaf contains symbol value)<br />
<br />
== Games Using Duck TrueMotion 2 ==<br />
These software titles are known to use the Duck TrueMotion 2 video codec to encode full motion video:<br />
<br />
[http://www.mobygames.com/game/windows/final-fantasy-vii Final Fantasy VII (Windows)]<br />
<br />
[[Category:Video Codecs]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Duck_TrueMotion_2_Realtime&diff=15712Duck TrueMotion 2 Realtime2023-07-21T16:13:10Z<p>Kostya: the description is pretty complete</p>
<hr />
<div>* FOURCCs: TR20<br />
* Company: [[On2|On2 (formerly Duck)]]<br />
<br />
Duck TrueMotion 2 Realtime relies on differential coding of samples in a YUV colorspace. There's no interframe coding or Huffman codes for speed considerations.<br />
<br />
Internally frame is coded as YUV410:<br />
R = Y + Cr * 1.370705<br />
G = Y + Cr * -0.698 + Cb * -0.337633<br />
B = Y + Cb * 1.732446<br />
<br />
=== Header ===<br />
<br />
Frame header is obfuscated --- frame size if obtained from the first byte by cyclically rotating it right and masking <code>size = rcr(src[0], 5) & 0x7F;</code><br />
<br />
Header data is expected to be at least 8 bytes.<br />
<br />
byte 0 --- (meaningless)<br />
byte 1 --- should be 0x11<br />
byte 2 --- delta mode<br />
byte 3 --- should be 0x05<br />
byte 4 --- horizontal flag (should be either <code>0x00</code> - no scaling or <code>0x01</code> - 2x scale)<br />
byte 5 --- unused<br />
bytes 6-7 --- frame height<br />
bytes 8-9 --- frame width<br />
<br />
Delta mode selects which delta table will be used by signalling number of bits it uses (only 2-, 3- and 4-bit delta tables are known). Data should be treated as 32-bit little endian words and should be read LSB first.<br />
<br />
Right after the header and 4-byte packed data size compressed data for Y, U and V plane follows.<br />
<br />
Each plane is coded as the cumulative difference from the top neighbour (i.e. you update delta value for the line and add it to the top neighbour value), delta value is kept internally but is clipped on output. <br />
<br />
==== Luma plane ====<br />
<br />
for (y = 0; y < height; y++) {<br />
diff = 0;<br />
for (x = 0; x < width; x++) { // or maybe width / 2 when scaling flag is 1<br />
diff += delta_tab[delta_mode][get_bits(delta_size)];<br />
dst[x + y * stride] = clip_uint8((y ? dst[x + (y - 1) * stride] : 0) + diff);<br />
}<br />
}<br />
<br />
==== Chroma planes ====<br />
<br />
for (y = 0; y < height / 4; y++) {<br />
diff = 0;<br />
for (x = 0; x < width / 4; x++) {<br />
diff += delta_tab[delta_mode][get_bits(delta_size)];<br />
dst[x + y * stride] = clip_uint8((y ? dst[x + (y - 1) * stride] : 0x80) + diff);<br />
}<br />
}<br />
<br />
=== Delta tables ===<br />
<br />
4-bit:<br />
<br />
1, -1, 2, -3, 8, -8, 0x12, -0x12, 0x24, -0x24, 0x36, -0x36, 0x60, -0x60, 0x90, -0x90<br />
<br />
3-bit:<br />
<br />
2, -3, 8, -8, 0x12, -0x12, 0x24, -0x24<br />
<br />
2-bit:<br />
<br />
5, -7, 0x24, -0x24 <br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Duck_TrueMotion_1&diff=15711Duck TrueMotion 12023-07-21T16:10:06Z<p>Kostya: fill the missing bits</p>
<hr />
<div>* FOURCCs: DUCK,PVEZ<br />
* Company: [[On2|On2 (formerly Duck)]]<br />
* Patents: US #6,181,822, "Data compression apparatus and method"<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/DUCK/ and http://ghostarchive.org/samples/V-codecs/DUCK/<br />
<br />
<br />
Duck TrueMotion 1 (TM1) is the first video codec developed by The Duck Corporation. It uses [[Differential Coding|differential pulse code modulation]] and interframe differencing. The primary application for which Duck TrueMotion 1 was developed is gaming software, typically on PC or Sega Saturn titles.<br />
<br />
== Underlying Concepts ==<br />
<br />
TM1 operates on bidirectional delta quantization. The mathematical premise of delta coding is simple addition. For example, take the following sequence of numbers:<br />
<br />
82 84 81 80 86 88 85<br />
<br />
All of the numbers are reasonably large (on a scale of 0..255 in this example). However, they are quite similar to each other. Using delta coding, only the differences between successive numbers are stored:<br />
<br />
82 +2 -3 -1 +6 +2 -3<br />
<br />
The first number is still large but since the remaining numbers are clustered close to each other, the deltas are relatively small. Thus, the deltas require less information to encode. <br />
<br />
TM1 takes this concept and extends it to 2 dimensions. A particular pixel is assumed to have a pixel directly above it and a pixel directly to the left (if neither of these pixels exist in the frame, e.g., for the top-left corner pixel, 0 is used in place of the non-existant pixels). The current pixel is decoded as the sum of the up and left pixels, plus a delta from the encoded video bitstream.<br />
<br />
encoded video bitstream: ...D5 D6 D7 D8...<br />
<br />
decoded video frame:<br />
A B C D<br />
E F G H<br />
<br />
In this example, the encoded video bitstream is sitting at delta D5 when it is time to decode pixel E. A is the pixel above E. There is no pixel to the left of E, so 0 is used as the left value. Thus:<br />
<br />
E = A + 0 + D5<br />
<br />
In the case of pixel F, both the up and left pixel values (called the vertical and horizontal predictors, respectively) are available:<br />
<br />
F = B + E + D6<br />
<br />
That is the general idea behind decoding TM1 data. The actual decoding algorithm is more involved. The TM1 bytestream actually contains a series of indices that point into tables with the delta values to be<br />
applied to the vertical and horizontal predictor pixels. These tables only specify small deltas to be added to pixel predictors. When a larger delta is needed because the delta between 2 numbers is too large, then a special bytestream code indicates that the next delta is to be multiplied by 5 before it is applied.<br />
<br />
TM1 operates on 4x4 macroblocks of pixels. Each block in a frame can be broken into 4 2x2 blocks, 2 halves (either 4x2 or 2x4), or encoded as a 4x4 block. The block type is encoded in the frame header.<br />
<br />
While the TM1 algorithm operates on RGB colorspace data at the input and output level, it borrows some ideas from YUV colorspaces. For more information of RGB and YUV colorspaces, see the References section.<br />
<br />
TM1 uses a modified colorspace that embodies luminance (Y) and chrominance (C) information encoded as RGB deltas. Since Y information is more important to the human eye than C information, the Y data must be updated more frequently than the C data (i.e., more Y predictors than C predictors are applied to the image). For a every pixel within a given block in a macroblock, a Y predictor must be applied. However, only one C predictor is applied for each block in the macroblock.<br />
<br />
== TrueMotion v1 Frame Format and Header ==<br />
<br />
All multi-byte numbers are stored in little endian format.<br />
<br />
An encoded intraframe (keyframe) of TM1 data is laid out as:<br />
<br />
header | predictor indices<br />
<br />
An encoded interframe is laid out as:<br />
<br />
header | block change bits | predictor indices<br />
<br />
The difference between the 2 types of frames is that an interframe has a section of bits that specify which blocks in the frame are unchanged from the previous frame.<br />
<br />
A TM1 header is quasi-encrypted with a logical XOR operation. This is probably done to provide some obfuscation of the header and thwart casual inspection of the data format.<br />
<br />
An encoded TM1 frame begins with the one byte that indicates the length of the decrypted header, only with a dummy high bit and rotated left by 5. To obtain the actual length from byte B:<br />
<br />
L = ((B >> 5) | (B << 3)) & 0x7F<br />
<br />
Then, decrypt the header by starting with byte 1 in the encoded frame (indexing from 0) and XORing each byte with its successive byte. Assuming the header is of length L bytes as computed above, and that the encoded header starts at buffer[1] (buffer[0] had the rotated length), the decode process is:<br />
<br />
for (i = 1; i < L; i++)<br />
decoded_header[i - 1] = buffer[i] ^ buffer[i + 1];<br />
<br />
The decoded header data structure is laid out as follows (depending on the version and header type not all of the subsequent fields may be present though):<br />
<br />
byte 0 compression method<br />
byte 1 delta set<br />
byte 2 vector set<br />
bytes 3-4 frame height<br />
bytes 5-6 frame width<br />
bytes 7-8 checksum<br />
byte 9 version<br />
byte 10 header type<br />
byte 11 flags<br />
byte 12 control<br />
bytes 13-14 x offset<br />
bytes 15-16 y offset<br />
bytes 17-18 width<br />
bytes 19-20 height<br />
<br />
The compression method field indicates the type of compression used to encode this frame. There are 2 general types: 16-bit and 24-bit. Further, each has 4 block sizes to select from. The valid compression types are<br />
<br />
0, 9, 11, 13, 15: NOP frames; frame is unchanged from previous frame<br />
1: 16-bit 4x4 (V)<br />
2: 16-bit 4x4 (H)<br />
3: 16-bit 4x2 (V)<br />
4: 16-bit 4x2 (H)<br />
5: 16-bit 2x4 (V)<br />
6: 16-bit 2x4 (H)<br />
7: 16-bit 2x2 (V)<br />
8: 16-bit 2x2 (H)<br />
10: 24-bit 4x4 (H)<br />
12: 24-bit 4x2 (H)<br />
14: 24-bit 2x4 (H)<br />
16: 24-bit 2x2 (H)<br />
>16: invalid compression type (or TrueMotion RT)<br />
<br />
The (H) and (V) designations come from the original Duck source code. It is unclear what they mean, except for the common horizontal and vertical designations common in video terminology.<br />
<br />
The delta set and vector set fields are used to generate the set of predictor tables that will be used to decode this frame.<br />
<br />
The height and width fields should be the same as those specified in the AVI file that contains the data.<br />
<br />
The checksum field appears to contain the frame's sequence number modulo 512. The first frame is 0x0000 and the next frame is 0x0001. Frame #511 has a checksum of 0x01FF while frame #512 wraps around to 0x0000.<br />
<br />
If the version field is less than 2, then the frame is intracoded (this may indicate that early versions of the coding method was purely intracoded). If the version field is greater than or equal to 2, then if the header type field is 2 or 3, the flags field has bit 4 set (flags & 0x10) to indicate an intraframe; else if the header type field is greater than 3, then the header is invalid; else the frame is intracoded.<br />
<br />
The control field contains the information for which platform the video file was mastered (e.g. PC or SEGA console).<br />
<br />
The x & y offset, width, and height fields pertain to a sprite mode.<br />
<br />
== Overall frame coding ==<br />
Frame is split into 4x4 pixel blocks (or 2x4 pixel blocks for 24-bit mode; each pixel should be repeated twice during the reconstruction phase). Depending on the coding mode, each block may contain 1-4 chroma deltas (for each sub-block) and luma deltas for each pixel. Data is still coded per-line though. Delta values obtained during compression are substituted with the values from the corresponding delta table (with possible escapes, more about them below), grouped into pairs and sent to the Tunstall code compressor with fixed codebook (i.e. the coding method that replaces a sequence of input codes with one fixed-width output value, TM1 codebooks replace sequences of 2-8 delta pairs with a single byte).<br />
<br />
In case the delta value is too large, it is coded as a sum of of small delta and escape delta value. In this case Tunstall coder flushes output sequence, sends zero byte to signal escape and codes delta indices for the escape values. In 16-bit mode those values are the ordinary delta values multiplied by five, in case of 24-bit coding they come from the so-called fat tables instead.<br />
<br />
== 16-bit Data ==<br />
<br />
Deltas in 16-bit mode are applied to two pixels at once, so horizontal predictor contains two pixels as well. Luma delta pair codes deltas that should be applied to all components in the pixel, chroma delta codes deltas for red and blue components of the pixel pair (i.e. first delta codes red offset for both pixels, second delta codes blue offset for both pixels).<br />
<br />
=== Sprite mode ===<br />
Sprite mode augments 16-bit intra-only coding mode with 4x4 sub-blocks with some additional modes signalled by two bits per block. First bit tells whether the block is coded like the ordinary TM1 data, second bit signals that the block is either coded with the transparency information (transmitted as a delta pair right after luma delta pair) or that the block is completely transparent.<br />
<br />
A working implementation of sprite support can be found in NihAV: https://git.nihav.org/?p=nihav.git;a=blob;f=nihav-duck/src/codecs/truemotion1.rs;hb=HEAD <br />
<br />
== 24-bit Data ==<br />
In 24-bit mode delta pairs code components of a single pixel. Because of the nature of the compression, deltas should be applied as a single 32-bit word to the <code>((R << 16) | (G << 8) | B)</code> packed pixel value. Luma delta pair should be unpacked as <code>((lo << 16) | (lo << 8) | hi)</code>, chroma delta pair should be unpacked as <code>((lo << 16) | hi)</code>.<br />
<br />
== Duck TrueMotion v1 Tables ==<br />
[https://github.com/FFmpeg/FFmpeg/blob/5541cffa17a8c45004e5ceeda52d4d6b2acee037/libavcodec/truemotion1.c FFmpeg implementation]<br />
<br />
== Games Using Duck TrueMotion 1 ==<br />
These software titles are known to use the Duck TrueMotion 1 video codec to encode full motion video:<br />
<br />
* [https://www.mobygames.com/game/bubble-bobble-also-featuring-rainbow-islandsBubble Bobble special CD release with Rainbow Island (DOS)]<br />
* [https://www.mobygames.com/game/saturn/blazing-dragons Blazing Dragons (Sega Saturn)]<br />
* [https://www.mobygames.com/game/sega-saturn/congo-the-movie-the-lost-city-of-zinj Congo: The Movie - The Lost City of Zinj (Sega Saturn)]<br />
* [https://www.mobygames.com/game/dos/d D (DOS version only)]<br />
* [https://www.mobygames.com/game/saturn/horde The Horde (Sega Saturn)]<br />
* [https://www.mobygames.com/game/saturn/nhl-all-star-hockey NHL All-Star Hockey (Sega Saturn)]<br />
* [https://www.mobygames.com/game/phantasmagoria-a-puzzle-of-flesh Phantasmagoria: A Puzzle of Flesh (DOS & Windows)]<br />
* [https://www.mobygames.com/game/santa-fe-mysteries-the-elk-moon-murder Santa Fe Mysteries: The Elk Moon Murder (DOS & Windows)]<br />
* [https://www.mobygames.com/game/saturn/solar-eclipse Solar Eclipse (Sega Saturn)]<br />
* [https://www.mobygames.com/game/saturn/sonic-3d-blast Sonic 3D Blast (Sega Saturn)] - [https://www.youtube.com/watch?v=Fpqs4Fta3Ho Video]<br />
* [https://www.mobygames.com/game/spycraft-the-great-game Spycraft: The Great Game (DOS & Windows)]<br />
* [https://www.mobygames.com/game/saturn/theme-park Theme Park (Sega Saturn)]<br />
* [https://www.mobygames.com/game/saturn/virtua-cop-2 Virtua Cop 2 (Sega Saturn)]<br />
* [https://www.mobygames.com/game/saturn/virtua-fighter-2 Virtua Fighter 2 (Sega Saturn)]<br />
* [https://www.mobygames.com/game/windows/zork-grand-inquisitor Zork: Grand Inquisitor (Windows)]<br />
* [https://www.mobygames.com/game/windows/zork-nemesis-the-forbidden-lands Zork Nemesis: The Forbidden Lands (Windows)]<br />
<br />
[[Category:Video Codecs]]<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=NUVision&diff=15703NUVision2023-07-08T09:11:02Z<p>Kostya: fill information</p>
<hr />
<div>* Company: Zoran Ltd.<br />
* FOURCCs: <code>NTN1</code>, <code>NTN2</code>, <code>NTN3</code><br />
<br />
This is rather simple video codec based on delta coding. Frame is split into lines, line is split into slices of width 24, 16 or 8 and each is coded using its own prediction method and delta quantisation.<br />
<br />
Frame consists of frame header and line data packed into chunks. Word is 16-bit little-endian value (or two-byte pair when used for mode/delta coding).<br />
<br />
=== Frame header ===<br />
word - should be 0x55 0xAA<br />
byte - header size (normally 12)<br />
byte - frame sequence number (reset with each new frame group)<br />
word - unknown, probably CRC<br />
byte - unknown <br />
byte - probably intra frame flag (should be 0x80)<br />
word - frame width<br />
word - frame height<br />
<br />
=== Line chunks ===<br />
1 byte - should be always 0x5A<br />
1 byte - chunk size in 16-bit words<br />
1 byte - line number<br />
luma slices coding modes<br />
chroma slices coding modes<br />
luma slice data<br />
chroma slice data (may be just one component per line)<br />
<br />
=== Line chunk coding ===<br />
As mentioned above, line is split into 24-pixel slices plus a tail. Each slice can be coded using one of four different modes that are packed into bytes as two-bit variables (MSB first) and padded to the even length (the same coding is used for deltas).<br />
<br />
Slice coding modes:<br />
* 0 - do not update slice from the previous frame<br />
* 1 - unpack deltas and add them to the slice from the previous frame<br />
* 2 - copy top slice (should not be present for the first line)<br />
* 3 - unpack deltas and use top slice for prediction (or do horizontal prediction for the first line)<br />
<br />
Delta values are coded in the following manner:<br />
1 byte - quantiser<br />
packed delta modes<br />
escape values<br />
<br />
There are four delta modes: -1, 0, 1 and escape. So when e.g. delta mode 2 is read, the output value is equal to the quantiser value. And in case of delta mode 3 the actual value is stored right after the whole mode flags.<br />
<br />
[[Category:Video Codecs]]<br />
[[Category:Incomplete Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=OSQ&diff=15702OSQ2023-06-28T14:30:59Z<p>Kostya: add OSQ description</p>
<hr />
<div>* Company: Steinberg<br />
* Extension: osq<br />
<br />
OSQ (short for Original Sound Quality) is a lossless audio format created reportedly by Philippe Goutier for storing audio output of WaveLab software. The format is based on [[Shorten]].<br />
<br />
=== Header ===<br />
The file starts with 56-byte header (all values are little-endian):<br />
<br />
4 bytes - "OSQ "<br />
1 byte - version? (should be 1)<br />
1 byte - unknown (always zero?)<br />
1 byte - bits per sample<br />
1 byte - number of channels (mono or stereo)<br />
4 bytes - unknown<br />
4 bytes - sample rate?<br />
4 bytes - number of samples<br />
36 bytes - unknown<br />
<br />
=== Block format ===<br />
Blocks consist of samples with some redundancy removed by one of the fixed prediction schemes and residues coded with fixed-width bits or Rice codes. Block starts with block header(s) for each channel and then the data for each channel follows.<br />
<br />
Block header format is the following:<br />
1 bit - should be zero<br />
1 bit - stereo decorrelation present<br />
1 bit - use 8-bit samples internally<br />
for each channel {<br />
rice(5) - prediction mode<br />
rice(3) - coefficient coding mode<br />
if coefficient coding mode < 3 {<br />
rice(4) - residue Rice parameter<br />
} else {<br />
rice(4) - number of bits per residue<br />
}<br />
}<br />
for each frame {<br />
for each channel {<br />
coefficient residue<br />
}<br />
}<br />
0-7 bits - byte boundary align<br />
<br />
Coefficient residues are coded either as signed fixed-width bitfield or using Rice code plus sign bit (even for zeros). In case of coding mode 2 an adaptive Rice coding is used (in this case Rice parameter is calculated using the statistics of decoded residues).<br />
<br />
Prediction modes are the following (for brevity <code>predict2 = dst[-1] * 2 - dst[-2]</code> and <code>predict3 = dst[-1] * 3 - dst[-2] * 3 + dst[-3]</code>):<br />
* 0 - no prediction<br />
* 1 - add previous sample<br />
* 2 - add previous sample and half of previous difference<br />
* 3 - <code>predict2</code><br />
* 4 - <code>predict2 + prev_diff / 2</code><br />
* 5 - <code>predict3</code><br />
* 6 - <code>predict3 + prev_diff / 2</code><br />
* 7 - <code>(predict2 + predict3) / 2 + prev_diff / 2</code><br />
* 8 - <code>(predict2 + predict3) / 2</code><br />
* 9 - <code>(predict2 * 2 + predict3) / 3 + prev_diff / 2</code><br />
* 10 - <code>(predict2 + predict3 * 2) / 3 + prev_diff / 2</code><br />
* 11 - <code>(dst[-1] + dst[-2]) / 2</code><br />
* 12 - <code>dst[-2]</code><br />
* 13 - <code>(dst[-4] + dst[-2]) / 2</code><br />
* 14 - <code>(dst[-1] + predict2) / 2 + prev_diff / 2</code><br />
<br />
After the residue is read, prediction should be applied to it, then an optional clearing of low eight bits and finally, for the decoded pair of samples, stereo decorrelation in form <code>L+R, L-R</code> may be applied.<br />
<br />
[[Category:Container Formats]]<br />
[[Category:Audio Codecs]]<br />
[[Category:Lossless Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=NXL&diff=15701NXL2023-06-21T16:02:35Z<p>Kostya: add yet another game format description</p>
<hr />
<div>NXL is a format used for cutscenes in [https://www.mobygames.com/game/898/the-lawnmower-man/ The Lawnmower Man] game. The format is big-endian, uses chunks padded to 64 bytes and stores raw video and audio.<br />
<br />
NXL files comprise a series of chunk headers with optional payload following. The first chunk should be of type 1.<br />
<br />
Known chunk types:<br />
* 1 -- file header (no other payload)<br />
* 3 -- video frame<br />
* 4 -- audio frame<br />
* 5 -- something, probably signals a required delay<br />
* 255 -- end of file marker<br />
<br />
=== Header chunk ===<br />
1 byte - should be 1 (chunk type)<br />
1 byte - should be 1 (file version)<br />
2 bytes - padding<br />
4 bytes - chunk size (should be 64)<br />
4 bytes - should be zero (chunk number)<br />
4 bytes - unknown<br />
4 bytes - should be "NXL1"<br />
2 bytes - video width<br />
2 bytes - video height<br />
2 bytes - number of video planes<br />
4 bytes - unknown<br />
4 bytes - maximum video chunk size?<br />
4 bytes - audio sampling rate?<br />
2 bytes - some audio parameter?<br />
24 bytes - padding<br />
<br />
=== Audio chunk ===<br />
1 byte - should be 4<br />
3 bytes - padding<br />
4 bytes - chunk size<br />
4 bytes - chunk number<br />
4 bytes - unknown<br />
2 bytes - sampling rate<br />
2 bytes - actual audio payload size<br />
16 bytes - unknown<br />
28 bytes - padding<br />
<br />
Audio is raw 8-bit unsigned PCM.<br />
<br />
=== Video chunk ===<br />
1 byte - should be 3<br />
3 bytes - padding<br />
4 bytes - chunk size<br />
4 bytes - chunk number<br />
4 bytes - unknown<br />
2 bytes - current frame width<br />
2 bytes - current frame height<br />
2 bytes - current frame left offset<br />
2 bytes - current frame top offset<br />
1 byte - scaling mode (0 - none, 1 - double horizontal, 2 - double vertical, 3 - scale 2x)<br />
1 byte - frame flags (bit 0 seems to signal delta frame, bit 1 also signals something)<br />
2 bytes - number of colours?<br />
4 bytes - actual video payload size<br />
32 bytes - padding<br />
<br />
Video payload starts with VGA palette (usually for 16 colours) followed by raw video data in planar format. I.e. for each line palette indices are split in bits for each position, those are packed together (MSB first) with LSB of the index stored first. E.g. entries <code>4 3 0 1 ...</code> will be stored as <code>0101... | 0100... | 1000... | 0000...</code><br />
<br />
[[Category:Game Format]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Eurocom_FMV&diff=15700Eurocom FMV2023-06-15T12:21:39Z<p>Kostya: Created page with "* Extension: fmv * Company: Eurocom Developments Ltd. This is a simple cutscene format used at least in [https://www.mobygames.com/game/6925/machine-hunter/ Machine Hunter] game. Header format (all values are little-endian): 4 bytes - "AFMV" 2 bytes - number of frames 1 byte - initial background value 768 bytes - palette 1 byte - padding The header is followed by frame data which starts with 32-bit frame size and 32-bit number of opcode quads. Opco..."</p>
<hr />
<div>* Extension: fmv<br />
* Company: Eurocom Developments Ltd.<br />
<br />
This is a simple cutscene format used at least in [https://www.mobygames.com/game/6925/machine-hunter/ Machine Hunter] game.<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - "AFMV"<br />
2 bytes - number of frames<br />
1 byte - initial background value<br />
768 bytes - palette<br />
1 byte - padding<br />
<br />
The header is followed by frame data which starts with 32-bit frame size and 32-bit number of opcode quads. Opcode quad contains two pairs of (skip, run) values that tells how many quads of pixels should be skipped or read. The new pixel data follows the opcode quad. E.g. <code>10 01 05 00 02 03 04 05</code> means skip 16*4 pixels, output pixels <code>02 03 04 05</code>, skip 5*4 pixels and output no new pixels.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=IAVF&diff=15699IAVF2023-02-24T15:19:55Z<p>Kostya: fill information about Ripper AVI</p>
<hr />
<div>* Extension: AVI<br />
* Company: Take-Two Interactive<br />
<br />
This format packs another animation format for (presumably) making videos more interactive. It embeds audio tracks and [[Flic Video]] or [[Smacker]] video.<br />
<br />
Internally the file consists of 145-byte header and various commands with 14-byte header.<br />
<br />
File header:<br />
8 bytes - "IAVF2.00"<br />
4 bytes - always zero?<br />
4 bytes - always zero?<br />
4 bytes - number of frames<br />
4 bytes - always zero?<br />
4 bytes - always zero?<br />
2 bytes - sampling rate<br />
1 byte - number of channels<br />
1 byte - bits per sample<br />
4 bytes - bytes per second<br />
4 bytes - audio block align<br />
1 byte - unknown<br />
4 bytes - some offset or size<br />
2 bytes - frames per second<br />
2 bytes - width<br />
2 bytes - height<br />
4 bytes - some size<br />
4 bytes - some size<br />
4 bytes - some size<br />
4 bytes - some size<br />
(the rest 78 bytes are zero)<br />
<br />
Command header:<br />
2 bytes - command ID<br />
4 bytes - metainformation 1<br />
4 bytes - metainformation 2<br />
4 bytes - metainformation 3<br />
<br />
Known commands:<br />
* 0x66 - audio payload. First field is sequence number, second field is the expected audio block size, third field is actual data length<br />
* 0x67 - some pause command<br />
* 0x68 - clear frame command (no payload expected)<br />
* 0x69 - 20-byte mouse region information followed by 128-byte FLI header<br />
* 0x6A - 20-byte mouse region information followed by Smacker header (the same as in SMK file but without frame sizes)<br />
* 0x6B - unknown, probably FLI frame data<br />
* 0x6C - SMK frame<br />
* 0x70 - EOF (not necessarily present)<br />
* 0x75 - some audio command<br />
* 0x76 - unknown<br />
* 0x77 - unknown, probably related to Smacker<br />
* 0x78 - Smacker preload data<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=WavArc&diff=15697WavArc2023-01-23T13:24:03Z<p>Kostya: fill in the description</p>
<hr />
<div>* Original link: [http://www.firstpr.com.au/audiocomp/lossless/wavarc/ WavArc]<br />
* Extension: .wa<br />
* Developer: Dennis Lee<br />
<br />
WA or WavArc is a program for manipulating archives of several compressed audio files. There are several audio compression formats employed.<br />
<br />
== Archive format ==<br />
WA files do not have any special headers and contain a series of archive entries with the following header (all values are little-endian):<br />
1 byte - original filename length<br />
N bytes - original filename<br />
1 byte - should be in 0-5 range<br />
4 bytes - compression method name<br />
4 bytes - original file size<br />
4 bytes - compressed file size<br />
4 bytes - file timestamp (DOS format)<br />
4 bytes - CRC32 (Ethernet one, with 0xedb88320 polynomial)<br />
M bytes - the original WAV header (parsed by WA to get the audio parameters)<br />
<br />
Known compression methods (actually only the digit matters):<br />
* <code>0CPY</code><br />
* <code>1DIF</code><br />
* <code>2SLP</code><br />
* <code>3NLP</code><br />
* <code>4ALP</code><br />
* <code>5ELP</code><br />
<br />
The first one is simple copy, the others are based on LPC and Rice codes (method 5 may employ arithmetic coding as well). All methods (beside the copy, of course) split audio data into short blocks and may apply different compression to them. This is signalled by the block type.<br />
<br />
Rice codes can be signed or unsigned. Unsigned Rice codes (or <code>rice(k)</code>) are represented as a run of zero bits plus additional <code>k</code> bits. Signed Rice codes (or <code>rice_s(k)</code>) are unsigned Rice codes mapped as 0, -1, 1, -2, 2... Bitstream is big-endian MSB first.<br />
<br />
=== Compression method 1 ===<br />
Audio is split into interleaved chunks and have the same fixed prediction modes as [[Shorten]]. For stereo sound there's an additional flag between left and right channel blocks signalling whether right channel is coded as the difference to the left channel or not. Block types are signalled with <code>rice(1)</code> values, normal block size is 256 samples but it can be set to a lower value in block type 7.<br />
<br />
Data is coded with Rice codes using a fixed parameter <code>k</code> transmitted before block data (for types 0-3). For 8-bit audio it's <code>rice(1)</code>, for 16-bit audio it's <code>rice(2)</code>. Delta values are transmitted as <code>rice_s(k + 1)</code>.<br />
<br />
Block types:<br />
* 0 - samples have no prediction;<br />
* 1 - samples are coded as the difference to the previous sample;<br />
* 2 - prediction is <code>src[-1] * 2 - src[-2]</code>;<br />
* 3 - prediction is <code>src[-1] * 3 - src[-2] * 3 + src[-3]</code>;<br />
* 4 - zero block;<br />
* 5 - fill block with constant 8- or 16-bit value (stored as unsigned);<br />
* 6 - read <code>rice(2)</code> shift for data samples and decode no data;<br />
* 7 - read new 8-bit block size and decode no data;<br />
* 8 - end of file<br />
<br />
=== Compression methods 2-4 ===<br />
This is a development of compression method 1, now with arbitrary LPC filters, more block types and the default block size set to 570 samples. Block types 1-9 map to block types of 0-8 of compression method 1 except that block size is transmitted as <code>rice(8)</code> block size (it should be no more than 570). Block type 0 is a new LPC coding.<br />
<br />
Block type 0:<br />
rice(2) - filter order (<= 70)<br />
Norder x rice_s(2) - filter coefficients<br />
Nsamples x rice_s(k + 1) - deltas<br />
<br />
Sample prediction is performed as <code>((sum_(0..Norder) filter[i] * src[-(i + 1)]) + 15) >> 4</code>.<br />
<br />
=== Compression method 5 ===<br />
This is also a development of the previous compression methods, now with arithmetic coding for deltas as an alternative. Another feature is splitting LPC-predicted data in two parts, initial one with the same length as LPC filter and the rest. The second part is LPC predicted (in the same way as in compression method 4) using zeroes for initial sample history, the first part uses some fixed prediction.<br />
<br />
Data coding order is still the same: Rice code parameter for modes 0-7, LPC filter for the modes with LPC. Modes 13-20 are coded with static arithmetic coding described below (modes 0-7 use Rice codes as before).<br />
<br />
Block modes (residue coding is described above, only restoration part is mentioned):<br />
* 0/13 - LPC prediction with no split;<br />
* 1/14 - LPC with previous sample prediction for the initial samples;<br />
* 2/15 - LPC with <code>src[-1] * 2 - src[-2]</code> prediction for the initial samples;<br />
* 3/16 - predict from previous sample<br />
* 4/17 - no prediction<br />
* 5/18 - prediction is <code>src[-1] * 2 - src[-2]</code>;<br />
* 6/19 - LPC with <code>src[-1] * 3 - src[-2] * 3 + src[-3]</code> prediction for the initial samples;<br />
* 7/20 - prediction is <code>src[-1] * 3 - src[-2] * 3 + src[-3]</code>;<br />
* 8 - zero block<br />
* 9 - fill block<br />
* 10 - set shift<br />
* 11 - set block length<br />
* 12 - EOF<br />
<br />
Arithmetic coding data has 12-bit prefix specifying compressed data size in bits. That data starts with the symbol frequencies table and followed by the arithmetic coder data. Frequencies are stored as ranges in form <code>(start value, end value, probabilites for start, start+1,..., end)</code>. Zero <code>start value</code> except for the first pair means end of table. Model values represent deltas in -128..127 range.<br />
<br />
Arithmetic coder is the classic one with 16-bit range.<br />
<br />
Initialisation:<br />
high = 0xFFFF;<br />
low = 0;<br />
value = get 16 bits from the stream (MSB first)<br />
<br />
Symbol decoding:<br />
model_range = model[256]; // total cumulative frequency<br />
ac_range = high - low + 1;<br />
helper = ((range - 1) * model_range + (model_range - 1)) / range;<br />
idx = index of the model symbol containing that entry<br />
high = low + range * model[idx + 1] / model_range - 1;<br />
low += range * model[idx] / model_range;<br />
return idx - 128;<br />
<br />
Normalisation:<br />
for (;;) {<br />
if ((high & 0x8000) ^ (low & 0x8000) != 0) {<br />
if ((high & 0x4000) == 0 && (low & 0x4000) != 0)<br />
return;<br />
value ^= 0x8000;<br />
low &= 0x3FFF;<br />
high |= 0x4000;<br />
}<br />
low <<= 1;<br />
high = high * 2 + 1;<br />
value = value * 2 + get_bit();<br />
}<br />
<br />
[[Category: Audio Codecs]]<br />
[[Category: Lossless Audio Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=UMV&diff=15695UMV2022-12-16T11:22:44Z<p>Kostya: fill an updated description</p>
<hr />
<div>* Extension: UMV<br />
* Samples: http://samples.mplayerhq.hu/game-formats/umv/<br />
<br />
UMV is a full motion video file format used in several DOS games like [http://www.mobygames.com/game/dos/are-you-afraid-of-the-dark-the-tale-of-orpheos-curse Are You Afraid of the Dark? The Tale of Orpheo's Curse] and [https://www.mobygames.com/game/dracula-unleashed Dracula Unleashed].<br />
<br />
The format is split into individual packets, with no special header.<br />
Each packet starts with 4 bytes packet size (big-endian, including the 4 bytes for the size itself), followed by the size value of the previous packet (0 if there is no previous one).<br />
The following 4 bytes are packet type.<br />
<br />
Packets are grouped together, each group starting with packet type 2 and usually ending with packet type 0x80 (which usually has reported size 0 and occupies all the space until the next packet group or file end). Additionally there may be alternative scenes stored after the main video. Alternative scenes data has different chunk types but video compression remains the same.<br />
<br />
There are the following packet types known:<br />
* 0x01 -- video header<br />
* 0x02 -- packet group header, its first 32-bit value is an offset to the start of the next packet group<br />
* 0x04 -- delta frame<br />
* 0x10 -- intra frame for alternative scenes<br />
* 0x20 -- intra frame<br />
* 0x40 -- global palette<br />
* 0x80 -- PCM audio<br />
* 0x100 -- probably alternative scenes header<br />
* 0x200 -- probably offsets for alternative scene frames<br />
<br />
Video header usually consists of ten 32-bit words:<br />
* unknown<br />
* unknown<br />
* video width (160)<br />
* video height (100)<br />
* framerate?<br />
* audio sample rate<br />
* unknown<br />
* unknown<br />
* offset to alternative scenes data (or 0 if not present)<br />
* unknown<br />
<br />
Palette chunk format:<br />
* 32-bit start colour<br />
* 32-bit number of colours to update (for <i>Dracula</i> value 255 actually means 256 colours)<br />
* full-scale palette data<br />
<br />
Frame data starts with 32-bit frame number, 32-bit local offset to audio data (local means offset inside the packet group) and some additional fields: intra frames (type 0x20) have palette data offset, alternative intra frames (type 0x10) have three additional fields, delta frames have no additional information.<br />
<br />
Intra frames are stored in raw form (i.e. usually 16000 bytes for 160x100 frame) followed by palette change data (the same as palette chunk).<br />
<br />
Delta frames employ rather simple compression for conveying pixel change information: at first a byte is read and treated as a bitmask (MSB first) telling to update the following 0-8 pixels with the values read from the bitstream. E.g. <code>02 2A</code> will skip 6 pixels, set the following pixel to <code>0x2A</code> and skip one more pixel. There's a special case though: code 0x05 may be used to signal long skip, so <code>05 00 aa bb</code> means updating two pixels and <code>05 nz aa bb</code> means skipping <code>(aa * 256 + bb) * 8</code> pixels.<br />
<br />
[[Category:Video Codecs]]<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CI2&diff=15694CI22022-12-13T13:33:46Z<p>Kostya: create a redirect for CI2</p>
<hr />
<div>#redirect [[CNM]]<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15693CNM2022-12-13T13:32:12Z<p>Kostya: mention CI2 game</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm, ci2<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen]. The CI2 is the next iteration of CNM with slightly different compression that is used in [https://www.mobygames.com/game/seven-games-of-the-soul Faust: The Seven Games of the Soul].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table (v1 only)<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - tile data<br />
* 0x55 - image (v2)<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression for version 1 ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (!getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
== Video compression for version 2 ==<br />
In this version tile data is usually stored separately, in chunk type 0x54. Also bitstream format has changed to LSB first little-endian.<br />
<br />
=== Tile format ===<br />
Chunk type <code>0x54</code> starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1:<br />
<br />
read raw data for tile 0<br />
for each tile {<br />
copy previous tile data<br />
for each component of tile { // i.e. all Rs, Gs, Bs and As<br />
bits = get_bits(3);<br />
if (bits < 7) {<br />
for (i = 0; i < tile_size; i++) {<br />
delta = get_bits(bits); // get_bits(0)=0<br />
if (delta && get_bit(1))<br />
delta = -delta;<br />
tile[component][i] += delta;<br />
}<br />
} else {<br />
for (i = 0; i < tile_size; i++) {<br />
tile[component][i] = get_bits(8);<br />
}<br />
}<br />
}<br />
}<br />
<br />
=== Frame format ===<br />
Frame is now packed using a lot of various LRUs and first tile indices are restored and afterwards they are replaced with actual tile data. Frame data is coded in groups of 8 tiles using a bit prefix: 1 - copy 8 tile indices from the previous line, 0 - switch to individual tile index decoding. Individual tile indices are coded in several ways (depending on code):<br />
* <code>&nbsp;&nbsp;1</code> -- copy index from the top line<br />
* <code>000</code> -- get <code>ceil(log2(tile_size))</code> bits for a new tile index, add it to LRU list (see below)<br />
* <code>100</code> -- get 4-bit delta value, a sign bit, add that to top index value, output and add it to LRU list<br />
* <code>010</code> -- form a list of 0-4 context-dependent values (see below), select one using 0-2 bits, output and add it to LRU list<br />
* <code>110</code> -- get 4-bit index, output value retrieved from LRU list using that index<br />
<br />
==== LRU list ====<br />
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 15 values. The actual buffer is selected using the top tile index (so it is not in use for the first line). Initially it contains all zeroes.<br />
<br />
==== Context-dependent list ====<br />
For one of the modes such list is formed and then used as the pixel source:<br />
<br />
// list forming<br />
list = (empty);<br />
top = y > 0 ? top tile index : NONE;<br />
for left, top-left, top-right and top-top positions {<br />
idx = tile index at the search position<br />
if (!contains(list, idx) && (top == NONE || top != idx)) {<br />
push(list, idx)<br />
}<br />
}<br />
//decoding<br />
if (length(list) < 2) {<br />
new_idx = list[0]; // it should be empty<br />
} else if (length(list) == 2) {<br />
new_idx = list[get_bit()];<br />
} else {<br />
new_idx = list[get_bits(2)];<br />
}<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15692CNM2022-12-13T13:28:44Z<p>Kostya: add CI2 description</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm, ci2<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table (v1 only)<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - tile data<br />
* 0x55 - image (v2)<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression for version 1 ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (!getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
== Video compression for version 2 ==<br />
In this version tile data is usually stored separately, in chunk type 0x54. Also bitstream format has changed to LSB first little-endian.<br />
<br />
=== Tile format ===<br />
Chunk type <code>0x54</code> starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1:<br />
<br />
read raw data for tile 0<br />
for each tile {<br />
copy previous tile data<br />
for each component of tile { // i.e. all Rs, Gs, Bs and As<br />
bits = get_bits(3);<br />
if (bits < 7) {<br />
for (i = 0; i < tile_size; i++) {<br />
delta = get_bits(bits); // get_bits(0)=0<br />
if (delta && get_bit(1))<br />
delta = -delta;<br />
tile[component][i] += delta;<br />
}<br />
} else {<br />
for (i = 0; i < tile_size; i++) {<br />
tile[component][i] = get_bits(8);<br />
}<br />
}<br />
}<br />
}<br />
<br />
=== Frame format ===<br />
Frame is now packed using a lot of various LRUs and first tile indices are restored and afterwards they are replaced with actual tile data. Frame data is coded in groups of 8 tiles using a bit prefix: 1 - copy 8 tile indices from the previous line, 0 - switch to individual tile index decoding. Individual tile indices are coded in several ways (depending on code):<br />
* <code>&nbsp;&nbsp;1</code> -- copy index from the top line<br />
* <code>000</code> -- get <code>ceil(log2(tile_size))</code> bits for a new tile index, add it to LRU list (see below)<br />
* <code>100</code> -- get 4-bit delta value, a sign bit, add that to top index value, output and add it to LRU list<br />
* <code>010</code> -- form a list of 0-4 context-dependent values (see below), select one using 0-2 bits, output and add it to LRU list<br />
* <code>110</code> -- get 4-bit index, output value retrieved from LRU list using that index<br />
<br />
==== LRU list ====<br />
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 15 values. The actual buffer is selected using the top tile index (so it is not in use for the first line). Initially it contains all zeroes.<br />
<br />
==== Context-dependent list ====<br />
For one of the modes such list is formed and then used as the pixel source:<br />
<br />
// list forming<br />
list = (empty);<br />
top = y > 0 ? top tile index : NONE;<br />
for left, top-left, top-right and top-top positions {<br />
idx = tile index at the search position<br />
if (!contains(list, idx) && (top == NONE || top != idx)) {<br />
push(list, idx)<br />
}<br />
}<br />
//decoding<br />
if (length(list) < 2) {<br />
new_idx = list[0]; // it should be empty<br />
} else if (length(list) == 2) {<br />
new_idx = list[get_bit()];<br />
} else {<br />
new_idx = list[get_bits(2)];<br />
}<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Lightning_Strike_Video_Codec&diff=15691Lightning Strike Video Codec2022-12-10T16:47:45Z<p>Kostya: /* Wavelet coding */ mention transform</p>
<hr />
<div>* FourCCs: LSVC, LSVM, LSVX<br />
* Company: [[Espre Solutions]]<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/LSV/<br />
<br />
The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications. It is based on [[H.263]] with an addition of wavelet coding for intra frames.<br />
<br />
== Frame structure ==<br />
<br />
Each frame begins with 5-byte header, the second byte denoting the frame type and the rest of bytes being usually garbage.<br />
<br />
Known types:<br />
* <code>78 01 yy cc xx</code> -- frame actually starts at line <code>yy</code> and the rest should be filled with colour <code>cc</code>. It has 8 additional bytes following the header, usually with codec version like <code>"lsvx2.0"</code>. Usually it's the first frame;<br />
* <code>xx 01 xx xx xx</code> -- the same as above but without start line and fill values;<br />
* <code>xx 05 xx xx xx</code> -- skip frame, no further data is transmitted;<br />
* <code>xx 08 xx xx xx</code> -- probably a wavelet-coded keyframe that happens once is several seconds<br />
* <code>xx 09 xx xx xx</code> -- the usual frame<br />
<br />
This header is followed by the normal H.263++ picture header with a special exception: picture code 7 means wavelet-coded frame, otherwise it should be a conventional H.263++ frame.<br />
<br />
== Wavelet coding ==<br />
Wavelet data begins at byte-aligned position after H.263 picture header with 24-bit big-endian data size preceding actual frame data.<br />
<br />
Then a frame header follows:<br />
32-bit LE - data size<br />
1/3 bytes - end depth for each plane (grayscale/YUV420) relative to the maximum one <br />
<br />
After the header there's data for each plane that should end with <code>'c' 'o' 'd'</code> marker.<br />
<br />
Each plane is split down to depth 4 and each band is coded in one of three possible ways using rather simple prediction and binary coder.<br />
<br />
Wavelet transform seems to be the usual LGT5/3.<br />
<br />
=== Plane coding ===<br />
Each plane begins with three 16-bit LE values for each band (there should be depth*3+1 = 13 bands in total): some quantisers and maximum coded symbol (used in high-frequency band coding). Each band is coded with a binary coder and an appropriate model.<br />
<br />
=== Band models ===<br />
There are three band models (one for LL band and two for the rest of bands for default depth or not) that provide states for decoding variable-length integers. The code is decoded in the following way in all cases:<br />
<br />
if (coder.decode_bit(model.get_state_nz())) {<br />
sign = coder.decode_bit(model.get_state_sign());<br />
large = coder.decode_bit(sign ? model.get_state3() : model.get_state2());<br />
if (!large) {<br />
val = sign ? -1 : 1;<br />
} else {<br />
idx = 1;<br />
pfx = 2;<br />
while (coder.decode_bit(model.get_state_exponent(idx))) {<br />
pfx <<= 1;<br />
idx += 1;<br />
}<br />
let mant_state = mdl.get_state_mantissa(idx);<br />
val = pfx >> 1;<br />
mask = val >> 1;<br />
while (mask) {<br />
if (coder.decode_bit(model.get_state_mantissa(idx))) { // the same state<br />
val |= mask;<br />
}<br />
mask >>= 1;<br />
}<br />
val++;<br />
if (sign) {<br />
val = -val;<br />
}<br />
}<br />
} else {<br />
val = 0;<br />
}<br />
<br />
==== Model for LL band ====<br />
This model uses 49 bit states and a prediction for data decoding.<br />
* NZ state - state offset+0<br />
* sign state - state offset+1<br />
* state2 - state offset+2<br />
* state3 - state offset+3<br />
* exponent state N - state 20+N<br />
* mantissa state N - state 20+14+N<br />
<br />
Band decoding:<br />
pred = 0;<br />
model.offset = 0;<br />
for all values {<br />
val = decode_value(coder, model);<br />
pred += val;<br />
pixel_value = pred;<br />
if (val < -8)<br />
model.offset = 16;<br />
else if (val < 0)<br />
model.offset = 8;<br />
else if (val == 0)<br />
model.offset = 0;<br />
else if (val <= 8)<br />
model.offset = 4;<br />
else<br />
model.offset = 12;<br />
}<br />
<br />
==== Model for normal-case bands ====<br />
This model uses dynamic size dependent on number of wavelet levels to decode. Band data decoding is performed on row basis with decoding an end-of-line bit after each non-zero coefficient. Additionally for the first few decoded coefficient different bit contexts of the model are selected.<br />
<br />
==== Model for reduced-case bands ====<br />
This model uses 31 bit states and a prediction for data decoding.<br />
* NZ state - state 0<br />
* sign state - state with index 113 and mps=0 (will not change)<br />
* state2 - state 1<br />
* state3 - state 1<br />
* exponent state N - state N<br />
* mantissa state N - state 14+N<br />
<br />
Band decoding done by decoding data in every row until value equal to the maximum value+2 is encountered (the maximum value is signalled in the plane data header for each band).<br />
<br />
=== Binary coder ===<br />
Binary coder resembles CABAC since it also codes single bits using static probabilities and updates model state after decoding each bit. Additionally coder bitstream uses <code>FF</code> as a marker for the end of stream (and <code>FF 00</code> for transmitting actual <code>FF</code> value).<br />
<br />
==== Initialisation ====<br />
range = 1 << 16;<br />
bits = 8;<br />
value = 0;<br />
for (i = 0; i < 4; i++)<br />
value = (value << 8) | next_byte();<br />
<br />
==== Decoding a bit ====<br />
This function takes and modifies <code>state_idx</code> (model index) and <code>state_mps</code> (most probable symbol) for decoding one bit:<br />
<br />
prob = model_probabilities[state_idx];<br />
help = range - prob;<br />
if (help <= (value >> 16)) {<br />
value -= help << 16;<br />
range = prob;<br />
if (help < prob) {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
} else {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
}<br />
} else if (help & 0x8000) {<br />
return state_mps;<br />
} else {<br />
if (help < prob) {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
} else {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
}<br />
}<br />
renorm();<br />
return bit;<br />
<br />
==== Renormalisation ====<br />
while (range < 0x8000) {<br />
if (bits == 0) {<br />
value += next_byte() << 8;<br />
bits = 8;<br />
}<br />
range <<= 1;<br />
value <<= 1;<br />
bits -= 1;<br />
}<br />
<br />
==== Tables ====<br />
Probabilities:<br />
23069, 9606, 4372, 2059, 984, 474, 229, 111, 54, 26, 13, 6, 3, 1,<br />
23167, 16165, 11506, 8316, 6073, 4482, 3311, 2465, 1839, 1372, 1030, 771, 576, 433, 324, 245, 183, 138, 104, 78, 59, 44,<br />
23265, 18508, 14861, 12017, 9759, 7987, 6568, 5400, 4471, 3700, 3067, 2552, 2145, 1798, 1485, 1246, 1039, 867, 724, 604, 504, 420, 352, 293, 246, 203, 171, 143,<br />
23314, 19716, 16684, 14296, 12264, 10556, 9081, 7903, 6825, 5966, 5156, 4508, 3947, 3409, 2998, 2624,<br />
22578, 19740, 17294, 15325, 13550, 11950, 10650, 9494,<br />
21872, 19625, 17625, 15906, 14372, 12980, 11799,<br />
22184, 20294, 18405, 16847, 15421, 14174,<br />
21041, 19471, 17977, 16734,<br />
22055, 20711, 19333,<br />
21911, 20559,<br />
23056, 21794,<br />
23019,<br />
23069<br />
<br />
MPS switch flags:<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0<br />
<br />
MPS transition table:<br />
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13, 15, 16,<br />
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,<br />
33, 34, 35, 9, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,<br />
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 32,<br />
65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 48,<br />
81, 82, 83, 84, 85, 86, 87, 71, 89, 90, 91, 92, 93, 94, 86, 96,<br />
97, 98, 99, 100, 93, 102, 103, 104, 99, 106, 107, 103, 109, 107, 111, 109, 111, 113<br />
<br />
LPS transition table:<br />
1, 14, 16, 18, 20, 23, 25, 28, 30, 33, 35, 9, 10, 12, 15, 36,<br />
38, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 56, 57, 59, 60,<br />
62, 63, 32, 33, 37, 64, 65, 67, 68, 69, 70, 72, 73, 74, 75, 77,<br />
78, 79, 48, 50, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 61, 61,<br />
65, 80, 81, 82, 83, 84, 86, 87, 87, 72, 72, 74, 74, 75, 77, 77,<br />
80, 88, 89, 90, 91, 92, 93, 86, 88, 95, 96, 97, 99, 99, 93, 95,<br />
101, 102, 103, 104, 99, 105, 106, 107, 103, 105, 108, 109, 110, 111, 110, 112, 112, 113<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Lightning_Strike_Video_Codec&diff=15690Lightning Strike Video Codec2022-12-10T16:38:57Z<p>Kostya: /* Band models */</p>
<hr />
<div>* FourCCs: LSVC, LSVM, LSVX<br />
* Company: [[Espre Solutions]]<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/LSV/<br />
<br />
The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications. It is based on [[H.263]] with an addition of wavelet coding for intra frames.<br />
<br />
== Frame structure ==<br />
<br />
Each frame begins with 5-byte header, the second byte denoting the frame type and the rest of bytes being usually garbage.<br />
<br />
Known types:<br />
* <code>78 01 yy cc xx</code> -- frame actually starts at line <code>yy</code> and the rest should be filled with colour <code>cc</code>. It has 8 additional bytes following the header, usually with codec version like <code>"lsvx2.0"</code>. Usually it's the first frame;<br />
* <code>xx 01 xx xx xx</code> -- the same as above but without start line and fill values;<br />
* <code>xx 05 xx xx xx</code> -- skip frame, no further data is transmitted;<br />
* <code>xx 08 xx xx xx</code> -- probably a wavelet-coded keyframe that happens once is several seconds<br />
* <code>xx 09 xx xx xx</code> -- the usual frame<br />
<br />
This header is followed by the normal H.263++ picture header with a special exception: picture code 7 means wavelet-coded frame, otherwise it should be a conventional H.263++ frame.<br />
<br />
== Wavelet coding ==<br />
Wavelet data begins at byte-aligned position after H.263 picture header with 24-bit big-endian data size preceding actual frame data.<br />
<br />
Then a frame header follows:<br />
32-bit LE - data size<br />
1/3 bytes - end depth for each plane (grayscale/YUV420) relative to the maximum one <br />
<br />
After the header there's data for each plane that should end with <code>'c' 'o' 'd'</code> marker.<br />
<br />
Each plane is split down to depth 4 and each band is coded in one of three possible ways using rather simple prediction and binary coder.<br />
<br />
=== Plane coding ===<br />
Each plane begins with three 16-bit LE values for each band (there should be depth*3+1 = 13 bands in total): some quantisers and maximum coded symbol (used in high-frequency band coding). Each band is coded with a binary coder and an appropriate model.<br />
<br />
=== Band models ===<br />
There are three band models (one for LL band and two for the rest of bands for default depth or not) that provide states for decoding variable-length integers. The code is decoded in the following way in all cases:<br />
<br />
if (coder.decode_bit(model.get_state_nz())) {<br />
sign = coder.decode_bit(model.get_state_sign());<br />
large = coder.decode_bit(sign ? model.get_state3() : model.get_state2());<br />
if (!large) {<br />
val = sign ? -1 : 1;<br />
} else {<br />
idx = 1;<br />
pfx = 2;<br />
while (coder.decode_bit(model.get_state_exponent(idx))) {<br />
pfx <<= 1;<br />
idx += 1;<br />
}<br />
let mant_state = mdl.get_state_mantissa(idx);<br />
val = pfx >> 1;<br />
mask = val >> 1;<br />
while (mask) {<br />
if (coder.decode_bit(model.get_state_mantissa(idx))) { // the same state<br />
val |= mask;<br />
}<br />
mask >>= 1;<br />
}<br />
val++;<br />
if (sign) {<br />
val = -val;<br />
}<br />
}<br />
} else {<br />
val = 0;<br />
}<br />
<br />
==== Model for LL band ====<br />
This model uses 49 bit states and a prediction for data decoding.<br />
* NZ state - state offset+0<br />
* sign state - state offset+1<br />
* state2 - state offset+2<br />
* state3 - state offset+3<br />
* exponent state N - state 20+N<br />
* mantissa state N - state 20+14+N<br />
<br />
Band decoding:<br />
pred = 0;<br />
model.offset = 0;<br />
for all values {<br />
val = decode_value(coder, model);<br />
pred += val;<br />
pixel_value = pred;<br />
if (val < -8)<br />
model.offset = 16;<br />
else if (val < 0)<br />
model.offset = 8;<br />
else if (val == 0)<br />
model.offset = 0;<br />
else if (val <= 8)<br />
model.offset = 4;<br />
else<br />
model.offset = 12;<br />
}<br />
<br />
==== Model for normal-case bands ====<br />
This model uses dynamic size dependent on number of wavelet levels to decode. Band data decoding is performed on row basis with decoding an end-of-line bit after each non-zero coefficient. Additionally for the first few decoded coefficient different bit contexts of the model are selected.<br />
<br />
==== Model for reduced-case bands ====<br />
This model uses 31 bit states and a prediction for data decoding.<br />
* NZ state - state 0<br />
* sign state - state with index 113 and mps=0 (will not change)<br />
* state2 - state 1<br />
* state3 - state 1<br />
* exponent state N - state N<br />
* mantissa state N - state 14+N<br />
<br />
Band decoding done by decoding data in every row until value equal to the maximum value+2 is encountered (the maximum value is signalled in the plane data header for each band).<br />
<br />
=== Binary coder ===<br />
Binary coder resembles CABAC since it also codes single bits using static probabilities and updates model state after decoding each bit. Additionally coder bitstream uses <code>FF</code> as a marker for the end of stream (and <code>FF 00</code> for transmitting actual <code>FF</code> value).<br />
<br />
==== Initialisation ====<br />
range = 1 << 16;<br />
bits = 8;<br />
value = 0;<br />
for (i = 0; i < 4; i++)<br />
value = (value << 8) | next_byte();<br />
<br />
==== Decoding a bit ====<br />
This function takes and modifies <code>state_idx</code> (model index) and <code>state_mps</code> (most probable symbol) for decoding one bit:<br />
<br />
prob = model_probabilities[state_idx];<br />
help = range - prob;<br />
if (help <= (value >> 16)) {<br />
value -= help << 16;<br />
range = prob;<br />
if (help < prob) {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
} else {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
}<br />
} else if (help & 0x8000) {<br />
return state_mps;<br />
} else {<br />
if (help < prob) {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
} else {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
}<br />
}<br />
renorm();<br />
return bit;<br />
<br />
==== Renormalisation ====<br />
while (range < 0x8000) {<br />
if (bits == 0) {<br />
value += next_byte() << 8;<br />
bits = 8;<br />
}<br />
range <<= 1;<br />
value <<= 1;<br />
bits -= 1;<br />
}<br />
<br />
==== Tables ====<br />
Probabilities:<br />
23069, 9606, 4372, 2059, 984, 474, 229, 111, 54, 26, 13, 6, 3, 1,<br />
23167, 16165, 11506, 8316, 6073, 4482, 3311, 2465, 1839, 1372, 1030, 771, 576, 433, 324, 245, 183, 138, 104, 78, 59, 44,<br />
23265, 18508, 14861, 12017, 9759, 7987, 6568, 5400, 4471, 3700, 3067, 2552, 2145, 1798, 1485, 1246, 1039, 867, 724, 604, 504, 420, 352, 293, 246, 203, 171, 143,<br />
23314, 19716, 16684, 14296, 12264, 10556, 9081, 7903, 6825, 5966, 5156, 4508, 3947, 3409, 2998, 2624,<br />
22578, 19740, 17294, 15325, 13550, 11950, 10650, 9494,<br />
21872, 19625, 17625, 15906, 14372, 12980, 11799,<br />
22184, 20294, 18405, 16847, 15421, 14174,<br />
21041, 19471, 17977, 16734,<br />
22055, 20711, 19333,<br />
21911, 20559,<br />
23056, 21794,<br />
23019,<br />
23069<br />
<br />
MPS switch flags:<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0<br />
<br />
MPS transition table:<br />
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13, 15, 16,<br />
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,<br />
33, 34, 35, 9, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,<br />
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 32,<br />
65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 48,<br />
81, 82, 83, 84, 85, 86, 87, 71, 89, 90, 91, 92, 93, 94, 86, 96,<br />
97, 98, 99, 100, 93, 102, 103, 104, 99, 106, 107, 103, 109, 107, 111, 109, 111, 113<br />
<br />
LPS transition table:<br />
1, 14, 16, 18, 20, 23, 25, 28, 30, 33, 35, 9, 10, 12, 15, 36,<br />
38, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 56, 57, 59, 60,<br />
62, 63, 32, 33, 37, 64, 65, 67, 68, 69, 70, 72, 73, 74, 75, 77,<br />
78, 79, 48, 50, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 61, 61,<br />
65, 80, 81, 82, 83, 84, 86, 87, 87, 72, 72, 74, 74, 75, 77, 77,<br />
80, 88, 89, 90, 91, 92, 93, 86, 88, 95, 96, 97, 99, 99, 93, 95,<br />
101, 102, 103, 104, 99, 105, 106, 107, 103, 105, 108, 109, 110, 111, 110, 112, 112, 113<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Lightning_Strike_Video_Codec&diff=15689Lightning Strike Video Codec2022-12-10T11:33:57Z<p>Kostya: start documenting</p>
<hr />
<div>* FourCCs: LSVC, LSVM, LSVX<br />
* Company: [[Espre Solutions]]<br />
* Samples: http://samples.mplayerhq.hu/V-codecs/LSV/<br />
<br />
The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications. It is based on [[H.263]] with an addition of wavelet coding for intra frames.<br />
<br />
== Frame structure ==<br />
<br />
Each frame begins with 5-byte header, the second byte denoting the frame type and the rest of bytes being usually garbage.<br />
<br />
Known types:<br />
* <code>78 01 yy cc xx</code> -- frame actually starts at line <code>yy</code> and the rest should be filled with colour <code>cc</code>. It has 8 additional bytes following the header, usually with codec version like <code>"lsvx2.0"</code>. Usually it's the first frame;<br />
* <code>xx 01 xx xx xx</code> -- the same as above but without start line and fill values;<br />
* <code>xx 05 xx xx xx</code> -- skip frame, no further data is transmitted;<br />
* <code>xx 08 xx xx xx</code> -- probably a wavelet-coded keyframe that happens once is several seconds<br />
* <code>xx 09 xx xx xx</code> -- the usual frame<br />
<br />
This header is followed by the normal H.263++ picture header with a special exception: picture code 7 means wavelet-coded frame, otherwise it should be a conventional H.263++ frame.<br />
<br />
== Wavelet coding ==<br />
Wavelet data begins at byte-aligned position after H.263 picture header with 24-bit big-endian data size preceding actual frame data.<br />
<br />
Then a frame header follows:<br />
32-bit LE - data size<br />
1/3 bytes - end depth for each plane (grayscale/YUV420) relative to the maximum one <br />
<br />
After the header there's data for each plane that should end with <code>'c' 'o' 'd'</code> marker.<br />
<br />
Each plane is split down to depth 4 and each band is coded in one of three possible ways using rather simple prediction and binary coder.<br />
<br />
=== Plane coding ===<br />
Each plane begins with three 16-bit LE values for each band (there should be depth*3+1 = 13 bands in total): some quantisers and maximum coded symbol (used in high-frequency band coding). Each band is coded with a binary coder and an appropriate model.<br />
<br />
=== Band models ===<br />
There are three band models (one for LL band and two for the rest of bands for default depth or not) that provide states for decoding variable-length integers. The code is decoded in the following way in all cases:<br />
<br />
if (coder.decode_bit(model.get_state_nz())) {<br />
sign = coder.decode_bit(model.get_state_sign());<br />
large = coder.decode_bit(sign ? model.get_state3() : model.get_state2());<br />
if (!large) {<br />
val = sign ? -1 : 1;<br />
} else {<br />
idx = 1;<br />
pfx = 2;<br />
while (coder.decode_bit(model.get_state_exponent(idx))) {<br />
pfx <<= 1;<br />
idx += 1;<br />
}<br />
let mant_state = mdl.get_state_mantissa(idx);<br />
val = pfx >> 1;<br />
mask = val >> 1;<br />
while (mask) {<br />
if (coder.decode_bit(model.get_state_mantissa(idx))) { // the same state<br />
val |= mask;<br />
}<br />
mask >>= 1;<br />
}<br />
val++;<br />
if (sign) {<br />
val = -val;<br />
}<br />
}<br />
} else {<br />
val = 0;<br />
}<br />
<br />
==== Model for LL band ====<br />
This model uses 49 bit states and a prediction for data decoding.<br />
* NZ state - state offset+0<br />
* sign state - state offset+1<br />
* state2 - state offset+2<br />
* state3 - state offset+3<br />
* exponent state N - state 20+N<br />
* mantissa state N - state 20+14+N<br />
<br />
Band decoding:<br />
pred = 0;<br />
model.offset = 0;<br />
for all values {<br />
val = decode_value(coder, model);<br />
pred += val;<br />
pixel_value = pred;<br />
if (val < -8)<br />
model.offset = 16;<br />
else if (val < 0)<br />
model.offset = 8;<br />
else if (val == 0)<br />
model.offset = 0;<br />
else if (val <= 8)<br />
model.offset = 4;<br />
else<br />
model.offset = 12;<br />
}<br />
<br />
==== Model for normal-case bands ====<br />
**TODO<br />
<br />
==== Model for reduced-case bands ====<br />
**TODO<br />
<br />
=== Binary coder ===<br />
Binary coder resembles CABAC since it also codes single bits using static probabilities and updates model state after decoding each bit. Additionally coder bitstream uses <code>FF</code> as a marker for the end of stream (and <code>FF 00</code> for transmitting actual <code>FF</code> value).<br />
<br />
==== Initialisation ====<br />
range = 1 << 16;<br />
bits = 8;<br />
value = 0;<br />
for (i = 0; i < 4; i++)<br />
value = (value << 8) | next_byte();<br />
<br />
==== Decoding a bit ====<br />
This function takes and modifies <code>state_idx</code> (model index) and <code>state_mps</code> (most probable symbol) for decoding one bit:<br />
<br />
prob = model_probabilities[state_idx];<br />
help = range - prob;<br />
if (help <= (value >> 16)) {<br />
value -= help << 16;<br />
range = prob;<br />
if (help < prob) {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
} else {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
}<br />
} else if (help & 0x8000) {<br />
return state_mps;<br />
} else {<br />
if (help < prob) {<br />
bit = 1 - state_mps;<br />
state_idx = model_state_lps[state_idx];<br />
state_mps ^= model_mps_switch[state_idx];<br />
} else {<br />
bit = state_mps;<br />
state_idx = model_state_mps[state_idx];<br />
}<br />
}<br />
renorm();<br />
return bit;<br />
<br />
==== Renormalisation ====<br />
while (range < 0x8000) {<br />
if (bits == 0) {<br />
value += next_byte() << 8;<br />
bits = 8;<br />
}<br />
range <<= 1;<br />
value <<= 1;<br />
bits -= 1;<br />
}<br />
<br />
==== Tables ====<br />
Probabilities:<br />
23069, 9606, 4372, 2059, 984, 474, 229, 111, 54, 26, 13, 6, 3, 1,<br />
23167, 16165, 11506, 8316, 6073, 4482, 3311, 2465, 1839, 1372, 1030, 771, 576, 433, 324, 245, 183, 138, 104, 78, 59, 44,<br />
23265, 18508, 14861, 12017, 9759, 7987, 6568, 5400, 4471, 3700, 3067, 2552, 2145, 1798, 1485, 1246, 1039, 867, 724, 604, 504, 420, 352, 293, 246, 203, 171, 143,<br />
23314, 19716, 16684, 14296, 12264, 10556, 9081, 7903, 6825, 5966, 5156, 4508, 3947, 3409, 2998, 2624,<br />
22578, 19740, 17294, 15325, 13550, 11950, 10650, 9494,<br />
21872, 19625, 17625, 15906, 14372, 12980, 11799,<br />
22184, 20294, 18405, 16847, 15421, 14174,<br />
21041, 19471, 17977, 16734,<br />
22055, 20711, 19333,<br />
21911, 20559,<br />
23056, 21794,<br />
23019,<br />
23069<br />
<br />
MPS switch flags:<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br />
1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,<br />
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0<br />
<br />
MPS transition table:<br />
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13, 15, 16,<br />
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,<br />
33, 34, 35, 9, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,<br />
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 32,<br />
65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 48,<br />
81, 82, 83, 84, 85, 86, 87, 71, 89, 90, 91, 92, 93, 94, 86, 96,<br />
97, 98, 99, 100, 93, 102, 103, 104, 99, 106, 107, 103, 109, 107, 111, 109, 111, 113<br />
<br />
LPS transition table:<br />
1, 14, 16, 18, 20, 23, 25, 28, 30, 33, 35, 9, 10, 12, 15, 36,<br />
38, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 56, 57, 59, 60,<br />
62, 63, 32, 33, 37, 64, 65, 67, 68, 69, 70, 72, 73, 74, 75, 77,<br />
78, 79, 48, 50, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 61, 61,<br />
65, 80, 81, 82, 83, 84, 86, 87, 87, 72, 72, 74, 74, 75, 77, 77,<br />
80, 88, 89, 90, 91, 92, 93, 86, 88, 95, 96, 97, 99, 99, 93, 95,<br />
101, 102, 103, 104, 99, 105, 106, 107, 103, 105, 108, 109, 110, 111, 110, 112, 112, 113<br />
<br />
[[Category:Video Codecs]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Ark_of_Time_AN&diff=15688Ark of Time AN2022-11-27T17:04:42Z<p>Kostya: </p>
<hr />
<div>AN is an animation format used in [https://www.mobygames.com/game/ark-of-time Ark of Time] game.<br />
<br />
Container format (all numbers are little-endian):<br />
<br />
Header:<br />
4 bytes - header size<br />
N bytes - flags for each frame (variable-length)<br />
0x1400 bytes - ignored<br />
frames in the following format:<br />
4 bytes - frame size<br />
N bytes - frame data<br />
<br />
Flags are variable-length pieces of information:<br />
1 byte - initial flags<br />
2 bytes - frame delay<br />
if (initial_flags & 2)<br />
// no additional data, it signals that actual frame data is present<br />
if (initial_flags & 8)<br />
2 bytes - unknown<br />
if (initial_flags & 0x20) // something related to displaying subtitles<br />
2 bytes - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
if (initial_flags & 0x40)<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x80) // should not happen simultaneously with flag 0x40<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x10)<br />
768 bytes - VGA palette<br />
<br />
Frames are packed using very simple copy/skip method: read a byte, top bit of it is a copy/skip marker (if set - read data from stream, otherwise skip it), low 7 bits are copy/skip length (0 means that 16-bit actual length should be read); depending on it either read the requested amount of pixels from the stream or skip them.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Ark_of_Time_AV&diff=15687Ark of Time AV2022-11-27T17:04:28Z<p>Kostya: Kostya moved page Ark of Time AV to Ark of Time AN: typo in the extension</p>
<hr />
<div>#REDIRECT [[Ark of Time AN]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Ark_of_Time_AN&diff=15686Ark of Time AN2022-11-27T17:04:28Z<p>Kostya: Kostya moved page Ark of Time AV to Ark of Time AN: typo in the extension</p>
<hr />
<div>AV is an animation format used in [https://www.mobygames.com/game/ark-of-time Ark of Time] game.<br />
<br />
Container format (all numbers are little-endian):<br />
<br />
Header:<br />
4 bytes - header size<br />
N bytes - flags for each frame (variable-length)<br />
0x1400 bytes - ignored<br />
frames in the following format:<br />
4 bytes - frame size<br />
N bytes - frame data<br />
<br />
Flags are variable-length pieces of information:<br />
1 byte - initial flags<br />
2 bytes - frame delay<br />
if (initial_flags & 2)<br />
// no additional data, it signals that actual frame data is present<br />
if (initial_flags & 8)<br />
2 bytes - unknown<br />
if (initial_flags & 0x20) // something related to displaying subtitles<br />
2 bytes - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
if (initial_flags & 0x40)<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x80) // should not happen simultaneously with flag 0x40<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x10)<br />
768 bytes - VGA palette<br />
<br />
Frames are packed using very simple copy/skip method: read a byte, top bit of it is a copy/skip marker (if set - read data from stream, otherwise skip it), low 7 bits are copy/skip length (0 means that 16-bit actual length should be read); depending on it either read the requested amount of pixels from the stream or skip them.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Ark_of_Time_AN&diff=15685Ark of Time AN2022-11-27T17:03:47Z<p>Kostya: document another game format</p>
<hr />
<div>AV is an animation format used in [https://www.mobygames.com/game/ark-of-time Ark of Time] game.<br />
<br />
Container format (all numbers are little-endian):<br />
<br />
Header:<br />
4 bytes - header size<br />
N bytes - flags for each frame (variable-length)<br />
0x1400 bytes - ignored<br />
frames in the following format:<br />
4 bytes - frame size<br />
N bytes - frame data<br />
<br />
Flags are variable-length pieces of information:<br />
1 byte - initial flags<br />
2 bytes - frame delay<br />
if (initial_flags & 2)<br />
// no additional data, it signals that actual frame data is present<br />
if (initial_flags & 8)<br />
2 bytes - unknown<br />
if (initial_flags & 0x20) // something related to displaying subtitles<br />
2 bytes - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
1 byte - unknown<br />
if (initial_flags & 0x40)<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x80) // should not happen simultaneously with flag 0x40<br />
1 byte - also something related to subtitles<br />
if (initial_flags & 0x10)<br />
768 bytes - VGA palette<br />
<br />
Frames are packed using very simple copy/skip method: read a byte, top bit of it is a copy/skip marker (if set - read data from stream, otherwise skip it), low 7 bits are copy/skip length (0 means that 16-bit actual length should be read); depending on it either read the requested amount of pixels from the stream or skip them.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=Interspective_animation&diff=15684Interspective animation2022-11-15T17:13:55Z<p>Kostya: fill information from the source code for ScummVM implementation of Interspecive engine</p>
<hr />
<div>This is a cutscene format used by Divide By Zero in two of its adventure games built on Interpective engine, namely [https://www.mobygames.com/game/innocent-until-caught Innocent Until Caught] and its sequel [https://www.mobygames.com/game/guilty Guilty]. The former has its cutscene files named <code>IUC_F0[1-7].DAT</code>, the latter names them <code>GBG_F[01][0-9].DAT</code>.<br />
<br />
Animation files contain one or more blocks consisting of I-frame and several P-frames. Each block starts with 32-bit little-endian block size and 16-bit number of frames in the block. Each frame starts with 16-bit frame data size.<br />
<br />
I-frame starts with 16-bit width and height (which should always be 320 and 200) and image data compressed in the same way as in [[PCX]]. At the end of image data there should be <code>0xC0</code> byte signalling end of image followed by 768-byte palette.<br />
<br />
P-frame starts with two bytes that should be used to signal short (one-byte) and long (16-bit) skips. So every byte but those two is a pixel value and after a signal byte a skip value should be read. In case skip value is zero, the original pixel value should be output. E.g. if <code>02</code> means long skip, <code>02 03 01</code> means "skip 0x103 (=259) pixels" and <code>02 00 00</code> means "output 02 pixel".<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15683CNM2022-11-12T11:16:17Z<p>Kostya: /* Video compression */ typo</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - some control marker<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (!getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15682CNM2022-11-11T17:31:14Z<p>Kostya: /* Video compression */ fix video description</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - some control marker<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles<br />
2 bytes - tile data size<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:<br />
<br />
copy 16 bytes (4x1 tile) from the stream<br />
for (tile = 1; tile < num_tiles; tile++) {<br />
tile_data[tile] = tile_data[tile - 1];<br />
bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen<br />
for (i = 0; i < 16; i++) {<br />
delta = get_bits(bits);<br />
if (delta && get_bit())<br />
delta = -delta;<br />
tile_data[tile][i] += delta;<br />
}<br />
}<br />
<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
Actual image may be interlaced, i.e. only half of the lines are decoded.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15681CNM2022-11-11T16:04:38Z<p>Kostya: /* Container format */ small fix</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen].<br />
<br />
== Container format ==<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - number of audio tracks<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table<br />
152 bytes - always zero?<br />
when audio is present for each track:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
10 bytes - unused?<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - some control marker<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into 4x4 tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles?<br />
2 bytes - tile size?<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data begins with 16 bytes containing string "ARXEL".<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
After decoding twice as many data it is somehow combined and converted into 32- or 24-bit image.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CNM&diff=15680CNM2022-11-11T14:34:20Z<p>Kostya: fill information</p>
<hr />
<div>* Company: [[Arxel Tribe]]<br />
* Extension: cnm<br />
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/]<br />
<br />
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen].<br />
<br />
== Container format ===<br />
<br />
Container has the following structure:<br />
* magic <code>CNM UNR\0</code><br />
* header<br />
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present)<br />
* frames<br />
<br />
Header format (all values are little-endian):<br />
4 bytes - number of frames<br />
4 bytes - unknown<br />
1 byte - unknown<br />
4 bytes - image width<br />
4 bytes - image height<br />
2 bytes - unknown<br />
1 byte - audio present flag<br />
4 bytes - number of video frames?<br />
4 bytes - number of frames repeated?<br />
4 bytes - size of offsets table<br />
152 bytes - always zero?<br />
when audio is present:<br />
1 byte - number of channels<br />
1 bytes - bits per sample<br />
4 bytes - audio rate<br />
<br />
Each frame is prefixed by a byte containing its type. Known frame types:<br />
* 0x41 - audio data<br />
* 0x42 - audio data<br />
* 0x53 - image<br />
* 0x54 - some control marker<br />
* 0x5A - audio data<br />
<br />
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.<br />
<br />
== Video compression ==<br />
<br />
Each frame is an independently compressed image (in bottoms-up format) split into 4x4 tiles.<br />
Frame header:<br />
4 bytes - payload size (not counting the header)<br />
4 bytes - offset to the colour data<br />
2 bytes - number of tiles?<br />
2 bytes - tile size?<br />
4 bytes - width<br />
4 bytes - height<br />
4 bytes - unknown<br />
4 bytes - unknown<br />
3 bytes - unused?<br />
<br />
Colour data begins with 16 bytes containing string "ARXEL".<br />
<br />
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.<br />
<br />
Single tile decoding flow:<br />
<br />
if (getbit()) {<br />
offset = get_bits(tile_index_bits);<br />
copy tile data from the colour data using offset*16<br />
} else { // copy existing tile<br />
decode motion vector, copy tile to which it points to<br />
(e.g. -1,0 means previous tile and 0,-1 means top tile)<br />
}<br />
<br />
Motion vector codebook:<br />
1 - 0,-1<br />
0100 - -1, 0<br />
0101 - -1,-1<br />
0110 - 1,-1<br />
0111 - 0,-2<br />
000000 - -2,-3<br />
000001 - 2,-3<br />
000010 - -1,-4<br />
000011 - 1,-4<br />
000100 - -1,-2<br />
000101 - 1,-2<br />
000110 - 0,-3<br />
000111 - 0,-4<br />
001000 - -2, 0<br />
001001 - -2,-1<br />
001010 - 1,-1<br />
001011 - -2,-2<br />
001100 - 2,-2<br />
001101 - -1,-3<br />
001110 - 1,-3<br />
001111 - 0,-5<br />
<br />
After decoding twice as many data it is somehow combined and converted into 32- or 24-bit image.<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=SIFF&diff=15679SIFF2022-11-05T10:10:43Z<p>Kostya: /* Audio file structures */ fix one field description</p>
<hr />
<div>* Extentions: vb, vbc, son<br />
* Samples: http://samples.mplayerhq.hu/game-formats/SIFF/<br />
<br />
Certain PC games developed by [[Beam Software]] use a multimedia format with the file signature 'SIFF'. The format can transport audio (in <code>.son</code> files), image (in <code>.pim</code>) and video (in <code>.vb</code>/<code>.vbc</code> files).<br />
<br />
== File Format ==<br />
<br />
The general fourcc chunk format is as follows:<br />
<br />
bytes 0-3 chunk type<br />
bytes 4-7 chunk size (not including 8-byte chunk preamble and usually in MSB format)<br />
bytes 8.. chunk payload<br />
<br />
It is important to note that in the fourcc chunk preamble, the size field is big-endian. However, all other multi-byte numbers in the file are little-endian.<br />
<br />
=== Audio file structures ===<br />
Audio file type is signalled by 'SOUN' signature. The header is stored in 8-byte 'SHDR' chunk:<br />
4 bytes - audio duration<br />
2 bytes - sampling rate<br />
1 byte - bits per sample (usually 8 or 12)<br />
1 byte - audio flags (1 - stereo sound)<br />
<br />
Audio data is stored in 'BODY' chunk. 12-bit PCM audio is unpacked as <code>AB CD EF -> DAB0 EFC0</code>.<br />
<br />
=== Sprite file structures ===<br />
<br />
This kind of files has 'PXAN' signature, header is stored in 'AHDR' chunk, palette is transmitted in 'CMAP' chunk and 'BODY' chunk contains image data.<br />
<br />
=== FCP video structures ===<br />
<br />
FCP video is signalled by 'FCPK' signature. It has the following header in 16-byte 'FCHD' chunk:<br />
2 bytes - flags (0x2000 - global MV present for each frame)<br />
2 bytes - width<br />
2 bytes - height<br />
2 bytes - number of frames<br />
<br />
Frame data is stored inside 'BODY' chunk prefixed with 16-bit frame size and frames have the following header:<br />
2 bytes - audio data size<br />
2 bytes - frame duration (presumably)<br />
2 bytes - number of changed palette colours<br />
(if there are changed colours)<br />
2 bytes - palette change start<br />
3*N bytes - new palette bytes<br />
M bytes - audio size<br />
2 bytes (if present) - global motion X component<br />
2 bytes (if present) - global motion Y component<br />
the rest is video data<br />
<br />
For actual compression description see [[Beam Video]].<br />
<br />
=== VBV video structures ===<br />
<br />
A VB file has 'VBV1' signature. The header is stored in 32-byte 'VBHD' chunk:<br />
2 bytes - version/bytes per sample (1 - palettised video, 2 - RGB555 video)<br />
2 bytes - width<br />
2 bytes - height<br />
2 bytes - x offset of the video<br />
2 bytes - y offset of the video<br />
2 bytes - number of frames<br />
1 byte - bits per audio sample<br />
1 byte - audio flags (probably)<br />
2 bytes - sample rate<br />
16 bytes - reserved, set to 0<br />
<br />
All data is stored in 'BODY' chunk in blocks. Block format:<br />
<br />
bytes 0-3 block size (with header)<br />
bytes 4-5 flags<br />
bytes 6.. block payload<br />
<br />
The flags are defined as:<br />
<br />
bit 0 - frame has global motion vector<br />
bit 2 - block contains audio<br />
bit 3 - block contains video<br />
bit 4 - palette change<br />
bit 5 - block has frame duration<br />
<br />
Global motion vector is stored as two signed words (16 bits each). Duration is 16-bit value. The rest of sub-payloads have 32-bit size preceding them (the size includes 4 bytes of the size field itself).<br />
<br />
Palette change data:<br />
<br />
byte 0 - start index<br />
byte 1 - number of entries to change<br />
bytes 2.. - RGB entry, 3 bytes per entry<br />
<br />
For the actual compression algorithm see [[Beam Video]].<br />
<br />
== Games That Use The SIFF Format ==<br />
<br />
* [https://www.mobygames.com/game/dame-was-loaded The Dame was Loaded] (FCP and 12-bit PCM in SON)<br />
* [https://www.mobygames.com/game/bug Bug!]<br />
* [https://www.mobygames.com/game/cricket-97 Cricket 97]<br />
* [https://www.mobygames.com/game/kknd-krush-kill-n-destroy KKND]<br />
* [https://www.mobygames.com/game/windows/kknd2-krossfire KKND2: Krossfire]<br />
* [https://www.mobygames.com/game/dos/norse-by-norse-west-the-return-of-the-lost-vikings The Lost Vikings II] (a.k.a. Norse By Norse West: The Return of the Lost Vikings)<br />
* [https://www.mobygames.com/game/dethkarz Dethkarz]<br />
* [https://www.mobygames.com/game/alien-earth Alien Earth] (the only one with 15-bit video)<br />
<br />
[[Category:Game Formats]]</div>Kostyahttps://wiki.multimedia.cx/index.php?title=CRI_P256&diff=15678CRI P2562022-10-31T11:12:06Z<p>Kostya: save some information for the posterity</p>
<hr />
<div>* Company: [[CRI]]<br />
* Platform: Nintendo DS<br />
<br />
This description is based on the [https://web.archive.org/web/20070602010100/http://cri-ch.tv/iwai/363.shtml B256 description] from CRI Channel blog (in Japanese).<br />
<br />
P256 is a paletted video codec that employs prediction and B256 entropy coding.<br />
<br />
B256 (internal [[CRI]] name for the technology) seems to be Tunstall coding (like in [[Duck TrueMotion 1]]) where one byte corresponds to a codebook entry with a variable amount of symbols.<br />
<br />
[[Category:Game Formats]]<br />
[[Category:Video Codecs]]<br />
[[Category:Incomplete Video Codecs]]</div>Kostya