CNM: Difference between revisions
(→Video compression: fix video description) |
(fill information about CI2) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
* Company: [[Arxel Tribe]] | * Company: [[Arxel Tribe]] | ||
* Extension: cnm | * Extension: cnm, ci2 | ||
* Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/] | * Samples: [http://samples.mplayerhq.hu/game-formats/ring-cnm/ http://samples.mplayerhq.hu/game-formats/ring-cnm/] | ||
CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen]. | CNM is a multimedia format used in the computer game [http://www.mobygames.com/game/windows/ring-the-legend-of-the-nibelungen Ring: The Legend of the Nibelungen]. The CI2 is the next iteration of CNM with slightly different compression that is used in [https://www.mobygames.com/game/seven-games-of-the-soul Faust: The Seven Games of the Soul]. | ||
== Container format == | == Container format == | ||
Line 10: | Line 10: | ||
* magic <code>CNM UNR\0</code> | * magic <code>CNM UNR\0</code> | ||
* header | * header | ||
* frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present) | * frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present; completely zero in CI2) | ||
* frames | * frames | ||
Line 23: | Line 23: | ||
4 bytes - number of video frames? | 4 bytes - number of video frames? | ||
4 bytes - number of frames repeated? | 4 bytes - number of frames repeated? | ||
4 bytes - size of offsets table | 4 bytes - size of offsets table (v1 only) | ||
152 bytes - always zero? | 152 bytes - always zero? | ||
when audio is present for each track: | when audio is present for each track: | ||
Line 35: | Line 35: | ||
* 0x42 - audio data | * 0x42 - audio data | ||
* 0x53 - image | * 0x53 - image | ||
* 0x54 - | * 0x54 - tile data | ||
* 0x55 - image (v2) | |||
* 0x5A - audio data | * 0x5A - audio data | ||
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below. | Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below. | ||
== Video compression == | == Video compression for version 1 == | ||
Each frame is an independently compressed image (in bottoms-up format) split into tiles. | Each frame is an independently compressed image (in bottoms-up format) split into tiles. | ||
Line 73: | Line 74: | ||
Single tile decoding flow: | Single tile decoding flow: | ||
if (getbit()) { | if (!getbit()) { | ||
offset = get_bits(tile_index_bits); | offset = get_bits(tile_index_bits); | ||
copy tile data from the colour data using offset*16 | copy tile data from the colour data using offset*16 | ||
Line 97: | Line 98: | ||
001000 - -2, 0 | 001000 - -2, 0 | ||
001001 - -2,-1 | 001001 - -2,-1 | ||
001010 - | 001010 - 2,-1 | ||
001011 - -2,-2 | 001011 - -2,-2 | ||
001100 - 2,-2 | 001100 - 2,-2 | ||
Line 105: | Line 106: | ||
Actual image may be interlaced, i.e. only half of the lines are decoded. | Actual image may be interlaced, i.e. only half of the lines are decoded. | ||
== Video compression for version 2 == | |||
In this version frames are coded in small groups (usually by four) with the common tile data (chunk <code>0x54</code>) preceding keyframe (chunk <code>0x55</code>) and inter frames (chunk <code>0x53</code>). | |||
Also note that in this version bitstream format is little-endian LSB first. | |||
=== Tile format === | |||
Chunk type <code>0x54</code> starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1: | |||
read raw data for tile 0 | |||
for each tile { | |||
copy previous tile data | |||
for each component of tile { // i.e. all Rs, Gs, Bs and As | |||
bits = get_bits(3); | |||
if (bits < 7) { | |||
for (i = 0; i < tile_size; i++) { | |||
delta = get_bits(bits); // get_bits(0)=0 | |||
if (delta && get_bit(1)) | |||
delta = -delta; | |||
tile[component][i] += delta; | |||
} | |||
} else { | |||
for (i = 0; i < tile_size; i++) { | |||
tile[component][i] = get_bits(8); | |||
} | |||
} | |||
} | |||
} | |||
=== Frame format === | |||
Frame is now packed using various methods of prediction operating on tile indices. In inter frame tile index 0 means unchanged area. | |||
Frame data is split into regions of eight tiles, for each a bit is transmitted. Bit 1 means the whole region should be copied from above, bit 0 means that each individual tile index needs to be treated separately. | |||
Individual tile indices have the following mode codewords: | |||
* <code> 1</code> -- copy index from the top line | |||
* <code>000</code> -- get <code>ceil(log2(tile_size))</code> bits for a new tile index, add it to context list (see below) | |||
* <code>100</code> -- get 4-bit delta value, a sign bit, add/subtract <code>delta+1</code> to/from top index value, output and add it to the context list | |||
* <code>010</code> -- form a list of 1-4 unique neighbour values (see below), select one using 0-2 bits, output and add it to the context list | |||
* <code>110</code> -- get 4-bit index in the corresponding context list and output it (without updating the list) | |||
==== Context list ==== | |||
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 16 values that had it as a top neighbour value. Initially it contains all zeroes. | |||
For all but one single-index operations the list should be updated: | |||
if (y > 0) { // not the first line | |||
top_idx = frame[cur_pos - stride]; | |||
contexts[top_idx].list[contexts[top_idx].pos] = cur_idx; | |||
contexts[top_idx].pos = (contexts[top_idx].pos + 1) & 15; | |||
} | |||
==== Context-dependent list ==== | |||
For one of the modes such list is formed and then used as the pixel source: | |||
// list forming | |||
list = (empty); | |||
top = y > 0 ? top tile index : NONE; | |||
for left, top-left, top-right and top-top positions { | |||
idx = tile index at the search position | |||
if (!contains(list, idx) && (top == NONE || top != idx)) { | |||
push(list, idx) | |||
} | |||
} | |||
//decoding | |||
if (length(list) < 2) { | |||
new_idx = list[0]; // it should not be empty | |||
} else if (length(list) == 2) { | |||
new_idx = list[get_bit()]; | |||
} else { | |||
new_idx = list[get_bits(2)]; | |||
} | |||
[[Category:Game Formats]] | [[Category:Game Formats]] |
Latest revision as of 09:00, 9 November 2023
- Company: Arxel Tribe
- Extension: cnm, ci2
- Samples: http://samples.mplayerhq.hu/game-formats/ring-cnm/
CNM is a multimedia format used in the computer game Ring: The Legend of the Nibelungen. The CI2 is the next iteration of CNM with slightly different compression that is used in Faust: The Seven Games of the Soul.
Container format
Container has the following structure:
- magic
CNM UNR\0
- header
- frame offsets table (video and audio interleaved, audio offsets are zero when audio is not present; completely zero in CI2)
- frames
Header format (all values are little-endian):
4 bytes - number of frames 4 bytes - unknown 1 byte - unknown 4 bytes - image width 4 bytes - image height 2 bytes - unknown 1 byte - number of audio tracks 4 bytes - number of video frames? 4 bytes - number of frames repeated? 4 bytes - size of offsets table (v1 only) 152 bytes - always zero? when audio is present for each track: 1 byte - number of channels 1 bytes - bits per sample 4 bytes - audio rate 10 bytes - unused?
Each frame is prefixed by a byte containing its type. Known frame types:
- 0x41 - audio data
- 0x42 - audio data
- 0x53 - image
- 0x54 - tile data
- 0x55 - image (v2)
- 0x5A - audio data
Audio data is PCM prefixed by 32-bit data size, video frames are reviewed below.
Video compression for version 1
Each frame is an independently compressed image (in bottoms-up format) split into tiles. Frame header:
4 bytes - payload size (not counting the header) 4 bytes - offset to the colour data 2 bytes - number of tiles 2 bytes - tile data size 4 bytes - width 4 bytes - height 4 bytes - unknown 4 bytes - unknown 3 bytes - unused?
Colour data may contain either raw tile pixels (32-bit BGR0) or it may be packed. In that case tile data size is set to 4 or 2 and deltas stored right after it. Overall tile restoration algorithm is the following:
copy 16 bytes (4x1 tile) from the stream for (tile = 1; tile < num_tiles; tile++) { tile_data[tile] = tile_data[tile - 1]; bits = get_bits(3) + 1; //the same bit reading as below, bits=8 should not happen for (i = 0; i < 16; i++) { delta = get_bits(bits); if (delta && get_bit()) delta = -delta; tile_data[tile][i] += delta; } }
Tile control data is compressed using variable amount of bits, bits are stored MSB first. Tile index is read depending on the number of tiles: if it can fit into 10 bits then it's ten bits, if it can fit into 11 bits then it's 11 bits, otherwise it's 12 bits.
Single tile decoding flow:
if (!getbit()) { offset = get_bits(tile_index_bits); copy tile data from the colour data using offset*16 } else { // copy existing tile decode motion vector, copy tile to which it points to (e.g. -1,0 means previous tile and 0,-1 means top tile) }
Motion vector codebook:
1 - 0,-1 0100 - -1, 0 0101 - -1,-1 0110 - 1,-1 0111 - 0,-2 000000 - -2,-3 000001 - 2,-3 000010 - -1,-4 000011 - 1,-4 000100 - -1,-2 000101 - 1,-2 000110 - 0,-3 000111 - 0,-4 001000 - -2, 0 001001 - -2,-1 001010 - 2,-1 001011 - -2,-2 001100 - 2,-2 001101 - -1,-3 001110 - 1,-3 001111 - 0,-5
Actual image may be interlaced, i.e. only half of the lines are decoded.
Video compression for version 2
In this version frames are coded in small groups (usually by four) with the common tile data (chunk 0x54
) preceding keyframe (chunk 0x55
) and inter frames (chunk 0x53
).
Also note that in this version bitstream format is little-endian LSB first.
Tile format
Chunk type 0x54
starts with the usual header: 32-bit data size, 16-bit number of tiles and 16-bit tile size. Tile data is packed almost but not exactly like in version 1:
read raw data for tile 0 for each tile { copy previous tile data for each component of tile { // i.e. all Rs, Gs, Bs and As bits = get_bits(3); if (bits < 7) { for (i = 0; i < tile_size; i++) { delta = get_bits(bits); // get_bits(0)=0 if (delta && get_bit(1)) delta = -delta; tile[component][i] += delta; } } else { for (i = 0; i < tile_size; i++) { tile[component][i] = get_bits(8); } } } }
Frame format
Frame is now packed using various methods of prediction operating on tile indices. In inter frame tile index 0 means unchanged area.
Frame data is split into regions of eight tiles, for each a bit is transmitted. Bit 1 means the whole region should be copied from above, bit 0 means that each individual tile index needs to be treated separately.
Individual tile indices have the following mode codewords:
1
-- copy index from the top line000
-- getceil(log2(tile_size))
bits for a new tile index, add it to context list (see below)100
-- get 4-bit delta value, a sign bit, add/subtractdelta+1
to/from top index value, output and add it to the context list010
-- form a list of 1-4 unique neighbour values (see below), select one using 0-2 bits, output and add it to the context list110
-- get 4-bit index in the corresponding context list and output it (without updating the list)
Context list
Decoder keeps context-dependent (i.e. one list for each possible tile index) cyclic list of last 16 values that had it as a top neighbour value. Initially it contains all zeroes.
For all but one single-index operations the list should be updated:
if (y > 0) { // not the first line top_idx = frame[cur_pos - stride]; contexts[top_idx].list[contexts[top_idx].pos] = cur_idx; contexts[top_idx].pos = (contexts[top_idx].pos + 1) & 15; }
Context-dependent list
For one of the modes such list is formed and then used as the pixel source:
// list forming list = (empty); top = y > 0 ? top tile index : NONE; for left, top-left, top-right and top-top positions { idx = tile index at the search position if (!contains(list, idx) && (top == NONE || top != idx)) { push(list, idx) } } //decoding if (length(list) < 2) { new_idx = list[0]; // it should not be empty } else if (length(list) == 2) { new_idx = list[get_bit()]; } else { new_idx = list[get_bits(2)]; }