Motion Wavelets: Difference between revisions

Latest revision as of 10:32, 26 December 2021

FourCC: MWV1
Samples: http://samples.mplayerhq.hu/V-codecs/MWV1/

This is a rather simple intra-only wavelet coding that uses static codebooks.

Extradata format

Surprisingly, extradata values are stored in big-endian order. There are the following fields there:

 4 bytes - extradata length including the header
 4 bytes - always seems to be 7
 4 bytes - version
 4 bytes - width
 4 bytes - height
 2 bytes - unknown
 2 bytes - bits per pixel
 4 bytes - source format (e.g. 0 - RGB, '024I' for YUV420)
 4 bytes - raw image size
 4 bytes - always zero?
 4 bytes - always zero?
 4 bytes - always zero?
 4 bytes - always zero?
 4 bytes - codec flags (0x100 seems to mean grayscale)
 3x2x2 bytes - vertical and horizontal transform levels for each YUV component (usually 4, 4, 3, 3, 3, 3)

Frame format

MotionWavelets frame consists of tags starting with FF FF FF FF followed by tag ID.

Known tags are:

0xAA - frame header
0xAB - alternative frame header
0xD1 - wavelet band data
0xD2 - alternative band data?
0xDA - alternative band data?
0xDD - unknown meaning

0xAA tag (frame header)

 4 bytes - tag size
 2 bytes - default Y plane bias?
 (non-grayscale) 2 bytes - default U plane bias?
 (non-grayscale) 2 bytes - default V plane bias?
 (for version > 1) 1 byte - unknown
 (for version > 4) 4 bytes - total packed frame size

0xD1 tag (wavelet band data)

This tag contains data for one wavelet band. Bands are stored in interleaved order (Y LL band, U LL band, V LL band, Y LH band, U LH band...) with their dimensions implicitly derived from the image size and the number of transform levels.

Pixels in bands are coded in boustrophedon order (i.e. first line left to right, next line right to left then left to right again).

Old tag header (before version 3):

 4 bytes - band size?
 4 bytes - unknown
 4 bytes - band quantiser multiplied by 32768 and stored as integer

New tag header (version 3 and later):

 1 byte  - band mode (0 means the band is not coded and no further data is present)
 4 bytes - band quantiser multiplied by 32768 and stored as integer (for non-empty bands)

The following band modes are known:

0 - empty uncoded band
5 - LL band (coefficients are coded as differences to the previous ones, no bias)
1 - same coding as mode 5 but for coefficients instead of deltas, without quantisation bias
9 - same as mode 1 but with quantisation bias
2 - alternative band coding, no quantisation bias
10 - alternative band coding, with quantisation bias

Band data coding

Ordinary band coding:

 get code from codebook 1
 switch (code) {
   case 0: read 8 bits of escape code, remap to -0xFB..-0x7C, 0x7C..0xFB range
   case 1: read 12 bits of escape code, remap to -0x8FB..-0xFC, 0xFC..0x8FB range
   case 0x80: read 16 bits of escape code, remap to -0x88FB..-0x8FC, 0x8FC..0x88FB range
   case 0xFC/0xFD/0xFE/0xFF: zero run of length 1/2/3/4
   case 2/3/4: read 4/8/12 bits and add 5/21/277 in order to obtain zero run value (or repeat count for mode 5)
   default: integer value (or delta for mode 5) is equal to code - 0x80
 }

Alternative band coding:

 get code from codebook 2
 if (code <= 0x40) {
   output code - 32
 } else if (code <= 0x7B) {
   output zero run of (code - 0x40)
 } else if (code >= 0xEF && code <= 0xFA) {
   read 16/14/10/9/8/7/6/5/4/3/2/1 bits, add 0x483A/0x83A/0x43A/0x23A/0x13A/0xBA/0x7A/0x5A/0x4A/0x42/0x3E/0x3C
   output zero run of the resulting length
 } else if (code == 0xFE || code == 0xFF) {
   read 14/10 bits for escape value and remap it to -0x2220..-0x221,0x221..0x2220/-0x220..-0x21,0x21..0x220 range
 } // other codes should not be present

Quantisation without bias is simply value * scale, with a bias it's value > 0 ? (value + 0.5) * scale : (value - 0.5) * scale. Band quantisers on upper levels should be multiplied by power of 2 (i.e. for the smallest bands the multiplier is 1.0, for the next level it's 2.0, for the next one it's 4.0 and so on).

Reconstruction is using simple lifting scheme:

 dst[2n]   = (lo[n] + hi[n]) / 16.0 + (lo[n-1] - lo[n+2]) / 128.0
 dst[2n+1] = (lo[n] - hi[n]) / 16.0 - (lo[n-1] - lo[n+2]) / 128.0

After vertical reconstruction band values should be multiplied by 128.

Motion Wavelets: Difference between revisions

Latest revision as of 10:32, 26 December 2021

Contents

Extradata format

Frame format

0xAA tag (frame header)

0xD1 tag (wavelet band data)

Band data coding

Navigation menu

Motion Wavelets: Difference between revisions

Latest revision as of 10:32, 26 December 2021

Extradata format

Frame format

0xAA tag (frame header)

0xD1 tag (wavelet band data)

Band data coding

Navigation menu

Search