VGM Video

From MultimediaWiki
Revision as of 10:22, 17 February 2021 by Kostya (talk | contribs) (document XVD)
Jump to navigation Jump to search

Java player: http://www.ila-ila.com/xvd-hist/sites/lab1454/eng/products/jpl_dm2.htm

There are several codecs in this family:

  • VT
  • Domain
  • VT2k aka BigBits
  • V2K-II
  • XVD

VT

TODO

Domain

TODO

V2K

TODO

V2K-II

TODO

XVD

This is the last instalment in VGM Video series. Now the codec is DCT-only and uses either context-adaptive binary coder, arithmetic coder or a mix of arithmetic coder and variable-length codes. There is still one halfpel-precision motion vector per 8x8 block.

Extradata format

4 bytes - width
4 bytes - height
4 bytes - bitrate?
4 bytes - FPS
4 bytes - edge size (always 4?)
4 bytes - unknown (always 1?)
4 bytes - flags

Flags meaning:

  • bit 8 - probably interlaced coding
  • bit 9 - use DC prediction
  • bit 10 - use MV prediction
  • bit 11 - use binary coder

Frame format

16 bits - flags
if (flags & 1) {
 this is skip frame, do nothing else
}
8 bits  - quantiser
if (flags & 0x100) {
 8 bits - altquant difference
 if (flags & 0x200) {
  read bit plane with flags telling which quantiser macroblock should use
 }
}
if (flags & 2) { // intra frame
 reset binary coder state
} else {
 if (flags & 0x40) {
  8 bits - number of default MVs (0-3)
  if (num_def_mv == 0) {
   8 bits signed - mv_x
   8 bits signed - mv_y
  } else {
   read num_def_mv MVs in the same format as above
   decode per-macroblock default motion vector index using arithmetic coder with top/left/topleft context
  }
 }

 if (flags & 0x400) {
  all macroblocks are inter
 } else if (use_bincoder) {
  decode intra-MB flags using binary coder with top/left/topleft/previous value context
 } else {
  decode intra-MB flags using arithmetic coder with top/left/topleft/previous value context
 }

 if (!(flags & 0x800)) {
  decode MVS using arithmetic coder with top/left/topleft/previous value context
 } else if (use_bincoder) {
  decode x component using binary coder with top/left/topleft/previous value context
  for values >= 3 read that amount of bits as actual value; read sign bits for component
  decode y component using binary coder with top/left/topleft/previous value context
  for values >= 3 read that amount of bits as actual value; read sign bits for component
  median-predict MVs
 } else {
  decode MV present flags using arithmetic coder with top/left/topleft/previous value context
  decode actual MVs using MV codebook and apply prediction on them if codec flags say so
 }
 add corresponding full-pel default per-macroblock MV to each halfpel block MV if applicable
}
decode Y plane using either binary coder or arithmetic coder and codebooks
decode U plane using either binary coder or arithmetic coder and codebooks
decode V plane using either binary coder or arithmetic coder and codebooks

DC prediction uses gradient prediction from neighbouring intra-coded blocks.

Plane decoding with arithmetic coder and codebooks:

decode block uncoded flags using arithmetic coder and top/left/topleft context
for each 8x8 block {
 if (intra block) {
  if (!use_dc_pred) {
   read 8-bit DC
  } else if (block has no intra-block top neighbours) {
   read DC using raw DC codebook
  } else {
   read DC difference using DC difference codebook
   add predicted DC value
  }
  if (!uncoded block) {
   decode coefficients 1-64 for a block
  }
 } else if (!uncoded block) {
  decode coefficients 0-64 for a block
 }
}

Coefficients decoding is done in this case with a simple run-length inter or intra codebook.

Plane decoding with binary coder:

for each 8x8 block {
 decode block uncoded flag for current block using top/left/topleft context
 if (intra block) {
  decode DC difference by unary coding for actual value length and N bypass bits for the DC difference value
  add DC prediction (use 128 when it is not available)
  if (!uncoded block) {
   decode coefficients 1-64 for a block
  }
 } else if (!uncoded) {
  decode coefficients 0-64 for a block
 }
}

Coefficients decoding is done by decoding the unary value for number of coefficients and N run-length pairs using position-adaptive models.