# VGM Video

- Company: XVD Corporation (DigitalStream-USA)
- FourCC: VGMV (probably V2K-II), VTLP (VT codec)
- Samples: http://samples.mplayerhq.hu/internets/bha-xvd-vg2/ http://web.archive.org/web/http://xvd.bha.co.jp/download/sample.html http://samples.mplayerhq.hu/V-codecs/VTLP-VGPX/
- Binaries: http://web.archive.org/web/*/http://updater.bha.co.jp:80/XVDplusWin/* http://web.archive.org/web/*/http://updater.bha.co.jp:80/XVDfree/*

Java player: http://www.ila-ila.com/xvd-hist/sites/lab1454/eng/products/jpl_dm2.htm

There are several codecs in this family:

- VT
- Domen
- VT2k aka BigBits
- V2K-II
- XVD

## VT

First codec, uses variable-length codes to code macroblock types and either frame consisting of 8x8 IDCT blocks (coded in triplets - by component) or frame consisting of quantised deltas.

## Domen

This codec is the real base for all subsequent video codecs as it has most of the features present in later codecs. It uses codebooks and so-called RL-coding, a binary run length coding. In this mode a variable with the run length of bit value is read using static codebook and then used either to decode a whole bitplane or a flag for current (macro)block. When the decoding process gets to the end of the run, next length is read and bit value is flipped.

Frame format:

4 bits - quality (used to derive quantiser, codebooks to use and maximum number of coefficients in the block) 1 bit - intra frame flag RL-coded bitplane for macroblock flags if not intra frame { 1 bit - has global MV if has global MV { 4 bits signed - global MV x + 15 4 bits signed - global MV y + 15 } 1 bit - initial value for MV present RL map for each 8x8 block { if RL bit is set { decode MV value symbol using codebook if (sym != 420) { blk.mv.x = sym / 29 - 14; blk.mv.y = sym % 29 - 14; } else { blk.mv.x = get_bits(6) - 31; blk.mv.y = get_bits(6) - 31; } } } } RL-coded map for coded block flags 2 bits - codebook index Y DCs 2 bits - codebook index Y ACs 1-2 Y ACs 3- UV DCs U ACs V ACs

Luma coefficients 0-2 are coded in macroblocks, so for each macroblock and each coefficient first a pattern is decoded (using codebook) that tells which of four blocks have this coefficient coded and then (if coded) sign and actual coefficient value are coded as well.

Luma coefficients 3-63 are coded using RL-coding to tell which block has coefficient coded.

Chroma coefficients are coded in the similar way but using simpler coding: unary prefix for small codes and fixed-width bitfield for large codes.

## V2K

This version of the codec dropped RL-coding and started to use arithmetic coder with single static model per data type for coding frame data.

First frame presumably contains Run-Level map consisting of 49 entries in the following format:

- 1 bit - end-of-block flag
- 4 bits - run value
- 4 bits - level value

Frame format:

8 bits - flags if (flags & 4) { 8 bits - fade speed } if (flags & 1) { skip frame - fade the previous frame if needed, do nothing else } 8 bits - quantiser if (flags & 0x10) { 8 bits - altquant difference if (flags & 0x20) { read bit plane with flags telling which quantiser macroblock should use } } if (flags & 2) { // intra frame } else { // inter frame if (flags & 8) { 8 bits signed - global MV x 8 bits signed - global MV y } if (!(flags & 0x40)) { read MB intra flags bitplane } for all macroblocks { if not intra MB { decode compound MV value using arithmetic coder mv.x = val / (radius * 4 + 1) - radius * 2; mv.y = val / (radius * 4 + 1) - radius * 2; (radius is coded in the codec extradata and most likely it is 1) } } } decode Y plane blocks decode U plane blocks decode V plane blocks

Block plane decoding:

for each 8x8 block { coded = ac_get_sym(2); if block is intra { blk[0] = ac_get_sym(256); idx = 1; } else { idx = 0; } if (coded) { while (idx < 64) { sym = ac_get_mdl(COEF_MODEL); if (sym < 49) { level = RL_MAP[sym].level; run = RL_MAP[sym].run; eob = RL_MAP[sym].eob; if (ac_get_sym(2)) level = -level; } else { level = ac_get_sym(254) - 127; if (level >= 0) level++; run = ac_get_sym(64); eob = ac_get_sym(2); } if (level > 0) level *= quant * 2 + 1; else level *= quant * 2 - 1; idx += run; blk[zigzag[idx++]] = level; } } }

## V2K-II

This version of the codec adds wavelet coding as an alternative coding mode for intra-frames and uses context-adaptive arithmetic coding (i.e. usually top, left and top-left elements are used to select a model for decoding and then the same values along with the new decoded value are used to derive the actual output value). Alternatively codebooks can be used to code coefficients.

Frame format:

16 bits - flags if (flags & 8) { 8 bits - fade speed } if (flags & 1) { skip frame - fade the previous frame if needed, do nothing else } if (flags & 0x20) { 4 bits - unknown 4 bits - unknown } 8 bits - quantiser if (flags & 0x100) { 8 bits - altquant difference if (flags & 0x200) { read bit plane with flags telling which quantiser macroblock should use } } if (flags & 2) { // intra frame reset state } else { if (flags & 0x40) { 8 bits - number of default MVs (0-3) if (num_def_mv == 0) { 8 bits signed - mv_x 8 bits signed - mv_y } else { read num_def_mv MVs in the same format as above decode per-macroblock default motion vector index using arithmetic coder with top/left/topleft context } } if (flags & 0x400) { all macroblocks are inter } else { decode intra-MB flags using arithmetic coder with top/left/topleft/previous value context } decode MVS using arithmetic coder with top/left/topleft/previous value context add corresponding full-pel default per-macroblock MV to each halfpel block MV if applicable } if (!(flags & 4)) { decode Y blocks decode U blocks decode V blocks } else { decode wavelet picture }

Plane decoding with arithmetic coder and codebooks:

decode block uncoded flags using arithmetic coder and top/left/topleft context for each 8x8 block { if (intra block) { read 8-bit DC if (!uncoded block) { decode coefficients 1-64 for a block } } else if (!uncoded block) { decode coefficients 0-64 for a block } }

Wavelet decoding seems to be based on LGT 5/3 wavelet, discarding HH band, and coding data in bitslicing mode (i.e. all top bits first, then next-to-top bits, etc etc) using binary runs very similar to RL-coding in `Domen`

.

## XVD

This is the last instalment in VGM Video series. Now the codec is DCT-only and uses either context-adaptive binary coder, arithmetic coder or a mix of arithmetic coder and variable-length codes. There is still one halfpel-precision motion vector per 8x8 block.

### Extradata format

4 bytes - width 4 bytes - height 4 bytes - bitrate? 4 bytes - FPS 4 bytes - edge size (always 4?) 4 bytes - MV radius (always 1?) 4 bytes - flags

Flags meaning:

- bit 8 - probably interlaced coding
- bit 9 - use DC prediction
- bit 10 - use MV prediction
- bit 11 - use binary coder

### Frame format

16 bits - flags if (flags & 1) { this is skip frame, do nothing else } 8 bits - quantiser if (flags & 0x100) { 8 bits - altquant difference if (flags & 0x200) { read bit plane with flags telling which quantiser macroblock should use } } if (flags & 2) { // intra frame reset binary coder state } else { if (flags & 0x40) { 8 bits - number of default MVs (0-3) if (num_def_mv == 0) { 8 bits signed - mv_x 8 bits signed - mv_y } else { read num_def_mv MVs in the same format as above decode per-macroblock default motion vector index using arithmetic coder with top/left/topleft context } } if (flags & 0x400) { all macroblocks are inter } else if (use_bincoder) { decode intra-MB flags using binary coder with top/left/topleft/previous value context } else { decode intra-MB flags using arithmetic coder with top/left/topleft/previous value context } if (!(flags & 0x800)) { decode MVS using arithmetic coder with top/left/topleft/previous value context } else if (use_bincoder) { decode x component using binary coder with top/left/topleft/previous value context for values >= 3 read that amount of bits as actual value; read sign bits for component decode y component using binary coder with top/left/topleft/previous value context for values >= 3 read that amount of bits as actual value; read sign bits for component median-predict MVs } else { decode MV present flags using arithmetic coder with top/left/topleft/previous value context decode actual MVs using MV codebook and apply prediction on them if codec flags say so } add corresponding full-pel default per-macroblock MV to each halfpel block MV if applicable } decode Y plane using either binary coder or arithmetic coder and codebooks decode U plane using either binary coder or arithmetic coder and codebooks decode V plane using either binary coder or arithmetic coder and codebooks

DC prediction uses gradient prediction from neighbouring intra-coded blocks.

Plane decoding with arithmetic coder and codebooks:

decode block uncoded flags using arithmetic coder and top/left/topleft context for each 8x8 block { if (intra block) { if (!use_dc_pred) { read 8-bit DC } else if (block has no intra-block top neighbours) { read DC using raw DC codebook } else { read DC difference using DC difference codebook add predicted DC value } if (!uncoded block) { decode coefficients 1-64 for a block } } else if (!uncoded block) { decode coefficients 0-64 for a block } }

Coefficients decoding is done in this case with a simple run-length inter or intra codebook.

Plane decoding with binary coder:

for each 8x8 block { decode block uncoded flag for current block using top/left/topleft context if (intra block) { decode DC difference by unary coding for actual value length and N bypass bits for the DC difference value add DC prediction (use 128 when it is not available) if (!uncoded block) { decode coefficients 1-64 for a block } } else if (!uncoded) { decode coefficients 0-64 for a block } }

Coefficients decoding is done by decoding the unary value for number of coefficients and N run-length pairs using position-adaptive models.