From MultimediaWiki
Jump to: navigation, search

There is also new format from China called Chinese AVS

This page is based on the document 'Description of the AVS Format' by Mike Melanson and Vladimir "VAG" Gneushev found at http://multimedia.cx/avs-format.txt.

AVS is a full motion video (FMV) file format that is used in the game Creature Shock developed by Argonaut Games and published by Virgin Interactive in 1994. The game is an FMV-based shooting game that relies on this file format for much of its graphics.

File Format

All multi-byte numbers are little-endian.

An AVS file starts with a header and is followed by a series of frames. The file header has the following layout:

bytes 0-1    file signature - 0x77 0x57
bytes 2-3    size of main header (should be 0x0010)
bytes 4-5    frame width
bytes 6-7    frame height
bytes 8-9    color depth? (always 8)
bytes 10-11  frames per second
bytes 12-15  frame count

After the file header is a series of frames. Each frame has the following layout:

bytes 0-1    data present; if this is zero then the file is finished;
             otherwise, there is data in the payload
bytes 2-3    length of entire frame, including this 4-byte preamble
  block 1
  block n

Each block has the following layout:

bytes 0-1    block type
bytes 2-3    block length (including this 4-byte preamble)
bytes 4..    block payload

Keep processing blocks until the entire frame is depleted. These are the various block types:

0x0100: video intraframe (keyframe) with 3x3 vectors
0x0101: video interframe with 3x3 vectors
0x0102: video interframe with 2x2 vectors
0x0103: video interframe with 2x3 vectors
0x0200: audio frame
0x0300: palette
0x0400: game-related data; disregard
0x0401: game-related data; disregard

The next sections describe the video and audio formats in detail.

Palette Format

Block type 0x0300 denotes a palette chunk. The decoder maintains a 256-element array of palette entries. Each palette entry consists of a red value, a green value, and a blue value. Each of these R, G, and B components is a 6-bit VGA value with a range limited to 0..63. This is important to understand if the video will be decoded and rendered in more common formats that expect 8-bit components (0..255).

Palette data has the following format:

bytes 0-1    index of first palette entry to replace
bytes 2-3    number of palette entries to replace
bytes 4..    RGB byte triplets

For example, if the index of the first entry is 0x0001 and the number of palette entries to replace is 0x0077 then the decoder would iterate through palette indices 1..0x77 and read the RGB entries out of the encoded palette.

Video Format

AVS uses a very simplistic vector quantizer video coding scheme. Possible vector sizes include 2x2, 2x3, and 3x3 depending on block type. The codec was designed to operate on the standard IBM VGA 320x200 256-color video mode. However, the videos always appear to be encoded at a resolution of 318x198. Each of the video modes specifies that only one vector size is used. This is the total number of vectors comprising a frame for each vector size:

vector size    vectors in a 318x198 frame
-----------    --------------------------
    2x2                  15741
    2x3                  10494
    3x3                   6996

An intraframe is always painted with 3x3 vectors. An encoded intraframe consists of a 256-entry vector codebook followed by a vector map instructing the decoder where to place the vectors in the final frame. In fact, an intraframe will always have the same size: 9300 (0x2454) bytes. This is the intraframe layout:

bytes 0..2303     256 9-pixel vectors
bytes 2304..9299  6996 codebook indices

An interframe is painted with varying vector sizes depending on the block type:

type 0x0101    3x3 vectors
type 0x0102    2x2 vectors
type 0x0103    2x3 vectors

An encoded interframe has the following layout:

vector codebook
vector change bitmap
codebook indices

The vector codebook consists of 256 pixel vectors. The size of each vector (either 4, 6, or 9 bytes) depends on which size vector the current interframe type uses. The size of the vector change bitmap is defined as:

size_of_change_bitmap = ((319 / vector_width + 7) / 8) * 
    (199 / vector_height)

The interframe decoding algorithm is:

initialize pointers to the codebook, change bitmap, and indices
read the next byte from the change bitmap
for each vector position in image, left -> right, top -> bottom
  if bit 7 of change byte is 1 then
    read next codebook index from index portion of stream
    copy the vector at codebook[index] into the current vector position
  if bit 7 of change byte is 0 then
    vector is unchanged from previous frame
  shift the change byte left by 1
  if the change byte has been exhausted (shifted 8 times)
    read the next byte from the change bitmap

Audio Format

The AVS audio format is actually taken from the Creative VOC format. A VOC chunk has the following layout:

byte  0    chunk type (should be 1)
bytes 1-3  chunk length (including 4-byte payload but not next 2 bytes)
byte  4    frequency divisor
byte  5    data packing field (should be 0 which indicates unpacked)
bytes 6..  audio data

The caveat when processing audio blocks (type 0x0200) is that the block may or be a complete VOC chunk (with a header and complete data), or may be the beginning of a VOC chunk (with header and some data), or a continuation of a VOC chunk started in a previous frame along with the start of a new VOC chunk. For example, one audio block may contain an entire VOC chunk:

 block 0x200, length = 8196
  voc-header, chunk_length = 8188
  samples -> 8188 samples
 block 0x200, length = 50
  voc-header, chunk_length = 90
  samples -> to the end of the subblock (50 - 6 VOC header bytes)
 block 0x200, length = 50
  samples -> the rest of the data as counted by previous chunk_length 
    (90 - (50-6))

The frequency divisor, which should ideally remain constant through playback, is an unsigned byte that is fed directly into the classic Creative Sound Blaster to initialize the DAC for digital audio output. The formula to calculate the playback sample rate is:

sample_rate = 1000000 / (256 - frequency_divisor)

A common divisor is 0xA6/166 which yields a sample rate of 11111 Hz.

The audio data is single-channel, unsigned, 8-bit, PCM audio data.