# Difference between revisions of "VC-1 Data Structures"

Part of Understanding VC-1

This page is a discussion of the various data structures and constant enumerations employed in the SMPTE Reference Decoder (SRD) VC-1 reference implementation. These are mostly found in the file vc1types.h.

## Macroblocks, Blocks, and Sub-blocks

A macroblock embodies a 16x16 block of pixels in a YUV 4:2:0 colorspace. A macroblock consists of 6 blocks: 4 8x8 Y blocks, 1 U block, and 1 V block. Further, these blocks maybe divided into sub-blocks of sizes 8x4, 4x8, or 4x4.

## Intra Block

An intra block data structure requires the following information:

• number of non-zero AC coefficients
• quantized DC coefficient for prediction
• 7 quantized AC coefficients along the top row of the block, used for prediction
• 7 quantized AC coefficients along the left side of the block, used for prediction
• 16 samples representing the bottom 2 8-pixel rows of the block, maintained for overlap smoothing filters

## Motion Vector

This is the information maintained for an individual motion vector:

• X offset
• Y offset
• a flag indicating whether the vector pertains to the top or bottom field of interlaced video

The (X, Y) vector is relative to the top-left coordinate of a block. The fractional pel resolution of the vectors depends on the motion vector coding mode.

## Motion

This structure encapsulates motion vectors along with mode information.

• prediction mode, as enumerated in Hybrid Prediction Modes
• motion vector data structure
• differential motion vector data structure, represented in quarter-pel units

The SRD contains the following not accompanying this data structure:

```/*
* If Two Reference Images (NUMREF=1) then:
*      Y=2*(YValue)+PredFlag
*      PredFlag: 0=dominant 1=non-dominant
*/
```

## Motion Vector History

This data structure consists of an array of 4 motion vector data structures which stores the motion vector history for 4 Y blocks, used for direct mode.

## Inter Block

An intra block data structure requires the following information:

• an array of 4 integers representing the number of non-zero coefficients (DC and AC together) for up to 4 sub-blocks in the block
• 2 motion structures, 1 for backward prediction and 1 for forward prediction

## Block

This is the information that the SRD maintains for an individual block:

• block type, as defined in Block Types section
• a flag indicating whether there are non-zero AC coefficients for an intra block, or non-zero DC or AC coefficients for an inter block
• a C union that contains either an intra or inter block data structure

## Quantizer

This data structure maintains information about quantization parameters.

• quantizer step, range 0..31
• quantizer half-step, either 0 or 1
• a flag indicating whether the quantization is uniform or non-uniform

## Macroblock Properties

These properties are associated with various macroblock coding options:

• a block is intra-coded, or has 1, 2, or 4 motion vectors associated
• a block predicts backwards from a previous frame, from a forward frame, from both a backward or forward frame, or uses "direct" mode
• a flag that indicates whether MVs apply to interlaced fields
• a flag that indicates that "bottom field different direction to top"
• a flag that indicates "field transform"

## Macroblock

The SRD maintains the following information about an individual macroblock:

• macroblock type: this is a combination of properties enumerated in Macroblock Properties
• AC prediction status as enumerated in AC Prediction
• block type as enumerated in Block Types; might be "any"
• a flag indicating whether the overlap filter is active for this macroblock
• a flag indicating whether is motion predicted only (presumably, this means that a predicted block is formed by no other transform/addition occurs for the MB)
• a 6-bit flag vector that indicates which of the 6 constituent block in the MB are coded
• a 4-bit flag vector that indicates motion vector block pattern-- these flags are set if the differential MV for a respective Y block is not 0
• quantizer data structure for the macroblock
• 6 block structures

## Intensity Compensation

The SRD used the following data structure to maintain information about intensity compensation:

• a flag to indicate whether intensity compensation is enabled
• IC scale
• IC shift

## B Fraction

This data structure tracks B-fraction data:

• B-fraction numerator
• B-fraction denominator
• scalefactor: approximate numerator * 256 / denominator

## Hypothetical Reference Decoder

The SRD contains data structures pertaining to a hypothetical reference decoder involving a leaky bucket algorithm.

## Pan Scan Window

This data structure pertains to pan and scan windows:

• horizontal offset in pixels
• vertical offset in pixels
• width in pixels
• height in pixels

## Pan Scan Parameters

This data structure contains a flag indicating whether pan and scan is present, and an array of 3 Pan Scan Window data structures (3 is the maximum supported).

## Sequence And Layer Parameters

The SRD maintains all of the information for the sequence layer:

• profile (simple, main, or advanced)
• maximum coded width
• maximum coded height
• coded width
• coded height
• display width
• display height
• aspect width
• aspect height
• profile level
• interface flag
• frame rate numerator (0 unless otherwise specified)
• frame rate denominator (the comments list 1000, 1001, and 32 as valid values, and claim that it is 0 if not specified, which sounds potentially problematic)
• color format indicator flag
• chroma format
• color primaries
• transfer characteristics
• matrix coefficients
• hypothetical reference decoder
• loop filter flag
• multi resolution coding flag
• fast chrominance motion compensation flag
• extended motion vector flag
• extended differential motion vector flag
• d-quant
• variable size transform flag (VSTransform)
• overlapped transform flag
• sync marker flag
• range reduction flag
• max B-frames
• quantizer mode
• post processing flag
• frame counter flag
• pull down flag
• PsF (???)
• Q framerate for post processing
• Q bitrate for post processing
• pan scan flag
• reserved RTM flag
• frame interpolation flag
• range scale Y (comment: scale value times 8)
• range scale UV (comment: scale value times 8)
• number of pan scan windows
• broken link flag
• closed entry flag
• refdist flag (refdist refers to distance to previous reference frame)
• frame user data present flag
• end-of-sequence marker present flag

## Component

The SRD defines a component as a single Y, U, or V plane and associates these members with the data structure:

• data: the raw bytes that comprise the Y, U, or V sample data
• bytes/line, a.k.a. stride

## Field

This data structure defines all of the parameters that pertain to a particular field:

• picture type
• conditional overlap filter mode
• quantization mode
• motion vector mode
• motion vector range
• block transform type
• post processing flag
• extended X differential motion vector range
• extended Y differential motion vector range
• number of reference fields (either 1 or 2)
• reference field (either last or last-but-one)
• motion vector VLC table (0..7)
• MB mode VLC table (0..7)
• block pattern 2 motion vector table (0..3)
• block pattern 4 motion vector table (0..3)
• inter-coded block coding pattern VLC table (0..3)
• AC coding set to use for intra-coded Y blocks (0..2)
• AC coding set to use for all inter-coded blocks or U and V intra (0..2)
• DC coding set (0..1)
• rows per slice (0 = no slicing used)

## Picture

The picture is the fundamental data unit in the VC-1 coding scheme. A picture can be one of the following things:

• a progressive frame
• an interlaced top field
• an interlaced bottom field
• an interlaced frame

The SRD maintains the following information about a picture:

• frame number (modulo 1<<32)
• picture format
• Y component data structure
• U component data structure
• V component data structure
• 2 field data structures
• picture resolution index
• top field first flag
• repeat first field flag
• range reduction used flag
• frame interpolation hint flag
• chrominance sample format flag
• repeat frame count
• pan scan parameters data structure
• post processing mode

## Scale Motion Vectors

This data structure contains information about scaling motion vectors for interlaced frames:

• scale (comment: down scale factor * 256)
• scale 1 (comment: up scale factor * 256) if in zone 1
• scale 2 (comment: down scale factor * 256) if not in zone 1
• zone 1 X size
• zone 1 Y size
• zone 1 X offset
• zone 1 Y offset
• flag indicating scaling up or down for opposite
• flag indicating top or bottom field
• motion vector range
• motion vector mode

## Interpolation

This data structure contains information to be passed to a bilinear or bicubic interpolation function:

• component data structure
• width of resulting filtered rectangle
• height of resulting filtered rectangle
• flag indicating rounding behavior

• simple or main profile - pad from macroblock edge
• advanced profile progressive - pad from image edge

## Rectangle

Nothing complicated about this data structure-- it's just 2 (X, Y) coordinate pairs specifying the upper-left and lower-right corners of a rectange.

## Image Position

This data structure contains rectangles to control padding and cropping:

• total width of buffer
• total height of buffer
• image rectangle in pels relative to buffer origin
• rectangle to pad outwards from in pels relative to buffer origin
• rectangle limits to pad outwards to in pels relative to buffer origin

## Reference Picture

This data structure contains all of the information to comprise a reference picture (I-frame).

• valid flag
• broken link flag-- reference is not longer available due to a broken link
• parameter indicating whether top field, bottom field, or both are padded
• range Y scale (comment: Y scaling factor times 8)
• range UV scale (comment: UV scaling factor times 8)
• number of frames between this and the last reference frame)
• frame number modulo (1<<32)
• top field first flag
• repeat first field flag
• PsF (???)
• pan and scan parameters data structure
• frame interpolation hint flag (comment indicates it is not used in decoding process)
• chrominance plane sampling mode, pertains to interlaced modes
• repeated frame count
• post processing mode
• coded width
• coded height
• max coded width
• max coded height
• picture format
• 2 motion vector ranges, 1 for each field
• 2 picture types, 1 for each field
• picture resolution scaling mode
• Y component data structure
• U component data structure
• V component data structure
• pointer to Y data top-left corner
• pointer to U data top-left corner
• pointer to V data top-left corner
• image position data structure indicating position of Y samples in image buffer
• image position data structure indicating position of C samples in image buffer

## Level Limits

This data structure hold information about various limits at each profile and level.

• max macroblocks per second
• max macroblocks per frame
• max peak transmission rate in kilobits per second
• max buffer size in multiples of 16 kilobits
• motion vector range allowed

## Position

This data structure describes the current macroblock being processed:

• picture type
• picture format
• profile
• motion vector mode
• motion vector range
• flag indicating top vs. bottom field
• flag indicating first vs. second field
• pointer to the current macroblock data structure
• pointer to the start of the macroblock circular data structure
• pointer to the current position in the motion vector history buffer
• circular buffer size in macroblocks
• X macroblock offset in current slice
• Y macroblock offset in current slice
• Y macroblock offset of slice in picture
• width in macroblocks of coded picture
• height in macroblocks of codec picture
• coded width
• coded height
• max coded width
• max coded height
• picture quantizer (PQUANT)
• B-fraction syntax element
• number of reference fields, minus 1
• reference field when previous field is 0
• bias to add to intra blocks after transform
• Y scaling factor (times 8)
• UV scaling factor (times 8)
• fast chrominance motion compensation flag
• picture resolution scale mode
• reference picture data structure: old I/P
• reference picture data structure: new/current I/P
• reference picture data structure: reconstructed B picture
• reference picture data structure: backup copy of reference before IC applied
• 2 scale motion vector data structures (1 forward, 1 backward)
• 6 64-element arrays for rescontructing samples

## Bitstream

The SRD maintains a typical bitstream data structure. It simply treats a bytestream and a sequence of bits to be read from left -> right.

## VLC

The SRD uses a simple and highly inefficient VLC lookup mechanism. A table of VLCs consists of these data structures:

• the bit pattern of the VLC
• the number of bits in the VLC
• the number that the VLC represents

The first entry of a VLC table has the following meaning:

• bits = 0
• length = number of codes in table
• maximun VLC code length

The SRD's VLC reading function marches through each entry in a table, sequentially, until it finds a bit/length pattern that matches the bits at the current position in the bitstream.

## Bitplane

A bitplane data structure is used for representing a series of bit values which represent properties of the macroblocks in a picture. The data structure has the following properties:

## Picture Layer Parameters

The SRD maintains the following information for picture layer parameters:

• frame count
• 2 picture types (per field?)
• buffer fullness
• pq index
• per-picture quantizer mode
• PQUANT
• half q step
• frame transform AC coding set index
• frame transform AC coding set index 2
• intra transform DC table flag
• temporal reference frame counter
• top field first flag
• repeat first field flag
• U&V sample mode flag
• post processing mode
• quantization step size
• ALTPQUANT
• interpolation data structure
• pointer to selected motion vector VLC table
• pointer to selected coded block patterm VLC table
• transform type flag
• repeat frame count
• frame interpolation hint
• overlapping filter mode
• pan scan parameters data structure
• dquant frame flag (comment: per MB quant mode)
• bitplane for AC prediction
• bitplane for MB skip
• bitplane for MV type
• bitplane for Direct MB
• bitplane for overlap flags
• bitplane for Forward MB
• bitplane for FieldTX MB
• extend horizontal differential MV flag
• extend vertical differential MV flag
• pointer to selected VLC table for macroblock modes
• pointer to selected VLC table for macroblock 4 motion vector block pattern table
• pointer to selected VLC table for macroblock 2 motion vector block pattern table
• 2 intensity compensation data structures, for top and bottom fields

## State

The SRD maintains the following information about the overall state:

• macroblock position data structure
• picture data structure
• current frame number
• pointer to macroblock data structure
• number of fields per frame
• maximum number of macroblocks per frame
• pointer to the level limits for the combination of profile & level
• sequence layer data structure
• picture layer parameters data structure
• "1 if not first mode 3 escape in frame"
• "Level code size for mode 3 escape, per frame"
• "Run code size for mode 3 escape, per frmae"
• zig zag table index
• flag indicating if frame is first in stream
• flag indicating whether bitplane coding is in use
• number of first coded block in current macroblock
• reference picture data structure-- this is the where the current frame will be decoded
• motion vector history buffer
• number of fields present in the current picture

## Decoder Configuration

The SRD maintains the following information when configuring the decoder:

• max coded width
• max coded height
• highest profile supported by the decoder
• highest level supported by the decoder
• framerate numerator
• framerate denominator

## Run Level Data Structure

SRD: vc1DEC3DH_sRunLevel; this data structure ties together 3 bytes for run, level, and last triple, and is meant to be used as an array.

## AC Coding Set

SRD: vc1DEC3DH_sACCodingSet

The SRD defines the following fields for an AC coding set:

• VLC table (vc1DEC_sVLCCode data type)
• run-level-last (RLL) table (vc1DEC3DH_sRunLevel data type)
• delta level byte array
• delta run byte array
• delta level last byte array
• delta run last byte array