VC-1 Data Structures
Part of Understanding VC-1
This page is a discussion of the various data structures and constant enumerations employed in the SMPTE Reference Decoder (SRD) VC-1 reference implementation. These are mostly found in the file vc1types.h.
Macroblocks, Blocks, and Sub-blocks
A macroblock embodies a 16x16 block of pixels in a YUV 4:2:0 colorspace. A macroblock consists of 6 blocks: 4 8x8 Y blocks, 1 U block, and 1 V block. Further, these blocks maybe divided into sub-blocks of sizes 8x4, 4x8, or 4x4.
Intra Block
An intra block data structure requires the following information:
- number of non-zero AC coefficients
- quantized DC coefficient for prediction
- 7 quantized AC coefficients along the top row of the block, used for prediction
- 7 quantized AC coefficients along the left side of the block, used for prediction
- 16 samples representing the bottom 2 8-pixel rows of the block, maintained for overlap smoothing filters
Motion Vector
This is the information maintained for an individual motion vector:
- X offset
- Y offset
- a flag indicating whether the vector pertains to the top or bottom field of interlaced video
The (X, Y) vector is relative to the top-left coordinate of a block. The fractional pel resolution of the vectors depends on the motion vector coding mode.
Motion
This structure encapsulates motion vectors along with mode information.
- prediction mode, as enumerated in Hybrid Prediction Modes
- motion vector data structure
- differential motion vector data structure, represented in quarter-pel units
The SRD contains the following not accompanying this data structure:
/* * If Two Reference Images (NUMREF=1) then: * Y=2*(YValue)+PredFlag * PredFlag: 0=dominant 1=non-dominant */
Motion Vector History
This data structure consists of an array of 4 motion vector data structures which stores the motion vector history for 4 Y blocks, used for direct mode.
Inter Block
An intra block data structure requires the following information:
- an array of 4 integers representing the number of non-zero coefficients (DC and AC together) for up to 4 sub-blocks in the block
- 2 motion structures, 1 for backward prediction and 1 for forward prediction
Block
This is the information that the SRD maintains for an individual block:
- block type, as defined in Block Types section
- a flag indicating whether there are non-zero AC coefficients for an intra block, or non-zero DC or AC coefficients for an inter block
- a C union that contains either an intra or inter block data structure
Quantizer
This data structure maintains information about quantization parameters.
- quantizer step, range 0..31
- quantizer half-step, either 0 or 1
- a flag indicating whether the quantization is uniform or non-uniform
Macroblock Properties
These properties are associated with various macroblock coding options:
- a block is intra-coded, or has 1, 2, or 4 motion vectors associated
- a block predicts backwards from a previous frame, from a forward frame, from both a backward or forward frame, or uses "direct" mode
- a flag that indicates whether MVs apply to interlaced fields
- a flag that indicates that "bottom field different direction to top"
- a flag that indicates "field transform"
Macroblock
The SRD maintains the following information about an individual macroblock:
- macroblock type: this is a combination of properties enumerated in Macroblock Properties
- AC prediction status as enumerated in AC Prediction
- block type as enumerated in Block Types; might be "any"
- a flag indicating whether the overlap filter is active for this macroblock
- a flag indicating whether is motion predicted only (presumably, this means that a predicted block is formed by no other transform/addition occurs for the MB)
- a 6-bit flag vector that indicates which of the 6 constituent block in the MB are coded
- a 4-bit flag vector that indicates motion vector block pattern-- these flags are set if the differential MV for a respective Y block is not 0
- quantizer data structure for the macroblock
- 6 block structures
Alternate Description
SRD maintains the following information about each macroblock:
macroblock type, contains the following attributes: (attribute 1) intra 1 MV 2 MV 4 MV (attribute 2) direct macroblock forward prediction backward prediction forward and backward prediction (attribute 3) MVs apply to fields bottom different than top field transform AC prediction status, one of the following attributes: AC prediction off AC prediction on no blocks can be predicted Block type, one of the following types: 8x8 inter-coded 8x4 inter-coded 4x8 inter-coded 4x4 inter-coded intra-coded, no AC prediction intra-coded, AC prediction from top values intra-coded, AC prediction from left values flag indicating whether overlap filter is active for this macroblock flag indicating whether macroblock is motion predicted only (no residual) byte indicating coded block pattern which indicates which of the 6 sub-blocks are coded Quantizer information, this includes: quantizer step in the range 1..31 quantizer half step, either 0 or 1 flag indicating uniform or non-uniform quantizer Information for each of the 6 constituent blocks in the macroblock: block type (same choices as the macroblock attributes) flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter union between an intra block structure and an inter block structure intra structure: number of zero/non-zero AC coeffs quantized DC coeff quantized AC top row for prediction (7 values) quantized AC left column for prediction (7 values) bottom 2 rows (16 values) kept for overlap smoothing inter structure: number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?) forward and backward motion vector structures, each includes: hybrid prediction mode, one of the following attributes: predict from left predict from top no hybrid prediction (x,y) motion vectors for each of the 4 Y blocks (x,y) differential motion vectors in 1/4 pel units (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
Intensity Compensation
The SRD used the following data structure to maintain information about intensity compensation:
- a flag to indicate whether intensity compensation is enabled
- IC scale
- IC shift
B Fraction
This data structure tracks B-fraction data:
- B-fraction numerator
- B-fraction denominator
- scalefactor: approximate numerator * 256 / denominator
Hypothetical Reference Decoder
The SRD contains data structures pertaining to a hypothetical reference decoder involving a leaky bucket algorithm.
Pan Scan Window
This data structure pertains to pan and scan windows:
- horizontal offset in pixels
- vertical offset in pixels
- width in pixels
- height in pixels
Pan Scan Parameters
This data structure contains a flag indicating whether pan and scan is present, and an array of 3 Pan Scan Window data structures (3 is the maximum supported).
Sequence And Layer Parameters
The SRD maintains all of the information for the sequence layer:
- profile (simple, main, or advanced)
- maximum coded width
- maximum coded height
- coded width
- coded height
- display width
- display height
- aspect width
- aspect height
- profile level
- interface flag
- frame rate numerator (0 unless otherwise specified)
- frame rate denominator (the comments list 1000, 1001, and 32 as valid values, and claim that it is 0 if not specified, which sounds potentially problematic)
- color format indicator flag
- chroma format
- color primaries
- transfer characteristics
- matrix coefficients
- hypothetical reference decoder
- loop filter flag
- multi resolution coding flag
- fast chrominance motion compensation flag
- extended motion vector flag
- extended differential motion vector flag
- d-quant
- variable size transform flag (VSTransform)
- overlapped transform flag
- sync marker flag
- range reduction flag
- max B-frames
- quantizer mode
- post processing flag
- frame counter flag
- pull down flag
- PsF (???)
- Q framerate for post processing
- Q bitrate for post processing
- pan scan flag
- reserved RTM flag
- frame interpolation flag
- range scale Y (comment: scale value times 8)
- range scale UV (comment: scale value times 8)
- number of pan scan windows
- broken link flag
- closed entry flag
- refdist flag (refdist refers to distance to previous reference frame)
- frame user data present flag
- end-of-sequence marker present flag
Component
The SRD defines a component as a single Y, U, or V plane and associates these members with the data structure:
- data: the raw bytes that comprise the Y, U, or V sample data
- bytes/line, a.k.a. stride
Field
This data structure defines all of the parameters that pertain to a particular field:
- picture type
- conditional overlap filter mode
- quantization mode
- motion vector mode
- motion vector range
- block transform type
- post processing flag
- extended X differential motion vector range
- extended Y differential motion vector range
- number of reference fields (either 1 or 2)
- reference field (either last or last-but-one)
- motion vector VLC table (0..7)
- MB mode VLC table (0..7)
- block pattern 2 motion vector table (0..3)
- block pattern 4 motion vector table (0..3)
- inter-coded block coding pattern VLC table (0..3)
- AC coding set to use for intra-coded Y blocks (0..2)
- AC coding set to use for all inter-coded blocks or U and V intra (0..2)
- DC coding set (0..1)
- rows per slice (0 = no slicing used)
Picture
The picture is the fundamental data unit in the VC-1 coding scheme. A picture can be one of the following things:
- a progressive frame
- an interlaced top field
- an interlaced bottom field
- an interlaced frame
The SRD maintains the following information about a picture:
- frame number (modulo 1<<32)
- picture format
- Y component data structure
- U component data structure
- V component data structure
- 2 field data structures
- picture resolution index
- top field first flag
- repeat first field flag
- range reduction used flag
- frame interpolation hint flag
- chrominance sample format flag
- repeat frame count
- pan scan parameters data structure
- post processing mode
Scale Motion Vectors
This data structure contains information about scaling motion vectors for interlaced frames:
- scale (comment: down scale factor * 256)
- scale 1 (comment: up scale factor * 256) if in zone 1
- scale 2 (comment: down scale factor * 256) if not in zone 1
- zone 1 X size
- zone 1 Y size
- zone 1 X offset
- zone 1 Y offset
- flag indicating scaling up or down for opposite
- flag indicating top or bottom field
- motion vector range
- motion vector mode
Interpolation
This data structure contains information to be passed to a bilinear or bicubic interpolation function:
- component data structure
- width of resulting filtered rectangle
- height of resulting filtered rectangle
- flag indicating rounding behavior
Padding Modes
- simple or main profile - pad from macroblock edge
- advanced profile progressive - pad from image edge
- advanced profile interlaced field padding
Rectangle
Nothing complicated about this data structure-- it's just 2 (X, Y) coordinate pairs specifying the upper-left and lower-right corners of a rectange.
Image Position
This data structure contains rectangles to control padding and cropping:
- total width of buffer
- total height of buffer
- image rectangle in pels relative to buffer origin
- rectangle to pad outwards from in pels relative to buffer origin
- rectangle limits to pad outwards to in pels relative to buffer origin
Reference Picture
This data structure contains all of the information to comprise a reference picture (I-frame).
- valid flag
- broken link flag-- reference is not longer available due to a broken link
- parameter indicating whether top field, bottom field, or both are padded
- range Y scale (comment: Y scaling factor times 8)
- range UV scale (comment: UV scaling factor times 8)
- number of frames between this and the last reference frame)
- frame number modulo (1<<32)
- top field first flag
- repeat first field flag
- PsF (???)
- pan and scan parameters data structure
- frame interpolation hint flag (comment indicates it is not used in decoding process)
- chrominance plane sampling mode, pertains to interlaced modes
- repeated frame count
- post processing mode
- coded width
- coded height
- max coded width
- max coded height
- picture format
- 2 motion vector ranges, 1 for each field
- 2 picture types, 1 for each field
- padding mode
- picture resolution scaling mode
- Y component data structure
- U component data structure
- V component data structure
- pointer to Y data top-left corner
- pointer to U data top-left corner
- pointer to V data top-left corner
- image position data structure indicating position of Y samples in image buffer
- image position data structure indicating position of C samples in image buffer
Level Limits
This data structure hold information about various limits at each profile and level.
- max macroblocks per second
- max macroblocks per frame
- max peak transmission rate in kilobits per second
- max buffer size in multiples of 16 kilobits
- motion vector range allowed
Position
This data structure describes the current macroblock being processed:
- picture type
- picture format
- profile
- motion vector mode
- motion vector range
- flag indicating top vs. bottom field
- flag indicating first vs. second field
- pointer to the current macroblock data structure
- pointer to the start of the macroblock circular data structure
- pointer to the current position in the motion vector history buffer
- circular buffer size in macroblocks
- X macroblock offset in current slice
- Y macroblock offset in current slice
- Y macroblock offset of slice in picture
- width in macroblocks of coded picture
- height in macroblocks of codec picture
- coded width
- coded height
- max coded width
- max coded height
- picture quantizer (PQUANT)
- B-fraction syntax element
- number of reference fields, minus 1
- reference field when previous field is 0
- bias to add to intra blocks after transform
- Y scaling factor (times 8)
- UV scaling factor (times 8)
- fast chrominance motion compensation flag
- picture resolution scale mode
- reference picture data structure: old I/P
- reference picture data structure: new/current I/P
- reference picture data structure: reconstructed B picture
- reference picture data structure: backup copy of reference before IC applied
- 2 scale motion vector data structures (1 forward, 1 backward)
- 6 64-element arrays for rescontructing samples
Bitstream
The SRD maintains a typical bitstream data structure. It simply treats a bytestream and a sequence of bits to be read from left -> right.
VLC
The SRD uses a simple and highly inefficient VLC lookup mechanism. A table of VLCs consists of these data structures:
- the bit pattern of the VLC
- the number of bits in the VLC
- the number that the VLC represents
The first entry of a VLC table has the following meaning:
- bits = 0
- length = number of codes in table
- maximun VLC code length
The SRD's VLC reading function marches through each entry in a table, sequentially, until it finds a bit/length pattern that matches the bits at the current position in the bitstream.
Bitplane
A bitplane data structure is used for representing a series of bit values which represent properties of the macroblocks in a picture. The data structure has the following properties:
Picture Layer Parameters
The SRD maintains the following information for picture layer parameters:
- frame count
- 2 picture types (per field?)
- buffer fullness
- pq index
- per-picture quantizer mode
- PQUANT
- half q step
- frame transform AC coding set index
- frame transform AC coding set index 2
- intra transform DC table flag
- temporal reference frame counter
- top field first flag
- repeat first field flag
- U&V sample mode flag
- post processing mode
- quantization step size
- ALTPQUANT
- interpolation data structure
- pointer to selected motion vector VLC table
- pointer to selected coded block patterm VLC table
- transform type flag
- repeat frame count
- frame interpolation hint
- overlapping filter mode
- pan scan parameters data structure
- dquant frame flag (comment: per MB quant mode)
- bitplane for AC prediction
- bitplane for MB skip
- bitplane for MV type
- bitplane for Direct MB
- bitplane for overlap flags
- bitplane for Forward MB
- bitplane for FieldTX MB
- extend horizontal differential MV flag
- extend vertical differential MV flag
- pointer to selected VLC table for macroblock modes
- pointer to selected VLC table for macroblock 4 motion vector block pattern table
- pointer to selected VLC table for macroblock 2 motion vector block pattern table
- 2 intensity compensation data structures, for top and bottom fields
State
The SRD maintains the following information about the overall state:
- macroblock position data structure
- picture data structure
- current frame number
- pointer to macroblock data structure
- number of fields per frame
- maximum number of macroblocks per frame
- pointer to the level limits for the combination of profile & level
- sequence layer data structure
- picture layer parameters data structure
- "1 if not first mode 3 escape in frame"
- "Level code size for mode 3 escape, per frame"
- "Run code size for mode 3 escape, per frmae"
- zig zag table index
- flag indicating if frame is first in stream
- flag indicating whether bitplane coding is in use
- number of first coded block in current macroblock
- reference picture data structure-- this is the where the current frame will be decoded
- motion vector history buffer
- number of fields present in the current picture
Decoder Configuration
The SRD maintains the following information when configuring the decoder:
- max coded width
- max coded height
- highest profile supported by the decoder
- highest level supported by the decoder
- framerate numerator
- framerate denominator
Run Level Data Structure
SRD: vc1DEC3DH_sRunLevel; this data structure ties together 3 bytes for run, level, and last triple, and is meant to be used as an array.
AC Coding Set
SRD: vc1DEC3DH_sACCodingSet
The SRD defines the following fields for an AC coding set:
- VLC table (vc1DEC_sVLCCode data type)
- run-level-last (RLL) table (vc1DEC3DH_sRunLevel data type)
- delta level byte array
- delta run byte array
- delta level last byte array
- delta run last byte array