VC-1 Data Structures: Difference between revisions

Latest revision as of 20:12, 31 May 2006

This page is a discussion of the various data structures and constant enumerations employed in the SMPTE Reference Decoder (SRD) VC-1 reference implementation. These are mostly found in the file vc1types.h.

Macroblocks, Blocks, and Sub-blocks

A macroblock embodies a 16x16 block of pixels in a YUV 4:2:0 colorspace. A macroblock consists of 6 blocks: 4 8x8 Y blocks, 1 U block, and 1 V block. Further, these blocks maybe divided into sub-blocks of sizes 8x4, 4x8, or 4x4.

Intra Block

An intra block data structure requires the following information:

number of non-zero AC coefficients
quantized DC coefficient for prediction
7 quantized AC coefficients along the top row of the block, used for prediction
7 quantized AC coefficients along the left side of the block, used for prediction
16 samples representing the bottom 2 8-pixel rows of the block, maintained for overlap smoothing filters

Motion Vector

This is the information maintained for an individual motion vector:

X offset
Y offset
a flag indicating whether the vector pertains to the top or bottom field of interlaced video

The (X, Y) vector is relative to the top-left coordinate of a block. The fractional pel resolution of the vectors depends on the motion vector coding mode.

Motion

This structure encapsulates motion vectors along with mode information.

prediction mode, as enumerated in Hybrid Prediction Modes
motion vector data structure
differential motion vector data structure, represented in quarter-pel units

The SRD contains the following not accompanying this data structure:

/*
 * If Two Reference Images (NUMREF=1) then:
 *      Y=2*(YValue)+PredFlag
 *      PredFlag: 0=dominant 1=non-dominant
*/

Motion Vector History

This data structure consists of an array of 4 motion vector data structures which stores the motion vector history for 4 Y blocks, used for direct mode.

Inter Block

An intra block data structure requires the following information:

an array of 4 integers representing the number of non-zero coefficients (DC and AC together) for up to 4 sub-blocks in the block
2 motion structures, 1 for backward prediction and 1 for forward prediction

Block

This is the information that the SRD maintains for an individual block:

block type, as defined in Block Types section
a flag indicating whether there are non-zero AC coefficients for an intra block, or non-zero DC or AC coefficients for an inter block
a C union that contains either an intra or inter block data structure

Quantizer

This data structure maintains information about quantization parameters.

quantizer step, range 0..31
quantizer half-step, either 0 or 1
a flag indicating whether the quantization is uniform or non-uniform

Macroblock Properties

These properties are associated with various macroblock coding options:

a block is intra-coded, or has 1, 2, or 4 motion vectors associated
a block predicts backwards from a previous frame, from a forward frame, from both a backward or forward frame, or uses "direct" mode
a flag that indicates whether MVs apply to interlaced fields
a flag that indicates that "bottom field different direction to top"
a flag that indicates "field transform"

Macroblock

The SRD maintains the following information about an individual macroblock:

macroblock type: this is a combination of properties enumerated in Macroblock Properties
AC prediction status as enumerated in AC Prediction
block type as enumerated in Block Types; might be "any"
a flag indicating whether the overlap filter is active for this macroblock
a flag indicating whether is motion predicted only (presumably, this means that a predicted block is formed by no other transform/addition occurs for the MB)
a 6-bit flag vector that indicates which of the 6 constituent block in the MB are coded
a 4-bit flag vector that indicates motion vector block pattern-- these flags are set if the differential MV for a respective Y block is not 0
quantizer data structure for the macroblock
6 block structures

Alternate Description

SRD maintains the following information about each macroblock:

 macroblock type, contains the following attributes:
   (attribute 1)
   intra
   1 MV
   2 MV
   4 MV

   (attribute 2)
   direct macroblock
   forward prediction
   backward prediction
   forward and backward prediction

   (attribute 3)
   MVs apply to fields
   bottom different than top
   field transform

 AC prediction status, one of the following attributes:
   AC prediction off
   AC prediction on
   no blocks can be predicted

 Block type, one of the following types:
   8x8 inter-coded
   8x4 inter-coded
   4x8 inter-coded
   4x4 inter-coded
   intra-coded, no AC prediction
   intra-coded, AC prediction from top values
   intra-coded, AC prediction from left values

 flag indicating whether overlap filter is active for this macroblock
 flag indicating whether macroblock is motion predicted only (no residual)
 byte indicating coded block pattern which indicates which of the 6
   sub-blocks are coded

 Quantizer information, this includes:
   quantizer step in the range 1..31
   quantizer half step, either 0 or 1
   flag indicating uniform or non-uniform quantizer

 Information for each of the 6 constituent blocks in the macroblock:
   block type (same choices as the macroblock attributes)
   flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
   union between an intra block structure and an inter block structure
   intra structure:
     number of zero/non-zero AC coeffs
     quantized DC coeff
     quantized AC top row for prediction (7 values)
     quantized AC left column for prediction (7 values)
     bottom 2 rows (16 values) kept for overlap smoothing
   inter structure:
     number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
     forward and backward motion vector structures, each includes:
       hybrid prediction mode, one of the following attributes:
         predict from left
         predict from top
         no hybrid prediction
       (x,y) motion vectors for each of the 4 Y blocks
       (x,y) differential motion vectors in 1/4 pel units
 (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)

Intensity Compensation

The SRD used the following data structure to maintain information about intensity compensation:

a flag to indicate whether intensity compensation is enabled
IC scale
IC shift

B Fraction

This data structure tracks B-fraction data:

B-fraction numerator
B-fraction denominator
scalefactor: approximate numerator * 256 / denominator

Hypothetical Reference Decoder

The SRD contains data structures pertaining to a hypothetical reference decoder involving a leaky bucket algorithm.

Pan Scan Window

This data structure pertains to pan and scan windows:

horizontal offset in pixels
vertical offset in pixels
width in pixels
height in pixels

Pan Scan Parameters

This data structure contains a flag indicating whether pan and scan is present, and an array of 3 Pan Scan Window data structures (3 is the maximum supported).

Sequence And Layer Parameters

The SRD maintains all of the information for the sequence layer:

profile (simple, main, or advanced)
maximum coded width
maximum coded height
coded width
coded height
display width
display height
aspect width
aspect height
profile level
interface flag
frame rate numerator (0 unless otherwise specified)
frame rate denominator (the comments list 1000, 1001, and 32 as valid values, and claim that it is 0 if not specified, which sounds potentially problematic)
color format indicator flag
chroma format
color primaries
transfer characteristics
matrix coefficients
hypothetical reference decoder
loop filter flag
multi resolution coding flag
fast chrominance motion compensation flag
extended motion vector flag
extended differential motion vector flag
d-quant
variable size transform flag (VSTransform)
overlapped transform flag
sync marker flag
range reduction flag
max B-frames
quantizer mode
post processing flag
frame counter flag
pull down flag
PsF (???)
Q framerate for post processing
Q bitrate for post processing
pan scan flag
reserved RTM flag
frame interpolation flag
range scale Y (comment: scale value times 8)
range scale UV (comment: scale value times 8)
number of pan scan windows
broken link flag
closed entry flag
refdist flag (refdist refers to distance to previous reference frame)
frame user data present flag
end-of-sequence marker present flag

Component

The SRD defines a component as a single Y, U, or V plane and associates these members with the data structure:

data: the raw bytes that comprise the Y, U, or V sample data
bytes/line, a.k.a. stride

Field

This data structure defines all of the parameters that pertain to a particular field:

picture type
conditional overlap filter mode
quantization mode
motion vector mode
motion vector range
block transform type
post processing flag
extended X differential motion vector range
extended Y differential motion vector range
number of reference fields (either 1 or 2)
reference field (either last or last-but-one)
motion vector VLC table (0..7)
MB mode VLC table (0..7)
block pattern 2 motion vector table (0..3)
block pattern 4 motion vector table (0..3)
inter-coded block coding pattern VLC table (0..3)
AC coding set to use for intra-coded Y blocks (0..2)
AC coding set to use for all inter-coded blocks or U and V intra (0..2)
DC coding set (0..1)
rows per slice (0 = no slicing used)

Picture

The picture is the fundamental data unit in the VC-1 coding scheme. A picture can be one of the following things:

a progressive frame
an interlaced top field
an interlaced bottom field
an interlaced frame

The SRD maintains the following information about a picture:

frame number (modulo 1<<32)
picture format
Y component data structure
U component data structure
V component data structure
2 field data structures
picture resolution index
top field first flag
repeat first field flag
range reduction used flag
frame interpolation hint flag
chrominance sample format flag
repeat frame count
pan scan parameters data structure
post processing mode

Scale Motion Vectors

This data structure contains information about scaling motion vectors for interlaced frames:

scale (comment: down scale factor * 256)
scale 1 (comment: up scale factor * 256) if in zone 1
scale 2 (comment: down scale factor * 256) if not in zone 1
zone 1 X size
zone 1 Y size
zone 1 X offset
zone 1 Y offset
flag indicating scaling up or down for opposite
flag indicating top or bottom field
motion vector range
motion vector mode

Interpolation

This data structure contains information to be passed to a bilinear or bicubic interpolation function:

component data structure
width of resulting filtered rectangle
height of resulting filtered rectangle
flag indicating rounding behavior

Padding Modes

simple or main profile - pad from macroblock edge
advanced profile progressive - pad from image edge
advanced profile interlaced field padding

Rectangle

Nothing complicated about this data structure-- it's just 2 (X, Y) coordinate pairs specifying the upper-left and lower-right corners of a rectange.

Image Position

This data structure contains rectangles to control padding and cropping:

total width of buffer
total height of buffer
image rectangle in pels relative to buffer origin
rectangle to pad outwards from in pels relative to buffer origin
rectangle limits to pad outwards to in pels relative to buffer origin

Reference Picture

This data structure contains all of the information to comprise a reference picture (I-frame).

valid flag
broken link flag-- reference is not longer available due to a broken link
parameter indicating whether top field, bottom field, or both are padded
range Y scale (comment: Y scaling factor times 8)
range UV scale (comment: UV scaling factor times 8)
number of frames between this and the last reference frame)
frame number modulo (1<<32)
top field first flag
repeat first field flag
PsF (???)
pan and scan parameters data structure
frame interpolation hint flag (comment indicates it is not used in decoding process)
chrominance plane sampling mode, pertains to interlaced modes
repeated frame count
post processing mode
coded width
coded height
max coded width
max coded height
picture format
2 motion vector ranges, 1 for each field
2 picture types, 1 for each field
padding mode
picture resolution scaling mode
Y component data structure
U component data structure
V component data structure
pointer to Y data top-left corner
pointer to U data top-left corner
pointer to V data top-left corner
image position data structure indicating position of Y samples in image buffer
image position data structure indicating position of C samples in image buffer

Level Limits

This data structure hold information about various limits at each profile and level.

max macroblocks per second
max macroblocks per frame
max peak transmission rate in kilobits per second
max buffer size in multiples of 16 kilobits
motion vector range allowed

Position

This data structure describes the current macroblock being processed:

picture type
picture format
profile
motion vector mode
motion vector range
flag indicating top vs. bottom field
flag indicating first vs. second field
pointer to the current macroblock data structure
pointer to the start of the macroblock circular data structure
pointer to the current position in the motion vector history buffer
circular buffer size in macroblocks
X macroblock offset in current slice
Y macroblock offset in current slice
Y macroblock offset of slice in picture
width in macroblocks of coded picture
height in macroblocks of codec picture
coded width
coded height
max coded width
max coded height
picture quantizer (PQUANT)
B-fraction syntax element
number of reference fields, minus 1
reference field when previous field is 0
bias to add to intra blocks after transform
Y scaling factor (times 8)
UV scaling factor (times 8)
fast chrominance motion compensation flag
picture resolution scale mode
reference picture data structure: old I/P
reference picture data structure: new/current I/P
reference picture data structure: reconstructed B picture
reference picture data structure: backup copy of reference before IC applied
2 scale motion vector data structures (1 forward, 1 backward)
6 64-element arrays for rescontructing samples

Bitstream

The SRD maintains a typical bitstream data structure. It simply treats a bytestream and a sequence of bits to be read from left -> right.

VLC

The SRD uses a simple and highly inefficient VLC lookup mechanism. A table of VLCs consists of these data structures:

the bit pattern of the VLC
the number of bits in the VLC
the number that the VLC represents

The first entry of a VLC table has the following meaning:

bits = 0
length = number of codes in table
maximun VLC code length

The SRD's VLC reading function marches through each entry in a table, sequentially, until it finds a bit/length pattern that matches the bits at the current position in the bitstream.

Bitplane

A bitplane data structure is used for representing a series of bit values which represent properties of the macroblocks in a picture. The data structure has the following properties:

Picture Layer Parameters

The SRD maintains the following information for picture layer parameters:

frame count
2 picture types (per field?)
buffer fullness
pq index
per-picture quantizer mode
PQUANT
half q step
frame transform AC coding set index
frame transform AC coding set index 2
intra transform DC table flag
temporal reference frame counter
top field first flag
repeat first field flag
U&V sample mode flag
post processing mode
quantization step size
ALTPQUANT
interpolation data structure
pointer to selected motion vector VLC table
pointer to selected coded block patterm VLC table
transform type flag
repeat frame count
frame interpolation hint
overlapping filter mode
pan scan parameters data structure
dquant frame flag (comment: per MB quant mode)
bitplane for AC prediction
bitplane for MB skip
bitplane for MV type
bitplane for Direct MB
bitplane for overlap flags
bitplane for Forward MB
bitplane for FieldTX MB
extend horizontal differential MV flag
extend vertical differential MV flag
pointer to selected VLC table for macroblock modes
pointer to selected VLC table for macroblock 4 motion vector block pattern table
pointer to selected VLC table for macroblock 2 motion vector block pattern table
2 intensity compensation data structures, for top and bottom fields

State

The SRD maintains the following information about the overall state:

macroblock position data structure
picture data structure
current frame number
pointer to macroblock data structure
number of fields per frame
maximum number of macroblocks per frame
pointer to the level limits for the combination of profile & level
sequence layer data structure
picture layer parameters data structure
"1 if not first mode 3 escape in frame"
"Level code size for mode 3 escape, per frame"
"Run code size for mode 3 escape, per frmae"
zig zag table index
flag indicating if frame is first in stream
flag indicating whether bitplane coding is in use
number of first coded block in current macroblock
reference picture data structure-- this is the where the current frame will be decoded
motion vector history buffer
number of fields present in the current picture

Decoder Configuration

The SRD maintains the following information when configuring the decoder:

max coded width
max coded height
highest profile supported by the decoder
highest level supported by the decoder
framerate numerator
framerate denominator

Run Level Data Structure

SRD: vc1DEC3DH_sRunLevel; this data structure ties together 3 bytes for run, level, and last triple, and is meant to be used as an array.

AC Coding Set

SRD: vc1DEC3DH_sACCodingSet

The SRD defines the following fields for an AC coding set:

VLC table (vc1DEC_sVLCCode data type)
run-level-last (RLL) table (vc1DEC3DH_sRunLevel data type)
delta level byte array
delta run byte array
delta level last byte array
delta run last byte array

VC-1 Data Structures: Difference between revisions

Latest revision as of 20:12, 31 May 2006

Contents

Macroblocks, Blocks, and Sub-blocks

Intra Block

Motion Vector

Motion

Motion Vector History

Inter Block

Block

Quantizer

Macroblock Properties

Macroblock

Alternate Description

Intensity Compensation

B Fraction

Hypothetical Reference Decoder

Pan Scan Window

Pan Scan Parameters

Sequence And Layer Parameters

Component

Field

Picture

Scale Motion Vectors

Interpolation

Padding Modes

Rectangle

Image Position

Reference Picture

Level Limits

Position

Bitstream

VLC

Bitplane

Picture Layer Parameters

State

Decoder Configuration

Run Level Data Structure

AC Coding Set

Navigation menu

VC-1 Data Structures: Difference between revisions

Latest revision as of 20:12, 31 May 2006

Macroblocks, Blocks, and Sub-blocks

Intra Block

Motion Vector

Motion

Motion Vector History

Inter Block

Block

Quantizer

Macroblock Properties

Macroblock

Alternate Description

Intensity Compensation

B Fraction

Hypothetical Reference Decoder

Pan Scan Window

Pan Scan Parameters

Sequence And Layer Parameters

Component

Field

Picture

Scale Motion Vectors

Interpolation

Padding Modes

Rectangle

Image Position

Reference Picture

Level Limits

Position

Bitstream

VLC

Bitplane

Picture Layer Parameters

State

Decoder Configuration

Run Level Data Structure

AC Coding Set

Navigation menu

Search