On2 VP6: Difference between revisions

Revision as of 00:17, 5 May 2006

FOURCCs: VP60, VP61, VP62
Company: On2
Whitepaper: http://www.on2.com/cms-data/pdf/1125607149174329.pdf
Samples: http://www.mplayerhq.hu/MPlayer/samples/V-codecs/VP6/

Other Implementations

Early Open source implementation could be found here http://libvp62.sourceforge.net/. (broken link, project removed)

Format

The aim here is to open this standard with a full description of the bitstream format and decoding process. Contributors from On2 especially encouraged here, but it is anticipated that this section will be completed through reverse engineering and by people who saw libvp62 source code before it was censored.

Please do not submit any copyrighted text or code here.

Introduction

VP6 uses unidirectional ("P-frame") and intra-frame (within the current frame) prediction. Entropy coding is performed using arithmetic (range?) coding and an 8x8 iDCT is used. The format supports dynamic adjustment of encoded video resolution.

Macroblocks

Each video frame is composed of an array of 16x16 macroblocks, just like MPEG-2, MPEG-4 parts 2 and 10. Each MB (macroblock) takes one of the following modes ("MV" means "motion vector"):

Intra MB
Inter MB, null MV, previous frame reference
Inter MB, differential MV, previous frame reference
Inter MB, four MVs, previous frame reference
Inter MB, MV 1, previous frame reference
Inter MB, MV 2, previous frame reference
Inter MB, null MV, bookmarked frame reference
Inter MB, differential MV, bookmarked frame reference
Inter MB, MV 1, bookmarked frame reference
Inter MB, MV 2, bookmarked frame reference

Frame Header

The frame header commences with a section that is encoded using conventional big-endian bit packing.

Syntax	Number of bits	Type	Semantics
frame_mode	1	Enum	0x1 signifies an intra frame
qp	6	Unsigned	Quantization parameter valid range 0..63
marker_bit	1	Constant	Value should be 0x1
if (frame_mode == 0x01) {
version	7	Constant	Value should be 0x23
interlace	1	Boolean	true (1) means interlace will be used
dim_y	8	Unsigned	Macroblock height of video
dim_x	8	Unsigned	Macroblock width of video
render_y	8	Unsigned	Display height of video
render_x	8	Unsigned	Display width of video
}

If dim_x or dim_y have values different from the previous intra frame, then the resolution of the encoded image has changed.

Arithmetic coding commences at the next bit (which should be on a byte boundary):

Syntax	Type	Semantics
if (frame_mode == 0x1) {
marker1	Equiprobable 2-bit	Ignored
} else {
bookmark	Equiprobable 1-bit	bookmark == 0x1 means this frame will be the next bookmark frame
filter1	Equiprobable 1-bit
if (filter1 == 0x1) {
filter2	Equiprobable 1-bit
}
filter_info	Equiprobable 1-bit
}
if (frame_mode == 0x1 \|\| filter_info == 0x1) {
filter_mode1	Equiprobable 1-bit
if (filter_mode1 == 0x1) {
filter_threshold1	Equiprobable 5-bit
filter_motion_param	Equiprobable 3-bit
} else {
filter_mode2	Equiprobable 1-bit
}
filter_mode3	Equiprobable 4-bit
}
marker2	Equiprobable 1-bit	Ignored

Entropy Coding

Described here is the decoding process for the arithmetically-coded (AC) parts of the bitstream. VP6 uses a 16-bit range coding scheme to code binary symbols.

The AC decoder maintains three state variables: code, mask and high.

Initialization

At initialization, the first two bytes of the AC bitstream are shifted into code. The variable high is set to 0xff00. The variable mask is set to 0xffff.

Decoding a Binary Value

Each binary symbol has an associated probability p in the range 0 to 0xff.

A threshold, t, is computed thus:

t = 0x100 + ( 0xff00 & ( ( (high-0x100) * p ) >> 8 ) )

Equiprobable binary symbols are treated somewhat differently:

t = 0xff00 & ( (high+0x100) >> 1 )

The binary value may then be decoded by comparing code and t. If code is less than t, the binary value is decoded as 0. If code is equal to or greater than t, the binary value is decoded as 1.

If a 1 was decoded, then

high = high - t

code = code - t

If a 0 was decoded, then

high = t

The following renormalization is now repeated while (high & 0x8000) is non-zero.

high = 2 * high

code = 2 * code

mask = 2 * mask

if ((mask & 0xff) == 0x00) {

code = code | next byte from bitstream

mask = mask | 0xff

}

Decoding an Equiprobable n-bit Integer Value

Integer values are coded as a big-endian sequence of equiprobable binary values. To decode an n-bit equiprobable integer value, n equiprobable binary values should be decoded using the sequence above and left-shifted into an integral result variable.

Inverse DCT

Inverse DCT is performed on 8x8 blocks of pixels. The algorithm used is the same (or a small variation) of the one used for the VP3 decoder in FFmpeg [1], the original vp3 iDCT code is here [2].

@@ Line 130: / Line 130: @@
 A threshold, ''t'', is computed thus:
-: ''t'' = 0x100 + ( 0xff00 & ( ( (''high''-0x100) * ''p'' ) >> 8 )
+: ''t'' = 0x100 + ( 0xff00 & ( ( (''high''-0x100) * ''p'' ) >> 8 ) )
 Equiprobable binary symbols are treated somewhat differently:
-: ''t'' = 0xff00 & (''high''+0x100) >> 1 )
+: ''t'' = 0xff00 & ( (''high''+0x100) >> 1 )
 The binary value may then be decoded by comparing ''code'' and ''t''.  If ''code'' is less than ''t'', the binary value is decoded as 0.  If ''code'' is equal to or greater than ''t'', the binary value is decoded as 1.

On2 VP6: Difference between revisions

Revision as of 00:17, 5 May 2006

Contents

Other Implementations

Format

Introduction

Macroblocks

Frame Header

Entropy Coding

Initialization

Decoding a Binary Value

Decoding an Equiprobable n-bit Integer Value

Inverse DCT

Navigation menu

On2 VP6: Difference between revisions

Revision as of 00:17, 5 May 2006

Other Implementations

Format

Introduction

Macroblocks

Frame Header

Entropy Coding

Initialization

Decoding a Binary Value

Decoding an Equiprobable n-bit Integer Value

Inverse DCT

Navigation menu

Search