On2 VP6: Difference between revisions
(VP60/VP61 are now supported.) |
Suxen drol (talk | contribs) (vp60 headers) |
||
Line 9: | Line 9: | ||
was driven underground by On2 on copyright infringement claims. | was driven underground by On2 on copyright infringement claims. | ||
The [[FFmpeg]] implementation supports VP62 only to the extent necessary to play all known samples. It is known to be incomplete with regard to the VP6 specification, though. | The [[FFmpeg]] implementation supports VP61 and VP62 only to the extent necessary to play all known samples. It is known to be incomplete with regard to the VP6 specification, though. | ||
A decoder implementation may be found in the FFMPEG source file [http://svn.mplayerhq.hu/ffmpeg/trunk/libavcodec/vp6.c?view=markup vp6.c] | A decoder implementation may be found in the FFMPEG source file [http://svn.mplayerhq.hu/ffmpeg/trunk/libavcodec/vp6.c?view=markup vp6.c] | ||
Line 21: | Line 21: | ||
=== Introduction === | === Introduction === | ||
VP6 uses unidirectional ("P-frame") and intra-frame (within the current frame) prediction. Entropy coding is performed using arithmetic (range?) coding and an 8x8 iDCT is used. The format supports dynamic adjustment of encoded video resolution. | VP6 uses unidirectional ("P-frame") and intra-frame (within the current frame) prediction. Entropy coding is performed using arithmetic (range?) coding and an 8x8 iDCT is used. The format supports dynamic adjustment of encoded video resolution. There are three variants of the VP6 codec, VP60 (Simple Profile), VP62 (Advanced Profile) and VP62 (Heightened Sharpness Profile). | ||
=== Macroblocks === | === Macroblocks === | ||
Line 49: | Line 50: | ||
| qp || 6 || Unsigned || Quantization parameter valid range 0..63 | | qp || 6 || Unsigned || Quantization parameter valid range 0..63 | ||
|- | |- | ||
| | | marker || 1 || Constant || 0=VP61/62, 1=VP60 | ||
|- | |- | ||
| if (frame_mode == 0) { || || ||0 equals to INTRA_FRAME | | if (frame_mode == 0) { || || ||0 equals to INTRA_FRAME | ||
|- | |- | ||
| version || 7 || Constant || | | version || 5 || Constant || 6=VP60/61, 7=VP60, 8=VP62 | ||
|- | |||
| version2 || 2 || Constant || 0=VP60, 3=VP61/62 | |||
|- | |- | ||
| interlace || 1 || Boolean || true (1) means interlace will be used | | interlace || 1 || Boolean || true (1) means interlace will be used | ||
|- | |||
| if (marker==1 or version2==0) { | |||
|- | |||
| offset || 16 || Unsigned || secondary buffer offset (bytes releative to start of buffer) | |||
|- | |||
| } | |||
|- | |- | ||
| dim_y || 8 || Unsigned || Macroblock height of video | | dim_y || 8 || Unsigned || Macroblock height of video | ||
Line 65: | Line 74: | ||
| render_x || 8 || Unsigned || Display width of video | | render_x || 8 || Unsigned || Display width of video | ||
|- | |- | ||
| } | | }else{ | ||
|- | |||
| if (marker==1 or version2==0) { | |||
|- | |||
| offset || 16 || Unsigned || secondary buffer offset (bytes releative to start of buffer) | |||
|- | |||
| } | |||
|- | |||
| } | |||
|} | |} | ||
The coding alogorithm for VP60 sequences with a secondary buffer is not yet documented. VP60 is reported to have "the ability to switch to a faster entropy encoding strategy to ensure smooth playback" and "The ability to decode different parts of the bitstream on different sub-processors (for instance the vlx and the core), to ensure better overall system utilization" [http://www.on2.com/cms-data/pdf/1125607149174329.pdf]. | |||
If dim_x or dim_y have values different from the previous intra frame, then the resolution of the encoded image has changed. | If dim_x or dim_y have values different from the previous intra frame, then the resolution of the encoded image has changed. |
Revision as of 02:33, 6 January 2007
- FOURCCs: VP60, VP61, VP62
- Company: On2
- Whitepaper: http://www.on2.com/cms-data/pdf/1125607149174329.pdf
- Samples: http://samples.mplayerhq.hu/V-codecs/VP6/
Implementations
An early open source implementation could be found at http://libvp62.sourceforge.net/, but was driven underground by On2 on copyright infringement claims.
The FFmpeg implementation supports VP61 and VP62 only to the extent necessary to play all known samples. It is known to be incomplete with regard to the VP6 specification, though.
A decoder implementation may be found in the FFMPEG source file vp6.c
Format
The aim here is to open this standard with a full description of the bitstream format and decoding process. Contributors from On2 especially encouraged here, but it is anticipated that this section will be completed through reverse engineering and by people who saw libvp62 source code before it was censored.
Please do not submit any copyrighted text or code here.
Introduction
VP6 uses unidirectional ("P-frame") and intra-frame (within the current frame) prediction. Entropy coding is performed using arithmetic (range?) coding and an 8x8 iDCT is used. The format supports dynamic adjustment of encoded video resolution. There are three variants of the VP6 codec, VP60 (Simple Profile), VP62 (Advanced Profile) and VP62 (Heightened Sharpness Profile).
Macroblocks
Each video frame is composed of an array of 16x16 macroblocks, just like MPEG-2, MPEG-4 parts 2 and 10. Each MB (macroblock) takes one of the following modes ("MV" means "motion vector"):
- Intra MB
- Inter MB, null MV, previous frame reference
- Inter MB, differential MV, previous frame reference
- Inter MB, four MVs, previous frame reference
- Inter MB, MV 1, previous frame reference
- Inter MB, MV 2, previous frame reference
- Inter MB, null MV, bookmarked frame reference
- Inter MB, differential MV, bookmarked frame reference
- Inter MB, MV 1, bookmarked frame reference
- Inter MB, MV 2, bookmarked frame reference
Frame Header
The frame header commences with a section that is encoded using conventional big-endian bit packing.
Syntax | Number of bits | Type | Semantics |
---|---|---|---|
frame_mode | 1 | Enum | 0x0 signifies an intra frame |
qp | 6 | Unsigned | Quantization parameter valid range 0..63 |
marker | 1 | Constant | 0=VP61/62, 1=VP60 |
if (frame_mode == 0) { | 0 equals to INTRA_FRAME | ||
version | 5 | Constant | 6=VP60/61, 7=VP60, 8=VP62 |
version2 | 2 | Constant | 0=VP60, 3=VP61/62 |
interlace | 1 | Boolean | true (1) means interlace will be used |
if (marker==1 or version2==0) { | |||
offset | 16 | Unsigned | secondary buffer offset (bytes releative to start of buffer) |
} | |||
dim_y | 8 | Unsigned | Macroblock height of video |
dim_x | 8 | Unsigned | Macroblock width of video |
render_y | 8 | Unsigned | Display height of video |
render_x | 8 | Unsigned | Display width of video |
}else{ | |||
if (marker==1 or version2==0) { | |||
offset | 16 | Unsigned | secondary buffer offset (bytes releative to start of buffer) |
} | |||
} |
The coding alogorithm for VP60 sequences with a secondary buffer is not yet documented. VP60 is reported to have "the ability to switch to a faster entropy encoding strategy to ensure smooth playback" and "The ability to decode different parts of the bitstream on different sub-processors (for instance the vlx and the core), to ensure better overall system utilization" [1].
If dim_x or dim_y have values different from the previous intra frame, then the resolution of the encoded image has changed.
Arithmetic coding commences at the next bit (which should be on a byte boundary):
Syntax | Type | Semantics |
---|---|---|
if (frame_mode == 0) { | ||
marker1 | Equiprobable 2-bit | Ignored |
} else { | ||
bookmark | Equiprobable 1-bit | bookmark == 0x1 means this frame will be the next bookmark frame |
filter1 | Equiprobable 1-bit | |
if (filter1 == 0x1) { | ||
filter2 | Equiprobable 1-bit | |
} | ||
filter_info | Equiprobable 1-bit | |
} | ||
if (frame_mode == 0 || filter_info == 0x1) { | ||
filter_mode1 | Equiprobable 1-bit | |
if (filter_mode1 == 0x1) { | ||
filter_threshold1 | Equiprobable 5-bit | |
filter_motion_param | Equiprobable 3-bit | |
} else { | ||
filter_mode2 | Equiprobable 1-bit | |
} | ||
filter_mode3 | Equiprobable 4-bit | |
} | ||
marker2 | Equiprobable 1-bit | Ignored |
Entropy Coding
Described here is the decoding process for the arithmetically-coded (AC) parts of the bitstream. VP6 uses a 16-bit range coding scheme to code binary symbols.
The AC decoder maintains three state variables: code, mask and high.
Initialization
At initialization, the first two bytes of the AC bitstream are shifted into code. The variable high is set to 0xff00. The variable mask is set to 0xffff.
Decoding a Binary Value
Each binary symbol has an associated probability p in the range 0 to 0xff.
A threshold, t, is computed thus:
- t = 0x100 + ( 0xff00 & ( ( (high-0x100) * p ) >> 8 ) )
Equiprobable binary symbols are treated somewhat differently:
- t = 0xff00 & ( (high+0x100) >> 1 )
The binary value may then be decoded by comparing code and t. If code is less than t, the binary value is decoded as 0. If code is equal to or greater than t, the binary value is decoded as 1.
If a 1 was decoded, then
- high = high - t
- code = code - t
If a 0 was decoded, then
- high = t
The following renormalization is now repeated while (high & 0x8000) is non-zero.
- high = 2 * high
- code = 2 * code
- mask = 2 * mask
- if ((mask & 0xff) == 0x00) {
- code = code | next byte from bitstream
- mask = mask | 0xff
- }
Decoding an Equiprobable n-bit Integer Value
Integer values are coded as a big-endian sequence of equiprobable binary values. To decode an n-bit equiprobable integer value, n equiprobable binary values should be decoded using the sequence above and left-shifted into an integral result variable.
Inverse DCT
Inverse DCT is performed on 8x8 blocks of pixels. The algorithm used is the same (or a small variation) of the one used for the VP3 decoder in FFmpeg [2], the original vp3 iDCT code is here [3].