|
|
(42 intermediate revisions by 8 users not shown) |
Line 1: |
Line 1: |
| * FOURCCs: WMV3, WMV9, WMVA, WVC1 | | * FourCCs: WMV3, WMV9, WMVA, WVC1, WMVP, WVP2, WMVR |
| * Company: [[Microsoft]] | | * Company: [[Microsoft]] |
| * General info: [http://www.microsoft.com/windows/windowsmedia/forpros/events/NAB2005/VC-1.aspx http://www.microsoft.com/windows/windowsmedia/forpros/events/NAB2005/VC-1.aspx] | | * Samples: |
| Old specs can be found here: [http://jovian.com/files/C24.008-VC9-Spec-CD1.pdf http://jovian.com/files/C24.008-VC9-Spec-CD1.pdf]
| | ** WMV9: http://samples.mplayerhq.hu/V-codecs/WMV9/ |
| | ** WMVA: http://samples.mplayerhq.hu/V-codecs/WMVA/ |
| | ** WMVP: http://samples.mplayerhq.hu/V-codecs/WMVP/ |
| | ** WVP2: http://samples.mplayerhq.hu/V-codecs/WVP2/ |
| | ** WVC1: http://samples.mplayerhq.hu/V-codecs/WVC1/ |
| | ** WMVB: |
| | ** WMVR: |
| | * General Overview: http://www.microsoft.com/windows/windowsmedia/forpros/events/NAB2005/VC-1.aspx |
| | * Draft specification: http://multimedia.cx/mirror/C24.008-VC9-Spec-CD1.pdf |
|
| |
|
| VC-1 is the codec microsoft is pushing for SMPTE standard. VC-1 is what wmv9 became and specs for it can be found here: | | VC-1 is a video coding standard developed by Microsoft. It began as Windows Media Video 9. It is prevalent in [[ASF]] files downloaded from the internet. It is also supposed to be used on [[HD-DVD]]s. |
|
| |
|
| VC-1 Compressed Video Bitstream Format and Decoding Process [http://www.smpte.org/smpte_store/standards/pdf/s421m.pdf http://www.smpte.org/smpte_store/standards/pdf/s421m.pdf] | | '''See [[Understanding VC-1]] for more information about the technical details of the format.''' |
|
| |
|
| VC-1 Bitstream Transport Encodings (specs for placing VC-1 in MPEG-2 Program and Transport streams) [http://www.smpte.org/smpte_store/standards/pdf/rp227.pdf http://www.smpte.org/smpte_store/standards/pdf/rp227.pdf] | | == Encapsulation == |
| | Most commonly, VC-1 data is found inside of Microsoft [[ASF]] files and identified with the FourCC 'WMV3' for VC-1 simple and main profile and FourCC 'WVC1' for advanced profile. Note that the FourCC 'WMV9' may not actually exist in the wild but the acronym gained prominence anyway due to the fact that this video codec was introduced as part of the Windows Media 9 tool suite. VC-1 video will probably be encapsulated in other types of containers and stream formats such as [[MPEG]] for [[HD-DVD]] transport. |
|
| |
|
| VC-1 Decoder and Bitstream Conformance [http://www.smpte.org/smpte_store/standards/pdf/rp228.pdf http://www.smpte.org/smpte_store/standards/pdf/rp228.pdf]
| | == Profiles And Levels == |
| | ''This table is cribbed wholesale from http://www.microsoft.com/windows/windowsmedia/forpros/events/NAB2005/VC-1.aspx'' |
|
| |
|
| Googling for VC1_reference_decoder_release6.zip might turn up sources for the reference decoder.
| | VC-1 has 3 profiles: simple, main, and advanced. Each has various levels. The combinations of profiles and levels represent trade-offs between encoding/decoding complexity, compression quality, and compressed image size. |
|
| |
|
| == Data Format == | | {| class="wikitable" border="1" align="center" cellpadding="6" cellspacing="3" |
| | |- style="background:silver; color=black" |
| | ! Profile !! Level !! Maximum Bit Rate !! Representative Resolutions by Frame Rate (Format) |
| | |- |
| | |Simple |
| | |Low |
| | |96 kilobits per second (Kbps) |
| | |176 x 144 @ 15 Hz ([[QCIF]]) |
| | |- |
| | | |
| | |Medium |
| | |384 Kbps |
| | |240 x 176 @ 30 Hz<br>352 x 288 @ 15 Hz ([[CIF]]) |
| | |- |
| | |Main |
| | |Low |
| | |2 megabits per second (Mbps) |
| | |320 x 240 @ 24 Hz ([[QVGA]]) |
| | |- |
| | | |
| | |Medium |
| | |10 Mbps |
| | |720 x 480 @ 30 Hz (480p)<br>720 x 576 @ 25 Hz (576p) |
| | |- |
| | | |
| | |High |
| | |20 Mbps |
| | |1920 x 1080 @ 30 Hz (1080p) |
| | |- |
| | |Advanced |
| | |L0 |
| | |2 Mbps |
| | |352 x 288 @ 30 Hz (CIF) |
| | |- |
| | | |
| | |L1 |
| | |10 Mbps |
| | |720 x 480 @ 30 Hz (NTSC-SD)<br>720 x 576 @ 25 Hz (PAL-SD) |
| | |- |
| | | |
| | |L2 |
| | |20 Mbps |
| | |720 x 480 @ 60 Hz (480p)<br>1280 x 720 @ 30 Hz (720p) |
| | |- |
| | | |
| | |L3 |
| | |45 Mbps |
| | |1920 x 1080 @ 24 Hz (1080p)<br>1920 x 1080 @ 30 Hz (1080i)<br>1280 x 720 @ 60 Hz (720p) |
| | |- |
| | | |
| | |L4 |
| | |135 Mbps |
| | |1920 x 1080 @ 60 Hz (1080p)<br>2048 x 1536 @ 24 Hz |
| | |} |
|
| |
|
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with &amp;quot;extradata&amp;quot; which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| | == Coding Concepts == |
|
| |
|
| 2 bits VC-1 Profile
| | === Colorspace === |
| if (profile == 3)
| | VC-1 codes a sequence of images in the [[YUV 4:2:0]] colorspace. |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
|
| |
|
| There are 4 VC-1 profiles:
| | === Macroblocks, Blocks, and Sub-blocks === |
| | When VC-1 codes an image, it divides the image into [[Macroblock|macroblocks]]. Each 16x16 macroblock is comprised of 6 8x8 sample blocks (4 Y blocks, 1 U block, and 1 V block). Further, the coding method may divide an individual 8x8 block into 2 8x4 blocks, 2 4x8 blocks, or 4 4x4 blocks. |
|
| |
|
| * 0 simple profile
| | === Transform Coding === |
| * 1 main profile
| | VC-1 uses a variation of the [[Discrete Cosine Transform]] to convert blocks of samples into a transform domain to facilitate more efficient coding. The transform may operate on the full 8x8 block or any of the 3 supported sub-block sizes (8x4, 4x8, or 4x4). Unlike many codec standards preceding VC-1, the specification defines a bit-accurate transform method that all implementations are expected to conform to so as to minimize transform error. |
| * 2 reserved
| |
| * 3 advanced profile
| |
|
| |
|
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| | === Zigzag === |
| | After tranforming sample data into the transform domain, VC-1 reorders the transformed data in a [[Zigzag Reordering|zigzag]] pattern which makes certain successive coding techniques more effective. VC-1 has 13 different zigzag patterns depending on various parameters (block size, interlacing, prediction mode and intra/inter). |
|
| |
|
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| | === Quantization === |
| | [[Quantization]] is the compression step that potentially loses the most information in a lossy compression scheme such as VC-1. This codec (unlike many others) defines a direct way to scale DC/AC coefficients using quantization parameter instead of specifying quantization matrixes. |
|
| |
|
| macroblock_width = (frame_width + 15) / 16
| | Quantizer may differ between macroblocks in several ways - all macroblocks may have different quantizers, edge macroblocks only, two adjacent edges macroblocks, |
| macroblock_height = (frame_height + 15) / 16
| | macroblocks from one edge or all macroblocks may have the same quantizer. For cases 2-4 there is second quantizer for selected edge macroblocks, for the first case difference value between main and real quantizer is stored. |
|
| |
|
| The total number of macroblocks in a frame is defined as:
| | === Bitplane Coding === |
| | VC-1 uses a number of bitplanes which are simply maps of ones and zeros that specify properties for the macroblocks in an image. For example, a particular bitplane codes information about which macroblocks are not coded in a frame. These bitplanes are coded into the final bitstream using a number of methods: |
| | * raw (data from bitplane is actually stored in macroblock header) |
| | * rowskip/colskip (each row or column are either zero - '0' bit is sent or coded - '1' bit and raw data bits are sent) |
| | * tiling (bitplane is split into 2x3 or 3x2 blocks, each block is coded with own codeword, remainder is coded with rowskip and colskip method) |
| | Bitplane may be coded in inverted mode which is signalled by additional bit before bitplane data. |
|
| |
|
| total_macroblocks = macroblock_width * macroblock_height
| | === Motion Compensation === |
| | VC-1 uses half-pel and quarter-pel interframe [[Motion Compensation|motion compensation]] with either bilinear (like in [[H.264|H.264]]) or bicubic (extended version of motion compensation employed in [[WMV2|Windows Media 2]]) interpolation. |
|
| |
|
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits&amp;lt;nowiki&amp;gt;[][]&amp;lt;/nowiki&amp;gt;. The profile/level combination defines the following limits:
| | === Huffman Coding === |
| | All essential data in frames (like motion vectors, block coefficients) is stored using static Huffman codes. Usually there are several sets of codes for each data type (motion vectors, block coefficients) and one set is used throughout whole frame. The set index is usually defined in frame header or derived from some parameters (like quantization or is frame intra/inter). Many of those codesets are inherited from MS [[MPEG-4|MPEG-4]] variants. |
|
| |
|
| max macroblocks/second
| | === Intensity Compensation === |
| max macroblocks/frame
| | Intensity Compensation is special mode when reference frame luma and chroma data are scaled before using it in motion compensation. |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
|
| |
|
| SRD maintains the following information about each macroblock:
| | === Range Reduction === |
| | This is special mode when both luma and chroma data range (0..255) is scaled down twice to 64..192 (with center = 128), so it needs to be expanded back before displaying (and using for prediction in simple and main profiles). |
|
| |
|
| macroblock type, contains the following attributes:
| | === Overlap Transform === |
| (attribute 1)
| | For blocks with big quantization value overlap transform may be performed. This is done by smoothing borders of adjacent blocks. |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
|
| |
|
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| | == Bitstream Packing == |
| | VC-1 bitstreams are packed as bits into bytes in left -> right order: |
|
| |
|
| reference new (new/current I/P frame) | | byte 0 byte 1 byte 2 byte 3 byte 4 .... |
| reference old (old I/P reference frame) | |
| reference B (reconstructed B frame) | |
| reference NoIC (B reference before intensity compensation was applied) | |
|
| |
|
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| | byte 0 byte 1 |
| | abcdefgh ijklmnop |
|
| |
|
| ACPRED
| | Given the preceding bytestream/bitstream, a get_bit() operation to retrieve the next bit in the stream would return bit a. A get_bits(5) operation to request the next 5 bits would return 'bcdef'. The next get_bits(4) operation would return 'ghij'. |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
|
| |
|
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| | == Setup Data / Sequence Layer == |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| | When VC-1 data is encapsulated inside of an [[ASF]] file it will be accompanied with setup data attached as the extradata of a [[BITMAPINFOHEADER]] data structure. In VC-1 parlance, this data is called the sequence layer. The format of this data is as follows: |
|
| |
|
| And that's it for the SRD &amp;quot;requirements gathering&amp;quot; process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| | * 2 bits: profile (0 - simple, 1 - main, 2 - complex, 3 - advanced). Complex profile is not covered by VC-1 standard and may occur in old WMV3 files where it was called "advanced profile". |
| to allocate enough space for this state.
| | * if profile is simple or main (0 or 1, respectively) |
| | ** 2 bits: reserved, should be 0 |
| | * if profile is advanced (3) |
| | ** 3 bits: level of advanced profile (values 5-7 are invalid) |
| | ** 2 bits: chroma format (note that only format 1, [[YUV 4:2:0]] is defined; other values are invalid) |
| | * 3 bits: Q frame rate for post processing; unused |
| | * 5 bits: Q bit rate for post proc; unused |
| | * if profile is simple or main |
| | ** 1 bit: loop filter flag |
| | ** 1 bit: reserved, should be 0 (looks like special coding mode known as J-frames in WMV2) |
| | ** 1 bit: multiresolution coding flag |
| | ** 1 bit: reserved, should be 1 |
| | ** 1 bit: fast U/V motion compensation (note: must be 1 in simple profile) - hints if decoder should round chroma motion values to halves |
| | ** 1 bit: extended motion vectors (note: must be 0 in simple profile) |
| | ** 2 bits: macroblock dequantization mode |
| | ** 1 bit: variable sized transform (i.e. allow 8x4, 4x8 and 4x4 blocks) |
| | ** 1 bit: reserved, should be 0 (possibly means if codeset for decoding AC coefficient is specified explicitly) |
| | ** 1 bit: overlapped transform flag |
| | ** 1 bit: sync marker flag |
| | ** 1 bit: range reduction flag |
| | ** 3 bits: maximum number of consecutive B frames |
| | ** 2 bits: quantizer mode |
| | ** 1 bit: 'finterp' flag in present in frame header |
| | ** 1 bit: 'release-to-manufacturer' flag - if set to 0 means old WMV3 encoding with different bitstream format for P/B frames (yet unfigured) |
| | * if profile is advanced |
| | ** 1 bit: post processing flag |
| | ** 12 bits: max coded width (actual width = (width + 1) * 2) |
| | ** 12 bits: max coded height (actual height = (height + 1) * 2) |
| | ** 1 bit: pulldown flag |
| | ** 1 bit: interlaced |
| | ** 1 bit: frame counter flag |
| | * 1 bit: frame interpolation flag |
| | * if profile is advanced |
| | ** '''(UNFINISHED: lots more stuff to be filled in when advanced profile is needed)''' |
| | * if profile is simple or main |
| | ** 1 bit: reserved, should be 1 |
|
| |
|
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| | In the case of simple or main profile data encapsulated in a general container format, the max coded width and height parameters will come from the container format rather than being encoded in the sequence layer. Observe that the total number of bits that comprise the sequence layer for simple or main profile data is 32 which, incidentally, ought to be the size of the extradata transmitted from the ASF container to a VC-1 decoder. |
|
| |
|
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| | == High Level Decoding Algorithm == |
| | For each encoded frame: |
| | * unpack the frame information such as quantization parameters, bitplanes, and tables |
| | * for each field |
| | ** unpack each macroblock |
| | ** for each macroblock |
| | *** perform motion compensation if needed |
| | *** determine blocks coding mode (which block is intra and is coded or not) |
| | *** for each block |
| | **** if block is intra or coded then decode it else proceed to the next block |
| | **** do inverse transform and in some case add 128 to every sample value |
| | **** do overlapping if needed |
| | **** do postprocessing if requested |
|
| |
|
| 2 bits profile
| | === Decoding Motion Vector === |
| if (profile is simple or main)
| | Each motion vector is stored in form F*36+LY*6+LX where F - flag which signalizes that macroblock is coded, LX and LY are coded sign and lengths of dX and dY values. |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc (&amp;quot;see standard&amp;quot;, SRD does not use)
| |
| 5 bits QBitRateForPostProc (&amp;quot;see standard&amp;quot;, SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it &amp;quot;RES_FASTTX&amp;quot;
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it &amp;quot;RES_TRANSTAB&amp;quot;
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it &amp;quot;RES_RTM_FLAG&amp;quot;
| |
|
| |
|
| Finally, it is time to decode an actual frame (referred to as &amp;quot;unpacking the picture layer&amp;quot;). The decode process iterates through however many fields comprise the frame (1 or 2).
| | === Decoding Intra Block === |
| | Overall decoding process: |
| | * predict DC value |
| | * read and apply delta value |
| | * if block has AC coeffs |
| | ** select dezigzag matrix |
| | ** while it is not the last coefficient |
| | *** decode AC coefficient information |
| | *** put AC coefficient into designated place in block |
| | * if specified do AC prediction basing on predicted DC direction |
| | * unquantize all coefficients |
|
| |
|
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| | === Decoding Inter Block === |
| | Inter block is composed from AC coefficients starting from top left corner. The only catch that it could be divided into subblocks which need to be decoded separately. |
|
| |
|
| if (picture format is interlaced frame)
| | === Decoding AC Coefficient === |
| choose set 4
| | AC coefficient is coded by special value which decomposes into number of zeroes before coefficient, coefficient value and if this is the last non-zero coefficient. All information needed to decode them is contained in special tables. |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ... | |
|
| |
|
| Decode a macroblock:
| | === AC Prediction === |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| | AC prediction is simply adding seven AC coefficient values from first row of block above or first column of block left of the current one to the corresponding coefficients of the destination block. Source block is the same which DC value was used for prediction. |
| parameters to the same as the picture
| | This process is performed only when the special bit is set. |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
|
| |
|
| '''(unfinished)'''
| | == Official Information == |
| | This Wiki aims to provide a complete, independent, and understandable description of the VC-1 format. Until such time, here are some external references on the format. |
| | * Old specs can be found here: http://jovian.com/files/C24.008-VC9-Spec-CD1.pdf |
| | * VC-1 Compressed Video Bitstream Format and Decoding Process http://www.smpte.org/smpte_store/standards/pdf/s421m.pdf |
| | * VC-1 Bitstream Transport Encodings (specs for placing VC-1 in MPEG-2 Program and Transport streams) http://www.smpte.org/smpte_store/standards/pdf/rp227.pdf |
| | * VC-1 Decoder and Bitstream Conformance [http://www.smpte.org/smpte_store/standards/pdf/rp228.pdf http://www.smpte.org/smpte_store/standards/pdf/rp228.pdf] |
| | * Googling for VC1_reference_decoder_release6.zip might turn up sources for the reference decoder. |
|
| |
|
| [[Category:Video Codecs]]
| | == WMVP differences from WMV3 == |
|
| |
|
| | WMVP is essentially a slide show containing source material and transform information. |
|
| |
|
|
| | Source material is stored like sprite which is transformed and cropped afterwards (sprite dimensions are usually bigger than output). |
| | Sprite properties are stored in sequence header instead of <code>RES_RTM</code> flag: |
|
| |
|
| == Data Format ==
| | sprite width - 11 bits |
| | sprite height - 11 bits |
| | frame rate - 5 bits |
| | X8 presence - 1 bit |
| | skip DC/AC tables - 1 bit |
| | slice code - 3 bits |
|
| |
|
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with &quot;extradata&quot; which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| | Frames are preceded by two bits with undiscovered meaning. |
| | | I-frames contain sprites and transform coefficients in 15.15 fixed point format, P-frames contain only transform coefficients. |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits&lt;nowiki&gt;[][]&lt;/nowiki&gt;. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD &quot;requirements gathering&quot; process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| 5 bits QBitRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_FASTTX&quot;
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it &quot;RES_TRANSTAB&quot;
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_RTM_FLAG&quot;
| |
| | |
| Finally, it is time to decode an actual frame (referred to as &quot;unpacking the picture layer&quot;). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
|
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with &quot;extradata&quot; which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits&lt;nowiki&gt;[][]&lt;/nowiki&gt;. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD &quot;requirements gathering&quot; process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| 5 bits QBitRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_FASTTX&quot;
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it &quot;RES_TRANSTAB&quot;
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_RTM_FLAG&quot;
| |
| | |
| Finally, it is time to decode an actual frame (referred to as &quot;unpacking the picture layer&quot;). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
|
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with "extradata" which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits<nowiki>[][]</nowiki>. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD "requirements gathering" process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc ("see standard", SRD does not use)
| |
| 5 bits QBitRateForPostProc ("see standard", SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it "RES_FASTTX"
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it "RES_TRANSTAB"
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it "RES_RTM_FLAG"
| |
| | |
| Finally, it is time to decode an actual frame (referred to as "unpacking the picture layer"). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
| <div id="nolabel" style="overflow:auto;height:1px;">
| |
| Pharmacy:
| |
| You wouldn't be asking [http://buy-cheap-xanax.umaxnet.com/ buy cheap xanax] [http://www.zorpia.com/xfarm tramadol online] How did not sold and he! It seemed unaware
| |
| [http://www.geocities.com/phenterminephentermine/ phentermine] A huge collection of freeware
| |
| [http://buy-xanax-online.umaxnet.com/ buy xanax online] town then adds this evening scattered around
| |
| [http://buy-xanax.umaxnet.com/ buy xanax]
| |
| [http://xanax-on-line.umaxnet.com/ xanax on line]
| |
| [http://2mg-xanax.umaxnet.com/ 2mg xanax] [http://generic-xanax.umaxnet.com/ generic xanax]
| |
| </div>
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with &quot;extradata&quot; which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits&lt;nowiki&gt;[][]&lt;/nowiki&gt;. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD &quot;requirements gathering&quot; process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| 5 bits QBitRateForPostProc (&quot;see standard&quot;, SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_FASTTX&quot;
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it &quot;RES_TRANSTAB&quot;
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it &quot;RES_RTM_FLAG&quot;
| |
| | |
| Finally, it is time to decode an actual frame (referred to as &quot;unpacking the picture layer&quot;). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
|
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with "extradata" which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits<nowiki>[][]</nowiki>. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD "requirements gathering" process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc ("see standard", SRD does not use)
| |
| 5 bits QBitRateForPostProc ("see standard", SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it "RES_FASTTX"
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it "RES_TRANSTAB"
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it "RES_RTM_FLAG"
| |
| | |
| Finally, it is time to decode an actual frame (referred to as "unpacking the picture layer"). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
| <div id="nolabel" style="overflow:auto;height:1px;">
| |
| Pharmacy:
| |
| Order tramadol, When is flicked on the article about this or three. [http://www.zorpia.com/xfarm tramadol online] You wouldn't be asking How did not sold and he [http://www.geocities.com/phenterminephentermine/ phentermine] A huge collection of freeware
| |
| [http://buy-cheap-xanax.umaxnet.com/ buy cheap xanax]
| |
| [http://buy-xanax-online.umaxnet.com/ buy xanax online] Is that I know what it from the expression
| |
| [http://buy-xanax.umaxnet.com/ buy xanax]
| |
| [http://xanax-on-line.umaxnet.com/ xanax on line]
| |
| [http://2mg-xanax.umaxnet.com/ 2mg xanax] mean the events tramadol [http://generic-xanax.umaxnet.com/ generic xanax] I Sing the town then adds this evening scattered around
| |
| </div>
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with "extradata" which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits<nowiki>[][]</nowiki>. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD "requirements gathering" process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc ("see standard", SRD does not use)
| |
| 5 bits QBitRateForPostProc ("see standard", SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it "RES_FASTTX"
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it "RES_TRANSTAB"
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it "RES_RTM_FLAG"
| |
| | |
| Finally, it is time to decode an actual frame (referred to as "unpacking the picture layer"). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
| | |
| [[Category:Video Codecs]]
| |
| | |
| | |
| <div id="nolabel" style="overflow:auto;height:1px;">
| |
| Pharmacy:
| |
| Order tramadol, When is flicked on the article about this or three. [http://www.zorpia.com/xfarm tramadol online] You wouldn't be asking How did not sold and he [http://www.geocities.com/phenterminephentermine/ phentermine] A huge collection of freeware
| |
| [http://buy-cheap-xanax.umaxnet.com/ buy cheap xanax]
| |
| [http://buy-xanax-online.umaxnet.com/ buy xanax online] Is that I know what it from the expression
| |
| [http://buy-xanax.umaxnet.com/ buy xanax]
| |
| [http://xanax-on-line.umaxnet.com/ xanax on line]
| |
| [http://2mg-xanax.umaxnet.com/ 2mg xanax] mean the events tramadol [http://generic-xanax.umaxnet.com/ generic xanax] I Sing the town then adds this evening scattered around
| |
| </div>
| |
| | |
| == Data Format ==
| |
| | |
| This description assumes that the data to be decoded in WMV3 data coming in from a [[Microsoft Advanced Streaming Format|Microsoft ASF]] file. The video data should be packaged with "extradata" which is attached to the end of a [[BITMAPINFOHEADER]] structure and transported in the ASF file. The format of the extradata is as follows:
| |
| | |
| 2 bits VC-1 Profile
| |
| if (profile == 3)
| |
| 3 bits Profile level
| |
| 2 bits Chroma format (SRD does not care)
| |
| 3 bits VC1_BITS_FRMRTQ_POSTPROC (? SRD does not care)
| |
| 5 bits VC1_BITS_BITRTQ_POSTPROC (? SRD does not care)
| |
| 1 bit VC1_BITS_POSTPROCFLAG (? SRD does not care)
| |
| 12 bits Encoded width (actual width = (w + 1) * 2)
| |
| 12 bits Encoded height (actual height = (h + 1) * 2)
| |
| | |
| There are 4 VC-1 profiles:
| |
| | |
| * 0 simple profile
| |
| * 1 main profile
| |
| * 2 reserved
| |
| * 3 advanced profile
| |
| | |
| If profile is advanced, the extradata carries a lot of setup information. For simple and main profiles, the relevant setup data is established outside of the decoder, e.g., the [[BITMAPINFOHEADER]] of a Microsoft ASF file. This information provides the width and height that the decoder uses to set up its state.
| |
| | |
| The decoder computes the macroblock width and height as the ceiling of each dimension divided by 16:
| |
| | |
| macroblock_width = (frame_width + 15) / 16
| |
| macroblock_height = (frame_height + 15) / 16
| |
| | |
| The total number of macroblocks in a frame is defined as:
| |
| | |
| total_macroblocks = macroblock_width * macroblock_height
| |
| | |
| If the level is marked as unknown during the initialization process, figure out what level the video belongs at. This is determined by the number of macroblocks in combination with the profile. The relevant table is vc1gentab.c:vc1GENTAB_LevelLimits<nowiki>[][]</nowiki>. The profile/level combination defines the following limits:
| |
| | |
| max macroblocks/second
| |
| max macroblocks/frame
| |
| max peak transmission rate in kbps
| |
| max buffer size in multiples of 16 kbits
| |
| motion vector range
| |
| | |
| SRD maintains the following information about each macroblock:
| |
| | |
| macroblock type, contains the following attributes:
| |
| (attribute 1)
| |
| intra
| |
| 1 MV
| |
| 2 MV
| |
| 4 MV
| |
|
| |
| (attribute 2)
| |
| direct macroblock
| |
| forward prediction
| |
| backward prediction
| |
| forward and backward prediction
| |
|
| |
| (attribute 3)
| |
| MVs apply to fields
| |
| bottom different than top
| |
| field transform
| |
|
| |
| AC prediction status, one of the following attributes:
| |
| AC prediction off
| |
| AC prediction on
| |
| no blocks can be predicted
| |
|
| |
| Block type, one of the following types:
| |
| 8x8 inter-coded
| |
| 8x4 inter-coded
| |
| 4x8 inter-coded
| |
| 4x4 inter-coded
| |
| intra-coded, no AC prediction
| |
| intra-coded, AC prediction from top values
| |
| intra-coded, AC prediction from left values
| |
|
| |
| flag indicating whether overlap filter is active for this macroblock
| |
| flag indicating whether macroblock is motion predicted only (no residual)
| |
| byte indicating coded block pattern which indicates which of the 6
| |
| sub-blocks are coded
| |
|
| |
| Quantizer information, this includes:
| |
| quantizer step in the range 1..31
| |
| quantizer half step, either 0 or 1
| |
| flag indicating uniform or non-uniform quantizer
| |
|
| |
| Information for each of the 6 constituent blocks in the macroblock:
| |
| block type (same choices as the macroblock attributes)
| |
| flag indicating non-zero AC coeffs for intra, non-zero AC/DC for inter
| |
| union between an intra block structure and an inter block structure
| |
| intra structure:
| |
| number of zero/non-zero AC coeffs
| |
| quantized DC coeff
| |
| quantized AC top row for prediction (7 values)
| |
| quantized AC left column for prediction (7 values)
| |
| bottom 2 rows (16 values) kept for overlap smoothing
| |
| inter structure:
| |
| number of zero/non-zero AC coeffs for 4 sub-blocks (Y blocks?)
| |
| forward and backward motion vector structures, each includes:
| |
| hybrid prediction mode, one of the following attributes:
| |
| predict from left
| |
| predict from top
| |
| no hybrid prediction
| |
| (x,y) motion vectors for each of the 4 Y blocks
| |
| (x,y) differential motion vectors in 1/4 pel units
| |
| (note: I am a little confused as to why each of the 6 sub-blocks stores the motion vector data for the entire macroblock)
| |
| | |
| The initializer then needs to computer how much space to allocate for each reference frame. The size of a frame determined by frame width and height, encoding profile, and interlacing. This size is used to allocate space for 4 different frames:
| |
| | |
| reference new (new/current I/P frame)
| |
| reference old (old I/P reference frame)
| |
| reference B (reconstructed B frame)
| |
| reference NoIC (B reference before intensity compensation was applied)
| |
| | |
| Further, the initializer allocates space for 7 different bitplanes. Each bitplanes has 1 flag per each macroblock as enumerated by the max macroblocks per frame for the profile/level. The bitplanes are:
| |
| | |
| ACPRED
| |
| SKIPMB
| |
| MVTYPEMB
| |
| DIRECTMB
| |
| OVERFLAGS
| |
| FORWARDMB
| |
| FIELDTX
| |
| | |
| Allocate space for motion vector history. The number of entries in this array is macroblock_width * (macroblock_height + 1) (extra height is for interlaced field). Each entry is a motion vector history structure which contains the 4 Y block motion vectors for a particular macroblock. The individual motion
| |
| vector structures are the same as in the intra structure which provides hybrid prediction, motion vectors, and diff MVs (again, 4 for each block?).
| |
| | |
| And that's it for the SRD "requirements gathering" process (vc1dec.c:vc1DEC_DecoderRequirements()). The function returns the number of bytes needed for the decoder's internal state. The client app is expected
| |
| to allocate enough space for this state.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecoderInitialise() function. This sets up the positions and structures contained within the memory pool allocated for space.
| |
| | |
| Next is the vc1dec.c:vc1DEC_DecodeSequence() function which unpacks the sequence layer:
| |
| | |
| 2 bits profile
| |
| if (profile is simple or main)
| |
| 2 bits VC1_BITS_RES_SM (? SRD does not care)
| |
| if (profile is advanced)
| |
| 3 bits level of advanced profile
| |
| 2 bits chroma format (note that only format 1, YUV 4:2:0 is defined)
| |
| 3 bits QFrameRateForPostProc ("see standard", SRD does not use)
| |
| 5 bits QBitRateForPostProc ("see standard", SRD does not use)
| |
| if (profile is simple or main)
| |
| 1 bit loop filter flag
| |
| 1 bit reserved, should be 0
| |
| 1 bit multiresolution coding flag
| |
| 1 bit reserved, should be 1, SRD calls it "RES_FASTTX"
| |
| 1 bit fast U/V motion compensation
| |
| note: must be 1 in simple profile
| |
| 1 bit extended motion vectors
| |
| note: must be 0 in simple profile
| |
| 2 bits macroblock dequantization
| |
| 1 bit variable sized transform
| |
| 1 bit reserved, should be 0, SRD calls it "RES_TRANSTAB"
| |
| 1 bit overlapped transform flag
| |
| 1 bit sync marker flag
| |
| 1 bit range reduction flag
| |
| 3 bits maximum number of consecutive B frames
| |
| 2 bits quantizer
| |
| if (profile is advanced)
| |
| 1 bit post processing flag
| |
| 12 bits max coded width (actual width = (w + 1) * 2)
| |
| 12 bits max coded height (actual height = (h + 1) * 2)
| |
| 1 bit pulldown flag
| |
| 1 bit interlaced
| |
| 1 bit frame counter flag
| |
| 1 bit frame interpolation flag
| |
| if (profile is advanced)
| |
| [lots more stuff to be filled in when advanced profile is needed]
| |
| if (profile is simple or main)
| |
| 1 bit reserved, should be 1, SRD calls it "RES_RTM_FLAG"
| |
| | |
| Finally, it is time to decode an actual frame (referred to as "unpacking the picture layer"). The decode process iterates through however many fields comprise the frame (1 or 2).
| |
| | |
| Choose from among 5 different zigzag table sets depending on profile and interlacing:
| |
| | |
| if (picture format is interlaced frame)
| |
| choose set 4
| |
| if (picture is intra)
| |
| choose set 0
| |
| else
| |
| if (profile is simple or main)
| |
| choose set 1
| |
| else
| |
| if (picture format is progressive)
| |
| choose set 2
| |
| else
| |
| choose set 3
| |
| '''(unfinished)'''
| |
| ... there is a lot more logic dealing with frame accounting; let's skip to the real meat: macroblock decoding! ...
| |
| | |
| Decode a macroblock:
| |
| set the macroblock overlap filter flag, coding type, quantizer and halfstep
| |
| parameters to the same as the picture
| |
| clear the skipped flag
| |
| set the CBP to 0 (no coded blocks)
| |
| choose the quantizer (long list of logic, see
| |
| vc1iquant.c:vc1IQUANT_ChooseQuantizer())
| |
| for each of the 6 sub-blocks, set coded field to 0, clear down all MV data
| |
| decide on non-uniform quantizer
| |
|
| |
| unpack an I or BI macroblock:
| |
| | |
| '''(unfinished)'''
| |
|
| |
|
| [[Category:Video Codecs]] | | [[Category:Video Codecs]] |