Video XL
This page is based on the document 'Simple YUV Coding Formats' by Mike Melanson found at http://multimedia.cx/simple-yuv.txt.
This is a video codec used in hardware products by Miro Video and Pinnacle.
The Miro Video XL codec uses differential coding on a reduced-precision YUV 4:1:1 colorspace image. Each Y, U, or V component is only 7 bits (where 8 is more typical). Each group of 32 bits in the bitstream represents 6 5-bit delta table indices (with 2 unused bits). There is one index for each of the next 4 Y samples on the line and one index for each of the color samples.
The Pinnacle Video XL codec is apparently the same algorithm as the Miro codec except that the frames are 8 bytes longer. However, the same decoding process applies.
Data Format
For each block of 4 pixels on a line, fetch the next 32 bits as a little endian number and then swap the 16 bit words to achieve the correct bit orientation for decoding. To illustrate more clearly, this is the arrangement of the next 4 8-bit bytes (A, B, C, and D) on disk:
aaaaaaaa bbbbbbbb cccccccc dddddddd
Load the 4 bytes into a program variable so that the bytes are in this order:
dddddddd cccccccc bbbbbbbb aaaaaaaa
Then, swap the upper and lower 16-bit words to achieve this order:
31 0 bbbbbbbb aaaaaaaa dddddddd cccccccc
Further, the 32-bit blocks are stored in reverse order. So, for example, if an image is 16 pixels wide, it would have 4 pixel groups per line. Each pixel group would be represented by a 32-bit doubleword, swapped and mangled as described previously. The doublewords would be stored in the bytestream as:
D3 D2 D1 D0
D0 represents the first 4 pixels on the line and D3 represents the final 4 pixels on the line. Thus, a decoder must jump forward in the bytestream and work backwards through the bytestream while decoding in the forward direction on a particular line, then jump forward again in the bytestream when decoding the next line.
The 32 bits of the doubleword represent the following values:
bit 31: unused bits 30-26: V delta index bits 25-21: U delta index bits 20-16: Y3 delta index bit 15: unused bits 14-10: Y2 delta index bits 9-5: Y1 delta index bits 4-0: Y0 delta index
Each delta index value is used to index into this table and the referenced value is added to the previous element on the same plane, either Y, U, or V:
const int xl_delta_table[32] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 15, 20, 25, 34, 46, 64, 82, 94, 103, 108, 113, 116, 119, 120, 121, 122, 123, 124, 125, 126, 127 };
Remember that the YUV components only have 7 bits of precision. Thus, the second half of the table values all count as negative values.
At the beginning of a line, the Y0, U, and V delta indices actually represent the top 5 bits of the absolute 7-bit component value.
The final, concise decoding algorithm operates as follows:
foreach line in image foreach 32-bit doubleword, working from right -> left in bytestream load doubleword as little-endian number, swap 16-bit words if this is the first pixel group in line next Y value = (Y0 delta index) << 2 next U value = (U delta index) << 2 next V value = (V delta index) << 2 else next Y value = last Y value + xl_delta_table[Y0 delta index] next U value = last U value + xl_delta_table[U delta index] next V value = last V value + xl_delta_table[V delta index] next Y value = last Y value + xl_delta_table[Y1 delta index] next Y value = last Y value + xl_delta_table[Y2 delta index] next Y value = last Y value + xl_delta_table[Y3 delta index]
Since the components only have 7 bits of meaningful precision, it will likely be necessary to shift each of the components left once more to achieve 8 bits of output precision.