Apple ProRes: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
Line 109: Line 109:
|| header version.
|| header version.
|-
|-
| align="center" | 4 bytes || align="center" | vendorID? || align="center" | 'apl0' || Ignored in all known decoders.
| align="center" | 4 bytes || align="center" | creatorID ||
* 'apl0' -> Apple Inc.
* 'arri' -> Arnold & richter Cine Technik (A&R)
* 'aja0' -> AJA Kona Hardware
|| FOURCC of the creator of the present stream. Ignored in all known decoders.
|-
|-
| align="center" | 2 bytes || align="center" | frameWidth || || Width of encoded frame.
| align="center" | 2 bytes || align="center" | frameWidth || || Width of encoded frame.
Line 144: Line 148:
|}
|}


=== Field/Picture header ===
=== Field/Picture header ===



Revision as of 17:49, 18 September 2011

ProRes Introduction

Apple ProRes is a family of proprietary video codecs used for storing and editing high definition video data in Apple's Final Cut Pro. Apple's official whitepaper lists the codec's key features as being:

  • intra-only codecs
  • visually lossless compression (i.e. compressed images cannot be distinguished from the original by a human observer)
  • 4:2:2 / 4:4:4:4 source material
  • 10-bit (12-bit for ProRes 4444) sample depth
  • variable bitrate

ProRes 422 Standard Definition / High Quality codec

ProRes 422 SD/HQ is the same codec operating on two different bitrates (flavours). Two different FOURCCs are used in order to indicate each flavour:

Flavour name FOURCC Bitrate
Standard Definition (SD) 'apcn' 145 Mbps
High Quality (HQ) 'apch' 220 Mbps

ProRes algorithm is based on the Discrete cosine transform (further DCT) and utilizes the following compression techniques:

The bitstream of the ProRes 422 has been designed to provide the following additional features:

  • frame-level multi-threaded encoding/decoding depending on available CPU cores
  • spatial scalability providing the possibility to decode a video at different partial resolutions (1/2, 1/4, 1/8 of the full size and so on). ProRes is capable of saving CPU cycles while decoding at smaller resolutions due to a special bitstream layout enabling partial bitstream access and parsing.


Binary packages and compatibility

ProRes codec is currently available as the following binary libraries:

Lib Name Version Supported OS Supported Architecture Encoding Decoding
AppleProRes422.component 1.0.2 (Build 46) Mac OS X PowerPC Yes Yes
AppleProResDecoder.qtx 1.0.0.1 Windows x86 No Yes
AppleProResCodec.component 2.0 (Build 224) Mac OS X PowerPC/x86 Yes Yes
AppleProResDecoder.component 2.0.1 (Build 227) Mac OS X PowerPC/x86 No Yes
AppleProResDecoder.component 3.0.0 Mac OS X x86 No Yes

Frame layout

A typical ProRes 422 frame has the following layout:

       Frame container atom
------------------------------------
           Frame header
------------------------------------
            Picture 1
------------------------------------
 Picture 2 (interlaced frames only)

Frame container atom

At the beginning of each frame the frame container atom is located. It has the classical QuickTime atom structure with the ID set to the undocumented ProRes frame type ID:

Field size Field name Description
4 bytes size frame size in bytes
4 bytes type 'icpf' ("image codec prores frame"?)

All data is stored in the big-endian format. The value of the field "size" must match frame size from the movie container.


Frame header

A frame header stores description information, such as frame dimension, frame structure (progressive/interlaced), color information and the like. All data is stored in the big-endian format.

Field size Field name Value Description
2 bytes hdrSize size of this header in bytes. Must be at least 28 bytes long.
2 bytes version
  • "0" - supported in all known decoders
  • "1" - supported in the version 2.0 only
header version.
4 bytes creatorID
  • 'apl0' -> Apple Inc.
  • 'arri' -> Arnold & richter Cine Technik (A&R)
  • 'aja0' -> AJA Kona Hardware
FOURCC of the creator of the present stream. Ignored in all known decoders.
2 bytes frameWidth Width of encoded frame.
2 bytes frameHeight Height of encoded frame.
1 byte frameFlags

layout: AAxxBBxx where

  • bits AA = sample depth?
  • bits BB = frame type:
    • "0" - progressive
    • "1" - interlaced (top-field first)
    • "2" - interlaced (bottom-field first)
Frame structure flags.
3 bytes reserved1 0 Ignored in the decoder v1. It has some meaning in the version 2.0 that need to be clarified.
1 byte colorMatrix
  • "1" = ITU-R BT.709-2 / SMPTE 274M-1995 / SMPTE 296M-1997
  • "6" = ITU-R BT.601-4 / SMPTE 170M-1994 / SMPTE 293M-1996
Color matrix ID for color conversion between YUV and RGB (see below).
2 bytes reserved2 0 Ignored.
1 byte QMatFlags

layout: xxxxxxCD where

Custom quantization matrices presence indicators.
64 bytes QMatLuma Custom quantization matrix for luminance. Only present if indicated by the bit "C" of the QMatFlags.
64 bytes QMatChroma Custom quantization matrix for chrominance. Only present if indicated by the bit "D" of the QMatFlags.

Field/Picture header

This header is present for every picture (field).

Field size Field name Description
1 byte pic_hdr_size size of this header in bits. Must be at least 64 bits (8 bytes) long.
4 bytes pic_data_size size of the picture data in bytes.
2 bytes total_slices total number of slices in the picture.

At the same times it indicates the number of entries in the slice table.

4 bits slice_width_factor slice width = 2 ^ slice_width_factor. Supported slice sizes are therefore 8, 4, 2 and 1 macroblocks wide.
4 bits slice_height_factor Ideally slice height = 2 ^ slice_height_factor but in all known decoders only the value of "0" for that factor is allowed.

Thus, only one slice height = 1 macroblock is supported.

Slice coding

Slice header

 bits 0-2 unused?
 bits 3-7 header size
 1 byte   quantiser scale (1-224)
 2 bytes  luma data size
 2 bytes  U data size

Codeword encoding scheme

Every codeword is encoded as Rice code with three parameters defining coding parameters: maximum prefix length for Rice codes (MP), Rice code parameter (R) and Elias gamma (aka exp-Golomb) code parameter (G).

Decoding process is the following: read unary prefix, if its value more than MP then treat code as Elias gamma, otherwise treat it as Rice code (or pure unary for R=0).

 n = get_unary();
 if (n > MP) {
   val = get_bits(G + (n - MP - 1)) + ((MP + 1) << R);
 } else if (R) {
   val = (1 << n) | get_bits(R);
 } else {
   val = n;
 }

Coding parameters are packed into one byte:

 bits 0-1 MP
 bits 2-4 G
 bits 5-7 R

So further this byte value will be used to denote parameters.

Overall slice coding

Add data in slices is stored grouped: data for luma blocks is stored first, for chroma blocks last. Inside blocks DC coefficients are stored first, then AC coefficients.

DC coding scheme

DC values are delta-coded. First value and the first difference value are coded with fixed parameters, others depend on previous raw code:

 dc_code_params[] = {0x04, 0x28, 0x28, 0x4D, 0x4D, 0x70, 0x70 };
 
 code = get_code(0xB8);
 dc[0] = (code >> 1) ^ -(code & 1);
 
 code = 5;
 sign = 0;
 for (i = 1; i < num_dcs; i++) {
   code = get_code(dc_code_params[min(code, 6)]);
   sign ^= -(code & 1);
   dc[i] = dc[i - 1] + (((code + 1) >> 1) ^ sign) - sign; 
 }

AC coding scheme

AC coefficients from all blocks are coded together as single (skip, val, sign) stream interleaved (i.e. all coefficients at position 1 first, then all coefficients at position 2, etc.). And again parameters for coding next value are selected depending on previous decoded value:

 skip_code_params[] = { 0x06, 0x06, 0x05, 0x05, 0x04, 0x29, 0x29, 0x29, 0x29, 0x28, 0x28, 0x28, 0x28, 0x28, 0x28, 0x4C };
 level_code_params[] = { 0x04, 0x0A, 0x05, 0x06, 0x04, 0x28, 0x28, 0x28, 0x28, 0x4C };
 
 pos   = num_blocks;
 skip  = 4;
 level = 2;
 while (pos < 64 * num_blocks && has_bits_left()) {
   skip = get_code(skip_code_params[min(skip, 15)]);
   level = get_code(level_code_params[min(level, 9)]) + 1;
   sign = get_bit();
   
   pos += skip + 1;
   block[pos % num_blocks][scan[pos / num_blocks]] = sign ? -val : val;
 }

Unquantising

 DC = 4096 + ((dc_val * quant_matrix[0] * quant_mul) >> 2);
 AC = (ac_val * quant_matrix[i] * quant_mul) >> 2;

Base quantising matrices are given in frame header, quantising multiplier is given in each slice header.

Scan order

Progressive:

    0,  1,  8,  9,  2,  3, 10, 11,
   16, 17, 24, 25, 18, 19, 26, 27,
    4,  5, 12, 20, 13,  6,  7, 14,
   21, 28, 29, 22, 15, 23, 30, 31,
   32, 33, 40, 48, 41, 34, 35, 42,
   49, 56, 57, 50, 43, 36, 37, 44,
   51, 58, 59, 52, 45, 38, 39, 46,
   53, 60, 61, 54, 47, 55, 62, 63

Interlaced:

    0,  8,  1,  9, 16, 24, 17, 25,
    2, 10,  3, 11, 18, 26, 19, 27,
   32, 40, 33, 34, 41, 48, 56, 49,
   42, 35, 43, 50, 57, 58, 51, 59,
    4, 12,  5,  6, 13, 20, 28, 21,
   14,  7, 15, 22, 29, 36, 44, 37,
   30, 23, 31, 38, 45, 52, 60, 53,
   46, 39, 47, 54, 61, 62, 55, 63,