RealVideo 6

From MultimediaWiki
Revision as of 11:32, 2 December 2018 by Kostya (talk | contribs) (Initial RealVideo 6 description)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • FourCC: rv60
  • Company: Real

RealVideo 6 is a combination of coding technologies from RealVideo 4 and HEVC. The codec is designed for speed and simplicity and thus has only limited set of features: I/P/B frames with no complex reference lists, 64x64 coding blocks that can be split down to 8x8 blocks, variable-length codes.

Frame structure

Frame is composed from rows of 64x64 blocks (or slices) with each slice being coded independently (relying on top neighbours just for reconstruction). Frame consists of: frame header, slice sizes and coded slices data.

Frame header

  • sync (2 bits) - always 3
  • profile (2 bits) - always 0
  • unknown (4 bits)
  • frame type (2 bits) - I, P, B or some special frame type (probably preview frame)
  • quantiser (6 bits)
  • marker (1 bit) - always 0
  • toolset (2 bits) - always 0
  • osvquant (2 bits) - quantiser selection mode (for coefficient coding)
  • unknown flag (1 bit)
  • unknown field (2 bits)
  • picture number (24 bits)
  • width code (11 bits) - width is (code + 1) * 4
  • height code (11 bits) - height is code * 4
  • unknown flag (1 bit)

These two fields present only in inter frames:

    • some flag (1 bit) - if present then three unknown flags after it are present too
    • use two forward references (1 bit)
  • luma QP difference? (1-2 bits) - coded as 0, 10 or 11
  • chroma QP difference (the same coding scheme) - should be 0
  • QP offset type (also 012 coding) - defines how QP difference for each slice is coded
  • deblock flag (1 bit)
  • do not deblock chroma flag (1 bit, present only when deblock flag is set)
  • optional message present flag (1 bit)

= Optional message format

  • message chunks length - 2 bits
  • data

Message can be coded in 0-3 chunks that have lengths of 2, 4 and 16 bytes.

Slice sizes

There are always (height + 63) >> 6 slices so that number is calculated from transmitted frame height.

Sizes are coded as an array of differences from the previous size.

  • size difference length minus one - 5 bits
  • array of size change flags (1 bit per flag, 0 - add to the previous size, 1 - subtract from the previous size)
  • array of size differences (the first size equals to the difference, each difference takes the amount of bits signalled above)

Slice data

Slice data consists of QP difference for the whole slice and data for each individual 64x64 macroblock.

QP difference is read depending on the mode in frame header (QP offset type):

  • 0 - no difference
  • 1 - 0 = 0, 10 = +1, 11 = -1
  • 2 - 0 = 0, 100 = +1, 101 = +2, 110 = -1, 111 = -2

Coded block header

Coded blocks are coded recursively the same way as in HEVC. If the block parts are outside frame then split the block and process all parts that are (partly) inside the frame recursively. Otherwise read a bit to decide whether the block should be split unless block size is 8x8, then it should not be split any further.

  • block type (intra for I-frame, otherwise 2 bits) - types are intra, inter with motion vector, skip block and inter block without coded motion frame

If it is 8x8 intra block then we need to read another bit that signals whether the block should be coded as split one (i.e. four 4x4 subblocks with individual intra prediction and transform for luma instead of single 8x8 block).


      • todo the rest ***

Reconstruction

Transforms

4x4 transform is the same as RealVideo 4#ITransform4x4.

8x8 transform is defined by the following matrix:

    37,  37,  37,  37,  37,  37,  37,  37,
    51,  43,  29,  10, -10, -29, -43, -51,
    48,  20, -20, -48, -48, -20,  20,  48,
    43, -10, -51, -29,  29,  51,  10, -43,
    37, -37, -37,  37,  37, -37, -37,  37,
    29, -51,  10,  43, -43, -10,  51, -29,
    20, -48,  48, -20, -20,  48, -48,  20,
    10, -29,  43, -51,  51, -43,  29, -10

16x16 transform is defined by the following matrix:

    26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,  26,
    37,  35,  32,  28,  23,  17,  11,   4,  -4, -11, -17, -23, -28, -32, -35, -37,
    36,  31,  20,   7,  -7, -20, -31, -36, -36, -31, -20,  -7,   7,  20,  31,  36,
    35,  23,   4, -17, -32, -37, -28, -11,  11,  28,  37,  32,  17,  -4, -23, -35,
    34,  14, -14, -34, -34, -14,  14,  34,  34,  14, -14, -34, -34, -14,  14,  34,
    32,   4, -28, -35, -11,  23,  37,  17, -17, -37, -23,  11,  35,  28,  -4, -32,
    31,  -7, -36, -20,  20,  36,   7, -31, -31,   7,  36,  20, -20, -36,  -7,  31,
    28, -17, -35,   4,  37,  11, -32, -23,  23,  32, -11, -37,  -4,  35,  17, -28,
    26, -26, -26,  26,  26, -26, -26,  26,  26, -26, -26,  26,  26, -26, -26,  26,
    23, -32, -11,  37,  -4, -35,  17,  28, -28, -17,  35,   4, -37,  11,  32, -23,
    20, -36,   7,  31, -31,  -7,  36, -20, -20,  36,  -7, -31,  31,   7, -36,  20,
    17, -37,  23,  11, -35,  28,   4, -32,  32,  -4, -28,  35, -11, -23,  37, -17,
    14, -34,  34, -14, -14,  34, -34,  14,  14, -34,  34, -14, -14,  34, -34,  14,
    11, -28,  37, -32,  17,   4, -23,  35, -35,  23,  -4, -17,  32, -37,  28, -11,
     7, -20,  31, -36,  36, -31,  20,  -7,  -7,  20, -31,  36, -36,  31, -20,   7,
     4, -11,  17, -23,  28, -32,  35, -37,  37, -35,  32, -28,  23, -17,  11,  -4

Transforms are done columns then rows using the same rounded shift by 7 in both stages (for both 8x8 and 16x16).

Intra prediction

This is done essentially like in H.265 though plane mode prediction might be a bit different.

Motion compensation

Motion compensation uses the same 1/4-th pel interpolation as RealVideo 4. Motion vector prediction is done from neighbouring top, left and top right block. For three candidates a median prediction is used, for two candidates it is (A + B) >> 1.

Blocks without coded motion vector form a list of unique motion vector list from the neighbours (i.e. if the motion vector is present already in the list do not add it the second time), padding to the required length with zero MVs and taking the Nth motion vector from the list specified by the coded number.

Skip candidate selection order:

  • top
  • left
  • top right
  • left down
  • just above left down block
  • just left to the top right block

For B-frames the averaging is done with (A + B) >> 1 formula.