H.264 Prediction: Difference between revisions

Revision as of 14:51, 6 August 2007

This page documents the various prediction methods used in H.264 and related formats such as Sorenson Video 3 and RealVideo 4.

4x4 Prediction Modes

4x4 prediction modes vary between different codecs. While they are almost the same for H.264 and Sorenson Video 3, RealVideo 4 has a different order for these modes and some of them significantly differ from H.264 counterparts (by using left predictors where H.264 does not and down left predictors which are not used elsewhere).

Vertical

H.264: mode 0
SVQ3: mode 0
RV40: mode 1

 LT | T0  T1  T2  T3
---------------------
 L0 | T0  T1  T2  T3
 L1 | T0  T1  T2  T3
 L2 | T0  T1  T2  T3
 L3 | T0  T1  T2  T3

Horizontal

H.264: mode 1
SVQ3: mode 1
RV40: mode 2

 LT | T0  T1  T2  T3
---------------------
 L0 | L0  L0  L0  L0
 L1 | L1  L1  L1  L1
 L2 | L2  L2  L2  L2
 L3 | L3  L3  L3  L3

DC

H.264: mode 2
SVQ3: mode 2
RV40: mode 0

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   a   a   a
 L1 |  a   a   a   a
 L2 |  a   a   a   a
 L3 |  a   a   a   a

where:

a = (T0 + T1 + T2 + T3 + L0 + L1 + L2 + L3 + 4) / 8

Diagonal Down/Left

H.264: mode 3
SVQ3: not used
RV40: not used

 LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
 L0 |  a   b   c   d
 L1 |  b   c   d   e
 L2 |  c   d   e   f
 L3 |  d   e   f   g

where:

 a = (T0 + 2*T1 + T2 + 2) / 4
 b = (T1 + 2*T2 + T3 + 2) / 4
 c = (T2 + 2*T3 + T4 + 2) / 4
 d = (T3 + 2*T4 + T5 + 2) / 4
 e = (T4 + 2*T5 + T6 + 2) / 4
 f = (T5 + 2*T6 + T7 + 2) / 4
 g = (T6 + 3*T7      + 2) / 4

Diagonal Down/Left (SVQ3)

H.264: not used
SVQ3: mode 3
RV40: not used

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   c
 L1 |  b   c   c   c
 L2 |  c   c   c   c
 L3 |  c   c   c   c

where:

 a = (L1 + T1) / 2
 b = (L2 + T2) / 2
 c = (L3 + T3) / 2

Diagonal Down/Left (RV40)

H.264: not used
SVQ3: not used
RV40: mode 4

 LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
 L0 |  a   b   c   d
 L1 |  b   c   d   e
 L2 |  c   d   e   f
 L3 |  d   e   f   g
 L4 |
 L5 |
 L6 |
 L7 |

where:

 a = (T0 + 2*T1 + T2 + L0 + 2*L1 + L2 + 4) / 8
 b = (T1 + 2*T2 + T3 + L1 + 2*L2 + L3 + 4) / 8
 c = (T2 + 2*T3 + T4 + L2 + 2*L3 + L4 + 4) / 8
 d = (T3 + 2*T4 + T5 + L3 + 2*L4 + L5 + 4) / 8
 e = (T4 + 2*T5 + T6 + L4 + 2*L5 + L6 + 4) / 8
 f = (T5 + 2*T6 + T7 + L5 + 2*L6 + L7 + 4) / 8
 g = (T6 +   T7      + L6 +   L7      + 2) / 4

Diagonal Down/Right

H.264: mode 4
SVQ3: mode 4
RV40: mode 3

 LT | T0  T1  T2  T3
---------------------
 L0 |  d   e   f   g
 L1 |  c   d   e   f
 L2 |  b   c   d   e
 L3 |  a   b   c   d

where:

 a = (L3 + 2*L2 + L1 + 2) / 4
 b = (L2 + 2*L1 + L0 + 2) / 4
 c = (L1 + 2*L0 + LT + 2) / 4
 d = (L0 + 2*LT + T0 + 2) / 4
 e = (LT + 2*T0 + T1 + 2) / 4
 f = (T0 + 2*T1 + T2 + 2) / 4
 g = (T1 + 2*T2 + T3 + 2) / 4

Vertical/Right

H.264: mode 5
SVQ3: mode 5
RV40: mode 5

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   d
 L1 |  e   f   g   h
 L2 |  i   a   b   c
 L3 |  j   e   f   g

where:

 a = (LT + T0 + 1) / 2
 b = (T0 + T1 + 1) / 2
 c = (T1 + T2 + 1) / 2
 d = (T2 + T3 + 1) / 2
 e = (L0 + 2*LT + T0 + 2) / 4
 f = (LT + 2*T0 + T1 + 2) / 4
 g = (T0 + 2*T1 + T2 + 2) / 4
 h = (T1 + 2*T2 + T3 + 2) / 4
 i = (LT + 2*L0 + L1 + 2) / 4
 j = (L0 + 2*L1 + L2 + 2) / 4

Horizontal/Down

H.264: mode 6
SVQ3: mode 6
RV40: mode 8

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   d
 L1 |  e   f   a   b
 L2 |  g   h   e   f
 L3 |  i   j   g   h

where:

 a = (LT + L0 + 1) / 2
 b = (L0 + 2*LT + T0 + 2) / 4
 c = (LT + 2*T0 + T1 + 2) / 4
 d = (T0 + 2*T1 + T2 + 2) / 4
 e = (L0 + L1 + 1) / 2
 f = (LT + 2*L0 + L1 + 2) / 4
 g = (L1 + L2 + 1) / 2
 h = (L0 + 2*L1 + L2 + 2) / 4
 g = (L2 + L3 + 1) / 2
 j = (L1 + 2*L2 + L3 + 2) / 4

Vertical/Left

H.264: mode 7
SVQ3: mode 7
RV40: mode 6

 LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
 L0 |  a   b   c   d
 L1 |  f   g   h   i
 L2 |  b   c   d   e
 L3 |  g   h   i   j

where:

 a = (T0 + T1 + 1) / 2
 b = (T1 + T2 + 1) / 2
 c = (T2 + T3 + 1) / 2
 d = (T3 + T4 + 1) / 2
 e = (T4 + T5 + 1) / 2
 f = (T0 + 2*T1 + T2 + 2) / 4
 g = (T1 + 2*T2 + T3 + 2) / 4
 h = (T2 + 2*T3 + T4 + 2) / 4
 i = (T3 + 2*T4 + T5 + 2) / 4
 j = (T4 + 2*T5 + T6 + 2) / 4

For RV40 two coefficients differ:

 a = (2*T0 + 2*T1 + L1 + 2*L2 +   L3 +      4) / 8
 f = (  T0 + 2*T1 + T2 +   L2 + 2*L3 + L4 + 4) / 8

Horizontal/Up

H.264: mode 8
SVQ3: mode 8
RV40: not used

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   d
 L1 |  c   d   e   f
 L2 |  e   f   g   g
 L3 |  g   g   g   g

where:

 a = (L0 + L1 + 1) / 2
 b = (L0 + 2*L1 + L2 + 2) / 4
 c = (L1 + L2 + 1) / 2
 d = (L1 + 2*L2 + L3 + 2) / 4
 e = (L2 + L3 + 1) / 2
 f = (L2 + 2*L3 + L3 + 2) / 4
 g = L3

Horizontal/Up (RV40)

H.264: not used
SVQ3: not used
RV40: mode 7

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   d
 L1 |  c   d   e   f
 L2 |  e   f   g   h
 L3 |  g   h   i   j
 L4 |
 L5 |
 L6 |
 L7 |

where:

 a = (T1 + 2*T2 + T3 + 2*L0 + 2*L1 +      4) / 8
 b = (T2 + 2*T3 + T4 +   L0 + 2*L1 + L2 + 4) / 8
 c = (T3 + 2*T4 + T5 + 2*L1 + 2*L2 +      4) / 8
 d = (T4 + 2*T5 + T6 +   L1 + 2*L2 + L3 + 4) / 8
 e = (T5 + 2*T6 + T7 + 2*L2 + 2*L3 +      4) / 8
 f = (T6 + 3*T7 +        L2 + 3*L3 +      4) / 8
 g = (T6 +   T7 +        L3 +   L4      + 2) / 4
 h = (                   L3 + 2*L4 + L5 + 2) / 4
 i = (                   L4 +   L5      + 1) / 2
 j = (                   L4 + 2*L5 + L6 + 2) / 4

Left/DC

H.264: mode 9
SVQ3: mode 9
RV40: not used

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   a   a   a
 L1 |  a   a   a   a
 L2 |  a   a   a   a
 L3 |  a   a   a   a

where:

a = (L0 + L1 + L2 + L3 + 2) / 4

Top/DC

H.264: mode 10
SVQ3: mode 10
RV40: not used

 LT | T0  T1  T2  T3
---------------------
 L0 |  a   a   a   a
 L1 |  a   a   a   a
 L2 |  a   a   a   a
 L3 |  a   a   a   a

where:

a = (T0 + T1 + T2 + T3 + 2) / 4

DC-128

H.264: mode 11
SVQ3: mode 11
RV40: not used

 LT |  T0   T1   T2   T3
------------------------
 L0 | 128  128  128  128
 L1 | 128  128  128  128
 L2 | 128  128  128  128
 L3 | 128  128  128  128

16x16 Prediction Modes

DC

H.264: mode 0
SVQ3: mode 0
RV40: mode 0

Using the 16 top predictors (T0..T15) and the 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:

 mean = (sum(T0..T15) + sum(L0..L15) + 16) / 32

Vertical

H.264: mode 1
SVQ3: mode 1
RV40: mode 1

  LT | T0  T1  T2  T3  T4  ..  T15
------------------------- .. -----
  L0 | T0  T1  T2  T3  T4  ..  T15
  L1 | T0  T1  T2  T3  T4  ..  T15
  L2 | T0  T1  T2  T3  T4  ..  T15
 ......
 L15 | T0  T1  T2  T3  T4  ..  T15

Horizontal

H.264: mode 2
SVQ3: mode 2
RV40: mode 2

  LT |  T0  T1  T2  T3  T4  ..  T15
--------------------------- .. -----
  L0 |  L0  L0  L0  L0  L0  ..   L0
  L1 |  L1  L1  L1  L1  L1  ..   L1
  L2 |  L2  L2  L2  L2  L2  ..   L2
 ......
 L15 | L15 L15 L15 L15 L15  ..  L15

Plane

H.264: mode 3
SVQ3: mode 3
RV40: mode 3

Notice that SVQ3 follows a slightly different method here. RV40 is likely different as well and should be regarded as unfinished.

Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:

  LT |   T0    T1    T2  ..  T15
------------------------ .. -----
  L0 |  c_0,0  c_1,0  c_2,0 .. c_15,0
  L1 |  c_0,1  c_1,1  c_2,1 .. c_15,1
 ......
 L15 | c_0,15 c_1,15 c_2,15 .. c_15,15

Compute H and V as:

 H =  (T8 - T6) +
      (T9 - T5) +
     (T10 - T4) +
     (T11 - T3) +
     (T12 - T2) +
     (T13 - T1) +
     (T14 - T0) +
     (T15 - LT)

 V =  (L8 - L6) +
      (L9 - L5) +
     (L10 - L4) +
     (L11 - L3) +
     (L12 - L2) +
     (L13 - L1) +
     (L14 - L0) +
     (L15 - LT)

For H.264, further compute H and V as:

 H = (5*H + 32) / 64
 V = (5*V + 32) / 64

For SVQ3, further compute H and V as:

 H = (5*(H/4)) / 16
 V = (5*(V/4)) / 16 
 swap H and V

The final process for filling in the 16x16 block is:

 a = 16 * (L15 + T15 + 1) - 7*(V+H)
 for (j = 0..15)
   for (i = 0..15)
     b = a + V * j + H * i
     c[i,j] = SATURATE_U8(b / 32)

The SATURATE_U8() function indicates that the result of the operation should be bounded to an unsigned 8-bit range (0..255).

Left/DC

H.264: mode 4
SVQ3: mode 4
RV40: not used

Using 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:

 mean = (sum(L0..L15) + 8) / 16

Top/DC

H.264: mode 5
SVQ3: mode 5
RV40: not used

Using the 16 top predictors (T0..T15), set all 256 elements to the mean, computed as:

 mean = (sum(T0..T15) + 8) / 16

DC-128

H.264: mode 6
SVQ3: mode 6
RV40: not used

Set all 256 elements to 128.

H.264 Prediction: Difference between revisions

Revision as of 14:51, 6 August 2007

Contents

4x4 Prediction Modes

Vertical

Horizontal

DC

Diagonal Down/Left

Diagonal Down/Left (SVQ3)

Diagonal Down/Left (RV40)

Diagonal Down/Right

Vertical/Right

Horizontal/Down

Vertical/Left

Horizontal/Up

Horizontal/Up (RV40)

Left/DC

Top/DC

DC-128

16x16 Prediction Modes

DC

Vertical

Horizontal

Plane

Left/DC

Top/DC

DC-128

Navigation menu

H.264 Prediction: Difference between revisions

Revision as of 14:51, 6 August 2007

4x4 Prediction Modes

Vertical

Horizontal

DC

Diagonal Down/Left

Diagonal Down/Left (SVQ3)

Diagonal Down/Left (RV40)

Diagonal Down/Right

Vertical/Right

Horizontal/Down

Vertical/Left

Horizontal/Up

Horizontal/Up (RV40)

Left/DC

Top/DC

DC-128

16x16 Prediction Modes

DC

Vertical

Horizontal

Plane

Left/DC

Top/DC

DC-128

Navigation menu

Search