Difference between revisions of "H.264 Prediction"

From MultimediaWiki
Jump to navigation Jump to search
m (→‎Plane: ohh, Mike, Mike...)
 
(24 intermediate revisions by 4 users not shown)
Line 1: Line 1:
This page documents the various prediction methods used in [[H.264]] and related formats such as [[Sorenson Video 3]] and [[RealVideo 4]].
This page documents the various prediction methods used in [[H.264]] and related formats such as [[Sorenson Video 3]], [[RealVideo 4]], and [[On2 VP8]].


== 4x4 Prediction Modes ==
== 4x4 Prediction Modes ==
4x4 prediction modes vary between different codecs. While they are almost the same for [[H.264]] and [[Sorenson Video 3]], [[RealVideo 4]] has a different order for these modes and some of them significantly differ from H.264 counterparts (by using left predictors where H.264 does not and down left predictors which are not used elsewhere).


=== Vertical ===
=== Vertical ===
Line 8: Line 10:
* SVQ3: mode 0
* SVQ3: mode 0
* RV40: mode 1
* RV40: mode 1
* VP8: not used


  LT | T0  T1  T2  T3
    | T0  T1  T2  T3
  ---------------------
  ---------------------
  L0 | T0  T1  T2  T3
    | T0  T1  T2  T3
  L1 | T0  T1  T2  T3
    | T0  T1  T2  T3
  L2 | T0  T1  T2  T3
    | T0  T1  T2  T3
   L3 | T0  T1  T2  T3
    | T0  T1  T2  T3
 
=== Vertical (VP8) ===
 
* H.264: not used
* SVQ3: not used
* RV40: not used
* VP8: mode 2
 
   LT | T0  T1  T2  T3 T4
------------------------
    |  a  b  c  d
    |  a  b  c  d
    |  a  b  c  d
    |  a  b  c  d
 
where:
 
  a = (LT + 2*T0 + T1 + 2) >> 2
  b = (T0 + 2*T1 + T2 + 2) >> 2
  c = (T1 + 2*T2 + T3 + 2) >> 2
  d = (T2 + 2*T3 + T4 + 2) >> 2


=== Horizontal ===
=== Horizontal ===
Line 21: Line 45:
* SVQ3: mode 1
* SVQ3: mode 1
* RV40: mode 2
* RV40: mode 2
* VP8: not used


  LT | T0  T1  T2  T3
    |  
  ---------------------
  ---------------------
   L0 | L0  L0  L0  L0
   L0 | L0  L0  L0  L0
Line 28: Line 53:
   L2 | L2  L2  L2  L2
   L2 | L2  L2  L2  L2
   L3 | L3  L3  L3  L3
   L3 | L3  L3  L3  L3
=== Horizontal (VP8) ===
* H.264: not used
* SVQ3: not used
* RV40: not used
* VP8: mode 3
  LT |
--------------------
  L0 |  a  a  a  a
  L1 |  b  b  b  b
  L2 |  c  c  c  c
  L3 |  d  d  d  d
where:
  a = (LT + 2*L0 + L1 + 2) >> 2
  b = (L0 + 2*L1 + L2 + 2) >> 2
  c = (L1 + 2*L2 + L3 + 2) >> 2
  d = (L2 + 2*L3 + L3 + 2) >> 2


=== DC ===
=== DC ===
Line 34: Line 80:
* SVQ3: mode 2
* SVQ3: mode 2
* RV40: mode 0
* RV40: mode 0
* VP8: mode 0


  LT | T0  T1  T2  T3
    | T0  T1  T2  T3
  ---------------------
  ---------------------
   L0 |  a  a  a  a
   L0 |  a  a  a  a
Line 44: Line 91:
where:
where:


  a = (T0 + T1 + T2 + T3 + L0 + L1 + L2 + L3 + 4) / 8
  if top and left predictors are available
  a = (T0 + T1 + T2 + T3 + L0 + L1 + L2 + L3 + 4) / 8
else if top predictors are available
  a = (T0 + T1 + T2 + T3 + 2) / 4
else if left predictors are available
  a = (L0 + L1 + L2 + L3 + 2) / 4
else
  a = 128
 
Note that the VP8 reference code does not make any provisions for either or both sets of predictors to be missing.


=== Diagonal Down/Left ===
=== Diagonal Down/Left ===
Line 51: Line 107:
* SVQ3: not used
* SVQ3: not used
* RV40: not used
* RV40: not used
* VP8: mode 4


  LT | T0  T1  T2  T3  T4  T5  T6  T7
    | T0  T1  T2  T3  T4  T5  T6  T7
  -------------------------------------
  -------------------------------------
  L0 |  a  b  c  d
    |  a  b  c  d
  L1 |  b  c  d  e
    |  b  c  d  e
  L2 |  c  d  e  f
    |  c  d  e  f
  L3 |  d  e  f  g
    |  d  e  f  g


where:
where:
Line 74: Line 131:
* SVQ3: mode 3
* SVQ3: mode 3
* RV40: not used
* RV40: not used
* VP8: not used


  LT | T0  T1  T2  T3
    |     T1  T2  T3
  ---------------------
  ---------------------
  L0 |  a  b  c  c
    |  a  b  c  c
   L1 |  b  c  c  c
   L1 |  b  c  c  c
   L2 |  c  c  c  c
   L2 |  c  c  c  c
Line 93: Line 151:
* SVQ3: not used
* SVQ3: not used
* RV40: mode 4
* RV40: mode 4
* VP8: not used


=== Diagonal Down/Left ===
    | T0  T1  T2  T3  T4  T5  T6  T7
 
* H.264: mode 3
* SVQ3: not used
* RV40: not used
 
  LT | T0  T1  T2  T3  T4  T5  T6  T7
  -------------------------------------
  -------------------------------------
   L0 |  a  b  c  d
   L0 |  a  b  c  d
Line 110: Line 163:
   L6 |
   L6 |
   L7 |
   L7 |


where:
where:
Line 127: Line 179:
* SVQ3: mode 4
* SVQ3: mode 4
* RV40: mode 3
* RV40: mode 3
* VP8: mode 5


   LT | T0  T1  T2  T3
   LT | T0  T1  T2  T3
Line 150: Line 203:
* SVQ3: mode 5
* SVQ3: mode 5
* RV40: mode 5
* RV40: mode 5
* VP8: mode 6


   LT | T0  T1  T2  T3
   LT | T0  T1  T2  T3
Line 156: Line 210:
   L1 |  e  f  g  h
   L1 |  e  f  g  h
   L2 |  i  a  b  c
   L2 |  i  a  b  c
  L3 |  j  e  f  g
    |  j  e  f  g


where:
where:
Line 176: Line 230:
* SVQ3: mode 6
* SVQ3: mode 6
* RV40: mode 8
* RV40: mode 8
* VP8: mode 8


   LT | T0  T1  T2  T3
   LT | T0  T1  T2   
  ---------------------
  ---------------------
   L0 |  a  b  c  d
   L0 |  a  b  c  d
Line 202: Line 257:
* SVQ3: mode 7
* SVQ3: mode 7
* RV40: mode 6
* RV40: mode 6
* VP8: mode 7


  LT | T0  T1  T2  T3  T4  T5  T6 T7
    | T0  T1  T2  T3  T4  T5  T6  
  -------------------------------------
  ---------------------------------
  L0 |  a  b  c  d
    |  a  b  c  d
   L1 |  f  g  h  i
   L1 |  f  g  h  i
   L2 |  b  c  d  e
   L2 |  b  c  d  e
   L3 |  g  h  i  j
   L3 |  g  h  i  j
  L4 |


where:
where:
Line 227: Line 284:
   a = (2*T0 + 2*T1 + L1 + 2*L2 +  L3 +      4) / 8
   a = (2*T0 + 2*T1 + L1 + 2*L2 +  L3 +      4) / 8
   f = (  T0 + 2*T1 + T2 +  L2 + 2*L3 + L4 + 4) / 8
   f = (  T0 + 2*T1 + T2 +  L2 + 2*L3 + L4 + 4) / 8
For VP8, 3 coefficients differ:
  c = (T2 + T3 + T4 + 2) / 4
  e = (T4 + 2*T5 + T6 + 2) / 4
  j = (T5 + 2*T6 + T7 + 2) / 4


=== Horizontal/Up ===
=== Horizontal/Up ===
Line 232: Line 295:
* H.264: mode 8
* H.264: mode 8
* SVQ3: mode 8
* SVQ3: mode 8
* RV40: mode 7
* RV40: not used
* VP8: mode 9


  LT | T0  T1  T2  T3
    |  
  ---------------------
  ---------------------
   L0 |  a  b  c  d
   L0 |  a  b  c  d
Line 248: Line 312:
   d = (L1 + 2*L2 + L3 + 2) / 4
   d = (L1 + 2*L2 + L3 + 2) / 4
   e = (L2 + L3 + 1) / 2
   e = (L2 + L3 + 1) / 2
   f = (L2 + 2*L3 + L3 + 2) / 4
   f = (L2 + 3*L3     + 2) / 4
   g = L3
   g = L3


=== Left/DC ===
=== Horizontal/Up (RV40)===


* H.264: mode 9
* H.264: not used
* SVQ3: mode 9
* SVQ3: not used
* RV40: not used
* RV40: mode 7
* VP8: not used


  LT | T0  T1  T2  T3
    | T0  T1  T2  T3 T4  T5  T6  T7
  ---------------------
  -------------------------------------
   L0 |  a  a   a   a
   L0 |  a  b   c   d
   L1 |  a   a   a   a
   L1 |  c   d   e   f
   L2 |  a   a   a   a
   L2 |  e   f   g   h
   L3 |  a   a   a   a
   L3 |  g  h  i  j
   L4 |
   L5 |
   L6 |


where:
where:


a = (L0 + L1 + L2 + L3 + 2) / 4
  a = (T1 + 2*T2 + T3 + 2*L0 + 2*L1 +      4) / 8
  b = (T2 + 2*T3 + T4 +  L0 + 2*L1 + L2 + 4) / 8
  c = (T3 + 2*T4 + T5 + 2*L1 + 2*L2 +      4) / 8
  d = (T4 + 2*T5 + T6 +  L1 + 2*L2 + L3 + 4) / 8
  e = (T5 + 2*T6 + T7 + 2*L2 + 2*L3 +      4) / 8
  f = (T6 + 3*T7 +        L2 + 3*L3 +      4) / 8
  g = (T6 +  T7 +        L3 +  L4      + 2) / 4
  h = (                  L3 + 2*L4 + L5 + 2) / 4
  i = (                  L4 +  L5      + 1) / 2
  j = (                  L4 + 2*L5 + L6 + 2) / 4


=== Top/DC ===
=== TrueMotion (VP8) ===


* H.264: mode 10
* H.264: not used
* SVQ3: mode 10
* SVQ3: not used
* RV40: not used
* RV40: not used
* VP8: mode 1


   LT | T0  T1  T2  T3
   LT | T0  T1  T2  T3
  ---------------------
  ---------------------
   L0 |  a  a   a   a
   L0 |  a  .   .   .
   L1 |  a   a   a   a
   L1 |  .   b   .   .
   L2 |  a   a   a   a
   L2 |  .   .   c   .
   L3 |  a   a   a   a
   L3 |  .   .   .   d
 
where:


a = (T0 + T1 + T2 + T3 + 2) / 4
where this pattern is satisfied:


=== DC-128 ===
  a = SATURATE_U8(T0 - LT + L0)
  b = SATURATE_U8(T1 - LT + L1)
  c = SATURATE_U8(T2 - LT + L2)
  d = SATURATE_U8(T3 - LT + L3)


* H.264: mode 11
I.e., for each of the 16 samples: (top predictor for column) - (left/top predictor) + (left predictor for row), then saturate in an unsigned byte range 0..255.
* SVQ3: mode 11
* RV40: not used


  LT |  T0  T1  T2  T3
------------------------
  L0 | 128  128  128  128
  L1 | 128  128  128  128
  L2 | 128  128  128  128
  L3 | 128  128  128  128


== 16x16 Prediction Modes ==
== 16x16 Prediction Modes ==
Line 305: Line 376:
* SVQ3: mode 0
* SVQ3: mode 0
* RV40: mode 0
* RV40: mode 0
* VP8: mode 0


Using the 16 top predictors (T0..T15) and the 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:
Using the 16 top predictors (T0..T15) and the 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:


  mean = (sum(T0..T15) + sum(L0..L15) + 16) / 32
if top and left predictors are available
  mean = (sum(T0..T15) + sum(L0..L15) + 16) / 32
else if top predictors are available
  mean = (sum(T0..T15) + 8) / 16
else if left predictors are available
  mean = (sum(L0..L15) + 8) / 16
else
  mean = 128


=== Vertical ===
=== Vertical ===
Line 315: Line 394:
* SVQ3: mode 1
* SVQ3: mode 1
* RV40: mode 1
* RV40: mode 1
* VP8: mode 1


  LT | T0  T1  T2  T3  T4  ..  T15
      | T0  T1  T2  T3  T4  ..  T15
  ------------------------- .. -----
  -------------------------- .. -----
  L0 | T0  T1  T2  T3  T4  ..  T15
      | T0  T1  T2  T3  T4  ..  T15
  L1 | T0  T1  T2  T3  T4  ..  T15
      | T0  T1  T2  T3  T4  ..  T15
  L2 | T0  T1  T2  T3  T4  ..  T15
      | T0  T1  T2  T3  T4  ..  T15
   ......
   ......
  L15 | T0  T1  T2  T3  T4  ..  T15
      | T0  T1  T2  T3  T4  ..  T15


=== Horizontal ===
=== Horizontal ===
Line 329: Line 409:
* SVQ3: mode 2
* SVQ3: mode 2
* RV40: mode 2
* RV40: mode 2
* VP8: mode 2


  LT | T0  T1  T2  T3  T4  ..  T15
      |
  --------------------------- .. -----
  --------------------------- .. -----
   L0 |  L0  L0  L0  L0  L0  ..  L0
   L0 |  L0  L0  L0  L0  L0  ..  L0
Line 343: Line 424:
* SVQ3: mode 3
* SVQ3: mode 3
* RV40: mode 3
* RV40: mode 3
* VP8: not used


Notice that SVQ3 follows a slightly different method here. RV40 is likely different as well and should be regarded as unfinished.
Notice that SVQ3 and RV40 follow a slightly different method here.


Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:
Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:


   LT |   T0   T1   T2 .. T15
   LT |   T0       T1       T2     ..   T15
  ------------------------ .. -----
  -----------------------------------  .. --------
   L0 | c<sub>0,0</sub> c<sub>1,0</sub> c<sub>2,0</sub> .. c<sub>15,0</sub>
   L0 | c[ 0, 0] c[ 1, 0] c[ 2, 0.. c[15, 0]
   L1 | c<sub>0,1</sub> c<sub>1,1</sub> c<sub>2,1</sub> .. c<sub>15,1</sub>
   L1 | c[ 0, 1] c[ 1, 1] c[ 2, 1.. c[15, 1]
   ......
   ......
   L15 | c<sub>0,15</sub> c<sub>1,15</sub> c<sub>2,15</sub> .. c<sub>15,15</sub>
   L15 | c[ 0,15c[ 1,15c[ 2,15.. c[15,15]
 
Compute H' and V':
H' = 1* (T8 - T6) +
      2* (T9 - T5) +
      3*(T10 - T4) +
      4*(T11 - T3) +
      5*(T12 - T2) +
      6*(T13 - T1) +
      7*(T14 - T0) +
      8*(T15 - LT)
 
V' = 1* (L8 - L6) +
      2* (L9 - L5) +
      3*(L10 - L4) +
      4*(L11 - L3) +
      5*(L12 - L2) +
      6*(L13 - L1) +
      7*(L14 - L0) +
      8*(L15 - LT)


Compute H and V as:
For H.264, compute H and V as:
  H =  (T8 - T6) +
      (T9 - T5) +
      (T10 - T4) +
      (T11 - T3) +
      (T12 - T2) +
      (T13 - T1) +
      (T14 - T0) +
      (T15 - LT)


   V = (L8 - L6) +
   H = (5*H' + 32) / 64
      (L9 - L5) +
  V = (5*V' + 32) / 64
      (L10 - L4) +
      (L11 - L3) +
      (L12 - L2) +
      (L13 - L1) +
      (L14 - L0) +
      (L15 - LT)


For H.264, further compute H and V as:
For SVQ3, compute H and V as:


   H = (5*H + 32) / 64
   V = (5*(H'/4)) / 16
   V = (5*V + 32) / 64
   H = (5*(V'/4)) / 16
  (notice that V and H are computed from H' and V', respectively)


For SVQ3, further compute H and V as:
For RV40, compute H and V as:


   H = (5*(H/4)) / 16
   H = (5*(H' >> 2)) >> 4
   V = (5*(V/4)) / 16
   V = (5*(V' >> 2)) >> 4
   swap H and V
   (like SVQ3 but without swapping and it's important to use shifts here instead of divisions)


The final process for filling in the 16x16 block is:
The final process for filling in the 16x16 block is:
Line 390: Line 478:
   for (j = 0..15)
   for (j = 0..15)
     for (i = 0..15)
     for (i = 0..15)
       b = a + V * (15 - j) + (i * H * 4)
       b = a + V * j + H * i
       c[i,j] = SATURATE_U8((b + (i%4*H)) / 32)
       c[i,j] = SATURATE_U8(b / 32)


The SATURATE_U8() function indicates that the result of the operation should be bounded to an unsigned 8-bit range (0..255).
The SATURATE_U8() function indicates that the result of the operation should be bounded to an unsigned 8-bit range (0..255).


=== Left/DC ===
=== TrueMotion (VP8) ===


* H.264: mode 4
* H.264: not used
* SVQ3: mode 4
* SVQ3: not used
* RV40: not used
* RV40: not used
* VP8: mode 3


Using 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:
Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:


  mean = (sum(L0..L15) + 8) / 16
  LT |    T0        T1        T2    ..   T15
 
-----------------------------------  ..  --------
=== Top/DC ===
  L0 | c[ 0, 0]  c[ 1, 0]  c[ 2, 0]  ..  c[15, 0]
 
  L1 | c[ 0, 1]  c[ 1, 1]  c[ 2, 1]  ..  c[15, 1]
* H.264: mode 5
  ......
* SVQ3: mode 5
  L15 | c[ 0,15]  c[ 1,15]  c[ 2,15]  ..  c[15,15]
* RV40: not used


Using the 16 top predictors (T0..T15), set all 256 elements to the mean, computed as:
Compute c[t,l] as:


   mean = (sum(T0..T15) + 8) / 16
   c[t,l] = SATURATE_U8(L[l] + T[t] - LT)
 
=== DC-128 ===
 
* H.264: mode 6
* SVQ3: mode 6
* RV40: not used


Set all 256 elements to 128.
I.e., for each element, sum the left and top predictors for the row and column, respectively, and subtract the left-top predictor. Then, saturate the result between 0 and 255.


[[Category:Compression Theory]]
[[Category:Compression Theory]]

Latest revision as of 22:50, 22 April 2011

This page documents the various prediction methods used in H.264 and related formats such as Sorenson Video 3, RealVideo 4, and On2 VP8.

4x4 Prediction Modes

4x4 prediction modes vary between different codecs. While they are almost the same for H.264 and Sorenson Video 3, RealVideo 4 has a different order for these modes and some of them significantly differ from H.264 counterparts (by using left predictors where H.264 does not and down left predictors which are not used elsewhere).

Vertical

  • H.264: mode 0
  • SVQ3: mode 0
  • RV40: mode 1
  • VP8: not used
    | T0  T1  T2  T3
---------------------
    | T0  T1  T2  T3
    | T0  T1  T2  T3
    | T0  T1  T2  T3
    | T0  T1  T2  T3

Vertical (VP8)

  • H.264: not used
  • SVQ3: not used
  • RV40: not used
  • VP8: mode 2
 LT | T0  T1  T2  T3  T4
------------------------
    |  a   b   c   d
    |  a   b   c   d
    |  a   b   c   d
    |  a   b   c   d

where:

 a = (LT + 2*T0 + T1 + 2) >> 2
 b = (T0 + 2*T1 + T2 + 2) >> 2
 c = (T1 + 2*T2 + T3 + 2) >> 2
 d = (T2 + 2*T3 + T4 + 2) >> 2

Horizontal

  • H.264: mode 1
  • SVQ3: mode 1
  • RV40: mode 2
  • VP8: not used
    | 
---------------------
 L0 | L0  L0  L0  L0
 L1 | L1  L1  L1  L1
 L2 | L2  L2  L2  L2
 L3 | L3  L3  L3  L3

Horizontal (VP8)

  • H.264: not used
  • SVQ3: not used
  • RV40: not used
  • VP8: mode 3
 LT | 
--------------------
 L0 |  a   a   a   a
 L1 |  b   b   b   b
 L2 |  c   c   c   c
 L3 |  d   d   d   d

where:

 a = (LT + 2*L0 + L1 + 2) >> 2
 b = (L0 + 2*L1 + L2 + 2) >> 2
 c = (L1 + 2*L2 + L3 + 2) >> 2
 d = (L2 + 2*L3 + L3 + 2) >> 2

DC

  • H.264: mode 2
  • SVQ3: mode 2
  • RV40: mode 0
  • VP8: mode 0
    | T0  T1  T2  T3
---------------------
 L0 |  a   a   a   a
 L1 |  a   a   a   a
 L2 |  a   a   a   a
 L3 |  a   a   a   a

where:

if top and left predictors are available
  a = (T0 + T1 + T2 + T3 + L0 + L1 + L2 + L3 + 4) / 8
else if top predictors are available
  a = (T0 + T1 + T2 + T3 + 2) / 4
else if left predictors are available
  a = (L0 + L1 + L2 + L3 + 2) / 4
else
  a = 128

Note that the VP8 reference code does not make any provisions for either or both sets of predictors to be missing.

Diagonal Down/Left

  • H.264: mode 3
  • SVQ3: not used
  • RV40: not used
  • VP8: mode 4
    | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
    |  a   b   c   d
    |  b   c   d   e
    |  c   d   e   f
    |  d   e   f   g

where:

 a = (T0 + 2*T1 + T2 + 2) / 4
 b = (T1 + 2*T2 + T3 + 2) / 4
 c = (T2 + 2*T3 + T4 + 2) / 4
 d = (T3 + 2*T4 + T5 + 2) / 4
 e = (T4 + 2*T5 + T6 + 2) / 4
 f = (T5 + 2*T6 + T7 + 2) / 4
 g = (T6 + 3*T7      + 2) / 4

Diagonal Down/Left (SVQ3)

  • H.264: not used
  • SVQ3: mode 3
  • RV40: not used
  • VP8: not used
    |     T1  T2  T3
---------------------
    |  a   b   c   c
 L1 |  b   c   c   c
 L2 |  c   c   c   c
 L3 |  c   c   c   c

where:

 a = (L1 + T1) / 2
 b = (L2 + T2) / 2
 c = (L3 + T3) / 2

Diagonal Down/Left (RV40)

  • H.264: not used
  • SVQ3: not used
  • RV40: mode 4
  • VP8: not used
    | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
 L0 |  a   b   c   d
 L1 |  b   c   d   e
 L2 |  c   d   e   f
 L3 |  d   e   f   g
 L4 |
 L5 |
 L6 |
 L7 |

where:

 a = (T0 + 2*T1 + T2 + L0 + 2*L1 + L2 + 4) / 8
 b = (T1 + 2*T2 + T3 + L1 + 2*L2 + L3 + 4) / 8
 c = (T2 + 2*T3 + T4 + L2 + 2*L3 + L4 + 4) / 8
 d = (T3 + 2*T4 + T5 + L3 + 2*L4 + L5 + 4) / 8
 e = (T4 + 2*T5 + T6 + L4 + 2*L5 + L6 + 4) / 8
 f = (T5 + 2*T6 + T7 + L5 + 2*L6 + L7 + 4) / 8
 g = (T6 +   T7      + L6 +   L7      + 2) / 4

Diagonal Down/Right

  • H.264: mode 4
  • SVQ3: mode 4
  • RV40: mode 3
  • VP8: mode 5
 LT | T0  T1  T2  T3
---------------------
 L0 |  d   e   f   g
 L1 |  c   d   e   f
 L2 |  b   c   d   e
 L3 |  a   b   c   d

where:

 a = (L3 + 2*L2 + L1 + 2) / 4
 b = (L2 + 2*L1 + L0 + 2) / 4
 c = (L1 + 2*L0 + LT + 2) / 4
 d = (L0 + 2*LT + T0 + 2) / 4
 e = (LT + 2*T0 + T1 + 2) / 4
 f = (T0 + 2*T1 + T2 + 2) / 4
 g = (T1 + 2*T2 + T3 + 2) / 4

Vertical/Right

  • H.264: mode 5
  • SVQ3: mode 5
  • RV40: mode 5
  • VP8: mode 6
 LT | T0  T1  T2  T3
---------------------
 L0 |  a   b   c   d
 L1 |  e   f   g   h
 L2 |  i   a   b   c
    |  j   e   f   g

where:

 a = (LT + T0 + 1) / 2
 b = (T0 + T1 + 1) / 2
 c = (T1 + T2 + 1) / 2
 d = (T2 + T3 + 1) / 2
 e = (L0 + 2*LT + T0 + 2) / 4
 f = (LT + 2*T0 + T1 + 2) / 4
 g = (T0 + 2*T1 + T2 + 2) / 4
 h = (T1 + 2*T2 + T3 + 2) / 4
 i = (LT + 2*L0 + L1 + 2) / 4
 j = (L0 + 2*L1 + L2 + 2) / 4

Horizontal/Down

  • H.264: mode 6
  • SVQ3: mode 6
  • RV40: mode 8
  • VP8: mode 8
 LT | T0  T1  T2  
---------------------
 L0 |  a   b   c   d
 L1 |  e   f   a   b
 L2 |  g   h   e   f
 L3 |  i   j   g   h

where:

 a = (LT + L0 + 1) / 2
 b = (L0 + 2*LT + T0 + 2) / 4
 c = (LT + 2*T0 + T1 + 2) / 4
 d = (T0 + 2*T1 + T2 + 2) / 4
 e = (L0 + L1 + 1) / 2
 f = (LT + 2*L0 + L1 + 2) / 4
 g = (L1 + L2 + 1) / 2
 h = (L0 + 2*L1 + L2 + 2) / 4
 g = (L2 + L3 + 1) / 2
 j = (L1 + 2*L2 + L3 + 2) / 4

Vertical/Left

  • H.264: mode 7
  • SVQ3: mode 7
  • RV40: mode 6
  • VP8: mode 7
    | T0  T1  T2  T3  T4  T5  T6 
---------------------------------
    |  a   b   c   d
 L1 |  f   g   h   i
 L2 |  b   c   d   e
 L3 |  g   h   i   j
 L4 |

where:

 a = (T0 + T1 + 1) / 2
 b = (T1 + T2 + 1) / 2
 c = (T2 + T3 + 1) / 2
 d = (T3 + T4 + 1) / 2
 e = (T4 + T5 + 1) / 2
 f = (T0 + 2*T1 + T2 + 2) / 4
 g = (T1 + 2*T2 + T3 + 2) / 4
 h = (T2 + 2*T3 + T4 + 2) / 4
 i = (T3 + 2*T4 + T5 + 2) / 4
 j = (T4 + 2*T5 + T6 + 2) / 4

For RV40 two coefficients differ:

 a = (2*T0 + 2*T1 + L1 + 2*L2 +   L3 +      4) / 8
 f = (  T0 + 2*T1 + T2 +   L2 + 2*L3 + L4 + 4) / 8

For VP8, 3 coefficients differ:

 c = (T2 + T3 + T4 + 2) / 4
 e = (T4 + 2*T5 + T6 + 2) / 4
 j = (T5 + 2*T6 + T7 + 2) / 4

Horizontal/Up

  • H.264: mode 8
  • SVQ3: mode 8
  • RV40: not used
  • VP8: mode 9
    | 
---------------------
 L0 |  a   b   c   d
 L1 |  c   d   e   f
 L2 |  e   f   g   g
 L3 |  g   g   g   g

where:

 a = (L0 + L1 + 1) / 2
 b = (L0 + 2*L1 + L2 + 2) / 4
 c = (L1 + L2 + 1) / 2
 d = (L1 + 2*L2 + L3 + 2) / 4
 e = (L2 + L3 + 1) / 2
 f = (L2 + 3*L3      + 2) / 4
 g = L3

Horizontal/Up (RV40)

  • H.264: not used
  • SVQ3: not used
  • RV40: mode 7
  • VP8: not used
    | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
 L0 |  a   b   c   d
 L1 |  c   d   e   f
 L2 |  e   f   g   h
 L3 |  g   h   i   j
 L4 |
 L5 |
 L6 |

where:

 a = (T1 + 2*T2 + T3 + 2*L0 + 2*L1 +      4) / 8
 b = (T2 + 2*T3 + T4 +   L0 + 2*L1 + L2 + 4) / 8
 c = (T3 + 2*T4 + T5 + 2*L1 + 2*L2 +      4) / 8
 d = (T4 + 2*T5 + T6 +   L1 + 2*L2 + L3 + 4) / 8
 e = (T5 + 2*T6 + T7 + 2*L2 + 2*L3 +      4) / 8
 f = (T6 + 3*T7 +        L2 + 3*L3 +      4) / 8
 g = (T6 +   T7 +        L3 +   L4      + 2) / 4
 h = (                   L3 + 2*L4 + L5 + 2) / 4
 i = (                   L4 +   L5      + 1) / 2
 j = (                   L4 + 2*L5 + L6 + 2) / 4

TrueMotion (VP8)

  • H.264: not used
  • SVQ3: not used
  • RV40: not used
  • VP8: mode 1
 LT | T0  T1  T2  T3
---------------------
 L0 |  a   .   .   .
 L1 |  .   b   .   .
 L2 |  .   .   c   .
 L3 |  .   .   .   d

where this pattern is satisfied:

 a = SATURATE_U8(T0 - LT + L0)
 b = SATURATE_U8(T1 - LT + L1)
 c = SATURATE_U8(T2 - LT + L2)
 d = SATURATE_U8(T3 - LT + L3)

I.e., for each of the 16 samples: (top predictor for column) - (left/top predictor) + (left predictor for row), then saturate in an unsigned byte range 0..255.


16x16 Prediction Modes

DC

  • H.264: mode 0
  • SVQ3: mode 0
  • RV40: mode 0
  • VP8: mode 0

Using the 16 top predictors (T0..T15) and the 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:

if top and left predictors are available
  mean = (sum(T0..T15) + sum(L0..L15) + 16) / 32
else if top predictors are available
  mean = (sum(T0..T15) + 8) / 16
else if left predictors are available
  mean = (sum(L0..L15) + 8) / 16
else
  mean = 128

Vertical

  • H.264: mode 1
  • SVQ3: mode 1
  • RV40: mode 1
  • VP8: mode 1
     | T0  T1  T2  T3  T4  ..  T15
-------------------------- .. -----
     | T0  T1  T2  T3  T4  ..  T15
     | T0  T1  T2  T3  T4  ..  T15
     | T0  T1  T2  T3  T4  ..  T15
 ......
     | T0  T1  T2  T3  T4  ..  T15

Horizontal

  • H.264: mode 2
  • SVQ3: mode 2
  • RV40: mode 2
  • VP8: mode 2
     |
--------------------------- .. -----
  L0 |  L0  L0  L0  L0  L0  ..   L0
  L1 |  L1  L1  L1  L1  L1  ..   L1
  L2 |  L2  L2  L2  L2  L2  ..   L2
 ......
 L15 | L15 L15 L15 L15 L15  ..  L15

Plane

  • H.264: mode 3
  • SVQ3: mode 3
  • RV40: mode 3
  • VP8: not used

Notice that SVQ3 and RV40 follow a slightly different method here.

Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:

  LT |    T0        T1        T2     ..    T15
-----------------------------------  ..  --------
  L0 | c[ 0, 0]  c[ 1, 0]  c[ 2, 0]  ..  c[15, 0]
  L1 | c[ 0, 1]  c[ 1, 1]  c[ 2, 1]  ..  c[15, 1]
 ......
 L15 | c[ 0,15]  c[ 1,15]  c[ 2,15]  ..  c[15,15]

Compute H' and V':

H' = 1* (T8 - T6) +
     2* (T9 - T5) +
     3*(T10 - T4) +
     4*(T11 - T3) +
     5*(T12 - T2) +
     6*(T13 - T1) +
     7*(T14 - T0) +
     8*(T15 - LT)
V' = 1* (L8 - L6) +
     2* (L9 - L5) +
     3*(L10 - L4) +
     4*(L11 - L3) +
     5*(L12 - L2) +
     6*(L13 - L1) +
     7*(L14 - L0) +
     8*(L15 - LT)

For H.264, compute H and V as:

 H = (5*H' + 32) / 64
 V = (5*V' + 32) / 64

For SVQ3, compute H and V as:

 V = (5*(H'/4)) / 16
 H = (5*(V'/4)) / 16 
 (notice that V and H are computed from H' and V', respectively)

For RV40, compute H and V as:

 H = (5*(H' >> 2)) >> 4
 V = (5*(V' >> 2)) >> 4 
 (like SVQ3 but without swapping and it's important to use shifts here instead of divisions)

The final process for filling in the 16x16 block is:

 a = 16 * (L15 + T15 + 1) - 7*(V+H)
 for (j = 0..15)
   for (i = 0..15)
     b = a + V * j + H * i
     c[i,j] = SATURATE_U8(b / 32)

The SATURATE_U8() function indicates that the result of the operation should be bounded to an unsigned 8-bit range (0..255).

TrueMotion (VP8)

  • H.264: not used
  • SVQ3: not used
  • RV40: not used
  • VP8: mode 3

Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:

  LT |    T0        T1        T2     ..    T15
-----------------------------------  ..  --------
  L0 | c[ 0, 0]  c[ 1, 0]  c[ 2, 0]  ..  c[15, 0]
  L1 | c[ 0, 1]  c[ 1, 1]  c[ 2, 1]  ..  c[15, 1]
 ......
 L15 | c[ 0,15]  c[ 1,15]  c[ 2,15]  ..  c[15,15]

Compute c[t,l] as:

 c[t,l] = SATURATE_U8(L[l] + T[t] - LT)

I.e., for each element, sum the left and top predictors for the row and column, respectively, and subtract the left-top predictor. Then, saturate the result between 0 and 255.