# Difference between revisions of "H.264 Prediction"

This page documents the various prediction methods used in H.264 and related formats such as Sorenson Video 3 and RealVideo 4.

## 4x4 Prediction Modes

### Vertical

• H.264: mode 0
• SVQ3: mode 0
• RV40: mode 1
``` LT | T0  T1  T2  T3
---------------------
L0 | T0  T1  T2  T3
L1 | T0  T1  T2  T3
L2 | T0  T1  T2  T3
L3 | T0  T1  T2  T3
```

### Horizontal

• H.264: mode 1
• SVQ3: mode 1
• RV40: mode 2
``` LT | T0  T1  T2  T3
---------------------
L0 | L0  L0  L0  L0
L1 | L1  L1  L1  L1
L2 | L2  L2  L2  L2
L3 | L3  L3  L3  L3
```

### DC

• H.264: mode 2
• SVQ3: mode 2
• RV40: mode 0
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   a   a   a
L1 |  a   a   a   a
L2 |  a   a   a   a
L3 |  a   a   a   a
```

where:

```a = (T0 + T1 + T2 + T3 + L0 + L1 + L2 + L3 + 4) / 8
```

### Diagonal Down/Left

• H.264: mode 3
• SVQ3: not used
• RV40: not used
``` LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
L0 |  a   b   c   d
L1 |  b   c   d   e
L2 |  c   d   e   f
L3 |  d   e   f   g
```

where:

``` a = (T0 + 2*T1 + T2 + 2) / 4
b = (T1 + 2*T2 + T3 + 2) / 4
c = (T2 + 2*T3 + T4 + 2) / 4
d = (T3 + 2*T4 + T5 + 2) / 4
e = (T4 + 2*T5 + T6 + 2) / 4
f = (T5 + 2*T6 + T7 + 2) / 4
g = (T6 + 3*T7      + 2) / 4
```

### Diagonal Down/Left (SVQ3)

• H.264: not used
• SVQ3: mode 3
• RV40: not used
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   b   c   c
L1 |  b   c   c   c
L2 |  c   c   c   c
L3 |  c   c   c   c
```

where:

``` a = (L1 + T1) / 2
b = (L2 + T2) / 2
c = (L3 + T3) / 2
```

### Diagonal Down/Left (RV40)

• H.264: not used
• SVQ3: not used
• RV40: mode 4

### Diagonal Down/Left

• H.264: mode 3
• SVQ3: not used
• RV40: not used
``` LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
L0 |  a   b   c   d
L1 |  b   c   d   e
L2 |  c   d   e   f
L3 |  d   e   f   g
L4 |
L5 |
L6 |
L7 |
```

where:

``` a = (T0 + 2*T1 + T2 + L0 + 2*L1 + L2 + 4) / 8
b = (T1 + 2*T2 + T3 + L1 + 2*L2 + L3 + 4) / 8
c = (T2 + 2*T3 + T4 + L2 + 2*L3 + L4 + 4) / 8
d = (T3 + 2*T4 + T5 + L3 + 2*L4 + L5 + 4) / 8
e = (T4 + 2*T5 + T6 + L4 + 2*L5 + L6 + 4) / 8
f = (T5 + 2*T6 + T7 + L5 + 2*L6 + L7 + 4) / 8
g = (T6 +   T7      + L6 +   L7      + 2) / 4
```

### Diagonal Down/Right

• H.264: mode 4
• SVQ3: mode 4
• RV40: mode 3
``` LT | T0  T1  T2  T3
---------------------
L0 |  d   e   f   g
L1 |  c   d   e   f
L2 |  b   c   d   e
L3 |  a   b   c   d
```

where:

``` a = (L3 + 2*L2 + L1 + 2) / 4
b = (L2 + 2*L1 + L0 + 2) / 4
c = (L1 + 2*L0 + LT + 2) / 4
d = (L0 + 2*LT + T0 + 2) / 4
e = (LT + 2*T0 + T1 + 2) / 4
f = (T0 + 2*T1 + T2 + 2) / 4
g = (T1 + 2*T2 + T3 + 2) / 4
```

### Vertical/Right

• H.264: mode 5
• SVQ3: mode 5
• RV40: mode 5
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   b   c   d
L1 |  e   f   g   h
L2 |  i   a   b   c
L3 |  j   e   f   g
```

where:

``` a = (LT + T0 + 1) / 2
b = (T0 + T1 + 1) / 2
c = (T1 + T2 + 1) / 2
d = (T2 + T3 + 1) / 2
e = (L0 + 2*LT + T0 + 2) / 4
f = (LT + 2*T0 + T1 + 2) / 4
g = (T0 + 2*T1 + T2 + 2) / 4
h = (T1 + 2*T2 + T3 + 2) / 4
i = (LT + 2*L0 + L1 + 2) / 4
j = (L0 + 2*L1 + L2 + 2) / 4
```

### Horizontal/Down

• H.264: mode 6
• SVQ3: mode 6
• RV40: mode 8
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   b   c   d
L1 |  e   f   a   b
L2 |  g   h   e   f
L3 |  i   j   g   h
```

where:

``` a = (LT + L0 + 1) / 2
b = (L0 + 2*LT + T0 + 2) / 4
c = (LT + 2*T0 + T1 + 2) / 4
d = (T0 + 2*T1 + T2 + 2) / 4
e = (L0 + L1 + 1) / 2
f = (LT + 2*L0 + L1 + 2) / 4
g = (L1 + L2 + 1) / 2
h = (L0 + 2*L1 + L2 + 2) / 4
g = (L2 + L3 + 1) / 2
j = (L1 + 2*L2 + L3 + 2) / 4
```

### Vertical/Left

• H.264: mode 7
• SVQ3: mode 7
• RV40: mode 6
``` LT | T0  T1  T2  T3  T4  T5  T6  T7
-------------------------------------
L0 |  a   b   c   d
L1 |  f   g   h   i
L2 |  b   c   d   e
L3 |  g   h   i   j
```

where:

``` a = (T0 + T1 + 1) / 2
b = (T1 + T2 + 1) / 2
c = (T2 + T3 + 1) / 2
d = (T3 + T4 + 1) / 2
e = (T4 + T5 + 1) / 2
f = (T0 + 2*T1 + T2 + 2) / 4
g = (T1 + 2*T2 + T3 + 2) / 4
h = (T2 + 2*T3 + T4 + 2) / 4
i = (T3 + 2*T4 + T5 + 2) / 4
j = (T4 + 2*T5 + T6 + 2) / 4
```

For RV40 two coefficients differ:

``` a = (2*T0 + 2*T1 + L1 + 2*L2 +   L3 +      4) / 8
f = (  T0 + 2*T1 + T2 +   L2 + 2*L3 + L4 + 4) / 8
```

### Horizontal/Up

• H.264: mode 8
• SVQ3: mode 8
• RV40: not used
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   b   c   d
L1 |  c   d   e   f
L2 |  e   f   g   g
L3 |  g   g   g   g
```

where:

``` a = (L0 + L1 + 1) / 2
b = (L0 + 2*L1 + L2 + 2) / 4
c = (L1 + L2 + 1) / 2
d = (L1 + 2*L2 + L3 + 2) / 4
e = (L2 + L3 + 1) / 2
f = (L2 + 2*L3 + L3 + 2) / 4
g = L3
```

### Left/DC

• H.264: mode 9
• SVQ3: mode 9
• RV40: not used
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   a   a   a
L1 |  a   a   a   a
L2 |  a   a   a   a
L3 |  a   a   a   a
```

where:

```a = (L0 + L1 + L2 + L3 + 2) / 4
```

### Top/DC

• H.264: mode 10
• SVQ3: mode 10
• RV40: not used
``` LT | T0  T1  T2  T3
---------------------
L0 |  a   a   a   a
L1 |  a   a   a   a
L2 |  a   a   a   a
L3 |  a   a   a   a
```

where:

```a = (T0 + T1 + T2 + T3 + 2) / 4
```

### DC-128

• H.264: mode 11
• SVQ3: mode 11
• RV40: not used
``` LT |  T0   T1   T2   T3
------------------------
L0 | 128  128  128  128
L1 | 128  128  128  128
L2 | 128  128  128  128
L3 | 128  128  128  128
```

## 16x16 Prediction Modes

### DC

• H.264: mode 0
• SVQ3: mode 0
• RV40: mode 0

Using the 16 top predictors (T0..T15) and the 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:

``` mean = (sum(T0..T15) + sum(L0..L15) + 16) / 32
```

### Vertical

• H.264: mode 1
• SVQ3: mode 1
• RV40: mode 1
```  LT | T0  T1  T2  T3  T4  ..  T15
------------------------- .. -----
L0 | T0  T1  T2  T3  T4  ..  T15
L1 | T0  T1  T2  T3  T4  ..  T15
L2 | T0  T1  T2  T3  T4  ..  T15
......
L15 | T0  T1  T2  T3  T4  ..  T15
```

### Horizontal

• H.264: mode 2
• SVQ3: mode 2
• RV40: mode 2
```  LT |  T0  T1  T2  T3  T4  ..  T15
--------------------------- .. -----
L0 |  L0  L0  L0  L0  L0  ..   L0
L1 |  L1  L1  L1  L1  L1  ..   L1
L2 |  L2  L2  L2  L2  L2  ..   L2
......
L15 | L15 L15 L15 L15 L15  ..  L15
```

### Plane

• H.264: mode 3
• SVQ3: mode 3
• RV40: mode 3

Notice that SVQ3 follows a slightly different method here. RV40 is likely different as well and should be regarded as unfinished.

Given the top predictors (T0..T15), left predictors (L0..L15) and the left-top corner predictor (LT) arranged as follows:

```  LT |   T0    T1    T2  ..  T15
------------------------ .. -----
L0 |  c0,0  c1,0  c2,0 .. c15,0
L1 |  c0,1  c1,1  c2,1 .. c15,1
......
L15 | c0,15 c1,15 c2,15 .. c15,15
```

Compute H and V as:

``` H =  (T8 - T6) +
(T9 - T5) +
(T10 - T4) +
(T11 - T3) +
(T12 - T2) +
(T13 - T1) +
(T14 - T0) +
(T15 - LT)
```
``` V =  (L8 - L6) +
(L9 - L5) +
(L10 - L4) +
(L11 - L3) +
(L12 - L2) +
(L13 - L1) +
(L14 - L0) +
(L15 - LT)
```

For H.264, further compute H and V as:

``` H = (5*H + 32) / 64
V = (5*V + 32) / 64
```

For SVQ3, further compute H and V as:

``` H = (5*(H/4)) / 16
V = (5*(V/4)) / 16
swap H and V
```

The final process for filling in the 16x16 block is:

``` a = 16 * (L15 + T15 + 1) - 7*(V+H)
for (j = 0..15)
for (i = 0..15)
b = a + V * (15 - j) + (i * H * 4)
c[i,j] = SATURATE_U8((b + (i%4*H)) / 32)
```

The SATURATE_U8() function indicates that the result of the operation should be bounded to an unsigned 8-bit range (0..255).

### Left/DC

• H.264: mode 4
• SVQ3: mode 4
• RV40: not used

Using 16 left predictors (L0..L15), set all 256 elements to the mean, computed as:

``` mean = (sum(L0..L15) + 8) / 16
```

### Top/DC

• H.264: mode 5
• SVQ3: mode 5
• RV40: not used

Using the 16 top predictors (T0..T15), set all 256 elements to the mean, computed as:

``` mean = (sum(T0..T15) + 8) / 16
```

### DC-128

• H.264: mode 6
• SVQ3: mode 6
• RV40: not used

Set all 256 elements to 128.