Lightning Strike Video Codec: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
No edit summary
 
(→‎Wavelet coding: mention transform)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
* FourCCs: LSVC, LSVM, LSVX
* FourCCs: LSVC, LSVM, LSVX
* Company: [[Espre Solutions]]
* Company: [[Espre Solutions]]
* Samples: http://www.mplayerhq.hu/MPlayer/samples/V-codecs/LSV/
* Samples: http://samples.mplayerhq.hu/V-codecs/LSV/


The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications.
The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications. It is based on [[H.263]] with an addition of wavelet coding for intra frames.
 
== Frame structure ==
 
Each frame begins with 5-byte header, the second byte denoting the frame type and the rest of bytes being usually garbage.
 
Known types:
* <code>78 01 yy cc xx</code> -- frame actually starts at line <code>yy</code> and the rest should be filled with colour <code>cc</code>. It has 8 additional bytes following the header, usually with codec version like <code>"lsvx2.0"</code>. Usually it's the first frame;
* <code>xx 01 xx xx xx</code> -- the same as above but without start line and fill values;
* <code>xx 05 xx xx xx</code> -- skip frame, no further data is transmitted;
* <code>xx 08 xx xx xx</code> -- probably a wavelet-coded keyframe that happens once is several seconds
* <code>xx 09 xx xx xx</code> -- the usual frame
 
This header is followed by the normal H.263++ picture header with a special exception: picture code 7 means wavelet-coded frame, otherwise it should be a conventional H.263++ frame.
 
== Wavelet coding ==
Wavelet data begins at byte-aligned position after H.263 picture header with 24-bit big-endian data size preceding actual frame data.
 
Then a frame header follows:
  32-bit LE - data size
  1/3 bytes - end depth for each plane (grayscale/YUV420) relative to the maximum one
 
After the header there's data for each plane that should end with <code>'c' 'o' 'd'</code> marker.
 
Each plane is split down to depth 4 and each band is coded in one of three possible ways using rather simple prediction and binary coder.
 
Wavelet transform seems to be the usual LGT5/3.
 
=== Plane coding ===
Each plane begins with three 16-bit LE values for each band (there should be depth*3+1 = 13 bands in total): some quantisers and maximum coded symbol (used in high-frequency band coding). Each band is coded with a binary coder and an appropriate model.
 
=== Band models ===
There are three band models (one for LL band and two for the rest of bands for default depth or not) that provide states for decoding variable-length integers. The code is decoded in the following way in all cases:
 
  if (coder.decode_bit(model.get_state_nz())) {
      sign = coder.decode_bit(model.get_state_sign());
      large = coder.decode_bit(sign ? model.get_state3() : model.get_state2());
      if (!large) {
          val = sign ? -1 : 1;
      } else {
            idx = 1;
            pfx = 2;
            while (coder.decode_bit(model.get_state_exponent(idx))) {
                pfx <<= 1;
                idx += 1;
            }
            let mant_state = mdl.get_state_mantissa(idx);
            val = pfx >> 1;
            mask = val >> 1;
            while (mask) {
                if (coder.decode_bit(model.get_state_mantissa(idx))) { // the same state
                    val |= mask;
                }
                mask >>= 1;
            }
            val++;
            if (sign) {
                val = -val;
            }
      }
  } else {
      val = 0;
  }
 
==== Model for LL band ====
This model uses 49 bit states and a prediction for data decoding.
* NZ state - state offset+0
* sign state - state offset+1
* state2 - state offset+2
* state3 - state offset+3
* exponent state N - state 20+N
* mantissa state N - state 20+14+N
 
Band decoding:
    pred = 0;
    model.offset = 0;
    for all values {
        val = decode_value(coder, model);
        pred += val;
        pixel_value = pred;
        if (val < -8)
            model.offset = 16;
        else if (val < 0)
            model.offset = 8;
        else if (val == 0)
            model.offset = 0;
        else if (val <= 8)
            model.offset = 4;
        else
            model.offset = 12;
    }
 
==== Model for normal-case bands ====
This model uses dynamic size dependent on number of wavelet levels to decode. Band data decoding is performed on row basis with decoding an end-of-line bit after each non-zero coefficient. Additionally for the first few decoded coefficient different bit contexts of the model are selected.
 
==== Model for reduced-case bands ====
This model uses 31 bit states and a prediction for data decoding.
* NZ state - state 0
* sign state - state with index 113 and mps=0 (will not change)
* state2 - state 1
* state3 - state 1
* exponent state N - state N
* mantissa state N - state 14+N
 
Band decoding done by decoding data in every row until value equal to the maximum value+2 is encountered (the maximum value is signalled in the plane data header for each band).
 
=== Binary coder ===
Binary coder resembles CABAC since it also codes single bits using static probabilities and updates model state after decoding each bit. Additionally coder bitstream uses <code>FF</code> as a marker for the end of stream (and <code>FF 00</code> for transmitting actual <code>FF</code> value).
 
==== Initialisation ====
    range = 1 << 16;
    bits  = 8;
    value = 0;
    for (i = 0; i < 4; i++)
        value = (value << 8) | next_byte();
 
==== Decoding a bit ====
This function takes and modifies <code>state_idx</code> (model index) and <code>state_mps</code> (most probable symbol) for decoding one bit:
 
    prob = model_probabilities[state_idx];
    help = range - prob;
    if (help <= (value >> 16)) {
        value -= help << 16;
        range = prob;
        if (help < prob) {
            bit = state_mps;
            state_idx = model_state_mps[state_idx];
        } else {
            bit = 1 - state_mps;
            state_idx = model_state_lps[state_idx];
            state_mps ^= model_mps_switch[state_idx];
        }
    } else if (help & 0x8000) {
        return state_mps;
    } else {
        if (help < prob) {
            bit = 1 - state_mps;
            state_idx = model_state_lps[state_idx];
            state_mps ^= model_mps_switch[state_idx];
        } else {
            bit = state_mps;
            state_idx = model_state_mps[state_idx];
        }
    }
    renorm();
    return bit;
 
==== Renormalisation ====
    while (range < 0x8000) {
        if (bits == 0) {
            value += next_byte() << 8;
            bits  = 8;
        }
        range <<= 1;
        value <<= 1;
        bits  -= 1;
    }
 
==== Tables ====
Probabilities:
    23069,  9606,  4372,  2059,  984,  474,  229,  111, 54, 26, 13, 6, 3, 1,
    23167, 16165, 11506,  8316,  6073,  4482,  3311, 2465, 1839, 1372, 1030, 771, 576, 433, 324, 245, 183, 138,  104,  78,  59,  44,
    23265, 18508, 14861, 12017,  9759,  7987,  6568, 5400, 4471, 3700, 3067, 2552, 2145, 1798, 1485, 1246, 1039, 867, 724,  604,  504,  420,  352,  293,  246,  203,  171, 143,
    23314, 19716, 16684, 14296, 12264, 10556,  9081, 7903, 6825, 5966, 5156, 4508, 3947, 3409, 2998, 2624,
    22578, 19740, 17294, 15325, 13550, 11950, 10650, 9494,
    21872, 19625, 17625, 15906, 14372, 12980, 11799,
    22184, 20294, 18405, 16847, 15421, 14174,
    21041, 19471, 17977, 16734,
    22055, 20711, 19333,
    21911, 20559,
    23056, 21794,
    23019,
    23069
 
MPS switch flags:
    1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0
 
MPS transition table:
      1,  2,  3,  4,  5,  6,  7,  8,  9,  10,  11,  12,  13,  13,  15,  16,
    17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,  29,  30,  31,  32,
    33,  34,  35,  9,  37,  38,  39,  40,  41,  42,  43,  44,  45,  46,  47,  48,
    49,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  32,
    65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  48,
    81,  82,  83,  84,  85,  86,  87,  71,  89,  90,  91,  92,  93,  94,  86,  96,
    97,  98,  99, 100,  93, 102, 103, 104,  99, 106, 107, 103, 109, 107, 111, 109, 111, 113
 
LPS transition table:
      1,  14,  16,  18,  20,  23,  25,  28,  30,  33,  35,  9,  10,  12,  15,  36,
    38,  39,  40,  42,  43,  45,  46,  48,  49,  51,  52,  54,  56,  57,  59,  60,
    62,  63,  32,  33,  37,  64,  65,  67,  68,  69,  70,  72,  73,  74,  75,  77,
    78,  79,  48,  50,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  61,  61,
    65,  80,  81,  82,  83,  84,  86,  87,  87,  72,  72,  74,  74,  75,  77,  77,
    80,  88,  89,  90,  91,  92,  93,  86,  88,  95,  96,  97,  99,  99,  93,  95,
    101, 102, 103, 104,  99, 105, 106, 107, 103, 105, 108, 109, 110, 111, 110, 112, 112, 113


[[Category:Video Codecs]]
[[Category:Video Codecs]]
[[Category:Undiscovered Video Codecs]]

Latest revision as of 09:47, 10 December 2022

The Lightning Strike Video Codecs has gone through a few iterations as indicated by its various FourCCs. The codec is designed for internet teleconferencing applications. It is based on H.263 with an addition of wavelet coding for intra frames.

Frame structure

Each frame begins with 5-byte header, the second byte denoting the frame type and the rest of bytes being usually garbage.

Known types:

  • 78 01 yy cc xx -- frame actually starts at line yy and the rest should be filled with colour cc. It has 8 additional bytes following the header, usually with codec version like "lsvx2.0". Usually it's the first frame;
  • xx 01 xx xx xx -- the same as above but without start line and fill values;
  • xx 05 xx xx xx -- skip frame, no further data is transmitted;
  • xx 08 xx xx xx -- probably a wavelet-coded keyframe that happens once is several seconds
  • xx 09 xx xx xx -- the usual frame

This header is followed by the normal H.263++ picture header with a special exception: picture code 7 means wavelet-coded frame, otherwise it should be a conventional H.263++ frame.

Wavelet coding

Wavelet data begins at byte-aligned position after H.263 picture header with 24-bit big-endian data size preceding actual frame data.

Then a frame header follows:

 32-bit LE - data size
 1/3 bytes - end depth for each plane (grayscale/YUV420) relative to the maximum one 

After the header there's data for each plane that should end with 'c' 'o' 'd' marker.

Each plane is split down to depth 4 and each band is coded in one of three possible ways using rather simple prediction and binary coder.

Wavelet transform seems to be the usual LGT5/3.

Plane coding

Each plane begins with three 16-bit LE values for each band (there should be depth*3+1 = 13 bands in total): some quantisers and maximum coded symbol (used in high-frequency band coding). Each band is coded with a binary coder and an appropriate model.

Band models

There are three band models (one for LL band and two for the rest of bands for default depth or not) that provide states for decoding variable-length integers. The code is decoded in the following way in all cases:

  if (coder.decode_bit(model.get_state_nz())) {
      sign = coder.decode_bit(model.get_state_sign());
      large = coder.decode_bit(sign ? model.get_state3() : model.get_state2());
      if (!large) {
          val = sign ? -1 : 1;
      } else {
           idx = 1;
           pfx = 2;
           while (coder.decode_bit(model.get_state_exponent(idx))) {
               pfx <<= 1;
               idx += 1;
           }
           let mant_state = mdl.get_state_mantissa(idx);
           val = pfx >> 1;
           mask = val >> 1;
           while (mask) {
               if (coder.decode_bit(model.get_state_mantissa(idx))) { // the same state
                   val |= mask;
               }
               mask >>= 1;
           }
           val++;
           if (sign) {
               val = -val;
           }
      }
  } else {
      val = 0;
  }

Model for LL band

This model uses 49 bit states and a prediction for data decoding.

  • NZ state - state offset+0
  • sign state - state offset+1
  • state2 - state offset+2
  • state3 - state offset+3
  • exponent state N - state 20+N
  • mantissa state N - state 20+14+N

Band decoding:

   pred = 0;
   model.offset = 0;
   for all values {
       val = decode_value(coder, model);
       pred += val;
       pixel_value = pred;
       if (val < -8)
           model.offset = 16;
       else if (val < 0)
           model.offset = 8;
       else if (val == 0)
           model.offset = 0;
       else if (val <= 8)
           model.offset = 4;
       else
           model.offset = 12;
   }

Model for normal-case bands

This model uses dynamic size dependent on number of wavelet levels to decode. Band data decoding is performed on row basis with decoding an end-of-line bit after each non-zero coefficient. Additionally for the first few decoded coefficient different bit contexts of the model are selected.

Model for reduced-case bands

This model uses 31 bit states and a prediction for data decoding.

  • NZ state - state 0
  • sign state - state with index 113 and mps=0 (will not change)
  • state2 - state 1
  • state3 - state 1
  • exponent state N - state N
  • mantissa state N - state 14+N

Band decoding done by decoding data in every row until value equal to the maximum value+2 is encountered (the maximum value is signalled in the plane data header for each band).

Binary coder

Binary coder resembles CABAC since it also codes single bits using static probabilities and updates model state after decoding each bit. Additionally coder bitstream uses FF as a marker for the end of stream (and FF 00 for transmitting actual FF value).

Initialisation

   range = 1 << 16;
   bits  = 8;
   value = 0;
   for (i = 0; i < 4; i++)
       value = (value << 8) | next_byte();

Decoding a bit

This function takes and modifies state_idx (model index) and state_mps (most probable symbol) for decoding one bit:

   prob = model_probabilities[state_idx];
   help = range - prob;
   if (help <= (value >> 16)) {
       value -= help << 16;
       range = prob;
       if (help < prob) {
           bit = state_mps;
           state_idx = model_state_mps[state_idx];
       } else {
           bit = 1 - state_mps;
           state_idx = model_state_lps[state_idx];
           state_mps ^= model_mps_switch[state_idx];
       }
   } else if (help & 0x8000) {
       return state_mps;
   } else {
       if (help < prob) {
           bit = 1 - state_mps;
           state_idx = model_state_lps[state_idx];
           state_mps ^= model_mps_switch[state_idx];
       } else {
           bit = state_mps;
           state_idx = model_state_mps[state_idx];
       }
   }
   renorm();
   return bit;

Renormalisation

   while (range < 0x8000) {
       if (bits == 0) {
           value += next_byte() << 8;
           bits   = 8;
       }
       range <<= 1;
       value <<= 1;
       bits   -= 1;
   }

Tables

Probabilities:

   23069,  9606,  4372,  2059,   984,   474,   229,  111, 54, 26, 13, 6, 3, 1,
   23167, 16165, 11506,  8316,  6073,  4482,  3311, 2465, 1839, 1372, 1030, 771, 576, 433, 324, 245, 183, 138,  104,   78,  59,  44,
   23265, 18508, 14861, 12017,  9759,  7987,  6568, 5400, 4471, 3700, 3067, 2552, 2145, 1798, 1485, 1246, 1039, 867, 724,  604,  504,  420,  352,  293,  246,  203,  171, 143,
   23314, 19716, 16684, 14296, 12264, 10556,  9081, 7903, 6825, 5966, 5156, 4508, 3947, 3409, 2998, 2624,
   22578, 19740, 17294, 15325, 13550, 11950, 10650, 9494,
   21872, 19625, 17625, 15906, 14372, 12980, 11799,
   22184, 20294, 18405, 16847, 15421, 14174,
   21041, 19471, 17977, 16734,
   22055, 20711, 19333,
   21911, 20559,
   23056, 21794,
   23019,
   23069

MPS switch flags:

   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0

MPS transition table:

     1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  13,  15,  16,
    17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,  29,  30,  31,  32,
    33,  34,  35,   9,  37,  38,  39,  40,  41,  42,  43,  44,  45,  46,  47,  48,
    49,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  32,
    65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  48,
    81,  82,  83,  84,  85,  86,  87,  71,  89,  90,  91,  92,  93,  94,  86,  96,
    97,  98,  99, 100,  93, 102, 103, 104,  99, 106, 107, 103, 109, 107, 111, 109, 111, 113

LPS transition table:

     1,  14,  16,  18,  20,  23,  25,  28,  30,  33,  35,   9,  10,  12,  15,  36,
    38,  39,  40,  42,  43,  45,  46,  48,  49,  51,  52,  54,  56,  57,  59,  60,
    62,  63,  32,  33,  37,  64,  65,  67,  68,  69,  70,  72,  73,  74,  75,  77,
    78,  79,  48,  50,  50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  61,  61,
    65,  80,  81,  82,  83,  84,  86,  87,  87,  72,  72,  74,  74,  75,  77,  77,
    80,  88,  89,  90,  91,  92,  93,  86,  88,  95,  96,  97,  99,  99,  93,  95,
   101, 102, 103, 104,  99, 105, 106, 107, 103, 105, 108, 109, 110, 111, 110, 112, 112, 113