SANM

From MultimediaWiki
Jump to navigation Jump to search

This page attempts to document the LucasArts Smush v2 codec, FOURCC "SANM".

A GPL'd decoder for the SNM format and relevant codecs can be found in the Residual reimplementation of the Grim Fandango Engine (GrimE), although it is largely unreadable because it was converted from the original assembler code to C.

Samples

Unique Samples

The following Grim Fandango movies are unique in that they (used to) make the Residual smush implementation segfault, and the issue was (never) resolved by adding 5700 bytes of padding to some buffers. When writing a decoder, they may serve as helpful potential stress tests.

lol.snm, byeruba.snm, crushed.snm, eldepot.snm, heltrain.snm, hostage.snm, tb_kitty.snm

Note - The Residual implementation's segfaulting results in from improper breakdown of the destination image into 8x8 blocks, whereby the calculation will claim that an image with N height blocks has (N+1) height blocks (or similar), at which point the segfault is imminent. This can be solved by either trimming the remaining pixels that don't fit into the last 8x8 blocks (undesirable), or set the image buffer width/height to the image size in blocks * 8, not pixels, to account for said remaining pixels (desirable). We'll dismiss this as a Residual implementation issue.

Use in Grim Fandango

SANM is used in Grim Fandango for cut-scenes and in-game animations. The actual SNM movie files are gzipped and stored inside LAB archive files (which are quite easy to extract, there are many tools). You must use a tool like *nix's "gunzip" to decompress the SNM files after extracting the SNM files out of the LAB files. A decompressed Smush file has the "SANM" FOURCC as the first four bytes.

Organization

This section deals with the structural properties of Smush movies. In other words, we describe the various headers used.

Note: each "chunk size" entry in a particular chunk header indicates the size of the chunk's contents without the chunk's FOURCC and size.

Preamble

The movie begins with a basic 8-byte section that looks like this:

0x00|"SANM" FOURCC        |4 bytes big endian
0x04|Movie size (in bytes)|4 bytes big endian

Video Header

This header immediately follows the preamble. It describes the movie's video properties.

0x000|"SHDR" FOURCC             |4 bytes big endian
0x004|Header size (in bytes)    |4 bytes big endian
0x008|Version value 1           |1 byte
0x009|Version value 2           |1 byte
0x00A|# of frames               |2 bytes little endian
0x00C|Video's x-coordinate      |2 bytes little endian
0x00E|Video's y-coordinate      |2 bytes little endian
0x010|Width                     |2 bytes little endian
0x012|Height                    |2 bytes little endian
0x014|Image type (see notes)    |2 bytes little endian
0x016|Frame delay (microseconds)|4 bytes little endian
0x01A|Maximal frame buffer size |4 bytes little endian
0x01E|Color Palette             |256 colors, 4-bytes little endian each
0x41E|Unused (see notes)        |16 bytes

Notes

  • It appears as though the original decoder ignores the Frame delay field completely. Instead, all movies are played with a frame delay of 66667 microseconds. For now we ignore it and force the 66667 usec value, until we find a sample that breaks with this.
  • Image type is "3" for all encountered samples so far. It might be helpful to assert() on this to easily single out samples that deviate from this.
  • Looks like the original decoder ignores the last 16 bytes of the header.

Audio/Keyframe Header

Smush supports variable-size keyframes. An example of this usage can be seen in the Full Throttle highway chase scenes, where different images are composited into the streaming video depending on the player's actions.

Curiously enough, this header includes both audio and keyframe information.

0x00|"FLHD" FOURCC         |4 bytes big endian
0x04|Header size (in bytes)|4 bytes big endian

Followed by any number of keyframe dimension chunks, which should match the number of keyframes in the movie. Dimension chunks in this header specify the dimensions of corresponding keyframes in the stream, in the order they're encountered. This information has not been rigorously verified, though.

0x00|"Bl16" FOURCC         |4 bytes big endian
0x04|Header size (in bytes)|4 bytes big endian
0x08|Padding?              |2 bytes
0x0A|Width                 |2 bytes little endian
0x0C|Height                |2 bytes little endian
0x0E|Padding?              |2 bytes

Followed by exactly one audio info chunk.

0x00|"Wave" FOURCC         |4 bytes big endian
0x04|Header size (in bytes)|4 bytes big endian
0x08|Frequency (Hz)        |4 bytes little endian
0x0C|# of channels         |4 bytes little endian
0x10|See notes             |4 bytes

Notes

  • For some movies, the "Wave" chunk contains an extra 4-byte field at its end, the purpose of which is unknown.
  • Movies without audio do not contain an audio info chunk.
  • The order in which Wave/Bl16 chunks are organized in the FLHD header is unspecified and is known to vary between movies.

Annotation

Movies may contain an optional plaintext annotation. In Grim Fandango, the only such movies are in-game animations. Keep in mind that the string itself may not always be as large as the advertised annotation size. In that case, the remaining space is padded with zeros until the advertised length is reached.

0x00|"ANNO" FOURCC             |4 bytes big endian
0x04|Annotation size (in bytes)|4 bytes big endian
0x08|Null-terminated string    |(Annotation size) bytes

Frame

This header is used as a container for a video frame and/or an audio frame, stored in an arbitrary order. In itself, it's just a FOURCC and a size.

0x00|"FRME" FOURCC        |4 bytes big endian
0x04|Chunk size (in bytes)|4 bytes big endian

Audio

Please see the appropriate section in VIMA for an audio frame's header/codec details. Note that as far as we know right now, this codec is specific to Grim Fandango Smush files.

Video

This chunk stores a potentially encoded video frame, as well as various opcodes and other stuff that's used to decode it. More details downstairs.

0x000|"Bl16" FOURCC            |4 bytes big endian
0x004|Chunk size (in bytes)    |4 bytes big endian
0x008|Unknown                  |8 bytes
0x010|Width                    |4 bytes little endian
0x014|Height                   |4 bytes little endian
0x018|Sequence #               |2 bytes little endian
0x01A|Subcodec ID              |1 byte
0x01B|Diff buffer rotate code  |1 byte
0x01C|Unknown                  |4 bytes
0x020|Small codebook           |8 bytes, 4 color values 2 bytes little endian each
0x028|Background colour        |2 bytes little endian
0x02A|Unknown                  |2 bytes
0x02C|RLE output size (bytes)  |4 bytes little endian
0x030|Codebook                 |512 bytes, 256 color values each 2 bytes little endian
0x230|Unknown                  |8 bytes
0x238|Video stream             |...

Codec

The codec is actually a combination of several subcodecs. The subcodec that's used for a particular frame is indicated by the appropriate field in the "Bl16" chunk of the frame.

Decompressed pixels come in 16-bit little endian, using the "565" bit arrangmenet.

Triple Diff Buffering

Smush uses a triple diff buffer mechanism to decode image data. A decoder's state includes three buffers, which are occasionally referenced by various subcodecs to decode individual frames. We will hereafter refer to said buffers as "db0", "db1", and "db2", where "db0" is the logical "current" diff buffer. It is crucial to note that "dbX" is only an alias to a particular diff buffer and does not stand for the contents of the buffer itself. In other words, it's a pointer.

Each frame contains an opcode that specifies how said buffers are rotated. Only two opcodes are used. Any other opcodes are ignored as "no-ops."

Opcode 1:
   swap(db0, db2)
Opcode 2:
   swap(db1, db2)
   swap(db2, db0) 

Initial Setup

We need to initialize two codebooks of 4x4 and 8x8 glyphs. The glyphs themselves are monochrome and thus consist of a foreground and background. We hereafter refer to said codebooks as glyph4_cb and glyph8_cb.

The construction algorithm iterates through two coordinate vectors, and interpolates an NxN glyph using every position in the x-vector with every position in the y-vector. Each vector contains 16 coordinates for a grand total of 256 glyphs per glyph size.

The vectors are defined for 4x4 and 8x8 glyphs as follows.

const int xvector4[] = { 0, 1, 2, 3, 3, 3, 3, 2, 1, 0, 0, 0, 1, 2, 2, 1 };
const int yvector4[] = { 0, 0, 0, 0, 1, 2, 3, 3, 3, 3, 2, 1, 1, 1, 2, 2 };
const int xvector8[] = { 0, 2, 5, 7, 7, 7, 7, 7, 7, 5, 2, 0, 0, 0, 0, 0 };
const int yvector8[] = { 0, 0, 0, 0, 1, 3, 4, 6, 7, 7, 7, 7, 6, 4, 3, 1 };

Here's how we make 4x4 glyphs. The algorithm for 8x8 glyphs is intuitively analogous.

for i = 0..16
{
   for j = 0..16
   {
      glyph[4][4] = all zeros

      vert1.x = xvector4[i]
      vert1.y = yvector4[i]
      vert2.x = xvector4[j]
      vert2.y = yvector4[j]
      
      edge1 = get_edge(vert1.x, vert1.y)
      edge2 = get_edge(vert2.x, vert2.y)
      direction = get_direction(edge1, edge2)

      width = largest side of line's bounding rectangle
      for each discrete point in _width_ points of our line (including the tips)
      {
         if direction is up, while row = point.y is >= 0, glyph[row--][point.x] = 1;
         if direction is down, while row = point.y is < 4, glyph[row++][point.x] = 1;
         if direction is left, while col = point.x is >= 0, glyph[point.y][col--] = 1;
         if direction is right, while col = point.x is < 4, glyph[point.y][col++] = 1;
      }
   
      codebook4.push_back(glyph) // order is important here, so yes, it's a push_back or equivalent
   }
}  

And here are the supplementary functions:

get_edge(x, y)
{
   if y == 0, return bottom_edge
   else if y == 3, return top_edge
   else if x == 0, return left_edge
   else if x == 3, return right_edge
   else, return no_edge
}

get_direction(2 edges)
{
   if (edges are left/right or right/left) or (edges are bottom/!top or !top/bottom), return up
   else if (edges are !bottom/top or top/!bottom), return down
   else if (edges are left/!right or !right/left), return left,
   else if (edges are bottom/top or top/bottom) or (edges are right/!left or !left/right), return right
}

Main Algorithm

if 0 == sequence number:
{
   // this is a keyframe
   fill db1 and db2 with background color.
}
handle subcodec according to ID.
copy contents of db0 into output image.
rotate buffers according to opcode.

Subcodecs

This section explains what the individual subcodecs mean, and how to suck out image data in each case. Note that ImageSize, in bytes, is defined as (Width * Height * 2)

ID|What
 0|Keyframe. Copy ImageSize bytes from video stream into db0, loop by 16-bit values and reinterpret each one as little endian.
 1|Never encountered so far.
 2|Hierarchical VQ and motion compensation.
 3|Copy ImageSize bytes from db2 into db0.
 4|Copy ImageSize bytes from db1 into db0.
 5|RLE decode. See below.
 6|Simple lookup/write. See below.
 7|Never encountered so far.
 8|RLE-encoded codebook indices. See below.

Subcodec 2

This codec is broken up into a three-level hierarchy, where each level decodes differently sized image blocks. The decoding algorithms are chosen based on opcodes provided by the video stream. Block sizes are 8x8, 4x4, and 2x2. Even though different block sizes exist, one pass through the decoding algorithm will always(!) decode an 8x8 block by either decoding an entire 8x8 block, by breaking the 8x8 block into four 4x4 blocks (and decoding each one), or by breaking a 4x4 block into four 2x2 blocks (and decoding each one). Any combination of said breakdown is possible and is up to the video stream.

  • We indicate the current x/y coordinates in db0 by "cx" and "cy", respectively.
  • We assume that two-dimensional arrays are row-major. That is, to access pixel(s) (x, y) at array, we write array[y][x].

Decoding is done by breaking the image into 8x8 blocks (rounding up where appropriate), and starting at the top, decoding each row from left to right with Level1. (Level1 will invoke any required levels, as described below.) For example:

int hblocks = round Height to next multiple of 8
int wblocks = round Width to next multiple of 8

cy = 0
for hblock in range(0, hblocks)
{
   cx = 0 // beginning of row
   for wblock in range(0, wblocks)
   {
      level1(cx, cy)
      cx += 8 // 8 pixels right
   }

   cy += 8 // 8 pixels down
}

We now describe the decoding algorithm for each opcode, per level. The general decoding algorithm of a level looks like this:

opcode = next byte in stream
handle opcode (see below)

Level 1 (8x8 blocks)

0x00 ... 0xF4
x, y = motion_vectors[opcode]
copy 8x8 block from db2[y + cy][x + cx] to db0[cy][cx]
0xF5
motion_vector = next 2 bytes of stream, little endian
x = motion_vector % image_width
y = motion_vector / image_width
copy 8x8 block from (db2[y + cy][x + cx]) to db0[cy][cx]
0xF6
copy 8x8 block from db1[cy][cx] to db0[cy][cx]
0xF7
glyph8_index = next byte of stream
indices = next 2 bytes of stream, little endian
fg_index = hi8bits(indices)
bg_index = lo8bits(indices)
fgcolor = codebook[fg_index]
bgcolor = codebook[bg_index]    
draw 8x8 glyph from glyph8_cb[glyph8_index] into db0[cy][cx] using fgcolor and bgcolor
0xF8
glyph8_index = next byte of stream
colors = next 4 bytes of stream, little endian
fgcolor = hi16bits(colors)
bgcolor = lo16bits(colors)   
draw 8x8 glyph from glyph8_cb[glyph8_index] into db0[cy][cx] using fgcolor and bgcolor
0xF9, 0xFA, 0xFB, 0xFC
color = value from small_codebook[opcode - 0xf9], little endian
fill 8x8 block in db0[cy][cx] with color
0xFD
index = value of next byte in stream
color = value from codebook[index], little endian
fill 8x8 block in db0[cy][cx] with color
0xFE
color = next 2 bytes in stream, little endian
fill 8x8 block in db0[cy][cx] with color
0xFF

This effectively breaks this block up into four 4x4 blocks and invokes the next level to decode them.

next_level(cx    , cy)
next_level(cx + 4, cy)
next_level(cx    , cy + 4)
next_level(cx + 4, cy + 4)

Level 2 (4x4 blocks)

Exactly the same as Level 1, except with 4x4 blocks.

Level 3 (2x2 blocks)

Same as the other levels except with 2x2 blocks, and with the following differences.

0xF7
indices[2][2] = next 4 bytes of stream, little endian
write a 2x2 block into db0[cy][cx] using codebook[indices[][]] for colors
0xF8, 0xFF
db0[cy][cx] = next 2 bytes of stream, little endian
db0[cy][cx + 1] = next 2 bytes of stream, little endian
db0[cy + 1][cx] = next 2 bytes of stream, little endian
db0[cy + 1][cx + 1] = next 2 bytes of stream, little endian

Subcodec 5

This is an RLE scheme.

size = RLE output size field from Bl16 header
rle_decode(db0, video stream, size)
for each 16-bit value of (size / 2) values starting at db0
{
   little_endian(value)
}

And here's the routine itself:

rle_decode(dst, src, const size)
{
   remaining = size
   while (remaining)
   {
      code = next byte of stream
      line_length = (code >> 1) + 1
 
      if (code & 1) // RLE run
      {
         color = next byte of input stream
         fill line_length bytes in dst with color
      }
      else // raw image data
      {
         copy line_length bytes from src into dst
      }
   
      remaining -= line_length
   }
}

Subcodec 6

This is a straightforward codebook lookup/write routine.

for each pixel in db0:
{
   index = value of next byte in video stream;
   pixel = (2 bytes little endian) codebook[index];
}

Subcodec 8

Used by loladies.snm, and repmec3c.snm.

Another RLE scheme, where the actual indices into the codebook are RLE-compressed. The decompression algorithm uses the same RLE decoding as in Subcodec 5.

indices = []
rle_decode(indices, video stream, width * height)
for each pixel in db0, i in range(0, indices.size())
{
   pixel = codebook[indices[i]], as little endian
}

Appendix A: Motion Vectors

This is the static motion vector table used in Subcodec 2. Each element is an (x, y) pair.

int motion_vectors[256][2] =
{
   {0,   0}, {-1, -43}, {6, -43},  {-9, -42},  {13, -41},
   {-16, -40},  {19, -39}, {-23, -36},  {26, -34},  {-2, -33},
   {4, -33}, {-29, -32},  {-9, -32},  {11, -31}, {-16, -29},
   {32, -29},  {18, -28}, {-34, -26}, {-22, -25},  {-1, -25},
   {3, -25},  {-7, -24},   {8, -24},  {24, -23},  {36, -23},
   {-12, -22},  {13, -21}, {-38, -20},   {0, -20}, {-27, -19},
   {-4, -19},   {4, -19}, {-17, -18},  {-8, -17},   {8, -17},
   {18, -17},  {28, -17},  {39, -17}, {-12, -15},  {12, -15},
   {-21, -14},  {-1, -14},   {1, -14}, {-41, -13},  {-5, -13},
   {5, -13},  {21, -13}, {-31, -12}, {-15, -11},  {-8, -11},
   {8, -11},  {15, -11},  {-2, -10},   {1, -10},  {31, -10},
   {-23,  -9}, {-11,  -9},  {-5,  -9},   {4,  -9},  {11,  -9},
   {42,  -9},   {6,  -8},  {24,  -8}, {-18,  -7},  {-7,  -7},
   {-3,  -7},  {-1,  -7},   {2,  -7},  {18,  -7}, {-43,  -6},
   {-13,  -6},  {-4,  -6},   {4,  -6},   {8,  -6}, {-33,  -5},
   {-9,  -5},  {-2,  -5},   {0,  -5},   {2,  -5},   {5,  -5},
   {13,  -5}, {-25,  -4},  {-6,  -4},  {-3,  -4},   {3,  -4},
   {9,  -4}, {-19,  -3},  {-7,  -3},  {-4,  -3},  {-2,  -3},
   {-1,  -3},   {0,  -3},   {1,  -3},   {2,  -3},   {4,  -3},
   {6,  -3},  {33,  -3}, {-14,  -2}, {-10,  -2},  {-5,  -2},
   {-3,  -2},  {-2,  -2},  {-1,  -2},   {0,  -2},   {1,  -2},
   {2,  -2},   {3,  -2},   {5,  -2},   {7,  -2},  {14,  -2},
   {19,  -2},  {25,  -2},  {43,  -2},  {-7,  -1},  {-3,  -1},
   {-2,  -1},  {-1,  -1},   {0,  -1},   {1,  -1},   {2,  -1},
   {3,  -1},  {10,  -1},  {-5,   0},  {-3,   0},  {-2,   0},
   {-1,   0},   {1,   0},   {2,   0},   {3,   0},   {5,   0},
   {7,   0}, {-10,   1},  {-7,   1},  {-3,   1},  {-2,   1},
   {-1,   1},   {0,   1},   {1,   1},   {2,   1},   {3,   1},
   {-43,   2}, {-25,   2}, {-19,   2}, {-14,   2},  {-5,   2},
   {-3,   2},  {-2,   2},  {-1,   2},   {0,   2},   {1,   2},
   {2,   2},   {3,   2},   {5,   2},   {7,   2},  {10,   2},
   {14,   2}, {-33,   3},  {-6,   3},  {-4,   3},  {-2,   3},
   {-1,   3},   {0,   3},   {1,   3},   {2,   3},   {4,   3},
   {19,   3},  {-9,   4},  {-3,   4},   {3,   4},   {7,   4},
   {25,   4}, {-13,   5},  {-5,   5},  {-2,   5},   {0,   5},
   {2,   5},   {5,   5},   {9,   5},  {33,   5},  {-8,   6},
   {-4,   6},   {4,   6},  {13,   6},  {43,   6}, {-18,   7},
   {-2,   7},   {0,   7},   {2,   7},   {7,   7},  {18,   7},
   {-24,   8},  {-6,   8}, {-42,   9}, {-11,   9},  {-4,   9},
   {5,   9},  {11,   9},  {23,   9}, {-31,  10},  {-1,  10},
   {2,  10}, {-15,  11},  {-8,  11},   {8,  11},  {15,  11},
   {31,  12}, {-21,  13},  {-5,  13},   {5,  13},  {41,  13},
   {-1,  14},   {1,  14},  {21,  14}, {-12,  15},  {12,  15},
   {-39,  17}, {-28,  17}, {-18,  17},  {-8,  17},   {8,  17},
   {17,  18},  {-4,  19},   {0,  19},   {4,  19},  {27,  19},
   {38,  20}, {-13,  21},  {12,  22}, {-36,  23}, {-24,  23},
   {-8,  24},   {7,  24},  {-3,  25},   {1,  25},  {22,  25},
   {34,  26}, {-18,  28}, {-32,  29},  {16,  29}, {-11,  31},
   {9,  32},  {29,  32},  {-4,  33},   {2,  33}, {-26,  34},
   {23,  36}, {-19,  39},  {16,  40}, {-13,  41},   {9,  42},
   {-6,  43},   {1,  43},   {0,   0},   {0,   0},   {0,   0}
};