Smacker

From MultimediaWiki
Jump to navigation Jump to search

Smacker is a technology that has been used in games and other entertainment software titles since the dawn of multimedia-era video games. The Smacker website claims that the format has been used for over 2600 titles.

Smacker files contain Smacker Video and one of a few different custom audio codecs.

Container Format

These are the data conventions used in this description:

  • All multi-byte numbers are stored in little-endian (Intel) format.
  • byte - 8 bits value
  • word - 16 bits value
  • dword - 32 bits value
  • All values are unsigned, unless stated otherwise.

This is the general layout of a Smacker file:

 struct Header
 dword  FrameSizes[];
 byte   FrameTypes[];
 byte   HuffmanTrees[];
 byte   FramesData[];

Header

General file description header. Total size is 104 (0x68) bytes.

  dword Signature;
  dword Width;
  dword Height;
  dword Frames;
  dword FrameRate;
  dword Flags;
  dword AudioSize[7];
  dword TreesSize;
  dword MMap_Size;
  dword MClr_Size;
  dword Full_Size;
  dword Type_Size;
  dword AudioRate[7];
  dword Dummy;
  • Signature - File signature. Either "SMK2" for original Smacker or "SMK4" for latest revisions.
  • Width, Height - Frame dimensions in pixels.
  • Frames - Number of logical frames. File may contain extra "ring" frame, but it's not counted here.
  • FrameRate - can be determined this way:
 if (FrameRate > 0)
   fps = 1000 / FrameRate;
 else if (FrameRate < 0)
   fps = 100000 / (-FrameRate);
 else
   fps = 10;
  • Flags - Only bits 0, 1 and 2 are used so far:
 Flag bits
 ---------
 0 - set to 1 if file contains a ring frame.
 1 - set to 1 if file is Y-interlaced, meaning that the frame should be
     scaled to twice its height before it is displayed.
 2 - set to 1 if file is Y-doubled, meaning that the frame should be
     scaled to twice its height before it is displayed.
  • AudioSize - Size of the largest unpacked audio data buffer in bytes; this is provided for up to 7 audio tracks.
  • TreesSize - Total size in bytes of HuffmanTrees stored in file.
  • MMap_Size, MClr_Size, Full_Size, Type_Size - Allocation size of corresponding Huffman table in memory.
  • AudioRate - Frequency and format information for each sound track, up to 7 audio tracks. The 32 constituent bits have the following meaning:
    • bit 31 - data is compressed
    • bit 30 - indicates that audio data is present for this track
    • bit 29 - 1 = 16-bit audio; 0 = 8-bit audio
    • bit 28 - 1 = stereo audio; 0 = mono audio
    • bits 27-26 - if both set to zero - use v2 sound decompression
    • bits 25-24 - unused
    • bits 23-0 - audio sample rate
  • Dummy - Unused.

FrameSizes

Following the header is an array of sizes (lengths) of each physical frame in the file. Each size is a 32-bit number stored in little endian format. Additionally, if bit 0 of a frame size is 1 then that frame is a keyframe. Bit 1 is also used for some undetermined purpose. NB: You DO NOT need to shift these bits out to get the proper length but you do need to clear both of these bottom bits.

FrameTypes

The FrameTypes section of a Smacker file contains an array of bytes, where the 8 bits of each byte describe the contents of the corresponding frame. The 8 bits have the following meaning when set:

  • 7 - frame contains audio data corresponding to track 6
  • 6 - frame contains audio data corresponding to track 5
  • 5 - frame contains audio data corresponding to track 4
  • 4 - frame contains audio data corresponding to track 3
  • 3 - frame contains audio data corresponding to track 2
  • 2 - frame contains audio data corresponding to track 1
  • 1 - frame contains audio data corresponding to track 0
  • 0 - frame contains a palette record

HuffmanTrees

The HuffmanTrees section contains Huffman tree data for each decoding table. The tables are stored in this order:

  • MMap
  • MClr
  • Full
  • Type

Each tree consists of four parts:

  • Huffman tree for low byte values
  • Huffman tree for high byte values
  • 3 escape codes
  • Huffman tree data

The first two trees are decoded as described in section Packed Huffman Trees. The actual tree data is decoded in mostly the same way except that a leaf value is 16 bits long and decoded from two Huffman trees (low byte first). You should also store pointers to nodes with values equal to escape values.

FramesData

The FramesData section contains the data for each frame. Additionally, each frame can be subdivided into multiple chunks and stored in this order:

  • new palette
  • audio stream(s)
  • video data

Palette Chunk

A palette chunk contains palette change information. Note that Smacker files use a 256-entry RGB where each component is 8 bits large. I.e., individual red, green, and blue palette components have a possible range of 0..255.

The layout of a palette chunk is:

byte Length;
byte Blocks[];
  • Length - Size of the entire palette chunk, including this length byte, divided by 4.
  • Blocks - One or more blocks of palette data.

Each block may be either 1, 2, or 3 bytes long, depending on the top one or two bits of the first byte. These are the 3 possible cases:

  • 1ccccccc: The top bit of the first byte in the block is 1. Copy next (c + 1) color entries of the previous palette to the next entries of the new palette.
  • 01cccccc, ssssssss: The top 2 bits of the first byte in the block are 01. Copy (c + 1) color entries of the previous palette, starting from entry (s) to the next entries of the new palette.
  • 00bbbbbb, 00gggggg, 00rrrrrr: The top 2 bits of the first byte in the block are 00. Use (b, g, r) to create the next entry of the new palette. Note, that each component in this case is only 6 bits long. You need to upscale them to the full 8-bit values using the following lookup table:
 unsigned char palmap[64] = {
   0x00, 0x04, 0x08, 0x0C, 0x10, 0x14, 0x18, 0x1C,
   0x20, 0x24, 0x28, 0x2C, 0x30, 0x34, 0x38, 0x3C,
   0x41, 0x45, 0x49, 0x4D, 0x51, 0x55, 0x59, 0x5D,
   0x61, 0x65, 0x69, 0x6D, 0x71, 0x75, 0x79, 0x7D,
   0x82, 0x86, 0x8A, 0x8E, 0x92, 0x96, 0x9A, 0x9E,
   0xA2, 0xA6, 0xAA, 0xAE, 0xB2, 0xB6, 0xBA, 0xBE,
   0xC3, 0xC7, 0xCB, 0xCF, 0xD3, 0xD7, 0xDB, 0xDF,
   0xE3, 0xE7, 0xEB, 0xEF, 0xF3, 0xF7, 0xFB, 0xFF
 };

Keep parsing palette blocks until all 256 colors of the new palette have been filled, or until the palette chunk has been depleted.

Audio Track Chunk

An audio track chunk contains a portion of audio data corresponding to a single track. A Smacker file can contain up to 7 audio tracks and a single frame may have several audio track chunks. The layout of a single audio track chunk is:

 dword Length;
 dword UnpackedLength; (optional)
 byte Data[];
  • Length - Size of the entire chunk, including this length field.
  • UnpackedLength - Size of decompressed Data. This field is present only if the corresponding AudioRate header field indicates that this track uses compressed data.
  • Data - Audio data, encoded as specified by corresponding AudioRate header field.

Video Chunk

A video chunk contains compressed video data for the remainder of a frame.

Bit Streams

HuffmanTrees, video and audio data, stored and accessed via bit streams. Bits counted from the lower bit of each new byte. Thus if we have stream of bytes 0x5C, 0x96, 0xEF and sequentially read 5, 6 and 7 bits, we'll get this output: 0x1C, 0x32, 0x72.

Packed Huffman Trees

Huffman trees are stored in the bitstream in a compressed format. Basic stream layout:

  Tag
  Flag[, Leaf][, Flag[, Leaf]][, ...]
  Tag  - Single bit, indicates that tree is present.
  Flag - Single bit, indicates whether tree entry is a Node (1) or Leaf (0).
  Leaf - If Flag bit is zero, then next 8 bits or variable size field follow,
         representing a tree leaf value.

Reconstructing the Tree

Before trees can be used, they need to be unpacked into an easily seekable form. Reconstruction algorithm ('stream' denotes packed tree, 'tree' - unpacked):

  1: Read Tag
  2: If Tag is zero, finish
  3: Read Flag
  4a: If Flag is non-zero:
    5a: Remember current tree node
    5b: Advance to its '0' branch
    5c: Repeat recursively from step 3 (one level down)
  4b: If flag is zero:
    5a: Read Leaf from stream
    5b: Assign Leaf value to current node (convert it to leaf)
  6: If no node previously remembered, finish (one level up)
  7: Use node's '1' branch from step 5a as current tree position
  8: Repeat from step 3

Imagine, we have following packed tree stream (these are not bytes, just a sequence of codes!):

  1, 1, 1, 1, 0, 3, 1, 0, 4, 0, 5, 0, 6, 1, 0, 7, 0, 8

Decompressing the tree, we'll get:

                               <--'0'- -'1'-->
                                     ( )
                                    /  \
                                   /   ( )
                                  /   /   \
                                ( ) (7)   (8)
                               /   \
                             ( )   (6)
                            /   \
                          (3)   ( )
                               /   \
                             (4)   (5)

Optimized Compression

Smacker uses several techniques to improve compression ratio. When Huffman codes are used to compress 8-bit values (bytes) each Leaf entry holds raw 8 bits of data. When Huffman codes are used to compress 16-bit values (even if the actual value requires fewer than 16 bits), the packed tree's Leaves do not contain all 16 bits. Instead, the stored bits are decompressed using the two previously initialized 8-bit Huffman decoders (each with its own tree). The lower byte is decompressed first, then highest. So when you enter steps 4b-5a of the tree unpacking algorithm, you need to invoke another Huffman decoder twice to read the full 16-bit Leaf value.

Another optimization technique employed is semi-dynamic trees. When you unpack codes from a tree and a code is not the same as previously unpacked one, it will be moved together with another two recent codes to the shortest tree's branches. Shortest branches are explicitly marked with special Leaf values and replaced with zeros during tree reconstruction.

Both optimization methods are used together, so typical tree initialization involves the following steps:

  1a: Read Tag (of main tree)
  1b: If Tag is zero, finish
  2: Read Huffman tree for low bytes
  3: Read Huffman tree for high bytes
  4a: Read 16-bit marker of shortest node #1
  4b: Read 16-bit marker of shortest node #2
  4c: Read 16-bit marker of shortest node #3
  5: Read the rest of Huffman tree, contains 16-bit values
  5a: When you read Leaf and it's value matches one of the markers obtained on step
      4, remember the leaf for this marker and set leaf's value to zero.

Unpacking Data Using Huffman Trees

The unpacking process is fairly simple. Start from the top of the Huffman tree. Read and examine the next bit of packed bitstream. According to its value, choose either the '0' or '1' branch of the tree. As soon as you encounter a leaf, the unpacking is finished and a value is obtained. Otherwise, repeat bit reading and tree branching. If a dynamic tree used, compare unpacked value with previously unpacked one. If the value is different, move two previously remembered values to shortest branches #2 and #3, and newly unpacked value to #1.

Unpacking Audio Track

Depending on AudioRate flags, sound can be stored in either uncompressed or packed form.

Uncompressed data is stored as raw PCM samples.

Compressed audio may be stored in Huffman-packed DPCM format or, in case of perceptual coding used (that is Bink Audio), as rle-packed coefficients.

Huffman DPCM

DPCM-packed stream has the following format:

DataPresent, IsStereo, Is16Bits, Trees[], Bases[], Data[]
  • DataPresent - Bit, indicates sound presence
  • IsStereo - Bit, indicates mono or stereo sound
  • Is16Bits - Bit, indicates 8 or 16 bits samples
  • Trees - 8-bits Huffman trees, one tree per each byte of unpacked sample (i.e. single tree for 8bit mono sound, 4 trees for 16bits stereo sound).
  • Bases - One to four 8-bits starting values for each of sample' bytes
  • Data - Huffman-packed stream of 8-bit deltas for each byte of samples

Obviously, Is16Bits and IsStereo must match flags of AudioRate field.

Decompression steps:

  1. Read DataPresent; if no DataPresent, finish
  2. Read IsStereo
  3. Read Is16Bits
  4. Read number of Trees, according to resulting sample size (see above)
  5. Read number of Bases, according to resulting sample size (see above). If sample is 16-bits wide, it's highest base byte stored first. First comes right-channel bases, then left-channel one(s).
  6. Output Base bytes is an unpacked sample.
  7. For every byte part of sample, Huffman decompress delta from Data using its Tree and add it to the Base byte. If sample is 16-bits wide, low byte stored first, then high, left channel, then right (opposing to step 5). When you unpack 16-bits sample, take in to account possible overflow of lower byte and adjust high byte accordingly.
  8. Repeat step 6 until all data decompressed (UnpackedLength bytes of samples processed)

Unpacking Video

The video in a Smacker is encoded as a series of 4x4 pixel blocks. The image is decoded left to right, top to bottom. If the original image size is not divisible by 4, it is padded up to the next block boundary.

Block Types

  • Mono Block (0) - Whole block contains only pixels of two colors and special map of their order.
  • Full Block (1) - Normal full-color block. v4 may perform extra compession of this block.
  • Void Block (2) - Empty (skip) block, indicates that whole block data is unchanged from previous frame.
  • Solid Block (3) - Whole block is filled with single color.

Mono blocks colors, Mono blocks maps and Full blocks have it's own 16-bit Huffman tables, stored in the beginning of the file (MClr, MMap and Full respectively).

Compressed frame stream consists of a packed block Types descriptors and actual blocks data. Another 16-bits Huffman table stored in the file header for Type descriptors decompression.

Type descriptors have the following format in bits:

       1..0 - Type of block
       7..2 - Blocks chain length. Not a real length, but index in the table:
                unsigned int sizetable[64] = {
                  1,    2,    3,    4,    5,    6,    7,    8,
                  9,   10,   11,   12,   13,   14,   15,   16,
                 17,   18,   19,   20,   21,   22,   23,   24,
                 25,   26,   27,   28,   29,   30,   31,   32,
                 33,   34,   35,   36,   37,   38,   39,   40,
                 41,   42,   43,   44,   45,   46,   47,   48,
                 49,   50,   51,   52,   53,   54,   55,   56,
                 57,   58,   59,  128,  256,  512, 1024, 2048};
      15..8 - Extra data, mainly contains fill color for Solid block.


Frame painting steps:

  1. Read Type descriptor
  2. Draw single block selected by Type bits
  3. If whole image drawn, finish
  4. Repeat from step 2 until whole chain complete
  5. Repeat from step 1

Mono Block

Mono block contains 16 pixels of two colors. First, you need to unpack the pixel's color from the stream using the MClr Huffman table. The high byte of the unpacked value contains color1 and lower byte contains color2. Next, decode the pixels map, using MMap table. For each bit of the map, starting from the least significant bit, if the value is zero, paint pixel using color0, if bit is set paint with color1. Repeat until all 16 pixels are painted.

Full Block

Full block contains 16 pixels of any color. Format of this block has been changed in v4.

v2 Full Block

Decompress 16-bit value and use it draw pixels 3 and 4. Use lower byte for pixel 3 and high byte for pixel 4. Then, decompress next value and paint pixels 1 and 2 using same technique. Advance to the next 4 pixels. Repeat these steps to draw all 16 pixels.

v4 Full Block

Chain of v4 Full blocks prepended with extra bits, determinating sub-type of all following blocks in the chain. If first bit is 0, paint chain of blocks as v2 Full blocks. Otherwise, read next bit. If bit is zero, paint chain of Double-blocks; if one, paint chain of Half-blocks.

v4 Double Block

Decompress 16-bits value and use it to draw first two 2x2 subblocks (left to right, top to bottom), filling whole subblock with single color (i.e low byte used for pixels 1, 2, 5, 6 and high byte for pixels 3, 4, 7, 8). Advance to the next two subblocks and repeat again to paint all 16 pixels.

v4 Half Block

Perform steps similar to v2 Full block, but instead of drawing each even line, just duplicate if from previous line.

Void Block

All 16 pixels in the block are unchanged from the previous frame.

Solid Block

Fill whole block using color obtained as Extra value during Type descriptor decoding.