Difference between revisions of "Smacker"

From MultimediaWiki
Jump to navigation Jump to search
m (...and I should probably have called it "frame" instead of "image" all along.)
(Update flags information)
Line 59: Line 59:
   ---------
   ---------
   0 - set to 1 if file contains a ring frame.
   0 - set to 1 if file contains a ring frame.
   1 - set to 1 if file is Y-interlaced (whatever that means).
   1 - set to 1 if file is Y-interlaced, meaning that the frame should be
      scaled to twice its height before it is displayed.
   2 - set to 1 if file is Y-doubled, meaning that the frame should be
   2 - set to 1 if file is Y-doubled, meaning that the frame should be
       scaled to twice its height before it is displayed.
       scaled to twice its height before it is displayed.

Revision as of 18:38, 28 June 2006

Smacker is a technology that has been used in games and other entertainment software titles since the dawn of multimedia-era video games. The Smacker website claims that the format has been used for over 2600 titles.

Smacker files contain Smacker Video and one of a few different custom audio codecs.

Container Format

These are the data conventions used in this description:

  • All multi-byte numbers are stored in little-endian (Intel) format.
  • byte - 8 bits value
  • word - 16 bits value
  • dword - 32 bits value
  • All values are unsigned, unless stated otherwise.

This is the general layout of a Smacker file:

 struct Header
 dword  FrameSizes[];
 byte   FrameTypes[];
 byte   HuffmanTrees[];
 byte   FramesData[];

Header

General file description header. Total size is 0x68 bytes.

  dword Signature;
  dword Width;
  dword Height;
  dword Frames;
  dword FrameRate;
  dword Flags;
  dword AudioSize[7];
  dword TreesSize;
  dword MMap_Size;
  dword MClr_Size;
  dword Full_Size;
  dword Type_Size;
  dword AudioRate[7];
  dword Dummy;
  • Signature - File signature. Either "SMK2" for original Smacker or "SMK4" for latest revisions.
  • Width, Height - Frame dimensions in pixels.
  • Frames - Number of logical frames. File may contain extra "ring" frame, but it's not counted here.
  • FrameRate - can be determined this way:
 if (FrameRate > 0)
   fps = 1000 / FrameRate;
 else if (FrameRate < 0)
   fps = 100000 / (-FrameRate);
 else
   fps = 10;
  • Flags - Only bits 0, 1 and 2 are used so far:
 Flag bits
 ---------
 0 - set to 1 if file contains a ring frame.
 1 - set to 1 if file is Y-interlaced, meaning that the frame should be
     scaled to twice its height before it is displayed.
 2 - set to 1 if file is Y-doubled, meaning that the frame should be
     scaled to twice its height before it is displayed.
  • AudioSize - Size of the largest unpacked audio data for track in bytes.
  • TreesSize - Total size in bytes of HuffmanTrees stored in file.
  • MMap_Size, MClr_Size, Full_Size, Type_Size - Allocation size of corresponding Huffman table in memory.
  • AudioRate - Frequency and format of each sound track.
    • Lower 3 bytes contains frequency and upper byte flag bits:
    • 1..0 - unused
    • 3..2 - if both set to zero - use v2 sound decompression
    • 4 - stereo sound
    • 5 - 16bits sound
    • 6 - indicates data presence
    • 7 - data is compressed
  • Dummy - Unused.

FrameSizes

Array of sizes (lengths) of each physical frame in the file. Additionally, bit 0 of size indicates whenever frame is a keyframe or not. NB: You DON'T need to shift this bit out to get the length.

FrameTypes

Array of flags, describing contents of corresponding frame data.

   Flags bits
   ----------
       0 - frame contains a palette record
    7..1 - frame contains a sound track. Each bit indicates corresponding
           track presence (lower bits - lower tracks).


HuffmanTrees

Packed Huffman trees data for each of decoding table. First MMap, then MClr, Full, and Type tree.

Each tree consists of four parts: Huffman tree for low byte value, Huffman tree for high byte value, three escape codes and actual tree data.

First two trees are decoded as described in section "Packed Huffman Trees". Actual tree is decoded in mostly the same way excepting that leaf value is 16-bit and decoded from two huffman trees (lo-byte first). Also you should store pointers to nodes with values equal to escape values.

FramesData

Stream of frames. Data for each frame additionally subdivided on chunks by their type - palette, audio stream(s), video data, ordered in that sequence.

Palette Chunk

Contains palette change information.

byte Length;
byte Blocks[];

Length - Size of the following data, including this byte, divided by 4. Blocks - One or more blocks of palette data.

Each Block may be either 1, 2, or 3 bytes long, depends on top two bits of its first byte. Possible cases:

       1ccccccc           : Copy next (c + 1) color entries of the previous
                            palette to the next entries of the new palette.
       01cccccc, ssssssss : Copy (c + 1) color entries of the previous palette,
                            starting from entry (s) to the next entries of the
                            new palette.
       00bbbbbb, 00gggggg, 00rrrrrr : Make (b, g, r) colors as the next entry
                            of the new palette. Note, that components is only
                            6 bits long, you need to upscale them to real
                            values using following lookup table:
                            unsigned char palmap[64] = {
                              0x00, 0x04, 0x08, 0x0C, 0x10, 0x14, 0x18, 0x1C,
                              0x20, 0x24, 0x28, 0x2C, 0x30, 0x34, 0x38, 0x3C,
                              0x41, 0x45, 0x49, 0x4D, 0x51, 0x55, 0x59, 0x5D,
                              0x61, 0x65, 0x69, 0x6D, 0x71, 0x75, 0x79, 0x7D,
                              0x82, 0x86, 0x8A, 0x8E, 0x92, 0x96, 0x9A, 0x9E,
                              0xA2, 0xA6, 0xAA, 0xAE, 0xB2, 0xB6, 0xBA, 0xBE,
                              0xC3, 0xC7, 0xCB, 0xCF, 0xD3, 0xD7, 0xDB, 0xDF,
                              0xE3, 0xE7, 0xEB, 0xEF, 0xF3, 0xF7, 0xFB, 0xFF};

Keep parsing Blocks until all 256 colors of the new palette formed.

Audio Track Chunk

Contains portion of sound data for single track. Depending on FrameTypes bits, several audio chunks may belong to single frame. Single chunk layout:

   dword Length;
     dword UnpackedLength;
   byte Data[];
   Length         - Size of the following data, including this counter.
   UnpackedLength - Size of decompressed Data. Present only if AudioRate bits
                    indicates compressed data format.
   Data           - Audio data in format, described by AudioRate entry.


Video Chunk

Compressed video data to the end of Frame.

Bit Streams

HuffmanTrees, video and audio data, stored and accessed via bit streams. Bits counted from the lower bit of each new byte. Thus if we have stream of bytes 0x5C, 0x96, 0xEF and sequentially read 5, 6 and 7 bits, we'll get this output: 0x1C, 0x1A, 0x79.

Packed Huffman Trees

Huffman trees are stored in the bitstream in a compressed format. Basic stream layout:

  Tag
  Flag[, Leaf][, Flag[, Leaf]][, ...]
  Tag  - Single bit, indicates that tree is present.
  Flag - Single bit, indicates whether tree entry is a Node (1) or Leaf (0).
  Leaf - If Flag bit is zero, then next 8 bits or variable size field follow,
         representing a tree leaf value.

Reconstructing the Tree

Before trees can be used, they need to be unpacked into an easily seekable form. Reconstruction algorithm ('stream' denotes packed tree, 'tree' - unpacked):

  1: Read Tag
  2: If Tag is zero, finish
  3: Read Flag
  4a: If Flag is non-zero:
    5a: Remember current tree node
    5b: Advance to its '0' branch
    5c: Repeat recursively from step 3 (one level down)
  4b: If flag is zero:
    5a: Read Leaf from stream
    5b: Assign Leaf value to current node (convert it to leaf)
  6: If no node previously remembered, finish (one level up)
  7: Use node's '1' branch from step 5a as current tree position
  8: Repeat from step 3

Imagine, we have following packed tree stream (these are not bytes, just a sequence of codes!):

  1, 1, 1, 1, 0, 3, 1, 0, 4, 0, 5, 0, 6, 1, 0, 7, 0, 8

Decompressing the tree, we'll get:

                               <--'0'- -'1'-->
                                     ( )
                                    /  \
                                   /   ( )
                                  /   /   \
                                ( ) (7)   (8)
                               /   \
                             ( )   (6)
                            /   \
                          (3)   ( )
                               /   \
                             (4)   (5)

Optimized Compression

Smacker uses several techniques to improve compression ratio. When Huffman codes are used to compress 8-bit values (bytes) each Leaf entry holds raw 8 bits of data. When Huffman codes are used to compress 16-bit values (even if the actual value requires fewer than 16 bits), the packed tree's Leaves do not contain all 16 bits. Instead, the stored bits are decompressed using the two previously initialized 8-bit Huffman decoders (each with its own tree). The lower byte is decompressed first, then highest. So when you enter steps 4b-5a of the tree unpacking algorithm, you need to invoke another Huffman decoder twice to read the full 16-bit Leaf value.

Another optimization technique employed is semi-dynamic trees. When you unpack codes from a tree and a code is not the same as previously unpacked one, it will be moved together with another two recent codes to the shortest tree's branches. Shortest branches are explicitly marked with special Leaf values and replaced with zeros during tree reconstruction.

Both optimization methods are used together, so typical tree initialization involves the following steps:

  1a: Read Tag (of main tree)
  1b: If Tag is zero, finish
  2: Read Huffman tree for low bytes
  3: Read Huffman tree for high bytes
  4a: Read 16-bit marker of shortest node #1
  4b: Read 16-bit marker of shortest node #2
  4c: Read 16-bit marker of shortest node #3
  5: Read the rest of Huffman tree, contains 16-bit values
  5a: When you read Leaf and it's value matches one of the markers obtained on step
      4, remember the leaf for this marker and set leaf's value to zero.

Unpacking Data Using Huffman Trees

The unpacking process is fairy simple. Start from the top of the Huffman tree. Read and examine the next bit of packed bitstream. According to its value, choose either the '0' or '1' branch of the tree. As soon as you encounter a leaf, the unpacking is finished and a value is obtained. Otherwise, repeat bit reading and tree branching. If a dynamic tree used, compare unpacked value with previously unpacked one. If the value is different, move two previously remembered values to shortest branches #2 and #3, and newly unpacked value to #1.

Unpacking Audio Track

Depending on AudioRate flags, sound can be stored in either uncompressed or packed form.

Uncompressed data is stored as raw PCM samples.

Compressed audio may be stored in Huffman-packed DPCM format or, in case of perceptual coding used (that is Bink Audio), as rle-packed coefficients.

Huffman DPCM

DPCM-packed stream has the following format:

DataPresent, IsStereo, Is16Bits, Trees[], Bases[], Data[]
  • DataPresent - Bit, indicates sound presence
  • IsStereo - Bit, indicates mono or stereo sound
  • Is16Bits - Bit, indicates 8 or 16 bits samples
  • Trees - 8-bits Huffman trees, one tree per each byte of unpacked sample (i.e. single tree for 8bit mono sound, 4 trees for 16bits stereo sound).
  • Bases - One to four 8-bits starting values for each of sample' bytes
  • Data - Huffman-packed stream of 8-bit deltas for each byte of samples

Obviously, Is16Bits and IsStereo must match flags of AudioRate field.

Decompression steps:

  1. Read DataPresent; if no DataPresent, finish
  2. Read IsStereo
  3. Read Is16Bits
  4. Read number of Trees, according to resulting sample size (see above)
  5. Read number of Bases, according to resulting sample size (see above). If sample is 16-bits wide, it's highest base byte stored first. First comes right-channel bases, then left-channel one(s).
  6. Output Base bytes is an unpacked sample.
  7. For every byte part of sample, Huffman decompress delta from Data using its Tree and add it to the Base byte. If sample is 16-bits wide, low byte stored first, then high, left channel, then right (opposing to step 5). When you unpack 16-bits sample, take in to account possible overflow of lower byte and adjust high byte accordingly.
  8. Repeat step 6 until all data decompressed (UnpackedLength bytes of samples processed)

Unpacking Video

The video in a Smacker is encoded as a series of 4x4 pixel blocks. The image is decoded left to right, top to bottom. If the original image size is not divisible by 4, it is padded up to the next block boundary.

Block Types

  • Mono Block (0) - Whole block contains only pixels of two colors and special map of their order.
  • Full Block (1) - Normal full-color block. v4 may perform extra compession of this block.
  • Void Block (2) - Empty (skip) block, indicates that whole block data is unchanged from previous frame.
  • Solid Block (3) - Whole block is filled with single color.

Mono blocks colors, Mono blocks maps and Full blocks have it's own 16-bit Huffman tables, stored in the beginning of the file (MClr, MMap and Full respectively).

Compressed frame stream consists of a packed block Types descriptors and actual blocks data. Another 16-bits Huffman table stored in the file header for Type descriptors decompression.

Type descriptors have the following format in bits:

       1..0 - Type of block
       7..2 - Blocks chain length. Not a real length, but index in the table:
                unsigned int sizetable[64] = {
                  1,    2,    3,    4,    5,    6,    7,    8,
                  9,   10,   11,   12,   13,   14,   15,   16,
                 17,   18,   19,   20,   21,   22,   23,   24,
                 25,   26,   27,   28,   29,   30,   31,   32,
                 33,   34,   35,   36,   37,   38,   39,   40,
                 41,   42,   43,   44,   45,   46,   47,   48,
                 49,   50,   51,   52,   53,   54,   55,   56,
                 57,   58,   59,  128,  256,  512, 1024, 2048};
      15..8 - Extra data, mainly contains fill color for Solid block.


Frame painting steps:

  1. Read Type descriptor
  2. Draw single block selected by Type bits
  3. If whole image drawn, finish
  4. Repeat from step 2 until whole chain complete
  5. Repeat from step 1

Mono Block

Mono block contains 16 pixels of two colors. First, you need to unpack the pixel's color from the stream using the MClr Huffman table. The high byte of the unpacked value contains color1 and lower byte contains color2. Next, decode the pixels map, using MMap table. For each bit of the map, starting from the least significant bit, if the value is zero, paint pixel using color0, if bit is set paint with color1. Repeat until all 16 pixels are painted.

Full Block

Full block contains 16 pixels of any color. Format of this block has been changed in v4.

v2 Full Block

Decompress 16-bit value and use it draw pixels 3 and 4. Use lower byte for pixel 3 and high byte for pixel 4. Then, decompress next value and paint pixels 1 and 2 using same technique. Advance to the next 4 pixels. Repeat these steps to draw all 16 pixels.

v4 Full Block

Chain of v4 Full blocks prepended with extra bits, determinating sub-type of all following blocks in the chain. If first bit is 0, paint chain of blocks as v2 Full blocks. Otherwise, read next bit. If bit is zero, paint chain of Double-blocks; if one, paint chain of Half-blocks.

v4 Double Block

Decompress 16-bits value and use it to draw first two 2x2 subblocks (left to right, top to bottom), filling whole subblock with single color (i.e low byte used for pixels 1, 2, 5, 6 and high byte for pixels 3, 4, 7, 8). Advance to the next two subblocks and repeat again to paint all 16 pixels.

v4 Half Block

Perform steps similar to v2 Full block, but instead of drawing each even line, just duplicate if from previous line.

Void Block

All 16 pixels in the block are unchanged from the previous frame.

Solid Block

Fill whole block using color obtained as Extra value during Type descriptor decoding.