Smacker: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
(→‎Unpacking Audio Track: Mention Bink audio as possible audio in Smacker)
Line 154: Line 154:
== Packed Huffman Trees ==
== Packed Huffman Trees ==


Huffman trees bitstreamed in specially compressed way. Basic stream layout:
Huffman trees are stored in the bitstream in a compressed format. Basic stream layout:


   Tag
   Tag
Line 160: Line 160:


   Tag  - Single bit, indicates that tree is present.
   Tag  - Single bit, indicates that tree is present.
   Flag - Single bit, indicates whenver tree entry is a Node (1) or Leaf (0).
   Flag - Single bit, indicates whether tree entry is a Node (1) or Leaf (0).
   Leaf - If Flag bit is zero, then next 8 bits or varriable size field follow,
   Leaf - If Flag bit is zero, then next 8 bits or variable size field follow,
           representing a tree leaf value.
           representing a tree leaf value.



Revision as of 17:33, 28 March 2006

Smacker is a technology that has been used in games and other entertainment software titles since the dawn of multimedia-era video games. The Smacker website claims that the format has been used for over 2600 titles.

Smacker files contain Smacker Video and one of a few different custom audio codecs.

Container Format

These are the data conventions used in this description:

  • All multi-byte numbers are stored in little-endian (Intel) format.
  • byte - 8 bits value
  • word - 16 bits value
  • dword - 32 bits value
  • All values are unsigned, unless stated otherwise.

This is the general layout of a Smacker file:

 struct Header
 dword  FrameSizes[];
 byte   FrameTypes[];
 byte   HuffmanTrees[];
 byte   FramesData[];

Header

General file description header. Total size is 0x68 bytes.

  dword Signature;
  dword Width;
  dword Height;
  dword Frames;
  dword FrameRate;
  dword Flags;
  dword AudioSize[7];
  dword TreesSize;
  dword MMap_Size;
  dword MClr_Size;
  dword Full_Size;
  dword Type_Size;
  dword AudioRate[7];
  dword Dummy;
  • Signature - File signature. Either "SMK2" for original Smacker or "SMK4" for latest revisions.
  • Width, Height - Frame dimensions in pixels.
  • Frames - Number of logical frames. File may contain extra "ring" frame, but it's not counted here.
  • FrameRate - can be determined this way:
 if (FrameRate > 0)
   fps = 1000 / FrameRate;
 else if (FrameRate < 0)
   fps = 100000 / (-FrameRate);
 else
   fps = 10;
  • Flags - Only bit 0 used so far, set to 1 if file contains a ring frame.
  • AudioSize - Size of the largest unpacked audio data for track in bytes.
  • TreesSize - Total size in bytes of HuffmanTrees stored in file.
  • MMap_Size, MClr_Size, Full_Size, Type_Size - Allocation size of corresponding Huffman table in memory.
  • AudioRate - Frequency and format of each sound track.
    • Lower 3 bytes contains frequency and upper byte flag bits:
    • 1..0 - unused
    • 3..2 - if both set to zero - use v2 sound decompression
    • 4 - stereo sound
    • 5 - 16bits sound
    • 6 - indicates data presence
    • 7 - data is compressed
  • Dummy - Unused.

FrameSizes

Array of sizes (lengths) of each physical frame in the file. Additionally, bit 0 of size indicates whenever frame is a keyframe or not. NB: You DON'T need to shift this bit out to get the length.

FrameTypes

Array of flags, describing contents of corresponding frame data.

   Flags bits
   ----------
       0 - frame contains a palette record
    7..1 - frame contains a sound track. Each bit indicates corresponding
           track presence (lower bits - lower tracks).


HuffmanTrees

Packed Huffman trees data for each of decoding table. First MMap, then MClr, Full, and Type tree.

Each tree consists of four parts: Huffman tree for low byte value, Huffman tree for high byte value, three escape codes and actual tree data.

First two trees are decoded as described in section "Packed Huffman Trees". Actual tree is decoded in mostly the same way excepting that leaf value is 16-bit and decoded from two huffman trees (lo-byte first). Also you should store pointers to nodes with values equal to escape values.

FramesData

Stream of frames. Data for each frame additionally subdivided on chunks by their type - palette, audio stream(s), video data, ordered in that sequence.

Palette Chunk

Contains palette change information.

byte Length;
byte Blocks[];

Length - Size of the following data, including this byte, divided by 4. Blocks - One or more blocks of palette data.

Each Block may be either 1, 2, or 3 bytes long, depends on top two bits of its first byte. Possible cases:

       1ccccccc           : Copy next (c + 1) color entries of the previous
                            palette to the next entries of the new palette.
       01cccccc, ssssssss : Copy (c + 1) color entries of the previous palette,
                            starting from entry (s) to the next entries of the
                            new palette.
       00bbbbbb, 00gggggg, 00rrrrrr : Make (b, g, r) colors as the next entry
                            of the new palette. Note, that components is only
                            6 bits long, you need to upscale them to real
                            values using following lookup table:
                            unsigned char palmap[64] = {
                              0x00, 0x04, 0x08, 0x0C, 0x10, 0x14, 0x18, 0x1C,
                              0x20, 0x24, 0x28, 0x2C, 0x30, 0x34, 0x38, 0x3C,
                              0x41, 0x45, 0x49, 0x4D, 0x51, 0x55, 0x59, 0x5D,
                              0x61, 0x65, 0x69, 0x6D, 0x71, 0x75, 0x79, 0x7D,
                              0x82, 0x86, 0x8A, 0x8E, 0x92, 0x96, 0x9A, 0x9E,
                              0xA2, 0xA6, 0xAA, 0xAE, 0xB2, 0xB6, 0xBA, 0xBE,
                              0xC3, 0xC7, 0xCB, 0xCF, 0xD3, 0xD7, 0xDB, 0xDF,
                              0xE3, 0xE7, 0xEB, 0xEF, 0xF3, 0xF7, 0xFB, 0xFF};

Keep parsing Blocks until all 256 colors of the new palette formed.

Audio Track Chunk

Contains portion of sound data for single track. Depending on FrameTypes bits, several audio chunks may belong to single frame. Single chunk layout:

   dword Length;
     dword UnpackedLength;
   byte Data[];
   Length         - Size of the following data, including this counter.
   UnpackedLength - Size of decompressed Data. Present only if AudioRate bits
                    indicates compressed data format.
   Data           - Audio data in format, described by AudioRate entry.


Video Chunk

Compressed video data to the end of Frame.

Bit Streams

HuffmanTrees, video and audio data, stored and accessed via bit streams. Bits counted from the lower bit of each new byte. Thus if we have stream of bytes 0x5C, 0x96, 0xEF and sequentially read 5, 6 and 7 bits, we'll get this output: 0x1C, 0x1A, 0x79.

Packed Huffman Trees

Huffman trees are stored in the bitstream in a compressed format. Basic stream layout:

  Tag
  Flag[, Leaf][, Flag[, Leaf]][, ...]
  Tag  - Single bit, indicates that tree is present.
  Flag - Single bit, indicates whether tree entry is a Node (1) or Leaf (0).
  Leaf - If Flag bit is zero, then next 8 bits or variable size field follow,
         representing a tree leaf value.

Reconstructing the Tree

Before trees can be used, they need to be unpacked to easily-seekable form. Reconstruction algorithm ('stream' denotes packed tree, 'tree' - unpacked):

  1: Read Tag
  2: If Tag is zero, finish
  3: Read Flag
  4a: If Flag is non-zero:
    5a: Remember current tree node
    5b: Advance to it's '0' branch
    5c: Repeat recursively from step 3 (one level down)
  4b: If flag is zero:
    5a: Read Leaf from stream
    5b: Assign Leaf value to current node (convert it to leaf)
  6: If no node previously remembered, finish (one level up)
  7: Use remembered node's '1' branch as current tree position
  8: Repeat from step 3

Imagine, we have following packed tree stream (it's not a bytes, just sequence of codes!):

  1, 1, 1, 1, 0, 3, 1, 0, 4, 0, 5, 0, 6, 1, 0, 7, 0, 8

Decompressing the tree, we'll get:

                               <--'0'- -'1'-->
                                     ( )
                                    /  \
                                   /   ( )
                                  /   /   \
                                ( ) (7)   (8)
                               /   \
                             ( )   (6)
                            /   \
                          (3)   ( )
                               /   \
                             (4)   (5)

Optimized Compression

Several things used to improve bits compression ratio. When Huffman used to compress 8-bits values (bytes) each Leaf entry holds raw 8 bits of data. When Huffman used to compress 16-bits values (even if actual value takes less than 16 bits), packed tree's Leafs doesn't contain whole 16 bits. Instead, stored bits decompressed using previously initialized two 8-bits Huffman decoders (each with own tree). Lower byte decompressed first, then highest. So when you enter 4b-5a step of the tree unpacking algorithm, you need to invoke another Huffman decoder twice to read full 16-bits leaf value.

Another used optimization technique is a semi-dynamic trees. When you unpack codes from tree, if this code is not the same as previously unpacked one, it will be moved together with another two recent codes to shortest tree' branches. Shortest branches explicitly marked with special Leaf values, replaced with zeros during tree reconstruction.

Both optimization methods used together, so typical tree initialization involving following steps:

  1a: Read Tag (of main tree)
  1b: If Tag is zero, finish
  2: Read Huffman tree for low-bytes
  3: Read Huffman tree for high-bytes
  4a: Read 16-bits marker of shortest node #1
  4b: Read 16-bits marker of shortest node #2
  4c: Read 16-bits marker of shortest node #3
  5: Read the rest of Huffman tree, contains 16-bits values
  5a: When you read Leaf and it's value matches one of marker obtained on step
      4, remember the leaf for this marker and set leaf's value to zero.

Unpacking Data Using Huffman

Unpacking process is fairy simple. Starting from the top of tree, you need to read and examine next bit of packed stream. According to it's value, choose either '0' or '1' branch. As soon as you encounter leaf, finish the unpacking. Otherwise, repeat bits reading and navigation. When done, unpacked data is a leaf's value. If dynamic tree used, compare unpacked value with previously unpacked one. If it's changed, move two previosly remembered values to shortest branches #2 and #3, and newly unpacked one to #1.

Unpacking Audio Track

Depending on AudioRate flags, sound can be stored in either uncompressed or packed form.

Uncompressed data is stored as raw PCM samples.

Compressed audio may be stored in Huffman-packed DPCM format or, in case of perceptual coding used (that is Bink Audio), as rle-packed coefficients.

Huffman DPCM

DPCM-packed stream has the following format:

DataPresent, IsStereo, Is16Bits, Trees[], Bases[], Data[]
  • DataPresent - Bit, indicates sound presence
  • IsStereo - Bit, indicates mono or stereo sound
  • Is16Bits - Bit, indicates 8 or 16 bits samples
  • Trees - 8-bits Huffman trees, one tree per each byte of unpacked sample (i.e. single tree for 8bit mono sound, 4 trees for 16bits stereo sound).
  • Bases - One to four 8-bits starting values for each of sample' bytes
  • Data - Huffman-packed stream of 8-bit deltas for each byte of samples

Obviously, Is16Bits and IsStereo must match flags of AudioRate field.

Decompression steps:

  1. Read DataPresent; if no DataPresent, finish
  2. Read IsStereo
  3. Read Is16Bits
  4. Read number of Trees, according to resulting sample size (see above)
  5. Read number of Bases, according to resulting sample size (see above). If sample is 16-bits wide, it's highest base byte stored first. First comes right-channel bases, then left-channel one(s).
  6. Output Base bytes is an unpacked sample.
  7. For every byte part of sample, Huffman decompress delta from Data using its Tree and add it to the Base byte. If sample is 16-bits wide, low byte stored first, then high, left channel, then right (opposing to step 5). When you unpack 16-bits sample, take in to account possible overflow of lower byte and adjust high byte accordingly.
  8. Repeat step 6 until all data decompressed (UnpackedLength bytes of samples processed)

Unpacking Video

Video encoded as a series of 4x4 blocks of same type, left to right, top to bottom. If image size is not divisible by 4, it's padded to block's boundary.

Block Types

  • Mono Block (0) - Whole block contains only pixels of two colors and special map of their order.
  • Full Block (1) - Normal full-color block. v4 may perform extra compession of this block.
  • Void Block (2) - Empty (skip) block, indicates that whole block data must be unchanged from previous frame.
  • Solid Block (3) - Whole block is filled with single color.

Mono blocks colors, Mono blocks maps and Full blocks have it's own 16-bit Huffman tables, stored in the beginning of the file (MClr, MMap and Full respectively).

Compressed frame stream consists of a packed block Types descriptors and actual blocks data. Another 16-bits Huffman table stored in the file header for Type descriptors decompression.

Type descriptors have the following format in bits:

       1..0 - Type of block
       7..2 - Blocks chain length. Not a real length, but index in the table:
                unsigned int sizetable[64] = {
                  1,    2,    3,    4,    5,    6,    7,    8,
                  9,   10,   11,   12,   13,   14,   15,   16,
                 17,   18,   19,   20,   21,   22,   23,   24,
                 25,   26,   27,   28,   29,   30,   31,   32,
                 33,   34,   35,   36,   37,   38,   39,   40,
                 41,   42,   43,   44,   45,   46,   47,   48,
                 49,   50,   51,   52,   53,   54,   55,   56,
                 57,   58,   59,  128,  256,  512, 1024, 2048};
      15..8 - Extra data, mainly contains fill color for Solid block.


Frame painting steps:

  1. Read Type descriptor
  2. Draw single block selected by Type bits
  3. If whole image drawn, finish
  4. Repeat from step 2 until whole chain complete
  5. Repeat from step 1

Mono Block

Mono block contains 16 pixels of two colors. First, you need to unpack pixel's color from the stream, using MClr Huffman table. High-byte of unpacked value contains '1' color and lower byte - '0' color. Next, decode the pixels map, using MMap table. Now, for each bit of the map, starting from the lower, if it's value is zero, pain pixel using '0'-color, if bit is set - '1'-color. Repeat until all 16 pixels is done.

Full Block

Full block contains 16 pixels of any color. Format of this block has been changed in v4.

v2 Full Block

Decompress 16-bits value and use it draw pixels 3 and 4. Use lower byte for pixel 3 and high byte to pixel 4. Then, decompress next value and paint pixels 1 and 2 using same technique. Advance to the next 4 pixels. Repeat these steps to draw all 16 pixels.

v4 Full Block

Chain of v4 Full blocks prepended with extra bits, determinating sub-type of all following blocks in the chain. If first bit is 0, paint chain of blocks as v2 Full blocks. Otherwise, read next bit. If bit is zero, paint chain of Double-blocks, if one - chain of Half-blocks.

v4 Double Block

Decompress 16-bits value and use it to draw first two 2x2 subblocks (left to right, top to bottom), filling whole subblock with single color (i.e low byte used for pixels 1, 2, 5, 6 and high byte for pixels 3, 4, 7, 8). Advance to the next two subblocks and repeat again to paint all 16 pixels.

v4 Half Block

Perform steps similar to v2 Full block, but instead drawing of each even line, just duplicate if from previous line.

Void Block

Just keep all 16 pixels unchanged from previous frame.

Solid Block

Fill whole block using color obtained as Extra value during Type descriptor decoding.