From MultimediaWiki
Jump to navigation Jump to search

SAN stands for Smush Animation Format. It is a full motion video format used in a number of different LucasArts games initially created by Vince Lee. What follows are some random notes based on examining files as well as decoders from versions of the ScummVM and Residual applications which are open source reimplementations of interpreters of some of the games that used SAN.


Samples from various games are located at Keep in mind that these files may not all be encoded with the same version of Smush. We hope to describe the codec's usage in each game as we find out more information.

Vince Lee Quote From Old LucasFans Interview

Both Full Throttle and The Dig used the INSANE engine, originally designed for the Rebel Assault games. How much were you involved with these two games, and what exactly did INSANE accomplish in those titles?

INSANE was primarily used as a cut scene engine for both The Dig and Full Throttle, but the latter made more interesting use of it in the mineroad sequences. Fortunately, by the time both these products came around, I had broken up the INSANE engine into reusable modules, so the programmers for those respective games were able to integrate my code with little input from me.

One could assume that the INSANE engine and SMUSH movie codec will still be used in future LucasArts games. How does it feel to know that your code will continue to be used at the company long after you left?

It's kind of funny. I think four or five times in my career there I had anticipated the death of INSANE as an internal cut scene codec. Outside companies kept bringing in codecs that just seemed amazing. But many times, when the decision had to be made, we found the the outside codec wouldn't run on our target platform, or I found that I could tweak my codec to get better results. This was the case with Indy and the Infernal Machine. They had originally planned to use Windows DirectPlay to handle their cutscenes, but switched to INSANE after they found DirectPlay couldn't meet their needs. Will it continue? I left the code in good hands, so it certainly could. I'm sure it won't last forever, but I do get a little bit of a kick knowing that it might go on a bit without me.

Just out of curiosity, what does INSANE stand for, anyway?

INSANE sounds for the INteractive Streaming ANimation Engine, and originally referred to the streaming video engine from Rebel Assault. Nowadays, it's made up of some 18 code libraries which encompass most of the code I wrote in 8 and a half years at LucasArts.

Discussion Of Video Used in Rebel Assault II

Full article [ Optimizing CD-ROM Performance Under DOS/4GW]

Case Study: Rebel Assault II

Rebel Assault II: The Hidden Empire from LucasArts is the sequel to the action-arcade game Rebel Assault. Set in the Star Wars universe, it features 15 chapters of play and uses high-quality cinematic video sequences to advance the story and mood. The game play features various flying, dodging, and shooting sequences set in front of interactive streamed backgrounds.

The minimum platform for Rebel II is a 486/50 with a 2X CD-ROM drive. To achieve acceptable performance and image quality on this platform, LucasArts wrote a custom animation system. This system, the INteractive Streamed ANimation Engine (INSANE), is a collection of code libraries designed primarily to compress and play back video sequences. The system is modular, easily portable, and will be used in a majority of LucasArts's upcoming titles. In Rebel Assault II, noninteractive sequences are 320-by-200 pixels, while interactive sequences are rendered in 424-by-260 resolution. Both use 8-bit, 256-color imagery and appear full screen. For higher-end machines, optional interpolation up to 640-by-400 resolution is available. High resolution is more CPU-intensive, so this may result in a slower frame rate than low resolution, even on a moderately-powered system. To account for this, the system was designed to elegantly handle a less-than-optimal frame rate.

Each frame of video typically consists of 13K of video and 2K of audio. With a data rate of 225K per second, this allows a frame rate of 15fps. Due to the large quantity of video generated for the game, it would have been unreasonable to generate multiple copies of the video streams, each running at a different frame rate. Instead, all video sequences are designed to run at the machine's maximum speed, capping the rate at an optimal 15 frames per second. For high-end systems, the extra CPU time can be used to run in high resolution.

To account for possible synchronization problems due to variable frame rates, two approaches were taken. For sequences without onscreen speech, music and sound effects are linked to specific key frames and designed to accomodate up to a 15% variance in frame rate. For sequences with on-screen speech, rigid synchronization is used. For these sequences, every other frame of video can be optionally omitted, saving decompression and display time and allowing the animation engine to catch up to lip-synched audio.

For some interactive sequences, smooth branching must occur. To achieve this, the system allows video segments to be interlaced into the data stream and preloaded before a possible branch point. When the branch point is reached, the preloaded segment is played to cover up the seek delay to the new animation.

The INSANE library performs reads through DOS for portability. To achieve smooth, uninterrupted animation, it uses a hybrid preemptive cooperative multitasking system, in which data reads are performed within a mainline DOS thread; decompression and game logic run in time slices granted via the timer interrupt. Decompression time can vary from frame to frame depending on the layers of imagery and compression options used in a particular frame. To achieve best overall performance on all video sequences, the system dynamically varies both CPU time-slice allocation and decompression frame rate based on CD-ROM read performance and decompression time.

Chunk Format

A .SAN/.SNM/.NUT file is comprised on a series of chunks with the following format:

 bytes 0-3    chunk type FourCC
 bytes 4-7    chunk size not including this 8-byte preamble, stored in big endian format
 bytes 8..    chunk payload

The chunk structure is hierarchical/recursive for specific chunk types. For example, the payload of an ANIM chunk comprises a series of chunks using the format described above.

Chunk Layout

This section attempts to document the chunk layouts used by different variants of the multimedia format. There are atleast two variants.

  • ANIM: Animation
    • AHDR: Animiation Header, carries palette
    • FRME: Frame
      • NPAL: Intra palette
      • XPAL: Inter/delta palette
      • FOBJ: Frame Object
      • IACT: Audio - payload is within chunk
      • PSAD: Audio - payload is not hierarchical/recursive, but contains SAUD, STRK and SDAT FourCC strings
      • TRES

The different audio chunk types (IACT and PSAD) suggest that there are sub-revisions within the SAN/NUT format.


The following FourCCs have been identified, but not categorised:

  • LACT
  • STOR
  • FTCH
  • SKIP
  • STRK
  • SMRK
  • SHDR
  • SDAT
  • SAUD
  • iMUS
  • FRMT
  • TEXT
  • REGN
  • STOP
  • MAP_
  • DATA
  • ETRS

FOBJ Chunk Details

  • multi-byte numbers are big endian (possibly little endian)
  • these compressor algorithms are known:
    • codec 1
    • codec 37
    • codec 44
    • codec 47

Codec 1: RLE Encoding

  • for each line in image height:
    • 16-bit number indicates encoded line size
    • while there are still encoded data bytes for this line:
      • next byte is code
      • length = code / 2 + 1
      • if bit 0 of code is set:
        • value = next byte
        • if value is 0:
          • skip (length) pixels in output
        • else:
          • put (value) in output (length) times
      • else:
        • for each count in length:
          • value = next byte
          • if value is 0:
            • skip pixel in output
          • else:
            • put value in output

Codec 37

  • assign width and height
  • assign bw as block width
  • assign bh as block height
  • codec must operate on 4x4 blocks
  • assign pitch as block width * 4 (not the same as width necessarily since block width is rounded up to nearest multiple of 4)
  • assign chunk size as size of input chunk - 14
  • allocate a buffer with that size
  • read chunk_size bytes into new buffer
  • sequence number LE_16 @ chunk[2]
  • decoded size is LE_32 @ chunk[4]
  • maskflags = chunk[12]
  • make table with pitch and chunk_buffer[1] as index:
    • index *= 255
    • if (index + 254 < sizeof(table) / 2)
      • assert error condition
    • for i = 0..255
      • j = (i + index) * 2
      • offsettable[i] = maketable_bytes[j+1] * pitch + maketable_bytes[j]
  • if (chunk[0] == 0)
  • else if (chunk[0] == 1)
    • "missing opcode codec47" (?)
  • else if (chunk[0] == 2)
    • ...
  • else if (chunk[0] == 3)
    • ...
  • else if (chunk[0] == 4)
    • ...

Codec 44

  • iterate through the encoded chunk from 0 to size - 14 (?):
    • size of encoded line = next LE_16 in chunk
    • while size is not 0:
      • count = next byte
      • pixel = next byte
      • put (pixel) in output (count) times
      • if size of line is not 0 at this point:
        • count = next LE_16 + 1
        • copy (count) pixels from encoded stream to output
    • at the end of line, output buffer rewinds by one pixel (?)

Codec 47

  • chunk size = size of chunk passed in minus 14 bytes
  • sequence number = first LE_16 of chunk
  • encoded graphic data begins at chunk + 26
  • the bytes at chunk[12] and chunk[13] serve as initializers for deltabufs[0] and [1] respectively