Difference between revisions of "SGA"

From MultimediaWiki
Jump to navigation Jump to search
Line 196: Line 196:


[[Category:Game Formats]]
[[Category:Game Formats]]
[[Category:Platform-Dependent Codecs]]

Revision as of 14:21, 25 July 2014

SGA is a chunk-based multimedia file format used primarily in games by Digital Pictures for the Sega CD console system. Early versions of the format store uncompressed video frames, while later revisions add features such as LZ compression, tile maps, and overlapping chunks. The extension for SGA files is usually ".SGA", but some files from Supreme Warrior have the extensions ".CLP" and ".PT1", and some files from Slam City with Scottie Pippen have extensions that include numbers (e.g. ".SG0", ".S37", etc.). Variations of the format are found in versions of Digital Pictures games for DOS PCs and the 3DO. In those versions all video and audio is stored in large, monolithic files rather than individual files.

File Format

All multi-byte numbers in a SGA file are big-endian, since the Sega Genesis and Sega CD both use big-endian Motorola 68000 CPUs.

There are three known methods of storing data in SGA files:

(1) In the majority of files, each chunk is stored using 2048 byte sectors due to the nature of CD storage. The first sector in the file contains 2048 bytes of data, each subsequent sectors contain a 2 byte header specifying how much of the chunk is left followed by 2046 bytes of data. One item to be aware of is if the value of the header is zero then to skip the next 2046 bytes and check the next header. (I'm presuming this has something to do with padding the CD for faster loading?)

(2) Some files, in particular the audio-only files in Night Trap (all variations), do not use length indicators at the start of sectors. The files contain strictly chunk headers and data.

(3) Later versions of the SGA format, as seen in Slam City with Scottie Pippen, Prize Fighter, and Supreme Warriors, introduced a scheme in which chunks can overlap/interrupt other chunks. In this type of file, each sector begins with a two-byte sector header, with the top four bits set to a non-zero value, which we might call the chunk index. If the following twelve bits are set to zero, then we are at the start of a new chunk using the current index. If the following twelve bits are non-zero, then we are looking at a length value, similar to previous formats. A decoder written to handle this type of file would have to keep track of multiple chunks simultaneously. Videos used in Slam City use a container chunk (type F1) which generally contains a video chunk followed by an audio chunk. Videos in Supreme Warrior begin with what appears to be a global header (type F0) in addition to the F1 container chunk, but at present it is unknown how to interpret the metadata. There is also another chunk type, F2, whose use is presently unknown. F2 chunks can also exist in separate files of the extension ".F2". So far, all that it known about F2 chunks is that they often contain what appear to be filenames.

Most SGA files do not contain a global header, but have headers for each chunk. In files of type (1) and (2), headers are never split across sector boundaries. In files of type (3), chunk headers can be split across sector boundaries as long as they are contained within a chunk of type F1. Chunks are word aligned, can contain video or audio. The basic header is 4 bytes long, and is shared by all chunk types.

Byte  Description
----  -----------
0     Chunk type 
1     Stream index (Sewer Shark in particular uses this for its branching path-based gameplay)
2-3   Payload length

This is followed by more metadata:

Byte  Description
----  -----------
4-7   Time in SMPTE format

Video chunks for the Sega CD contain the following metadata:

Byte  Description
----  -----------
8     Data flags
9     Palette count (1-4)

A value of 1 for the topmost bit of byte 8 means that the chunk uses a tile map. Data for the tile map immediately follows the tile data. Each entry in the tile map is two bytes long, and consists of a tile index and flags for features such as vertical and horizontal flipping. Bits 3 and the lowermost bit 1 are usually set to 1, although their exact purpose is unknown. There may be some connection between bit 1 and tilemaps, though, since tilemap behavior seems to change when the bit is not set. Metadata in video chunks for the Sega CD 32X are as follows:

Byte  Description
----  -----------
8     Palette start offset (usually 1)
9     Palette update size (a value of 0 means use existing palette)

Then more metadata (both Sega CD and Sega CD 32X):

Byte  Description
----  -----------
10    tiles per column
11    tiles per row

In overlay chunk types D1 (and possibly D4), byte 10 indicates the number of tiles in the chunk, and byte 11 indicates the length the data for what are assumed to be codes that specify the layout of the tiles. Since each tile takes up 32 bytes, the length of the tile data is equal to the value in byte 10 * 32. Tile data is immediately followed by layout codes, which as of this writing are yet to be deciphered.

Certain types of video chunks contain additional metadata, which is detailed below in the sections related to the various chunk types.

Audio chunks have the following metadata:

Byte  Description
----  -----------
8-9   Sample rate
10    (Probably) Number of channels / bytes per sample (usually 1)
11    Unknown (usually 0)

Chunk Types

Some known chunk types are:

  • $81: encoded video (used in most Sega CD 32X games by Digital Pictures)
  • $A1: audio, sign/magnitude 8-bit PCM
  • $C1: uncompressed video (used in Night Trap SCD, Sewer Shark, Corpse Killer SCD, and others)
  • $C2: compressed video (used in Corpse Killer SCD, Slam City with Scottie Pippen, and others?)
  • $C4: compressed? video (used in Slam City with Scottie Pippen)
  • $C6: compressed video (used in Night Trap SCD "DPLOGO.SGA", Sewer Shark, Make My Video C&C, and others)
  • $C7: compressed video (used in Prize Fighter and others)
  • $C8: compressed video (used in Sewer Shark, Make My Video C&C, and others)
  • $CB: compressed video (used in Prize Fighter, Double Switch "DPLOGO.SGA", Corpse Killer 32X)
  • $CD: compressed video (used in Double Switch)
  • $D1: uncompressed overlay video (used in Sewer Shark when killing a Ratigator(TM), etc)
  • $D4: compressed? overlay video (used in Corpse Killer SCD/32X for the zombies)
  • $E7: (un)compressed video (used in the Make My Video series)
  • $E8: compressed video (used in Ground Zero Texas)
  • $E9: compressed video (used in Ground Zero Texas)
  • $F0: container for other chunks (used in Slam City with Scottie Pippen)
  • $F1: container for other chunks (used in Slam City with Scottie Pippen)
  • $F2: metadata of unknown use

As of this writing, the format and compression of the following types are generally understood: 81, A1, C1, C6, C7, C8, CB, CD, E7, and F1.

Encoded Video $81

This type of video is probably most appropriately called "encoded" rather than "compressed". Video frames of this type are represented with a series of codes that indicate the size, color, and pixel layout of the graphics to be drawn, rather than, for example, LZSS encoded byte streams as in previous versions of SGA files.

The encoding scheme has been completely reverse-engineered, and a functional decoder has been written. A detailed description of the codes will be added at a later time.

Audio $A1

Audio sample rate can be determined in the following way (for NTSC systems only?):

SamplesPerSecond = ((Byte 8 << 8) + Byte 9) * SEGA_CD_PCM_INCREMENT

SEGA_CD_PCM_INCREMENT is ~15.8945723, and is calculated as SEGA_CD_PCM_FREQUENCY_MAX / 2048

SEGA_CD_PCM_FREQUENCY_MAX is ~32552.084, and is calculated as SEGA_CD_CPU_FREQUENCY / 384

SEGA_CD_CPU_FREQUENCY is 12500000, and is calculated as SEGA_CD_CRYSTAL_FREQUENCY / 4

SEGA_CD_CRYSTAL_FREQUENCY is 50000000

The sample rate can also be used to determine the frame rate of video by using the formula SamplesPerSecond / NumSamples, where NumSamples is the length of the audio data in the audio chunk.

Uncompressed Video $C1

For compatibility with the Genesis' video hardware, video frames in this format (and all its derivatives) are made up of linear 8x8 pixel tiles. Each pixel consists of a 4-bit (one nibble) palette index, thus each tile takes up 32 bytes. The length of the tile data can be calculated as tilesPerColumn * tilesPerRow * 32.

Palette data immediately follows the tile data. Each palette is 18 bytes long. Palettes are stored in an unusual format. As the genesis normally uses either RGB or BGR stored in nibbles (even though only the top 3 bits are used).

bitmap={1,2,4}

 For bit=0 to 2
    for color=0 to 15
        red[color]+=Top Most Bit of Data *bitmap[bit]
    next
 next

Repeat for green and blue.

Reading 2 bits for each tile, determines which of the 4 palettes to use.

 For Row=0 to RowMax
    For Col=0 to ColMax
        PalMap[Row*ColMax+Col]=Top 2 Bits of data
    Next
 Next

When drawing a tile you would select the palette based upon PalMap. Note that palette maps for 2 palette frames use only 1 bit per palette map entry vs 2 bits for 3 or 4 palettes.

If a frame has a tilemap, the tilemap immediately follows the palette data. Each tilemap entry is 16 bits long, and so the length of the tilemap in bytes is tilesPerColumn * tilesPerRow * 2. Tilemaps also contain palette information, and so frames with tilemaps do not contain a separate palette map. The top 4 bits of a tilemap entry indicate the palette index of the tile, and the remaining 12 bits indicate the index of the tile itself.

Compressed Video $C6, $C7, $C8

Chunks in this format contain the same basic elements as chunks of type C1 (tiles, palette, palette map, tilemap, etc), but are compressed in LZSS format. The compressed data are comprised of several 34 byte LZSS blocks. Each block contains a one word (2 bytes) tag of compression flags followed by 16 words of data.

The tag is read in bits, starting with the most significant (left most) bit. For each bit set to 0, there is an uncompressed word literal. For each bit set to 1, the following word is a displacement/length reference in the following format:

LLLD DDDD DDDD DDDD

L = Amount of words to copy (amount of bytes to copy * 2)
D = Displacement

This may be calculated as:

for count = 0 to top 3 bits of LZ word * 2
   data[current + count] = data[current + count - last 13 bits of LZ word]
next

Note that the displacement, unlike the copy amount, is based on bytes, not words. Also, the displacement does not have to be word aligned.

After the entire tag is read, the next flag block is read, and the process continues. The sequence ends when an reference word's top three bits are all zeros.

In C7 format chunks, the top three bits represent the amount of words to copy minus one. Any decoder that implements C7 decoding must take this into account.

Most chunks in C6, and all chunks C8 and E7 formats require that adjacent pixels in even-numbered lines be swapped for frames to be correctly displayed. The exact reason for this is unclear, although since Sega CD video often contains a lot of checkerboard dithering, swapping pixels would eliminate the checkerboard patterns and lead to higher amounts of identical/redundant data, thus leading to more efficient compression. There is currently no known way to determine which C6 chunks require pixel swapping, however, it seems that pixel swapping only occurs in C6 frames that use fewer than 3 palettes and that do not use a tilemap.

Compressed Video $CB, $CD, $E7

Like other compressed chunk formats, chunks in this format consist of several 34 byte LZSS blocks. Each block contains a one word (2 bytes) tag of compression flags followed by 16 words of data.

For each bit set to 1, the following word is a displacement/length reference in the following format:

LLLL DDDD DDDD DDDD

L = Amount of words to copy minus one (amount of bytes to copy * 2)
D = Displacement

This may be calculated as:

for count = 0 to (top 4 bits of LZ word + 1) * 2
   data[current + count] = data[current + count - last 12 bits of LZ word]
next

Chunks of type E7 are used in the Make My Video series, and contain three subframes, which are stacked on top of each other to complete the whole frame. C7 chunks have an additional six bytes of metadata containing three 16-bit values. The topmost bit of each value indicates whether the data for that frame is raw (1) or compressed (0). The remaining 15 bits represent the length of the data for the subframe, with an apparent maximum value of 0x1500. The data for each subframe immediately follows the metadata in order, which is followed by 180 bytes of palette data in the usual format.

Games Using SGA

  • Night Trap
  • Sewer Shark
  • Power Factory Featuring C&C Music Factory
  • Make My Video series
  • Ground Zero: Texas
  • Prize Fighter
  • Double Switch
  • Slam City with Scottie Pippen
  • Corpse Killer
  • Supreme Warrior

Decoders/Converters

  • A decoder has been written that can currently decode SGA files from Night Trap (SCD and 32X), Sewer Shark (including stream selection), Corpse Killer (SCD and 32X), Prize Fighter, Double Switch, as well as various other games, and convert them to AVI format files. It has yet to be released to the public.