Sonarc

From MultimediaWiki
Jump to navigation Jump to search

Sonarc is a lossless audio compression format by Richard P. Sprague and published by the company Speech Compression. The documentation accompanying v2.1i (the latest known version of the software) indicates a copyright date of 1994. This documentation further states, "Sonarc is now being used as an installation utility in PC titles published by Interplay, Origin, and The Software Toolworks, among others."

The algorithm operates on 8- or 16-bit PCM samples in either mono or stereo configurations. It typically achieves 2:1 to 3:1 compression ratios for suitable audio, while falling back to an uncompressed mode for data which falls outside of its coding model.

File Format

All multi-byte numbers are little endian.

Sonarc data can be transported in 2 different containers: either a custom container format or a standard WAV file.

Custom Container Format

Sonarc data can be stored in its own custom container with the following format:

bytes 0-25    file signature: 'Sonarc-squeezed PCM file.\x1A'
bytes 26-39   unknown

After this is a sequence of frames.

WAV Format

Sonarc files have the extension .SNC. However, the files are just standard Microsoft Wave audio files with a codec ID of 0x0021. The 'data' chunk contains the total amount of compressed bytes. The next 4 bytes represent the total number of decompressed bytes. After this is a sequence of frames.

Frame Format

A frame has the following format:

bytes 0-1    frame size in bytes (including this 2-byte length field)
bytes 2-3    number of samples in decompressed frame (256-2048)
bytes 4-5    CRC value (full unpacked frame data words XORed together shall produce 0xACED)
byte  6      coding mode (top bit signals RLE compression of frame data, bottom bits are coding method used)
byte  7      LPC order
bytes 8-..   LPC coefficients (16-bit words)
bytes ...    frame data (optionally RLE compressed)

Buffer RLE decompression works as follows:

 byte = *src++;
 if (byte != 0x81) *dst++ = byte;
 else {
   len = *src++;
   if (len == 0) *dst++ = byte;
   else if (len <= 2) {
     len = *src++;
     byte = *src++;
     while (len--) *dst++ = byte;
   } else {
     while (len--) *dst++ = 0xFF;
   }
 }

There are at least 2 modes that Sonarc can use to compress audio information: Compressed and uncompressed. The details of the compression format are unknown. However, like most (all?) lossless compression algorithms, Sonarc includes a fallback mode which encodes the data raw. The size of an uncompressed chunk is predictably calculate as (frame size + 28).

8-bit audio may be coded in the following ways:

  • old coding (version 1)
    • methods 0-7 -- static Huffman codes for residues
    • method 8 -- raw data
  • newer coding (version 2) -- some adaptive Huffman coding
  • new coding (version 3) -- residue coding as for 16-bit samples but with its own 14 sets of up to nine categories

16-bit audio is coded using a set of codebooks describing the code. In order to decode a residue sample first an unary prefix is read to determine code category, which defines the number of bits and how to interpret them. Then the code is read as in a way resembling JPEG DCs. There may be 30 categories sets, selected by the compression method.