ATRAC3plus

From MultimediaWiki
Jump to navigation Jump to search

ATRAC3plus introduction

ATRAC3plus is a proprietary audio compression algorithm developed by Sony. As in the case of ATRAC3 ATRAC3plus represents the next generation of the ATRAC codec introduced in 1992 with the MiniDisc. Common use of that codec is in nowel Minidisc players and Portable Playstations made by Sony.

Streams coded with ATRAC3plus are usually stored either in the WAV container (those files have the ".at3" extension though) or in the Sony's proprietary Oma/Omg container. In the case of the WAV container the undocumented GUID:

E923AABF-CB58-4471-A119-FFFA01E4CE62

is used in order to indicate the ATRAC3plus codec.

There is very limited number of software products supporting encoding/decoding of the ATRAC3plus streams; most of them are unfortunately available for Microsoft Windows only. Those are:

  • Sony's own SonicStage software (Windows only)
  • ATRAC Codec Plugin for Sony Media Software (Windows only)
  • Sonic Studio's expensive N-code plugin for professionals (available for Windows and Mac OS X)

There is a multi-channel version of ATRAC3plus called "ATRAC-X".

ATRAC3plus technical documentaion

Supported bitrates

ATRAC3plus operates on fixed bitrates only. The following bitrates are supported:

   bitrate      frame size (stereo)
-------------   -------------------
   48 Kbps           280 bytes
   64 Kbps           376 bytes
   96 Kbps           560 bytes
  128 Kbps           744 bytes
  160 Kbps           936 bytes
  192 Kbps          1120 bytes
  256 Kbps          1488 bytes
  320 Kbps          1864 bytes
  352 Kbps          2048 bytes

Coding techniques

ATRAC3 is a hybrid subband/MDCT codec like MP3. The signal is split into 16 subbands using Quadrature Mirror Filter before MDCT and bit allocation. The MDCT window has the size of 2048 samples per channel.

Various algorithms are used to improve compression results:

  • gain control for reducing pre-echo artifacts
  • generalized harmonic analysis (GHA) for separating tone components
  • power compensation for better quality at low bitrates

The following techniques are used in order to make the compressed data smaller:

Probably the most interesting part of the ATRAC3plus codec is the Generalized Harmonic Analysis (GHA) - an inharmonic frequency analysis proposed by Norbert Wiener in 1930. The main advantage of that is an excellent frequency resolution that surpasses the short-time Discrete Furier transformation. However it requires huge amount of calculations. Several algorithms to work around that problem were introduced during last 20 years, for example the one proposed by Dr.Hirata.

Multichannel ATRAC3plus (ATRAC-X)

ATRAC3plus supports multichannel streams (up to 8 channels). Such streams are encoded in units customary called "channel block"; each block contains max. 2 channels (ie can be MONO or STEREO). For example, taking the channel_id = 3 and looking at the table below we have a stream containing 2 channel blocks: 1 stereo + 1 mono and thus 3 channels. The base codec operates on either MONO or STEREO channel blocks only.

ATRAC-X channel configurations

channel_id total channels number of channel blocks speaker mapping
0 0 undefined
  • undefined
1 1 1
  • front: center (MONO)
2 2 1
  • front: L, R (STEREO)
3 3 2
  • front: L, R
  • front: center
4 4 3
  • front: L, R
  • front: center
  • rear: surround
5 5+1 4
  • front: L, R
  • front: center
  • rear: L, R
  • LFE
6 6+1 5
  • front: L, R
  • front: center
  • rear: L, R
  • rear: center
  • LFE
7 7+1 5
  • front: L, R
  • front: center
  • rear: L, R
  • side: L, R
  • LFE

Bitstream overview

The table below shows the bitstream organization of ATRAC3plus at the top-level. Depends on channel configuration the bitstream may contain more than one channel block. In this case the additional fields channel_block_type and channel_block_data will be included for each block.


name number of bits value description
start_marker 1 0

marks the start of the ATRAC3plus bitstream

channel_block_type 2
  • 00b - MONO block
  • 01b - STEREO block
  • 10b - EXTENSION block
type of the channel block
channel_block_data variable contains encoded sound information
terminator 2 11b indicates the end of the bitstream

Channel block types

There are following channel block types in ATRAC3plus:

  • Mono channel block: contains monaural sound data.
  • Stereo channel block: contains stereofonic sound data.
  • Extension block: as indicated by its name it's intended to carry some extension information. Its purpose is unknown though due to the lack of the official description. All existing decoder implementations are programmed to ignore such blocks.

Channel block layout

ATRAC3plus was designed to provide a high-quality sound compression. Therefore it tries to save as much bits as possible. It uses a new coding scheme for channel blocks compared to ATRAC3: channels in a stereo sound are no more coded separately but rather in one stereo channel block. The bitstream for such a block provides the possibility for both channels to share several sound parameters so that there is no need to transmit the same things twice. Depends on correlation between the channels this can lead to a significant bit reduction and thus improve coding quality.

A mono/stereo channel block contains the following pieces of sound information:

name size in bits description
sound_header 6 defines some global sound parameters
wordlength_info variable quantization word length information for each coded subband
scalefactor_info variable quantization scale factor indexes for each coded subband
huffman_info variable huffman table information for each coded subband
spectra variable huffman-coded spectral information for each coded subband
window_info variable tells which IMDCT window shape should be used during the sound reconstruction