ATRAC3plus
- Format tag: uses WAVE_FORMAT_EXTENSIBLE with the "SubFormat" field set to the following GUID: E923AABF-CB58-4471-A119-FFFA01E4CE62
- Company: Sony
- Samples: http://samples.mplayerhq.hu/A-codecs/ATRAC3+/
- Stored in: WAV and Oma/Omg containers.
- Official information: http://www.sony.net/Products/ATRAC3/tech/atrac3plus.html
ATRAC3plus introduction
ATRAC3plus is a proprietary audio compression algorithm developed by Sony. As in the case of ATRAC3 ATRAC3plus represents the next generation of the ATRAC codec introduced in 1992 with the MiniDisc. Common use of that codec is in nowel Minidisc players and Portable Playstations made by Sony.
Streams coded with ATRAC3plus are usually stored either in the WAV container (those files have the ".at3" extension though) or in the Sony's proprietary Oma/Omg container. In the case of the WAV container the undocumented GUID:
E923AABF-CB58-4471-A119-FFFA01E4CE62
is used in order to indicate the ATRAC3plus codec.
There is very limited number of software products supporting encoding/decoding of the ATRAC3plus streams; most of them are unfortunately available for Microsoft Windows only. Those are:
- Sony's own SonicStage software (Windows only)
- ATRAC Codec Plugin for Sony Media Software (Windows only)
- Sonic Studio's expensive N-code plugin for professionals (available for Windows and Mac OS X)
There is a multi-channel version of ATRAC3plus called "ATRAC-X".
ATRAC3plus technical documentaion
Supported bitrates
ATRAC3plus operates on fixed bitrates only. The following bitrates are supported:
bitrate frame size (stereo) ------------- ------------------- 48 Kbps 280 bytes 64 Kbps 376 bytes 96 Kbps 560 bytes 128 Kbps 744 bytes 160 Kbps 936 bytes 192 Kbps 1120 bytes 256 Kbps 1488 bytes 320 Kbps 1864 bytes 352 Kbps 2048 bytes
Coding techniques
ATRAC3 is a hybrid subband/MDCT codec like MP3. The signal is split into 16 subbands using Quadrature Mirror Filter before MDCT and bit allocation. The MDCT window has the size of 2048 samples per channel.
Various algorithms are used to improve compression results:
- gain control for reducing pre-echo artifacts
- generalized harmonic analysis (GHA) for separating tone components
- power compensation for better quality at low bitrates
The following techniques are used in order to make the compressed data smaller:
- variable-lenght (Huffman) coding
- vector quantization based on trained tables
- differential coding
Probably the most interesting part of the ATRAC3plus codec is the Generalized Harmonic Analysis (GHA) - an inharmonic frequency analysis proposed by Norbert Wiener in 1930. The main advantage of that is an excellent frequency resolution that surpasses the short-time Discrete Furier transformation. However it requires huge amount of calculations. Several algorithms to work around that problem were introduced during last 20 years, for example the one proposed by Dr.Hirata.
Multichannel ATRAC3plus (ATRAC-X)
ATRAC3plus supports multichannel streams (up to 8 channels). Such streams are encoded in units customary called "channel block"; each block contains max. 2 channels (ie can be MONO or STEREO). For example, taking the channel_id = 3 and looking at the table below we have a stream containing 2 channel blocks: 1 stereo + 1 mono and thus 3 channels. The base codec operates on either MONO or STEREO channel blocks only.
ATRAC-X channel configurations
channel_id | total channels | number of channel blocks | speaker mapping |
---|---|---|---|
0 | 0 | undefined |
|
1 | 1 | 1 |
|
2 | 2 | 1 |
|
3 | 3 | 2 |
|
4 | 4 | 3 |
|
5 | 5+1 | 4 |
|
6 | 6+1 | 5 |
|
7 | 7+1 | 5 |
|
Bitstream overview
The table below shows the bitstream organization of ATRAC3plus at the top-level. Depends on channel configuration the bitstream may contain more than one channel block. In this case the additional fields channel_block_type and channel_block_data will be included for each block.
name | number of bits | value | description |
---|---|---|---|
start_marker | 1 | 0 |
marks the start of the ATRAC3plus bitstream |
channel_block_type | 2 |
|
type of the channel block |
channel_block_data | variable | contains encoded sound information | |
terminator | 2 | 11b | indicates the end of the bitstream |
Channel block types
There are following channel block types in ATRAC3plus:
- Mono channel block: contains monaural sound data.
- Stereo channel block: contains stereofonic sound data.
- Extension block: as indicated by its name it's intended to carry some extension information. Its purpose is unknown though due to the lack of the official description. All existing decoder implementations are programmed to ignore such blocks.
Channel block layout
ATRAC3plus was designed to provide a high-quality sound compression. Therefore it tries to save as much bits as possible. It uses a new coding scheme for channel blocks compared to ATRAC3: channels in a stereo sound are no more coded separately but rather in one stereo channel block. The bitstream for such a block provides the possibility for both channels to share several sound parameters so that there is no need to transmit the same things twice. Depends on correlation between the channels this can lead to a significant bit reduction and thus improve coding quality.
A mono/stereo channel block contains the following pieces of sound information:
name | size in bits | description |
---|---|---|
sound_header | 6 | defines some global sound parameters |
wordlength_info | variable | quantization word length information for each coded subband |
scalefactor_info | variable | quantization scale factor indexes for each coded subband |
huffman_info | variable | huffman table information for each coded subband |
spectra | variable | huffman-coded spectral information for each coded subband |
window_info | variable | tells which IMDCT window shape should be used during the sound reconstruction |