Meridian Lossless Packing

From MultimediaWiki
Jump to navigation Jump to search

Also known as Dolby TrueHD. This is a lossless audio codec for DVD audio, as stated in the whitepaper, it uses IIR prediction (like TrueAudio or Monkey's Audio) and Golomb codes residue packing.

You can download samples created by rjamorim at RareWares:

they were created with Surcode MLP, and the original samples are also available (WavPack-encoded). God Save The Queen is a 6ch/96k/24b sample from Queen's A Night At The Opera DVD-A. luckynight is the same sample used elsewhere in this wiki, and 440hz is, as the name implies, a simple 440 Hz tone.

You can contact rjamorim if you want other samples encoded.


Overview

A good overview of the encoding/decoding process is available on Meridian's website:

An MLP audio stream can contain multiple substreams - a decoder can optionally decode only a subset of these substreams. Typically stereo files are encoded with a single substream, while surround files have one substream containing a stereo downmix, and another substream containing the additional surround channels and matrixing parameters to extract the front channels from the stereo downmix.

MLP streams can specify that the second substream can have a lower bit depth or sampling frequency than the first substream. TrueHD removes this capability.

Stream structure

A stream is split up into a number of small blocks, or "access units". Each access unit contains coded data for 40 (44100/48000Hz), 80 (88200/96000Hz) or 160 (176400/192000Hz) samples. Periodically an access unit contains a "major sync" header that contains stream metadata (analogous to a sequence header in a video codec).

All values in the stream are big endian.

Access unit structure

Each access unit starts with a parity nibble (details later), and then a 12-bit length field (in units of two bytes). Following this is a 2-byte value that looks like it may be some form of DTS - in most cases each packet will have a value that is the sum of the previous packet's value and the number of samples per packet, but some values can diverge from this.

After these first four bytes, the major sync header can optionally appear - if the next four bits are all one, then the following 28 bytes (including the four one bits) are a major sync header.

Following this is an index of two- or four-byte entries for each substream. These have the following format:

Size Meaning
1 bit If set, indicates an extra 16 bits of data are present
1 bit If clear, indicates that the substream data starts with a "restart header"
1 bit If set, indicates that the substream data contains extra parity information
1 bit Unknown
12 bits Offset, in units of two bytes, of the end of the data for this substream from the start of the start of the substream data block (ie length of this substream's data plus all previous)
16 bits (Optional, present if first bit was set) Unknown

Following these entries is the data for each substream.

The first nibble of the access unit is a parity check on the access unit header - XOR'ing together the nibbles of the first four bytes and the substream entry table above (excluding the major sync header if present and the substream data) should give a result of 0xF.

Major sync header

The major sync header is 28 bytes long, and starts with the three bytes F8 72 6F, followed by BB for MLP streams, or BA for TrueHD streams. This contains details like the number of substreams, the channel arrangement, bit depth and sampling frequency. In MLP streams major syncs must occur between every 8 and 32 access units. In TrueHD streams spacings of 128 access units have been observed.

The major sync header for MLP streams has the following format:

Size Meaning
24 bits SYNC_MAJOR (0xf8726f)
8 bits SYNC_MLP (0xbb)
4 bits Bit-depth (substream 0)
4 bits Bit-depth (substream 1)
4 bits Sample-rate (substream 0)
4 bits Sample-rate (substream 1)
11 bits Unknown
5 bits Channel arrangement
16 bits MAJOR_SYNC_INFO_SIGNATURE (0xb752)
16 bits Info flags
16 bits Unknown
1 bit Set to 1 if VBR
15 bits Coded peak bitrate
4 bits Number of substreams
4 bits Unknown (set to 0x1)
8 bits Substream info
5 bits Unknown (some substream info)
5 bits Sample wordlength (no. of bits)
6 bits Channel occupancy
3 bits Unknown
10 bits Speaker layout?
3 bits Copy protection
16 bits Unknown (set to 0x8080)
7 bits Unknown
4 bits Source format
5 bits Summary info

Substream data block

A substream data block contains one or more chunks of data.

A data chunk may start with a restart header followed by a decoding parameter block, just a decoding parameter block, or may just go straight into data.

A decoding parameter block is present if the first bit is '1'. If the next bit is also '1', a restart header is also present, before the decoding parameter block.

The encoded residual data then follows. If data_check_present was set in the last restart header for this substream, the data is preceded by a 16-bit count indicating how many bits of data should be present, and followed by an 8-bit value (currently unknown, maybe a CRC). The bit count includes the 16 count bits, but not the 8 extra bits.

A single bit marks the end of a chunk - this is set to 1 if it is the last chunk in the data block.

After the last chunk, padding bits are inserted to align the bitstream to a 16-bit boundary. The 32-bit value D2 34 D2 34 may then optionally be present, which indicates that this was the last access unit of the audio stream.

If substream parity information was indicated in the substream index earlier, the data for the substream is terminated by a parity check byte and a CRC byte. The parity check is such that XORing all the bytes of the substream up to (and including) the parity byte should give the value 0xA9.

Restart header

The restart header contains information necessary for the decoding of the audio stream. It is generally present in packets containing a major sync header. The restart header starts with a 14-bit code, either 0x31EA or 0x31EB. MLP streams only use 0x31EA headers, TrueHD streams use either/both. The difference lies mostly in the way rematrixing is done.

TODO: Document more

Weirdness

It appears that the sample rate field is a full byte, with each sample rate being encoded in chunks of 16, so 0-15 = 48000, 16-31 = 96000, 32-47 = 192000, then there's a hole from 48-127, followed by 128-143 = 44100, 144-159 = 88200, 160-175 = 176400, then a second hole from 176-255