Dolby E

From MultimediaWiki
Jump to navigation Jump to search

Dolby E is a codec from Dolby Laboratories that is used to transport up to 8 channels of audio across AES-3 cabling (AES-3 is the professional version of SPDIF). It is carried in a SMPTE-337M data burst. Dolby E also carries metadata such as downmixing information which is intended to be passed through to the final distribution encoder.

Very similar to AC-3. Longer transform length and different windows. LFE also has a postfilter. Higher Bitrate overall. Official decoder is very slow. It has no SIMD at all. A decoder in ffmpeg is 20-30x faster.


Frame Structure

Frame structure.png

Dolby E is designed to match up with video frames to allow for easy cutting. Guard Bands are also present at the beginning and the end of the frame to reduce the risk of bad splicing causing problems.

There are 3 input bit depths of Dolby E: 16-bit, 20-bit and 24-bit. 16-bit mode has a maximum of 6 channels and 20-bit and 24-bit modes have a maximum of 8 channels.


Dolby E uses the following startcodes:

16-bit: 0x78e 20-bit: 0x788e 24-bit: 0x7888e

The LSB of the startcode signals the presence of a Bitstream Key. The Bitstream Key is mandatory in 16-bit mode.

Bitstream Key

Certain parts of the bitstream seem to be XOR ciphered. The key is always the first word of the section that is XORed.


Each audio subsegment, metadata section and the metering section is CRCed using the AV_CRC_16_ANSI in libavutil. (TODO: describe 20-bit mode because it's slightly different)


Dolby E contains "Professional Metadata", which include SMPTE timecodes along with how the downstream encoder should be configured and "Consumer metadata", which is for passing onto AC-3 and Dolby Pulse bitstreams to the viewer.

Size (bits) Explanation Value
16/20/24 Sync word See above
0 if(has_bitstream_key){
16/20/24 Bitstream Key
0 }
4 Metadata revision id Only seen 0 in the wild
10 Size of first metadata section In AES3 words
6 Program configuration Lookup table gives you number of channels and programs.
4 Framerate byte pt1 (does something else too) Seems to be the same as above ("unknown3" in the code)
4 Framerate byte pt2 (does something else too) Seems to be the same as above ("unknown2" in the code)
16 Frame counter Designed to detect splices
10 Unknown
2 SMPTE Timecode Hours tens
4 SMPTE Timecode Hours units NOTE: Timecode of "45" signifies "No Timecode" - Other sections are zeroed out. Max value for hours is 23.
9 Unknown
3 SMPTE Timecode Minutes tens
4 SMPTE Timecode Minutes units Max value for minutes is 59.
9 Unknown
3 SMPTE Timecode Seconds tens
4 SMPTE Timecode Seconds units Max value for seconds is 59.
9 Unknown
1 Drop Frame Flag
2 SMPTE Timecode Frames tens
4 SMPTE Timecode frames units
8 Unknown
0 for(int i=0; i < num_channels; i++){
10 Size in words of channel i
0 }
0 if(Framerate_byte_pt1 =< 5){
8 Size of metadata section 2 In AES3 words
0 }
8 Size of meter section In AES3 words
0 for(int i=0; i < num_programs; i++){
8 Description character ( part of program info word. (why is this needed for decoding with a LUT?)
2 zero (seemingly always) ( part of program info word. (why is this needed for decoding with a LUT?)
0 }
0 for(int i=0; i < num_channels; i++){
10 Gain word (first audio subsegment)
10 Gain word (second audio subsegment)
0 }
4 "unknown4" in code Seems to do a lot of things
0 if(unknown4 & 0x3){ Implies existence of more metadata
0 if(metadata_segment == 0 && (unknown4 == 1 OR unknown4 == 2)){
12 Unknown
0 for(int i=0; i < num_programs; i++){
5 Data Rate Need to find sample with this site
3 Bitstream Mode
3 Coding Mode What about 7.1 mode? (not enough bits to signal)
2 Centre Mix
2 Surround Mix
2 Surround Mode
1 LFE Enable
5 Dialogue Normalisation
1 Unknown Seems to always be zero
8 Unknown TODO
1 Production information exists
5 Mix Level
2 Room Type
1 Copyright
1 Original
0 }
0 }
0 else if(metadata_segment == 1){
0 }
0 }

Audio Segments

Dolby e Audio segments.png

There's always an even number of channels so the split is trivial.

Exponents and bit allocation

There seem to be only 2 exponent strategies. Bit allocation is similar to AC-3.

Mantissa Quantisation

Dolby E uses gain adaptive quantisation for its mantissas. (TODO: describe further)


Dolby E uses a slightly edited MDCT:

Dolby E Mdct.png

Sample Rate Conversion

The internal sample rate of Dolby E varies depending on the associated video frame-rate. This internal sample rate varies between 42.965kHz and 53.760kHz. This is sample rate converted to 48kHz after decoding.

Metering Information

There is also metering information available at the end of the frame. (TODO: describe further)


A free trial of a software Dolby E encoder and decoder that supports encoding of 16-bit and 20-bit modes and decoding of 16-bit, 20-bit and possibly 24-bit is available from However it requires Pace iLok to run, which features kernel level anti-debugging.


The application library uses the Dolby SIP interface to decode. More information about the Dolby SIP interface can be found here.

Dolby Subroutines

Each function name is followed by a function ID number. These take the take the form: DD_XXXXD_YYYY for decoding and DD_XXXXE_YYYY for encoding

DD_SYS_INIT – 0x00
System Initialise

meter DD_CRCD_VER – 0x1E
Verify CRC of meter section.

Metadata %d DD_CRCD_VER: 0x1E
Verify CRC of metadata section.

Channel decode %d:%d DD_CRCD_VER – 0x1E
Verify CRC of channel.

DD_DDED_DEC – 0x20
Dolby E Decode Seemingly needs to be called 8 times (similar to ac-3’s 6 times per frame)

Unpack Metadata

Return only the AC-3 compatible metadata?

Unpack Metering data

meter DD_KEYD_EXTR – 0x28
Extract Bitstream Key

Sample rate convert

External links

Discussion on VideoLan forum about E-distribution decoder"