MPEG-4 Audio: Difference between revisions
Jump to navigation
Jump to search
(→Audio Object Types: 44: LD MPEG Surround) |
mNo edit summary |
||
(4 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
* Company: [[ISO]] | * Company: [[ISO]] | ||
* Samples: http://samples.mplayerhq.hu/MPEG-4/ | * Samples: http://samples.mplayerhq.hu/MPEG-4/ and http://ghostarchive.org/samples/MPEG-4 | ||
* Samples: http://samples.mplayerhq.hu/A-codecs/AAC/ | * Samples: http://samples.mplayerhq.hu/A-codecs/AAC/ and http://ghostarchive.org/samples/AAC | ||
* Samples: [http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_14496- | * Samples: [http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_14496-26_2010_Bitstreams/ sample repo at standards.iso.org] | ||
* Sample Docs: [http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_14496- | * Sample Docs: [http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_14496-26_2010_Bitstreams/DVD1/mpeg4audio-conformance/doc/fileNameConventions.html sample docs] | ||
* Sample Docs: [https://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_IEC_14496-26_2010_Bitstreams/DVD1/mpeg4audio-conformance/doc/ more sample docs] | |||
Specification links: | Specification links: | ||
*MPEG-4 Audio: [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53943 ISO/IEC 14496-3:2009] | *MPEG-4 Audio: [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53943 ISO/IEC 14496-3:2009] | ||
*Conformance: [http:// | *Conformance: [http://standards.iso.org/ittf/PubliclyAvailableStandards/c053750_ISO_IEC_14496-26_2010.zip ISO/IEC 14496-26:2010] | ||
== MPEG-4 Audio == | == MPEG-4 Audio == | ||
MPEG-4 includes a system for handling a diverse group of audio formats in a uniform matter. Each format is assigned a unique Audio Object Type (AOT) to represent it. The common format Global header shared by all AOTs is called the Audio Specific Config. | MPEG-4 includes a system for handling a diverse group of audio formats in a uniform matter. Each format is assigned a unique Audio Object Type (AOT) to represent it. The common format Global header shared by all AOTs is called the Audio Specific Config. | ||
== Subparts == | |||
*Subpart 0: Overview | |||
*Subpart 1: Main ([[MP4|Systems]] Interaction) | |||
*Subpart 2: Speech coding - HVXC | |||
*Subpart 3: Speech coding - CELP | |||
*Subpart 4: General Audio coding (GA) - [[AAC]], TwinVQ, BSAC | |||
*Subpart 5: Structured Audio (SA) | |||
*Subpart 6: Text To Speech Interface (TTSI) | |||
*Subpart 7: Parametric Audio Coding - HILN | |||
*Subpart 8: Parametric coding for high quality audio - SSC (and [[Parametric Stereo]]) | |||
*Subpart 9: [[MP3|MPEG-1/2 Audio]] in MPEG-4 | |||
*Subpart 10: Lossless coding of oversampled audio - DST | |||
*Subpart 11: Audio lossless coding - [[ALS]] | |||
*Subpart 12: Scalable lossless coding - [[MPEG-4 SLS|SLS]] | |||
== Audio Specific Config == | == Audio Specific Config == | ||
Line 31: | Line 47: | ||
*4: [[AAC]] LTP (Long Term Prediction) | *4: [[AAC]] LTP (Long Term Prediction) | ||
*5: SBR ([[Spectral Band Replication]]) | *5: SBR ([[Spectral Band Replication]]) | ||
*6: AAC Scalable | *6: [[AAC]] Scalable | ||
*7: TwinVQ | *7: [[TwinVQ]] | ||
*8: [[CELP]] (Code Excited Linear Prediction) | *8: [[CELP]] (Code Excited Linear Prediction) | ||
*9: HXVC (Harmonic Vector eXcitation Coding) | *9: HXVC (Harmonic Vector eXcitation Coding) | ||
Line 42: | Line 58: | ||
*15: General MIDI | *15: General MIDI | ||
*16: Algorithmic Synthesis and Audio Effects | *16: Algorithmic Synthesis and Audio Effects | ||
*17: ER (Error Resilient) AAC LC | *17: ER (Error Resilient) [[AAC]] LC | ||
*18: Reserved | *18: Reserved | ||
*19: ER AAC LTP | *19: ER [[AAC]] LTP | ||
*20: ER AAC Scalable | *20: ER [[AAC]] Scalable | ||
*21: ER TwinVQ | *21: ER [[TwinVQ]] | ||
*22: ER BSAC (Bit-Sliced Arithmetic Coding) | *22: ER [[BSAC]] (Bit-Sliced Arithmetic Coding) | ||
*23: ER AAC LD (Low Delay) | *23: ER [[AAC]] LD (Low Delay) | ||
*24: ER CELP | *24: ER [[CELP]] | ||
*25: ER HVXC | *25: ER HVXC | ||
*26: ER HILN (Harmonic and Individual Lines plus Noise) | *26: ER HILN (Harmonic and Individual Lines plus Noise) | ||
Line 55: | Line 71: | ||
*28: SSC (SinuSoidal Coding) | *28: SSC (SinuSoidal Coding) | ||
*29: PS ([[Parametric Stereo]]) | *29: PS ([[Parametric Stereo]]) | ||
*30: [[ | *30: [[MPEG Surround]] | ||
*31: (Escape value) | *31: (Escape value) | ||
*32: Layer-1 | *32: [[MP1|Layer-1]] | ||
*33: [[MP2|Layer-2]] | *33: [[MP2|Layer-2]] | ||
*34: [[MP3|Layer-3]] | *34: [[MP3|Layer-3]] | ||
Line 63: | Line 79: | ||
*36: [[MPEG-4 ALS|ALS]] (Audio Lossless) | *36: [[MPEG-4 ALS|ALS]] (Audio Lossless) | ||
*37: [[MPEG-4 SLS|SLS]] (Scalable LosslesS) | *37: [[MPEG-4 SLS|SLS]] (Scalable LosslesS) | ||
*38: SLS non-core | *38: [[MPEG-4 SLS|SLS]] non-core | ||
*39: ER AAC ELD (Enhanced Low Delay) | *39: ER [[AAC]] ELD (Enhanced Low Delay) | ||
*40: SMR (Symbolic Music Representation) Simple | *40: SMR (Symbolic Music Representation) Simple | ||
*41: SMR Main | *41: SMR Main | ||
*42: USAC (Unified Speech and Audio Coding) (no SBR) | *42: [[USAC]] (Unified Speech and Audio Coding) (no [[SBR]]) | ||
*43: SAOC (Spatial Audio Object Coding) | *43: SAOC (Spatial Audio Object Coding) | ||
*44: LD MPEG Surround | *44: LD [[MPEG Surround]] | ||
*45: USAC | *45: [[USAC]] | ||
== Sampling Frequencies == | == Sampling Frequencies == |
Latest revision as of 20:59, 13 August 2021
- Company: ISO
- Samples: http://samples.mplayerhq.hu/MPEG-4/ and http://ghostarchive.org/samples/MPEG-4
- Samples: http://samples.mplayerhq.hu/A-codecs/AAC/ and http://ghostarchive.org/samples/AAC
- Samples: sample repo at standards.iso.org
- Sample Docs: sample docs
- Sample Docs: more sample docs
Specification links:
- MPEG-4 Audio: ISO/IEC 14496-3:2009
- Conformance: ISO/IEC 14496-26:2010
MPEG-4 Audio
MPEG-4 includes a system for handling a diverse group of audio formats in a uniform matter. Each format is assigned a unique Audio Object Type (AOT) to represent it. The common format Global header shared by all AOTs is called the Audio Specific Config.
Subparts
- Subpart 0: Overview
- Subpart 1: Main (Systems Interaction)
- Subpart 2: Speech coding - HVXC
- Subpart 3: Speech coding - CELP
- Subpart 4: General Audio coding (GA) - AAC, TwinVQ, BSAC
- Subpart 5: Structured Audio (SA)
- Subpart 6: Text To Speech Interface (TTSI)
- Subpart 7: Parametric Audio Coding - HILN
- Subpart 8: Parametric coding for high quality audio - SSC (and Parametric Stereo)
- Subpart 9: MPEG-1/2 Audio in MPEG-4
- Subpart 10: Lossless coding of oversampled audio - DST
- Subpart 11: Audio lossless coding - ALS
- Subpart 12: Scalable lossless coding - SLS
Audio Specific Config
The Audio Specific Config is the global header for MPEG-4 Audio:
5 bits: object type if (object type == 31) 6 bits + 32: object type 4 bits: frequency index if (frequency index == 15) 24 bits: frequency 4 bits: channel configuration var bits: AOT Specific Config
Audio Object Types
MPEG-4 Audio Object Types:
- 0: Null
- 1: AAC Main
- 2: AAC LC (Low Complexity)
- 3: AAC SSR (Scalable Sample Rate)
- 4: AAC LTP (Long Term Prediction)
- 5: SBR (Spectral Band Replication)
- 6: AAC Scalable
- 7: TwinVQ
- 8: CELP (Code Excited Linear Prediction)
- 9: HXVC (Harmonic Vector eXcitation Coding)
- 10: Reserved
- 11: Reserved
- 12: TTSI (Text-To-Speech Interface)
- 13: Main Synthesis
- 14: Wavetable Synthesis
- 15: General MIDI
- 16: Algorithmic Synthesis and Audio Effects
- 17: ER (Error Resilient) AAC LC
- 18: Reserved
- 19: ER AAC LTP
- 20: ER AAC Scalable
- 21: ER TwinVQ
- 22: ER BSAC (Bit-Sliced Arithmetic Coding)
- 23: ER AAC LD (Low Delay)
- 24: ER CELP
- 25: ER HVXC
- 26: ER HILN (Harmonic and Individual Lines plus Noise)
- 27: ER Parametric
- 28: SSC (SinuSoidal Coding)
- 29: PS (Parametric Stereo)
- 30: MPEG Surround
- 31: (Escape value)
- 32: Layer-1
- 33: Layer-2
- 34: Layer-3
- 35: DST (Direct Stream Transfer)
- 36: ALS (Audio Lossless)
- 37: SLS (Scalable LosslesS)
- 38: SLS non-core
- 39: ER AAC ELD (Enhanced Low Delay)
- 40: SMR (Symbolic Music Representation) Simple
- 41: SMR Main
- 42: USAC (Unified Speech and Audio Coding) (no SBR)
- 43: SAOC (Spatial Audio Object Coding)
- 44: LD MPEG Surround
- 45: USAC
Sampling Frequencies
There are 13 supported frequencies:
- 0: 96000 Hz
- 1: 88200 Hz
- 2: 64000 Hz
- 3: 48000 Hz
- 4: 44100 Hz
- 5: 32000 Hz
- 6: 24000 Hz
- 7: 22050 Hz
- 8: 16000 Hz
- 9: 12000 Hz
- 10: 11025 Hz
- 11: 8000 Hz
- 12: 7350 Hz
- 13: Reserved
- 14: Reserved
- 15: frequency is written explictly
Channel Configurations
These are the channel configurations:
- 0: Defined in AOT Specifc Config
- 1: 1 channel: front-center
- 2: 2 channels: front-left, front-right
- 3: 3 channels: front-center, front-left, front-right
- 4: 4 channels: front-center, front-left, front-right, back-center
- 5: 5 channels: front-center, front-left, front-right, back-left, back-right
- 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
- 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
- 8-15: Reserved