Difference between revisions of "PCM"
|Line 78:||Line 78:|
* 'ulaw' denotes mu-law logarithmic PCM.
* 'ulaw' denotes mu-law logarithmic PCM.
=== CD Audio ===
=== Sega CD ===
=== Sega CD ===
Revision as of 11:40, 4 February 2006
PCM stands for pulse code modulation. In the context of audio coding PCM encodes an audio waveform in the time domain as a series of amplitudes.
- 1 PCM Parameters
- 2 PCM Types
- 3 Platform-Specific PCM Identifiers And Characteristics
- 4 Identifying PCM Data
PCM audio is coded using a combination of various parameters.
This parameter specifies the amount of data used to represent each discrete amplitude sample. The most common values are 8 bits (1 byte), which gives a range of 256 amplitude steps, or 16 bits (2 bytes), which gives a range of 65536 amplitude steps. Other sizes, such as 12, 20, and 24 bits, are occasionally seen. Some king-sized formats even opt for 32 and 64 bits per sample.
When more than one byte is used to represent a PCM sample, the byte order (big endian vs. little endian) must be known. Due to the widespread use of little-endian Intel CPUs, little-endian PCM tends to be the most common byte orientation.
It is not enough to know that a PCM sample is, for example, 8 bits wide. Whether the sample is signed or unsigned is needed to understand the range. If the sample is unsigned, the sample range is 0..255 with a centerpoint of 128. If the sample is signed, the sample range is -128..127 with a centerpoint of 0. If a PCM type is signed, the sign encoding is almost always 2's complement. In very rare cases, signed PCM audio is represented as a series of sign/magnitude coded numbers.
Channels And Interleaving
If the PCM type is monaural, each sample will belong to that one channel. If there is more than one channel, the channels will almost always be interleaved: Left sample, right sample, left, right, etc., in the case of stereo interleaved data. In some rare cases, usually when optimized for special playback hardware, chunks of audio destined for different channels will not be interleaved.
Frequency And Sample Rate
This parameter measures how many samples/channel are played each second. Frequency is measured in samples/second (Hz). Common frequency values include 8000, 11025, 16000, 22050, 32000, 44100, and 48000 Hz.
Integer And Floating Point
Most PCM formats encode samples using integers. However, some applications which demand higher precision will store and process PCM samples using floating point numbers.
Rather than representing sample amplitudes on a linear scale as linear PCM coding does, logarithmic PCM coding plots the amplitudes on a logarithmic scale. Log PCM is more often used in telephony and communications applications than in entertainment multimedia applications.
There are two major variants of log PCM: mu-law (u-law) and A-law. Mu-law coding uses the format number 0x07 in Microsoft multimedia files (WAV/AVI/ASF) and the fourcc 'ulaw' in Apple Quicktime files. A-law coding uses the format number 0x06 is Microsoft multimedia files and the fourcc 'alaw' in Apple Quicktime files.
Every byte of a log PCM data chunk maps to a signed 16-bit linear PCM sample. [TODO: Add either the conversion tables or conversion formulas]
Platform-Specific PCM Identifiers And Characteristics
This section describes how different computing platforms store PCM audio data and any format identifiers they use.
The first widely available, PC audio card that could play back PCM audio was the Creative Labs' Sound Blaster. This drove the audio format for a lot of early audio-capable DOS applications and games. The original Sound Blaster could only play mono, unsigned 8-bit PCM data. Later Sound Blaster cards were capable of playing back 16-bit audio data. However, while these cards still played unsigned 8-bit PCM data, 16-bit data needed be signed.
Likely owing to the DOS/Intel little endian architecture, 16-bit PCM for the Sound Blaster also needs to be little endian.
Microsoft WAV/AVI/ASF Identifiers
Microsoft multimedia file formats such as WAV, AVI, and ASF all share the WAVEFORMATEX data structure. The structure defines, among other properties, a 16-bit little endian audio identifier. The following audio identifiers correspond to various PCM formats:
- 0x0001 denotes linear PCM
- 0x0006 denotes A-law logarithmic PCM
- 0x0007 denotes mu-law logarithmic PCM
Early Apple Macintosh audio hardware had some unique properties and the upshot was that 2 common sample rates were 11127 Hz and 22254 Hz. These sample rates are commonly seen in early Apple QuickTime files.
This page contains for details on this property: http://developer.apple.com/qa/qtpc/qtpc01.html.
Apple QuickTime Identifiers
- 'raw ' (need space character, ASCII 0x20, to round out FOURCC) denotes unsigned, linear PCM. 16-bit data is stored in little endian format.
- 'twos' denotes signed (i.e. twos-complement) linear PCM. 16-bit data is stored in big endian format.
- 'sowt' ('twos' spelled backwards) also denotes signed linear PCM. However, 16-bit data is stored in little endian format.
- 'in24' denotes 24-bit linear PCM. [BYTE ORDER?]
- 'in32' denotes 32-bit linear PCM. [BYTE ORDER?]
- 'fl32' denotes 32-bit floating point PCM. [Presumable IEEE 32-bit; byte order?]
- 'fl64' denotes 64-bit floating point PCM. [Presumable IEEE 32-bit; byte order?]
- 'alaw' denotes A-law logarithmic PCM.
- 'ulaw' denotes mu-law logarithmic PCM.
Red Book CD Audio
The "Red Book" defines the format of a standard audio compact disc (CD). The audio data on a standard CD consists of 16-bit linear PCM samples stored in little endian format, replayed at 44100 Hz (hence the standard term "CD-quality audio"), with left-right stereo interleaving.