WavPack: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
No edit summary
(Add unofficial fourcc)
Line 2: Line 2:
* Website: http://www.wavpack.com/
* Website: http://www.wavpack.com/
* Samples: http://samples.mplayerhq.hu/A-codecs/lossless/ (luckynight.wv)
* Samples: http://samples.mplayerhq.hu/A-codecs/lossless/ (luckynight.wv)
* FOURCC (unofficial): WVPK


WavPack is an open source lossless audio coding algorithm with floating point data support and optional lossy audio compression.
WavPack is an open source lossless audio coding algorithm with floating point data support and optional lossy audio compression.

Revision as of 16:31, 7 October 2006

WavPack is an open source lossless audio coding algorithm with floating point data support and optional lossy audio compression.

WavPack v.4

File Format

General details of WavPack format can be found in file 'format.txt' in wavpack sources archive. WavPack file consists of blocks each beginning with 'wvpk'. Every block contains all information about sound data - sampling rate, channels, bits per sample, etc. and so-called metadata. Metadata may contain different coefficients using for restoring samples, correction bitstream and actual compressed samples.

Block structure

Each block contains compressed data

Block header (all data is stored in little-endian words)

 4 bytes - 'wvpk'
 32 bits - block size
 16 bits  - version 
 8  bits - track number
 8  bits - track sub index
 32 bits - total samples in file (may be 0xFFFFFFFF)
 32 bits - offset in samples for current block (i.e. how much samples should be decoded by now)
 32 bits - samples in this block
 32 bits - flags
 32 bits - CRC

Flags meaning:

 bits  0- 1 - bytes per sample minus one
 bit      2 - sound is monaural
 bit      3 - hybrid profile (lossy compression)
 bit      4 - joint stereo coding scheme
 bit      5 - cross-decorrelation scheme is used
 bit      6 - shaping for hybrid profile is present
 bit      7 - floating point data present
 bit      8 - int32 mode
 bits  9-10 - hybrid profile flags
 bits 11-12 - multi-channel start and end blocks
 bits 13-17 - shift parameter?
 bits 18-22 - scaling parameter?
 bits 23-26 - sampling rate index

Metadata

Metadata can be divided into three parts: ID, length and data. Every metadata block has even length and data size is stored in words in either one or three bytes depending on ID flag

Flags for ID:

 0x20 - decoder may ignore data contained here
 0x40 - data size is odd
 0x80 - data size is large

IDs:

 * 0x01 - encoder info
 * 0x02 - decorrelation terms
 * 0x03 - decorrelation weights
 * 0x04 - decorrelation samples
 * 0x05 - entropy info
 * 0x0A - packed samples

Decorrelation terms

Decorrelation terms are stored in one byte, lower 5 bits indicate predictor type, high 3 bits contain delta value.

Possible predictor values:

 0-5 - predictors for stereo, only predictors 2-4 are implemented
 6-12 - predictor uses 1-7 samples for prediction
 13-16 - reserved
 17-18 - predictor does prediction by two samples

Decorrelation weights

Each decorrelation term should have one or two weights depending on channels. Each weight is packed into one byte and can be restored in this way:

 n = getchar() << 3;
 if(n > 0) n += (n + 64) >> 7;

Decorrelation samples

Each decorrelation term may have up to 16 samples depending on its value. Each sample is 32-bit but stored in 16 bits, lower 8 bits are mantiss and high 8 bits are exponent-9, i.e if exponent < 9 shift mantiss right, otherwise left

Entropy info

This section contains one or two sets of medians for samples decoding. Each median is log-packed into 16 bits as described above.

Samples coding

Samples are stored in metadata block with ID=0x0A and are packed with modified Golomb codes. Decoding process is specified below where get_unary() is the function which returns length of '1'-bits string (i.e. 111110b = 5, 10b = 1). Codeset is adaptively divided into four sets and every code has unary prefix (possibly escaped) defining interval of this code and mantis part like in Golomb code.

 if(last_zero){
   n = 0;
   last_zero = 0;
 }else{
   n = get_unary();
   if(n == 16){
     n2 = get_unary();
     if(n2 < 2) n += n2;
     else n += (1 << (n2-1)) | getbits(n2-1);
   }
   last_one = n & 1;
   if(last_one)
     n = (n>>1) + 1;
   else
     n = n >> 1;
   last_zero = !last_one;
 }
 if(n == 0){
   base = 0;
   add = median[0] - 1;
   decrease median[0];
 } else if(n == 1){
   base = median[0];
   add = median[1] - 1;
   increase median[0];
   decrease median[1];
 } else {
   base = median[0] + median[1] + median[2] * (n - 2);
   add = median[2] - 1;
   increase median[0];
   increase median[1];
   if(n == 2) derease median[2];
   else increase median[2];
 }
 k = log2(add);
 ex = (1 << k) - add - 1;
 t2 = getbits(k - 1);
 if(t2 >= ex)
   t2 = t2 * 2 - ex + getbit();
 sign = getbit();
 if(sign==0) result = base + t2;
 else result = ~(base + t2);