Difference between revisions of "DSP Group Truespeech"

From MultimediaWiki
Jump to navigation Jump to search
(skeleton description)
 
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
* Format tag: 0x22
 
* Format tag: 0x22
 
* Company: [[DSP Group]]
 
* Company: [[DSP Group]]
 +
* Description and binaries: http://www.rarewares.org/rrw/truespeech.php
 +
* Samples: http://samples.mplayerhq.hu/A-codecs/truespeech/
 +
 +
This is the codec developed by [[DSP Group]]. It operates on 8000 Hz, 16-bit mono [[PCM]] and have some fixed bitrates, Windows version supports 15:1 compression (~1 Kbps) only.
 +
 +
This codec employs LPC (Linear Predictive Coding) as almost every speech codec does.
 +
 +
== Bitstream format ==
 +
 +
Frame (240 samples) is divided into 4 subframes.
 +
 +
Packed frame data holds 8-order filter used in synthesis, 4x7 pulse positions and values, offset in previously decoded frame.
 +
 +
All data is stored in 32-bit little-endian words and bits are read from the LSB.
 +
 +
=== Frame format ===
 +
 +
  // 1st word
 +
  1 bit - previous filter selection mode
 +
  the rest are 8 filter coefficients with the next bit allocation: 5, 5, 4, 4, 4, 3, 3, 3
 +
 
 +
  // 2nd word
 +
  4x7 bits - offset2 values 0-3
 +
  4 bits - high 4 bits of offset1[0]
 +
 
 +
  // 3d word
 +
  2x14 bits - pulse values 0 and 1
 +
  4 bits - low 4 bits of offset1[1]
 +
 
 +
  // 4th word
 +
  2x14 bits - pulse values 2 and 3
 +
  4 bits - high 4 bits of offset1[1]
 +
 
 +
  // 5th word
 +
  4 bits - pulse offset 0
 +
  27 bits - pulse position 0
 +
  1 bit - offset1[0] bit 0
 +
 
 +
  // 6th word
 +
  4 bits - pulse offset 1
 +
  27 bits - pulse position 1
 +
  1 bit - offset1[0] bit 1
 +
 
 +
  // 7th word
 +
  4 bits - pulse offset 2
 +
  27 bits - pulse position 2
 +
  1 bit - offset1[0] bit 2
 +
 
 +
  // 8th word
 +
  4 bits - pulse offset 3
 +
  27 bits - pulse position 3
 +
  1 bit - offset1[0] bit 3
 +
 +
previous filter selection mode flag - when set to one, first subframe uses filter average 2/3*prev_filt[] + 1/3*cur_filt[] and 1/3*prev_filt[]+2/3*cur_filt[] for the second subframe, otherwise previous filter coefficients are used for both; the third and fourth subframes always use current filter.
 +
 +
== Technical details (for standard scheme) ==
 +
 +
 
 +
Decoding flow:
 +
 +
  unpack frame data
 +
  reconstruct packed filter
 +
  merge this filter with previous to create 4 filters for each subframe
 +
  for each subframe {
 +
    apply twopoint filter on some saved data
 +
    place pulses (7 pulses for each subframe max)
 +
    update saved filter
 +
    apply main filter to subframe data (mostly pulses)
 +
  }
  
 
[[Category: Audio Codecs]]
 
[[Category: Audio Codecs]]

Latest revision as of 10:00, 25 October 2017

This is the codec developed by DSP Group. It operates on 8000 Hz, 16-bit mono PCM and have some fixed bitrates, Windows version supports 15:1 compression (~1 Kbps) only.

This codec employs LPC (Linear Predictive Coding) as almost every speech codec does.

Bitstream format

Frame (240 samples) is divided into 4 subframes.

Packed frame data holds 8-order filter used in synthesis, 4x7 pulse positions and values, offset in previously decoded frame.

All data is stored in 32-bit little-endian words and bits are read from the LSB.

Frame format

 // 1st word
 1 bit - previous filter selection mode
 the rest are 8 filter coefficients with the next bit allocation: 5, 5, 4, 4, 4, 3, 3, 3
 
 // 2nd word
 4x7 bits - offset2 values 0-3
 4 bits - high 4 bits of offset1[0]
 
 // 3d word
 2x14 bits - pulse values 0 and 1
 4 bits - low 4 bits of offset1[1]
 
 // 4th word
 2x14 bits - pulse values 2 and 3
 4 bits - high 4 bits of offset1[1]
 
 // 5th word
 4 bits - pulse offset 0
 27 bits - pulse position 0
 1 bit - offset1[0] bit 0
 
 // 6th word
 4 bits - pulse offset 1
 27 bits - pulse position 1
 1 bit - offset1[0] bit 1
 
 // 7th word
 4 bits - pulse offset 2
 27 bits - pulse position 2
 1 bit - offset1[0] bit 2
 
 // 8th word
 4 bits - pulse offset 3
 27 bits - pulse position 3
 1 bit - offset1[0] bit 3

previous filter selection mode flag - when set to one, first subframe uses filter average 2/3*prev_filt[] + 1/3*cur_filt[] and 1/3*prev_filt[]+2/3*cur_filt[] for the second subframe, otherwise previous filter coefficients are used for both; the third and fourth subframes always use current filter.

Technical details (for standard scheme)

Decoding flow:

 unpack frame data
 reconstruct packed filter
 merge this filter with previous to create 4 filters for each subframe
 for each subframe {
   apply twopoint filter on some saved data
   place pulses (7 pulses for each subframe max)
   update saved filter
   apply main filter to subframe data (mostly pulses)
 }