DSP Group Truespeech: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
* Format tag: 0x22 | * Format tag: 0x22 | ||
* Company: [[DSP Group]] | * Company: [[DSP Group]] | ||
* Description and binaries: http://www.rarewares.org/rrw/truespeech.php | |||
* Samples: http://samples.mplayerhq.hu/A-codecs/truespeech/ | |||
This is the codec developed by [[DSP Group]]. It operates on 8000 Hz, 16-bit mono [[PCM]] and have some fixed bitrates, Windows version supports 15:1 compression (~1 Kbps) only. | This is the codec developed by [[DSP Group]]. It operates on 8000 Hz, 16-bit mono [[PCM]] and have some fixed bitrates, Windows version supports 15:1 compression (~1 Kbps) only. | ||
This codec employs LPC (Linear Predictive Coding) as almost every speech codec does. | This codec employs LPC (Linear Predictive Coding) as almost every speech codec does. | ||
== Bitstream format == | |||
Frame (240 samples) is divided into 4 subframes. | Frame (240 samples) is divided into 4 subframes. | ||
Line 11: | Line 14: | ||
Packed frame data holds 8-order filter used in synthesis, 4x7 pulse positions and values, offset in previously decoded frame. | Packed frame data holds 8-order filter used in synthesis, 4x7 pulse positions and values, offset in previously decoded frame. | ||
All data is stored in 32-bit little-endian words and bits are read from the LSB. | |||
=== Frame format === | |||
// 1st word | |||
1 bit - previous filter selection mode | |||
the rest are 8 filter coefficients with the next bit allocation: 5, 5, 4, 4, 4, 3, 3, 3 | |||
// 2nd word | |||
4x7 bits - offset2 values 0-3 | |||
4 bits - high 4 bits of offset1[0] | |||
// 3d word | |||
2x14 bits - pulse values 0 and 1 | |||
4 bits - low 4 bits of offset1[1] | |||
// 4th word | |||
2x14 bits - pulse values 2 and 3 | |||
4 bits - high 4 bits of offset1[1] | |||
// 5th word | |||
4 bits - pulse offset 0 | |||
27 bits - pulse position 0 | |||
1 bit - offset1[0] bit 0 | |||
// 6th word | |||
4 bits - pulse offset 1 | |||
27 bits - pulse position 1 | |||
1 bit - offset1[0] bit 1 | |||
// 7th word | |||
4 bits - pulse offset 2 | |||
27 bits - pulse position 2 | |||
1 bit - offset1[0] bit 2 | |||
// 8th word | |||
4 bits - pulse offset 3 | |||
27 bits - pulse position 3 | |||
1 bit - offset1[0] bit 3 | |||
previous filter selection mode flag - when set to one, first subframe uses filter average 2/3*prev_filt[] + 1/3*cur_filt[] and 1/3*prev_filt[]+2/3*cur_filt[] for the second subframe, otherwise previous filter coefficients are used for both; the third and fourth subframes always use current filter. | |||
== Technical details (for standard scheme) == | |||
Decoding flow: | Decoding flow: | ||
Latest revision as of 10:00, 25 October 2017
- Format tag: 0x22
- Company: DSP Group
- Description and binaries: http://www.rarewares.org/rrw/truespeech.php
- Samples: http://samples.mplayerhq.hu/A-codecs/truespeech/
This is the codec developed by DSP Group. It operates on 8000 Hz, 16-bit mono PCM and have some fixed bitrates, Windows version supports 15:1 compression (~1 Kbps) only.
This codec employs LPC (Linear Predictive Coding) as almost every speech codec does.
Bitstream format
Frame (240 samples) is divided into 4 subframes.
Packed frame data holds 8-order filter used in synthesis, 4x7 pulse positions and values, offset in previously decoded frame.
All data is stored in 32-bit little-endian words and bits are read from the LSB.
Frame format
// 1st word 1 bit - previous filter selection mode the rest are 8 filter coefficients with the next bit allocation: 5, 5, 4, 4, 4, 3, 3, 3 // 2nd word 4x7 bits - offset2 values 0-3 4 bits - high 4 bits of offset1[0] // 3d word 2x14 bits - pulse values 0 and 1 4 bits - low 4 bits of offset1[1] // 4th word 2x14 bits - pulse values 2 and 3 4 bits - high 4 bits of offset1[1] // 5th word 4 bits - pulse offset 0 27 bits - pulse position 0 1 bit - offset1[0] bit 0 // 6th word 4 bits - pulse offset 1 27 bits - pulse position 1 1 bit - offset1[0] bit 1 // 7th word 4 bits - pulse offset 2 27 bits - pulse position 2 1 bit - offset1[0] bit 2 // 8th word 4 bits - pulse offset 3 27 bits - pulse position 3 1 bit - offset1[0] bit 3
previous filter selection mode flag - when set to one, first subframe uses filter average 2/3*prev_filt[] + 1/3*cur_filt[] and 1/3*prev_filt[]+2/3*cur_filt[] for the second subframe, otherwise previous filter coefficients are used for both; the third and fourth subframes always use current filter.
Technical details (for standard scheme)
Decoding flow:
unpack frame data reconstruct packed filter merge this filter with previous to create 4 filters for each subframe for each subframe { apply twopoint filter on some saved data place pulses (7 pulses for each subframe max) update saved filter apply main filter to subframe data (mostly pulses) }