# Voxware Metavoice

- Format tags: 0x62, 0x69, 0x70, 0x71, 0x72, 0x73, 0x74, 0x76, 0x77, 0x78, 0x79, 0x7A, 0x7B, 0x81, 0x181C
- Samples: https://samples.mplayerhq.hu/A-codecs/VoxwareRT24-speechCodec/ https://samples.mplayerhq.hu/A-codecs/format-0x70-8klernouthausplecelp.wav https://samples.mplayerhq.hu/A-codecs/format-0x72-lernouthausplesbc12k.wav

This is several families of audio codecs licensed or created by VoxWare.

## L&H CELP 4.8kpbs

- Format tag: 0x70

This is a typical CELP codec that expands 12 bytes of input into 160 samples divided into three subframes (54/53/53 samples).

Transmitted data includes order-10 LPC filter for the whole frame and for each subframe there is also pitch and pulse information (scale and position).

## L&H SBC codecs 8/12/16kbps

- Format tags: 0x71, 0x72, 0x73

This is a family of variable-bitrate sub-band coders. The codec seem to divide audio into 8 sub-bands and not all of them may be coded (the first frame contains flags signalling which sub-bands are coded).

## RT24, RT28, RT29, VR12, VR18

- Format tags: 0x76, 0x77, 0x78, 0x181C
- Bitrates: 2.4kbps, 2.8kbps, 2.98kbps, 1.2kbps (variable), 1.8kbps (variable)

This is a family of order 10 LPC-based speech codecs sharing the same core. VR codecs can have variable bitrate because of four possible frame modes that take different amount of bits.

RT24 frame:

4 bits - some scale 8 bits - pitch 6 bits - pitch scale 35 bits - filter coefficients (3 or 4 bits per coefficient in randomised order) 1 bit - padding

RT28 frame:

4 bits - some scale 8 bits - pitch 41 bit - filter coefficients (3-5 bits per coefficient) 4 bits - unknown 6 bits - pitch scale 1 bit - padding

RT29 frame:

4 bits - some scale 8 bits - pitch 41 bit - filter coefficients (3-5 bits per coefficient) 3x3 bits - unknown 5 bits - pitch scale 1 bit - padding

VR12 frame:

2 bits - mode mode 0: 7 bits - pitch 6x4 bits - filter coefficients 5 bits - pitch scale mode 1: 3 bits - some scale 7 bits - pitch 6x4 bits - filter coefficients 5 bits - pitch scale mode 2: 4x4 bits - filter coefficients 5 bits - pitch scale mode 3: nothing else (silence?)

VR15 frame:

2 bits - mode mode 0: 7 bits - pitch 8x4 bits - filter coefficients 6 bits - pitch scale mode 1: 3 bits - some scale 7 bits - pitch 8x4 bits - filter coefficients 6 bits - pitch scale mode 2: 6x4 bits - filter coefficients 5 bits - pitch scale mode 3: nothing else (silence?)

## SC3, SC6

- Format tags: 0x7A, 0x7B
- Bitrates: 3kbps, 6kbps

Common SC3 and SC6 frame data:

8 bits - pitch 6 bits - pitch scale 4 bits - some scale 44 bits - filter coefficients (5/5/5/4/5/3/5/2/5/0/5 bits per coefficient) 1 bit - padding 1 bit - additional data present flag

In SC6 additional data format is the following:

8x7 bits - another filter coefficients 2x3 bits - unknown 1 bit - padding 1 bit - another flag (seems to be ignored)