Voxware Metavoice

From MultimediaWiki
Jump to navigation Jump to search

This is several families of audio codecs licensed or created by VoxWare.

L&H CELP 4.8kpbs

  • Format tag: 0x70

This is a typical CELP codec that expands 12 bytes of input into 160 samples divided into three subframes (54/53/53 samples).

Transmitted data includes order-10 LPC filter for the whole frame and for each subframe there is also pitch and pulse information (scale and position).

L&H SBC codecs 8/12/16kbps

  • Format tags: 0x71, 0x72, 0x73

This is a family of variable-bitrate sub-band coders. The codec seem to divide audio into 8 sub-bands and not all of them may be coded (the first frame contains flags signalling which sub-bands are coded).

RT24, RT28, RT29, VR12, VR18

  • Format tags: 0x76, 0x77, 0x78, 0x181C
  • Bitrates: 2.4kbps, 2.8kbps, 2.98kbps, 1.2kbps (variable), 1.8kbps (variable)

This is a family of order 10 LPC-based speech codecs sharing the same core. VR codecs can have variable bitrate because of four possible frame modes that take different amount of bits.

RT24 frame:

  4 bits - some scale
  8 bits - pitch
  6 bits - pitch scale
 35 bits - filter coefficients (3 or 4 bits per coefficient in randomised order)
  1 bit  - padding

RT28 frame:

  4 bits - some scale
  8 bits - pitch
 41 bit  - filter coefficients (3-5 bits per coefficient)
  4 bits - unknown
  6 bits - pitch scale
  1 bit  - padding

RT29 frame:

   4 bits - some scale
   8 bits - pitch
  41 bit  - filter coefficients (3-5 bits per coefficient)
 3x3 bits - unknown
   5 bits - pitch scale
   1 bit  - padding

VR12 frame:

 2 bits - mode
 mode 0:
     7 bits - pitch
   6x4 bits - filter coefficients
     5 bits - pitch scale
 mode 1:
     3 bits - some scale
     7 bits - pitch
   6x4 bits - filter coefficients
     5 bits - pitch scale
 mode 2:
   4x4 bits - filter coefficients
     5 bits - pitch scale
 mode 3: nothing else (silence?)

VR15 frame:

 2 bits - mode
 mode 0:
     7 bits - pitch
   8x4 bits - filter coefficients
     6 bits - pitch scale
 mode 1:
     3 bits - some scale
     7 bits - pitch
   8x4 bits - filter coefficients
     6 bits - pitch scale
 mode 2:
   6x4 bits - filter coefficients
     5 bits - pitch scale
 mode 3: nothing else (silence?)

SC3, SC6

  • Format tags: 0x7A, 0x7B
  • Bitrates: 3kbps, 6kbps

Common SC3 and SC6 frame data:

  8 bits - pitch
  6 bits - pitch scale
  4 bits - some scale
 44 bits - filter coefficients (5/5/5/4/5/3/5/2/5/0/5 bits per coefficient)
  1 bit  - padding
  1 bit  - additional data present flag

In SC6 additional data format is the following:

 8x7 bits - another filter coefficients
 2x3 bits - unknown
   1 bit  - padding
   1 bit  - another flag (seems to be ignored)