ACT

From MultimediaWiki
Jump to: navigation, search

ACT files are created by Chinese-made portable MP3 players when recording from the internal microphone.


ACT files uses the G729A speech compression codec. There are 3 different ways to record to this format from a player.

  • Fine Rec
  • Long Rec
  • Long Vox, in this mode the player will record when voice activated.

ACT File Format

Bitstream

These 3 modes map to 2 bitstream formats, the Fine and Long bitstream formats.

Name Sample rate Frame size Frames per chunk Chunk size Padding bytes
Fine 8000 10 51 510 2
Long 4400 22 23 506 6


Files are always aligned to 512 bytes boundary and contains 512 byte header and chunks of the same size. Each chunk consists of fixed number of frames and padding bytes.

Header

Name Offset Bytes Description
riff tag 0 4 'RIFF'
riff size 4 4 data_size + 36
wave tag 8 4 'WAVE'
fmt tag 12 4 'fmt '
wave header size 16 4 Always 0x10
wave format 20 2 PCM's 0x01<ref name="pcm">This collides with the values for regular 16bits PCM</ref>
channels 22 2 0x01(mono)
sample rate 24 4 8000 or 4400
bit rate 28 4 16000 or ?
block align 32 2 Always 0x02
bits per sample 34 2 Always 16
data_tag 36 4 'data'
data_size 40 4 chunks_count * 8 * chunk_size<ref name="chunk_size"> chunk_size is determined from the bitstream table.</ref>
reserved 44 212 filled by zero
act tag 256 1 Always 0x84
millisec 257 2 milliseconds part of duration value
seconds 259 1 seconds part of duration value
minutes 260 4 minutes part of duration value
reserved 264 248 filled by zero

<references/>

Frame format

First half of ACT frame contains odd bytes, while the second contains even bytes of encoded G.729 frame.

for(i=0 ; i<frame_size/2; i++)
{
    g729_data[2*i+1]=act_data[i]
    g729_data[2*i]=act_data[frame_size/2+i]
}

G.729A Codec parameters

Table of transmitted parameters indices. For each parameter, the Most Significant Bit (MSB) is transmitted first.

Symbol Description Bits (Fine) Bits (Long)
L0 Switched MA predictor of LSP quantizer 1 1
L1 First stage vector of quantizer 7 7
L2 Second stage lower vector of LSP quantizer 5 5
L3 Second stage higher vector of LSP quantizer 5 5
P1 Pitch delay first subframe 8 8
P0 Parity bit for Pitch delay 1 1
C1 Fixed codebook first subframe 13 17
S1 Signs of fixed-codebook pulses 1st subframe 4 4
GA1 Gain codebook (stage 1) 1st subframe 3 3
GB1 Gain codebook (stage 1) 1st subframe 4 4
P2 Pitch delay second subframe 5 5
C2 Fixed codebook 2nd subframe 13 17
S2 Signs of fixed-codebook pulses 2nd subframe 4 4
GA2 Gain codebook (stage 1) 2nd subframe 3 3
GB2 Gain codebook (stage 1) 2nd subframe 4 4
Total 80 88

In Long mode one ACT frame forms two G.729 frames and thus 40ms of audio data.