ACT
- Samples: http://samples.mplayerhq.hu/A-codecs/act/
- Working decoder: http://www.netac.com/Download/C635/C635%20Driver&Tools.zip
- Unfinished FFmpeg patch: http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2008-February/042714.html
ACT files are created by Chinese-made portable MP3 players when recording from the internal microphone.
ACT files uses the G729A speech compression codec. There are 3 different ways to record to this format from a player.
- Fine Rec
- Long Rec
- Long Vox, in this mode the player will record when voice activated.
ACT File Format
Bitstream
These 3 modes map to 2 bitstream formats, the Fine and Long bitstream formats.
Name | Sample rate | Frame size | Frames per chunk | Chunk size | Padding bytes |
---|---|---|---|---|---|
Fine | 8000 | 10 | 51 | 510 | 2 |
Long | 4400 | 22 | 23 | 506 | 6 |
Files are always aligned to 512 bytes boundary and contains 512 byte header and chunks of the same size. Each chunk consists of fixed number of frames and padding bytes.
Header
Name | Offset | Bytes | Description |
---|---|---|---|
riff tag | 0 | 4 | 'RIFF' |
riff size | 4 | 4 | data_size + 36 |
wave tag | 8 | 4 | 'WAVE' |
fmt tag | 12 | 4 | 'fmt ' |
wave header size | 16 | 4 | Always 0x10 |
wave format | 20 | 2 | PCM's 0x01<ref name="pcm">This collides with the values for regular 16bits PCM</ref> |
channels | 22 | 2 | 0x01(mono) |
sample rate | 24 | 4 | 8000 or 4400 |
bit rate | 28 | 4 | 16000 or ? |
block align | 32 | 2 | Always 0x02 |
bits per sample | 34 | 2 | Always 16 |
data_tag | 36 | 4 | 'data' |
data_size | 40 | 4 | chunks_count * 8 * chunk_size<ref name="chunk_size"> chunk_size is determined from the bitstream table.</ref> |
reserved | 44 | 212 | filled by zero |
act tag | 256 | 1 | Always 0x84 |
millisec | 257 | 2 | milliseconds part of duration value |
seconds | 259 | 1 | seconds part of duration value |
minutes | 260 | 4 | minutes part of duration value |
reserved | 264 | 248 | filled by zero |
<references/>
Frame format
First half of ACT frame contains odd bytes, while the second contains even bytes of encoded G.729 frame.
for(i=0 ; i<frame_size/2; i++) { g729_data[2*i+1]=act_data[i] g729_data[2*i]=act_data[frame_size/2+i] }
G.729A Codec parameters
Table of transmitted parameters indices. For each parameter, the Most Significant Bit (MSB) is transmitted first.
Symbol | Description | Bits (Fine) | Bits (Long) |
---|---|---|---|
L0 | Switched MA predictor of LSP quantizer | 1 | 1 |
L1 | First stage vector of quantizer | 7 | 7 |
L2 | Second stage lower vector of LSP quantizer | 5 | 5 |
L3 | Second stage higher vector of LSP quantizer | 5 | 5 |
P1 | Pitch delay first subframe | 8 | 8 |
P0 | Parity bit for Pitch delay | 1 | 1 |
C1 | Fixed codebook first subframe | 13 | 17 |
S1 | Signs of fixed-codebook pulses 1st subframe | 4 | 4 |
GA1 | Gain codebook (stage 1) 1st subframe | 3 | 3 |
GB1 | Gain codebook (stage 1) 1st subframe | 4 | 4 |
P2 | Pitch delay second subframe | 5 | 5 |
C2 | Fixed codebook 2nd subframe | 13 | 17 |
S2 | Signs of fixed-codebook pulses 2nd subframe | 4 | 4 |
GA2 | Gain codebook (stage 1) 2nd subframe | 3 | 3 |
GB2 | Gain codebook (stage 1) 2nd subframe | 4 | 4 |
Total | 80 | 88 |
In Long mode one ACT frame forms two G.729 frames and thus 40ms of audio data.