ACT

From MultimediaWiki

Jump to: navigation, search

ACT files are created by Chinese-made portable MP3 players when recording from the internal microphone.


ACT files uses the G729A speech compression codec. There are 3 different ways to record to this format from a player.

  • Fine Rec
  • Long Rec
  • Long Vox, in this mode the player will record when voice activated.

Contents

ACT File Format

Bitstream

These 3 modes map to 2 bitstream formats, the Fine and Long bitstream formats.

Name Sample rate Frame size Frames per chunk Chunk size Padding bytes
Fine800010515102
Long440022235066


Files are always aligned to 512 bytes boundary and contains 512 byte header and chunks of the same size. Each chunk consists of fixed number of frames and padding bytes.

Header

Name Offset Bytes Description
riff tag04'RIFF'
riff size44data_size + 36
wave tag84'WAVE'
fmt tag124'fmt '
wave header size164Always 0x10
wave format202PCM's 0x01[1]
channels2220x01(mono)
sample rate2448000 or 4400
bit rate28416000 or ?
block align322Always 0x02
bits per sample342Always 16
data_tag364'data'
data_size404chunks_count * 8 * chunk_size[2]
reserved44212filled by zero
act tag2561Always 0x84
millisec2572milliseconds part of duration value
seconds2591seconds part of duration value
minutes2604minutes part of duration value
reserved264248filled by zero
  1. This collides with the values for regular 16bits PCM
  2. chunk_size is determined from the bitstream table.

Frame format

First half of ACT frame contains odd bytes, while the second contains even bytes of encoded G.729 frame.

for(i=0 ; i<frame_size/2; i++)
{
    g729_data[2*i+1]=act_data[i]
    g729_data[2*i]=act_data[frame_size/2+i]
}

G.729A Codec parameters

Table of transmitted parameters indices. For each parameter, the Most Significant Bit (MSB) is transmitted first.

Symbol Description Bits (Fine) Bits (Long)
L0 Switched MA predictor of LSP quantizer 1 1
L1 First stage vector of quantizer 7 7
L2 Second stage lower vector of LSP quantizer 5 5
L3 Second stage higher vector of LSP quantizer 5 5
P1 Pitch delay first subframe 8 8
P0 Parity bit for Pitch delay 1 1
C1 Fixed codebook first subframe 13 17
S1 Signs of fixed-codebook pulses 1st subframe 4 4
GA1 Gain codebook (stage 1) 1st subframe 3 3
GB1 Gain codebook (stage 1) 1st subframe 4 4
P2 Pitch delay second subframe 5 5
C2 Fixed codebook 2nd subframe 13 17
S2 Signs of fixed-codebook pulses 2nd subframe 4 4
GA2 Gain codebook (stage 1) 2nd subframe 3 3
GB2 Gain codebook (stage 1) 2nd subframe 4 4
Total 80 88

In Long mode one ACT frame forms two G.729 frames and thus 40ms of audio data.

Personal tools