ACT: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
(Guesses and info)
m (Supported since over two years)
 
(8 intermediate revisions by 4 users not shown)
Line 1: Line 1:
* Samples: http://samples.mplayerhq.hu/A-codecs/act/
* Samples: http://samples.mplayerhq.hu/A-codecs/act/
* Working decoder: http://www.netac.com/Download/C635/C635%20Driver&Tools.zip
* Working decoder: http://www.netac.com/Download/C635/C635%20Driver&Tools.zip
* Unfinished FFmpeg patch: http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2008-February/042714.html


ACT is used as codec in is used by [http://en.wikipedia.org/wiki/Chinese_MP4/MTV_Player '''Chinese-made portable MP3 players'''] when recording from the internal microphone.
ACT files are created by [http://en.wikipedia.org/wiki/Chinese_MP4/MTV_Player '''Chinese-made portable MP3 players'''] when recording from the internal microphone.




ACT is a low complexity speech compression codec. There are 3 different ways to record to this codec from a player.
ACT files uses the [http://en.wikipedia.org/wiki/G.729a G729A] speech compression codec. There are 3 different ways to record to this format from a player.


* Fine Rec
*Fine Rec
* Long Rec
*Long Rec
* Long Vox, in this mode the player will just record if there is anything to record.
*Long Vox, in this mode the player will record when voice activated.


It is unknown if there are several bitstream formats. With the software that are supposed to convert the files to wav audio are 2 or 3 command line programs. When running one the work G729 is mentioned and the bitrate 4,4k. So an initial guess is that the 3 recording formats are 8kbits, 4.4kbits and 4.4kbits with voice activation. And that the codec actually is G729
==ACT File Format==
 
=== Bitstream ===
 
These 3 modes map to 2 bitstream formats, the Fine and Long bitstream formats.
 
{| border="1" cellpadding="5" style="border-collapse: collapse; border-style: dashed; border-color: #2f6fab;"
|- bgcolor="#f0f0f0" |
! Name!! Sample rate!! Frame size!! Frames per chunk !! Chunk size !! Padding bytes
|-
|Fine||8000||10||51||510||2
|-
|Long||4400||22||23||506||6
|}
 
 
Files are always aligned to 512 bytes boundary and contains 512 byte header and chunks of the same size. Each chunk consists of fixed number of frames and padding bytes.
 
=== Header ===
{| border="1" cellpadding="5" style="border-collapse: collapse; border-style: dashed; border-color: #2f6fab;"
|- bgcolor="#f0f0f0" |
! Name !! Offset !! Bytes !! Description
|-
|riff tag||0||4||'RIFF'
|-
|riff size||4||4||data_size + 36
|-
|wave tag||8||4||'WAVE'
|-
|fmt tag||12||4||'fmt '
|-
|wave header size||16||4||Always 0x10
|-
|wave format||20||2||PCM's 0x01<ref name="pcm">This collides with the values for regular 16bits PCM</ref>
|-
|channels||22||2||0x01(mono)
|-
|sample rate||24||4||8000 or 4400
|-
|bit rate||28||4||16000 or ?
|-
|block align||32||2||Always 0x02
|-
|bits per sample||34||2||Always 16
|-
|data_tag||36||4||'data'
|-
|data_size||40||4||chunks_count * 8 * chunk_size<ref name="chunk_size"> chunk_size is determined from the bitstream table.</ref>
|-
|reserved||44||212||filled by zero
|-
|act tag||256||1||Always 0x84
|-
|millisec||257||2||milliseconds part of duration value
|-
|seconds||259||1||seconds part of duration value
|-
|minutes||260||4||minutes part of duration value
|-
|reserved||264||248||filled by zero
|}
 
<references/>
 
=== Frame format ===
 
First half of ACT frame contains odd bytes, while the second contains even bytes of encoded G.729 frame.


==ACT File Format==
for(i=0 ; i<frame_size/2; i++)
{
    g729_data[2*i+1]=act_data[i]
    g729_data[2*i]=act_data[frame_size/2+i]
}
 
=== G.729A Codec parameters ===
 
Table of transmitted parameters indices. For each parameter, the Most Significant Bit (MSB) is transmitted first.
 
{| border="1" cellpadding="5" style="border-collapse: collapse; border-style: dashed; border-color: #2f6fab;"
|- bgcolor="#f0f0f0" |
! Symbol !! Description !! Bits (Fine) !! Bits (Long)
|-
| L0 || Switched MA predictor of LSP quantizer || 1 || 1
|-
| L1 || First stage vector of quantizer || 7 || 7
|-
| L2 || Second stage lower vector of LSP quantizer || 5 || 5
|-
| L3 || Second stage higher vector of LSP quantizer || 5 || 5
|-
| P1 || Pitch delay first subframe || 8 || 8
|-
| P0 || Parity bit for Pitch delay || 1 || 1
|-
| C1 || Fixed codebook first subframe || 13 || 17
|-
| S1 || Signs of fixed-codebook pulses 1st subframe || 4 || 4
|-
| GA1 || Gain codebook (stage 1) 1st subframe || 3 || 3
|-
| GB1 || Gain codebook (stage 1) 1st subframe || 4 || 4
|-
| P2 || Pitch delay second subframe || 5 || 5
|-
| C2 || Fixed codebook 2nd subframe || 13 || 17
|-
| S2 || Signs of fixed-codebook pulses 2nd subframe || 4 || 4
|-
| GA2 || Gain codebook (stage 1) 2nd subframe || 3 || 3
|-
| GB2 || Gain codebook (stage 1) 2nd subframe || 4 || 4
|-
! !! Total !! 80 !! 88
|}


The first 512 bytes seem to be a regular RIFF header. The function at 0x00401000 is in the call tree when converting act files.
In Long mode one ACT frame forms two G.729 frames and thus 40ms of audio data.


[[Category:Audio Codecs]]
[[Category:Audio Codecs]]
[[Category:Undiscovered Audio Codecs]]

Latest revision as of 03:21, 2 April 2014

ACT files are created by Chinese-made portable MP3 players when recording from the internal microphone.


ACT files uses the G729A speech compression codec. There are 3 different ways to record to this format from a player.

  • Fine Rec
  • Long Rec
  • Long Vox, in this mode the player will record when voice activated.

ACT File Format

Bitstream

These 3 modes map to 2 bitstream formats, the Fine and Long bitstream formats.

Name Sample rate Frame size Frames per chunk Chunk size Padding bytes
Fine 8000 10 51 510 2
Long 4400 22 23 506 6


Files are always aligned to 512 bytes boundary and contains 512 byte header and chunks of the same size. Each chunk consists of fixed number of frames and padding bytes.

Header

Name Offset Bytes Description
riff tag 0 4 'RIFF'
riff size 4 4 data_size + 36
wave tag 8 4 'WAVE'
fmt tag 12 4 'fmt '
wave header size 16 4 Always 0x10
wave format 20 2 PCM's 0x01<ref name="pcm">This collides with the values for regular 16bits PCM</ref>
channels 22 2 0x01(mono)
sample rate 24 4 8000 or 4400
bit rate 28 4 16000 or ?
block align 32 2 Always 0x02
bits per sample 34 2 Always 16
data_tag 36 4 'data'
data_size 40 4 chunks_count * 8 * chunk_size<ref name="chunk_size"> chunk_size is determined from the bitstream table.</ref>
reserved 44 212 filled by zero
act tag 256 1 Always 0x84
millisec 257 2 milliseconds part of duration value
seconds 259 1 seconds part of duration value
minutes 260 4 minutes part of duration value
reserved 264 248 filled by zero

<references/>

Frame format

First half of ACT frame contains odd bytes, while the second contains even bytes of encoded G.729 frame.

for(i=0 ; i<frame_size/2; i++)
{
    g729_data[2*i+1]=act_data[i]
    g729_data[2*i]=act_data[frame_size/2+i]
}

G.729A Codec parameters

Table of transmitted parameters indices. For each parameter, the Most Significant Bit (MSB) is transmitted first.

Symbol Description Bits (Fine) Bits (Long)
L0 Switched MA predictor of LSP quantizer 1 1
L1 First stage vector of quantizer 7 7
L2 Second stage lower vector of LSP quantizer 5 5
L3 Second stage higher vector of LSP quantizer 5 5
P1 Pitch delay first subframe 8 8
P0 Parity bit for Pitch delay 1 1
C1 Fixed codebook first subframe 13 17
S1 Signs of fixed-codebook pulses 1st subframe 4 4
GA1 Gain codebook (stage 1) 1st subframe 3 3
GB1 Gain codebook (stage 1) 1st subframe 4 4
P2 Pitch delay second subframe 5 5
C2 Fixed codebook 2nd subframe 13 17
S2 Signs of fixed-codebook pulses 2nd subframe 4 4
GA2 Gain codebook (stage 1) 2nd subframe 3 3
GB2 Gain codebook (stage 1) 2nd subframe 4 4
Total 80 88

In Long mode one ACT frame forms two G.729 frames and thus 40ms of audio data.