FFmpeg technical

From MultimediaWiki
Jump to navigation Jump to search

Demuxer stuff

How to use demuxer with raw data

Write a parser. There are two main function - parse and split. parse is used to reconstruct precisely one frame from raw packets, split is used for extracting extradata from the same stream.

Parser declarations will look like this:

 typedef struct MyParseContext{
    ParseContext pc; /* always include this first */
    another data
 AVParser some_parser = {
    { CODEC_ID_1 [CODEC_ID_2, ...] },
   sizeof (MyParseContext),
   NULL, /* usually there is no need in parser_open */
   ff_parse_close, /* again, use standard close function */

And here is the code for some parser which splits frame by some markers:

 static int xxx_find_frame_end(XXXParseContext *pc1, const uint8_t *buf,
                              int buf_size) {
   int start_found, i;
   uint32_t state;
   ParseContext *pc = &pc1->pc;
   start_found= pc->frame_start_found;
   state= pc->state;
       for(i=0; i<buf_size; i++){
           state= (state<<8) | buf[i];
           if(state == MARKER){
       for(; i<buf_size; i++){
           state= (state<<8) | buf[i];
           if(state == MARKER){
               pc->frame_start_found= 0;
               pc->state= -1;
               return i-3;
   pc->frame_start_found= start_found;
   pc->state= state;
   return END_NOT_FOUND;
 static int xxx_parse(AVCodecParserContext *s,
                          AVCodecContext *avctx,
                          uint8_t **poutbuf, int *poutbuf_size,
                          const uint8_t *buf, int buf_size)
   XXXParseContext *pc1 = s->priv_data;
   ParseContext *pc = &pc1->pc;
   int next;
       next= buf_size;
       next= xxx_find_frame_end(pc1, buf, buf_size);
       if (ff_combine_frame(pc, next, (uint8_t **)&buf, &buf_size) < 0) {
           *poutbuf = NULL;
           *poutbuf_size = 0;
           return buf_size;
   *poutbuf = (uint8_t *)buf;
   *poutbuf_size = buf_size;
   return next;

Decoder stuff

How to use delta frames

There are two ways: either use your own buffer for frame data or reget old frame and modify it. This can be done by calling reget_buffer() instead of get_buffer(). Usual code:

   if(avctx->get_buffer(avctx, &c->pic) < 0){
       av_log(avctx, AV_LOG_ERROR, "get_buffer() failed\n");
       return -1;

New code:

   c->pic.reference = 1;
   if(avctx->reget_buffer(avctx, &c->pic) < 0){
       av_log(avctx, AV_LOG_ERROR, "reget_buffer() failed\n");
       return -1;

How to use bitstream

There are two bitstream reading methods - MSB and 32-bit little-endian word. In order to use the second you need to #define BITSTREAM_READER_LE before including bitstream.h.

Simple bitstream writing:

 PutBitContext pb;
 init_put_bits(&pb, buffer, max_bytes); // max_bytes is buffer size to avoid out-of-bounds write
 put_bits(&pb, 13, value); // Note that the second argument is value size in bits, not actual value
 size = put_bits_count(&pb); // get the number of bits written so far
 flush_put_bits(&pb); // call this to finish bit writing so bits won't be lost in buffer

Simple bitstream reading:

 GetBitContext gb;
 init_get_bits(&gb, buffer, buffer_size * 8); // init_get_bits() expects buffer size in _bits_
 a = get_bits(&gb, 16); // read some bits from stream
 b = get_sbits(&gb, 15); // read some bits from stream as signed integer
 c = get_bits_long(&gb, 18); // get_bits() is guaranteed to work only for bits <= 17, for greater values use get_bits_long()
 d = show_bits(&gb, 5); // peek bits but don't change position in bitstream, also has show_bits_long() counterpart

Variable-length codes reading is also quite simple.

 VLC vlc;
 init_vlc(&vlc, MAX_BITS, codes_size,
            bits, bits_skip, sizeof(bits[0]),
            codes, codes_skip, sizeof(codes[0]), flags);
 value = get_vlc2(&gb, vlc.table, MAX_BITS, wrap);

Notes on VLC reading:

  • MAX_BITS is the maximum code length but not greater than 9 (i.e. if max codeword length is 5 then MAX_BITS=5 but if max codeword length is 15 then MAX_BITS=6)
  • *_skip should be either equal to sizeof() if bits and codes are stored in separate tables or = ((uint8_t*)bits[1])-((uint8_t*)bits[0]). For example if you store all in struct { char bits; short value;} codes[100] you should call
 init_vlc(&vlc, MAX_BITS, 100, &codes[0].bits, sizeof(codes[0]), 1, &codes[0].value, sizeof(codes[0]), 2, 0);
  • flags may be:
    • INIT_VLC_USE_STATIC - init codes statically ( free_vlc() is not needed)
    • INIT_VLC_LE - use alternative VLC reading mode
  • wrap = ceil(REAL_MAX_BITS/MAX_BITS) and should not be greater than 3. For example is maximum codeword size is 15 then wrap = ceil(15/9) = 2

How to use frame reordering

The simpliest way is to adapt decode_frame() from h263dec.c to your needs (that will also require to include MpegEncContext into your codec context).

Some example decoder

The simpliest decoder that does not even use bitstream reading is ATI VCR1 decoder, file libavcodec/vcr1.c.

For simple codec with VLC reading look at WNV1 (file libavcodec/wnv1.c).

For simple delta-frame codec look at RPZA (file libavcodec/rpza.c).


I would like to know how to do that properly, for now this is the stuff borrowed from wmadec.c and wmaenc.c.

In order to perform N-point transform you need those variables:

   MDCTContext mdct;
   DSPUtil dsp;
   float window[N];
   float output[2*N];
   float saved[2][N];
   float *prev = saved[0];
   float *curr = saved[1];
   float coefs[N]; // result of MDCT will be stored here

Forward windowed MDCT:

   memcpy(output, saved, sizeof(float)*N);
   for (i = 0; i < N; i++){
       output[i+N] = audio[i] / (N/2) * window[N - i - 1];
       saved [i]   = audio[i] / (N/2) * window[i];
   ff_mdct_calc(&mdct, coefs, output, tmp);

Backward windowed MDCT:

   mdct->imdct_half(mdct, curr, coefs);
   dsp.vector_fmul_window(out, prev, curr, window, N/2);
   FFSWAP(prev, curr);