FFmpeg audio API: Difference between revisions

From MultimediaWiki
Jump to navigation Jump to search
(More solid framework)
Line 31: Line 31:
== Mixing templates ==
== Mixing templates ==


Missing:
===struct codec_mix_struct===
* How to handle non implemented mixing configuration requests ?


===struct codec_mix_structe===
  /** This struct holds the possible stream channel configurations and the possible output configurations.
 
  *  The code will have a table of these struct's to define all the channel configurations it support.
  /** This structholds the possible stream channel configurations and the possible output configurations.
  *  This table will be passes to the ff_mix_init function and the init will search through the table
  *  for a matching configuration and load the appropriate mixing coeffs.
   */
   */
  typedef struct av_codec_mix_struct {
  typedef struct av_codec_mix_struct {
Line 47: Line 47:
                                         ///< For example (simplified) [1,2] would mean coeff[1]+coeff[2] while [1,-2] would mean coeff[1]-coeff[2].
                                         ///< For example (simplified) [1,2] would mean coeff[1]+coeff[2] while [1,-2] would mean coeff[1]-coeff[2].
  } av_codec_mix_struct;
  } av_codec_mix_struct;
===struct AVMIXContext===
/** Main AVMIX context
  *
  */
typedef struct AVMIXContext {
    unsigned int inchannels,            ///< amount of channels in the input stream
    unsigned int outchannels,          ///< amount of channels in the requested output stream
    void* inchannel[MAX_MIX_CHANNELS];  ///< pointers to the inchannels in channelmask order
    void* outchannel[MAX_MIX_CHANNELS]; ///< pointers to the outchannels in channelmask order
}


===function ff_mix_init===
===function ff_mix_init===
Line 65: Line 77:
   * @param[in] outchannels  
   * @param[in] outchannels  
   * Number of outchannels, this is set by the user. This value will be stored in the mixing context.
   * Number of outchannels, this is set by the user. This value will be stored in the mixing context.
  *
  * @param stream_channel_mask
  * This is the parameter describing the possible channel configuration a codec can have. This info is taken from
  * the input stream and converted to a channel mask.
  *
  * @param out_channel_mask
  * This mask will contain the user selected out channel configuration.
   *
   *
   * @param mix_table[in]
   * @param mix_table[in]
Line 72: Line 91:
   * Table with mixing coeffs, it is this table the mixing_coeff_index_matrix will refer too. It is declared as void* to
   * Table with mixing coeffs, it is this table the mixing_coeff_index_matrix will refer too. It is declared as void* to
   * make it possible for a future addition of fixed point mixing.
   * make it possible for a future addition of fixed point mixing.
  *
  * @return[out]
  * The init will do a lookup for a matching mixing configuration with the help of the in and out channel masks.
  * If there isn't any matching configuration return 0 otherwise return 1.
   */
   */
  void ff_mix_init(DMIXContext* mix, unsigned int inchannels, unsigned int outchannels, unsigned int stream_channel_mask,
  int ff_mix_init(AVMIXContext* mix, unsigned int inchannels, unsigned int outchannels, unsigned int stream_channel_mask,
                  unsigned int out_channel_mask, av_codec_mix_table* mix_table, void* mixing_coeffs_table);
                unsigned int out_channel_mask, av_codec_mix_table* mix_table, void* mixing_coeffs_table);


//FIXME


===select_mixing_matrix===
===select_mixing_matrix===
//Will be merged into the ff_mix_init
  /** Function to get the appropriate mixing_coeff_index_matrix.
  /** Function to select the appropriate mixing_coeff_index_matrix
  *
  *
  * @param[in] inchannels[in]
  * Number of inchannels, this is set by the input stream. This value will be stored in the mixing context.
   *
   *
   * This function will be reimplemented in every codec to match its possible layouts. It basicly will be
   * @param[in] outchannels[in]
   * a cascaded switch statement. First it will switch the possible codec native configurations and after that
   * Number of outchannels, this is set by the user. This value will be stored in the mixing context.
  * the possible out_channel_mask configuration.
   *
   *
   * @param codec_channel_configuration
   * @param stream_channel_mask[in]
   * This is the parameter describing the possible channel configuration a codec can have. This info is taken from
   * This is the parameter describing the possible channel configuration a codec can have. This info is taken from
   * the input stream. This parameter
   * the input stream and converted to a channel mask.
   *
   *
   * @param out_channel_mask
   * @param out_channel_mask[in]
   * This mask will contain the user selected outchannel configuration. It will base of the dwChannelMask with the
   * This mask will contain the user selected out channel configuration.
   * addition of the ambisonics speaker positions. The amount of set bits will
   *
  * @param mix_table[in]
  * Table of av_codec_mix_struct's.  
   *
   *
   * @returns
   * @returns[out]
   * A mixing_coeff_index_matrix suitable for ff_mix_init will be returned.
   * A mixing_coeff_index_matrix if the configuration could be found in the mix_table, NULL if not.
   */
   */
  int8_t* select_mixing_matrix(void codec_channel_configuration, unsigned int out_channel_mask);
  int8_t* select_mixing_matrix(unsigned int inchannels, unsigned int outchannels, unsigned int stream_channel_mask,
                              unsigned int out_channel_mask, av_codec_mix_table* mix_table);

Revision as of 13:39, 21 December 2007

This page is for discussion regarding the reworking of the FFmpeg audio API to accommodate the requirements needed for today's audio codecs.

Features needed

  • Generalized channel mixing (SIMD optimized) - users should be able to set their own channel mixing coefficients.
  • Codec alterable channel mixing coefficients - the codec should be able to set and update the channel mixing coefficients during runtime (DCA supports this feature, maybe AC-3 also).
  • Output channel request function - specify the number of output channels, default should be >2 channels mapped to 2 channels
  • Channel reordering - currently there are different orders depending on the codec.
  • SIMD optimized interleaving
  • Allow planar output - don't duplicate the interleaving code in every codec
  • Support bit depths other than 16-bit - 8-bit/24-bit/32-bit/float
  • Channel selection - ability to access one channel from a multichannel stream

Feature wish list

  • Dolby Pro Logic Surround Sound decoding.
  • Add a better FFT routine. (Would the KISS implementation be a good candidate?)
  • Fixed point MDCT/FFT implementations
  • Custom audio filter support. (Basing it on the video filter API ideas?)
  • Proper API for enabling SIMD optimized code.

Current ideas

Threads with previous discussions in the subject:


Mixing templates

struct codec_mix_struct

/** This struct holds the possible stream channel configurations and the possible output configurations.
 *  The code will have a table of these struct's to define all the channel configurations it support.
 *  This table will be passes to the ff_mix_init function and the init will search through the table
 *  for a matching configuration and load the appropriate mixing coeffs.
 */
typedef struct av_codec_mix_struct {
    unsigned int inchannels,            ///< amount of channels in the input stream
    unsigned int outchannels,           ///< amount of channels in the requested output stream
    unsinged int stream_channel_mask,   ///< channelmask for the input stream
    unsinged int out_channel_mask,      ///< channelmask for the output data
    int8_t* mixing_coeff_index_matrix,  ///< mixing matrix that correspond to the mixing configuration
                                        ///< Table with inchannels*outchannels index elements, a negative index means that the mixing coeffs should be negated.
                                        ///< For example (simplified) [1,2] would mean coeff[1]+coeff[2] while [1,-2] would mean coeff[1]-coeff[2].
} av_codec_mix_struct;

struct AVMIXContext

/** Main AVMIX context
 *
 */
typedef struct AVMIXContext {
    unsigned int inchannels,            ///< amount of channels in the input stream
    unsigned int outchannels,           ///< amount of channels in the requested output stream
    void* inchannel[MAX_MIX_CHANNELS];  ///< pointers to the inchannels in channelmask order
    void* outchannel[MAX_MIX_CHANNELS]; ///< pointers to the outchannels in channelmask order
}

function ff_mix_init

/** Initialization routine for the libavcodec multichannel audio mixer
 *
 * The multichannel mixer does not know the "position" of the speakers and it doesn't need to either. But
 * depending on the mixing matrix it will unknowingly reorder channels to the native order.
 *
 * @param[in|out] mix
 * This is the actual mixing context. It will hold the all the information needed to perform mixing.
 * If the passed argument is NULL it will allocate a context. If not null it will reinit the passed
 * context. The mix context is of fixed size and will be large enough to support a MAX_MIX_CHANNELS
 * amount of channels.
 *
 * @param[in] inchannels
 * Number of inchannels, this is set by the input stream. This value will be stored in the mixing context.
 *
 * @param[in] outchannels 
 * Number of outchannels, this is set by the user. This value will be stored in the mixing context.
 *
 * @param stream_channel_mask
 * This is the parameter describing the possible channel configuration a codec can have. This info is taken from
 * the input stream and converted to a channel mask.
 *
 * @param out_channel_mask
 * This mask will contain the user selected out channel configuration.
 *
 * @param mix_table[in]
 * Table of av_codec_mix_struct's. 
 *
 * @param[in] mixing_coeffs_table
 * Table with mixing coeffs, it is this table the mixing_coeff_index_matrix will refer too. It is declared as void* to
 * make it possible for a future addition of fixed point mixing.
 *
 * @return[out]
 * The init will do a lookup for a matching mixing configuration with the help of the in and out channel masks.
 * If there isn't any matching configuration return 0 otherwise return 1. 
 */
int ff_mix_init(AVMIXContext* mix, unsigned int inchannels, unsigned int outchannels, unsigned int stream_channel_mask,
                unsigned int out_channel_mask, av_codec_mix_table* mix_table, void* mixing_coeffs_table);


select_mixing_matrix

/** Function to get the appropriate mixing_coeff_index_matrix.
 *
 *
 * @param[in] inchannels[in]
 * Number of inchannels, this is set by the input stream. This value will be stored in the mixing context.
 *
 * @param[in] outchannels[in]
 * Number of outchannels, this is set by the user. This value will be stored in the mixing context.
 *
 * @param stream_channel_mask[in]
 * This is the parameter describing the possible channel configuration a codec can have. This info is taken from
 * the input stream and converted to a channel mask.
 *
 * @param out_channel_mask[in]
 * This mask will contain the user selected out channel configuration.
 *
 * @param mix_table[in]
 * Table of av_codec_mix_struct's. 
 *
 * @returns[out]
 * A mixing_coeff_index_matrix if the configuration could be found in the mix_table, NULL if not.
 */
int8_t* select_mixing_matrix(unsigned int inchannels, unsigned int outchannels, unsigned int stream_channel_mask,
                             unsigned int out_channel_mask, av_codec_mix_table* mix_table);