FFmpeg Wishlist

Temporary FFmpeg wish/todo list:

FFmpeg module refactoring

Multimedia programs tend to be highly modular in design and FFmpeg is no exception. However, it does not make the best use of independent modules. The major task in refactoring FFmpeg modules will be to reorganize code so that each individual codec or muxer/demuxer module can be easily enabled and disabled at compile time. This task also entails creating a test suite that can automatically enable each module, one at a time, and validate that FFmpeg still builds and works.

Decoders

H.264 (MPEG-4 AVC) decoder improvements/enhancements:
- Add PAFF to the existing H.264 decoder
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
VC-1 (a.k.a. Microsoft Windows Media Video 3 or 9) decoder improvements/enhancements:
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
LGPL'ed LC-AAC and HE-AAC (Advanced Audio Coding) decoder, based on the emerging open specification Wiki document.
- Also add a aac parser so -acodec copy to mp4/mov will work
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
aacPlus (a.k.a. AAC+) decoder Note: aacPlus v1 is HE-AAC + SBR, aacPlus v2 is HE-AAC + SBR + PS.
mp3PRO decoder (Note: mp3PRO is MP3 + SBR. Standard MP3 decoders can decode mp3PRO encoded files/streams but without SBR you do not get the full quality.
MPEG Surround decoder/parser (for all audio but especially MP3/mp3PRO and AAC/aacPlus as those are in use today).
ALAC decoder improvements/enhancements:
- Clean up the existing alac decoder code
ffsvq3 (FFmpeg SVQ3) decoder improvements/enhancements:
- add b-frame support to the ffsvq3 decoder
amr decoder
integrate speex (glue code or native)
g723.1/rtp decoder
g729/rtp decoder
Monkey's Audio decoder (APE)
- [1] (original C++ SDK sources)
- [2] (original sources port for non-win32 platforms)
- [3]LGPLed Java implementation
JPEG2000 decoder
Dirac decoder (Dirac is a audio codec developed by BBC as an open standard, shares shares some features with Snow)
GSM decoder
QCELP decoder spec is c.s0020 and source is c.r0020
AMV decoder, http://scrub50187.com/ has the creator. wikipedia has articles about the format also.
integer only vorbis decoder (to replace tremor)
Fix "[rv20 @ 009C8BF0]unknown bit3 set" in rv20 decoder
Add j-type picture support to the existing wmv8 decoder
MLP decoder
Indeo 4 decoder and Indeo 5 decoder
XEB - the RatDVD video codec (stored in XVO container format)
VNC decoder, files created by vncrec. Re-use code from VMware Video decoder http://www.sodan.org/~penny/vncrec/
Additional game formats support:
- Gremlin Digital Video
- ARMovie/RPL
- ESCAPE
- M95
- XMV / FMV (Xbox Media Video) decoder/demuxer (for the first Microsoft Xbox game-console, based on WMV8)
  - Working legal source code for a decoder/demuxer can be found here but it is copyrighted and without open source licence: http://sourceforge.net/tracker/index.php?func=detail&aid=1097094&group_id=53761&atid=471491 also look at http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/25207/focus=25224 and http://www.maxconsole.net/?mode=news&newsid=411 for hints/tips

Encoders

MP3 encoder (a simple fixedpoint implementation using existing infrastructure in ffmpeg when possible)
Implement a good psychoacoustic model
- Support the usage of this psychoacoustic model from the AC3, MP2 and other audio encoders
Snow - the FFmpeg's projects own video codec
- Write a formal specification of the codec and document the implementation.
  - DocBook, rst, or doxygen for the code documentation steps
  - A complete roadmap will be decided together with the candidates
- Improve the optimizations and refine the implementation in order to have FFmpeg playing Snow in more constrained environments
  - Avoid cache trashing and vectorize the code
  - AltiVec, MMX/SSE, VIS or any other vector extension assembly/C intrinsics for your favourite arch
  - AltiVec would preferably be written as intrinsics
  - SSE/MMX/3dNow! code as inline assembly.
- multiple reference frames improvements
  - decide which frames to keep (e.g. long-term refs)
  - some changes to the mv prediction code
- non-translational motion-compensation
  - estimate non translational parameters per block by using surrounding motion vectors
  - add a ac coded bit per block to switch between translational and non-translational MC
  - borrow the non translational MC code from libmpcodecs/vf_perspective.c
  - some changes to the encoder to decide between translational and non t.
- Trellis quantization (select quantized coefficient so as to minimize the rate distrortion
- 4x4 sized block support (we have 16x16 and 8x8 currently)
- 1/8 pel motion compensation / estimation support (pretty much just encoder changes needed which in case of the iterative me should be trivial)
- improve the intra color decision

Demuxers

iff demuxer (with anim and sound decoding)
- xine has a demuxer/decoder for iff
g723.1 / rtp demuxer
g729 / rtp demuxer
VIVO demuxer, look at the mplayer vivo demuxer for reference
XMV / FMV (Xbox Media Video) demuxer (from Microsoft and based on WMV8)
- Open source and legal demuxer/decoder but copyrighted source/specification:

http://sourceforge.net/tracker/index.php?func=detail&aid=1097094&group_id=53761&atid=471491 also look at http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/25207/focus=25224 and http://www.maxconsole.net/?mode=news&newsid=411 for hints/tips

AMV demuxer, http://scrub50187.com/ has the creator. wikipedia has articles about the format also.
FluxDVD / RatDVD demuxer for XVO files (Note! RatDVD is the predecesor of FluxDVD)
NUT demuxer (and container format specifications enhancements/improvements)
- improve the documentation available for FFmpeg's NUT implementation
  - Study the current specification and clarify it, putting it in a more verbose and understandable form.
  - Conversion to rst or docbook isn't really required, but would be greatly appreciated.
- Update the demuxer using libnut-produced files as testcases
  - Testcase 1: it should demux the complete file correctly as sequential reads
  - Testcase 2: it should seek correctly in the complete file
  - Testcase 3: same as 1 but with corrupted file
  - Testcase 4: same as 2 but with corrupted file
- Update the muxer using libnut and ffnut demuxer to validate the produced files
  - Make sure that the interleaving rules are respected.

Muxers

DVB (MPEG-TS) muxer inside DVB containers
- MPEG-1/2 video-streams inside DVB containers
- MPEG-4 ASP video-streams inside DVB containers
- MPEG-4 AVC (H.264) video-streams inside DVB containers
- AC3 audio-streams inside DVB containers
  - Mutiple AC3 audio-streams inside DVB containers
- MP3 audio-streams inside DVB containers
  - Mutiple MP3 audio-streams inside DVB containers
NSV muxer
NSA muxer
NUT muxer (and container format specifications enhancements/improvements)
- improve the documentation available for FFmpeg's NUT implementation
  - Study the current specification and clarify it, putting it in a more verbose and understandable form.
  - Conversion to rst or docbook isn't really required, but would be greatly appreciated.
- Update the demuxer using libnut-produced files as testcases
  - Testcase 1: it should demux the complete file correctly as sequential reads
  - Testcase 2: it should seek correctly in the complete file
  - Testcase 3: same as 1 but with corrupted file
  - Testcase 4: same as 2 but with corrupted file
- Update the muxer using libnut and ffnut demuxer to validate the produced files
  - Make sure that the interleaving rules are respected.

libavformat API improvements/enhancements

The libavformat API is the interface of FFmpeg that is responsible for splitting apart encoded audio and video data from multimedia files (demuxing) and putting it together in new multimedia files (muxing). While libavcodec, (the FFmpeg component that encodes and decodes audio and video data), enjoys widespread use among an impressive array of multimedia projects libavformat has not seen the same level of adoption. These tasks should entail investigating how to improve the libavformat API, how it interacts with client applications and input layers, developing proof of concept code for a new API and working to port existing muxers and demuxers to the new API. Reorganizing and refactoring FFmpeg libavformat module code so that each individual muxer/demuxer module can be easily enabled and disabled at compile time.

Study the current specification and clarify what the issues are, putting it in a verbose and understandable form.
Write a formal specification and roadmap of the new API and document the implementation.
DocBook, rst, or doxygen for the code documentation steps

Microsoft DirectShow and DirectX and MediaFoundation

Native DirectShow support

Option to build FFmpeg decoder/encoder/demuxer/muxer and post-processing filters for the DirectShow API for Windows by Microsoft, (the native DirectX 8/9 Direct3D overlay for video playback), so that FFmpeg has native support to be compiled for DirectShow and thus be used directly by players that use DirectShow. It should be noted that there are already one very popular fork of FFmpeg available that has native DirectShow support and that is FFdshow, (a other DirectShow implementation of FFmpeg is available as "drffmpeg" which is part of the DrDivX project).

DirectShow headers and compiling support http://en.wikipedia.org/wiki/DirectShow
- Specifications: http://msdn2.microsoft.com/en-us/library/ms783323.aspx
- http://msdn2.microsoft.com/en-us/library/ms788119.aspx
- Sources: http://sourceforge.net/projects/ffdshow-tryout/
Native GUI for codecs/filter configuration like FFDshow
- http://en.wikipedia.org/wiki/Ffdshow
- Sources: http://sourceforge.net/projects/ffdshow-tryout/ (FFdshow),

http://sourceforge.net/projects/drdivx/ (drffmpeg)

Native MediaFoundation support

Microsoft Media Foundation API usage for optimized digital media playback on Microsoft Windows Vista™
- http://msdn2.microsoft.com/en-us/library/ms694197.aspx
Multimedia Class Scheduler Service (MMCSS) class
Enhanced Video Renderer (EVR) class
Streaming Audio Renderer (SAR) class

DirectX Video Acceleration (DXVA)

DirectX Video Acceleration (DXVA) 1.0 AND 2.0 support, (for GPU accelerated video decoding under Windows).
Note! Native DirectShow support in FFmpeg is before DirectX VA (DXVA) video decoding support can be added!

DXVA 1.0 (DirectX SDK) specifications: http://msdn2.microsoft.com/en-us/library/ms798379.aspx
- http://en.wikipedia.org/wiki/DXVA
DXVA 2.0 (Windows SDK) specifications: http://msdn2.microsoft.com/en-us/library/ms788119.aspx

http://download.microsoft.com/download/5/b/9/5b97017b-e28a-4bae-ba48-174cf47d23cd/MED134_WH06.ppt

Features

Create a new audio API system
radix-4 fft routines
Grabbing from video devices under windows
- Apply this VFW capture patch http://lists.mplayerhq.hu/pipermail/ffmpeg-user/2006-December/005607.html
- Create a DirectShow patch
-[h|v]flip options for ffplay
Improved exition documentation and add additional means to document
- Web
- WIKI
- manpage
Add XING and/or VBRI header parsing support to the MP3 decoder/parser (for VBR encoded audio files)
- Possibly port code from this MPlayer patch: http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-March/050609.html
Add GAIN (MP3Gain) header parsing support to the MP3 decoder/parser
- Also add GAIN (AACGain) header parsing support to the AAC decoder/parser

Subtitles

Create a common 'subtitles parser library' (and/or an API system for adding support for additional subtitle formats?) - a common sub-library to FFmpeg with all subtile decoders/demuxers/parsers gathered (similar to the libpostproc and libavutils). Call it "libsubs" (or "libsub", "libsubtitles" or whatever). Move FFmpeg's existing VobSub and DVBsub code there, so no matter if they are bitmap or text-based subs all existing and future subtile code is collected there. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional subtitles.
- Maybe use MPlayer's recently added "libass" (SSA/ASS subtile reader) as a base for such a common library?
Support for advanced SSA/ASS rendering
- Possible source are libass or the asa library
Support bold, italic, underline, RGB colors, size changes and font changes for a whole line or part of one line
Line 23 signal (a.k.a. "Wide-screen signal") detecting and use for DVD-Video (VobSub)
Support for the subtitles HTML tags
Capability of displaying subtitles with no video enabled (for example for audio-books)
Support for Karaoke subtitles (for kar and cdg, etc.)
Dual-subtitle-display (display two subtitles/languages at the same time, one at the bottom as normal plus one at the top of the screen)
Capability of moving the subtitles in the picture (freetype renderer)
Support more subtitle formats (text and bitmap-based):
- Closed captioning (CC) subtile support - (Closed captions for the deaf and hard of hearing, also known as "Line 21 captioning", uses VobSub bitmaps)
  - xine have a SPU decoder for subpictures and Closed Captions software decoding
- DirectVobSub (VSFilter) - standard VobSubs (DVD-Video subtitles) embedded in AVI containers
- DivX Subtitles (XSUB) display/reader/decoder (Note: bitmap based subtitle, similar to VobSub)
- SubRip (.srt) subtile support (Note: simple text-based based subtitle with timestamp)
- Subviewer (.sub) subtile support (Note: simple text-based based subtitle with timestamp)
- MicroDVD (.sub) subtile support (Note: simple text-based based subtitle with timestamp
- Sami (.smi) subtile support (Note: simple text-based based subtitle with timestamp)
- SubStation Alpha (.ssa+.ass) subtile support (Note: advanced text-based based subtitle with timestamps and XY location on screen)
- RealText (.rt) subtile support
- PowerDivx (.psb) subtile support
- Universal Subtitle Format (.usf) subtile support
- Structured Subtitle Format (.ssf) subtile support

Misc

Add a aac parser so -acodec copy to mp4/mov will work
Clean up the h263 rtp patch found on this page: http://www.salyens.com/downloads/index.html#ffmpeg-0.4.7

Streaming Media Network Protocols

Streaming Media Network Protocols (client and server-side) improvements/enhancements and related ideas for new features/functions.

Create a common 'stream demuxer/parser library' for the client-side (and/or API for adding support for additional streaming formats?) - a LGPL'ed sub-library in FFmpeg with all stream demuxers/parsers gathered (similar to the libpostproc and libavutil). Call it "libstream" (or "stream" or whatever). Move FFmpeg's existing stream code there like HTTP and RTSP/RTP. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional streaming formats. All togther making it super easy for audio/video players using FFmpeg to add all-in-one streaming support to their player.
- Maybe use either MPlayer's "stream" library structure, LIVE555, or probebly the better libnms (from NeMeSi) as a base for such a common library?
Add support for additional streaming protocols (on the client side) and improve/enhance support for existing protocols:
- HTTP (Hypertext Transfer Protocol) client
- UDP (User Datagram Protocol) client
- RTSP - Real-Time Streaming Protocol (RFC2326) client
- RTP/RTCP - Real-Time Transport Protocol/RTP Control Protocol (RFC3550) client
- RTP Profile for Audio and Video Conferences with Minimal Control (RFC3551) client
- RealMedia RTSP/RDT (Real Time Streaming Protocol / Real Data Transport) client
- SDP (Service Discovery Protocol) / SSDP (Simple Service Discovery Protocol) client
- MMS (Microsoft Media Services) client
FFServer updating (and improving)
- FFServer code hasn't been update for quite a while.
- Streaming to clients like WMP 9, 10 and 11 is broken.
- MMS server streaming support in FFServer, (especially for Linux).
  - Note that al3x has gotten something working with ffserver, you might want to ask him what needs to be done as well :) --Compn 14:22, 19 March 2007 (EDT)
  - You should also take a look at the FENG (RTSP Streaming Server) code, and [NetEmbryo (Embedded Open Media Streaming Library)] --Gamester17 11:20, 29 March 2007 (GMT+1)

Audio and video (pre-process/post-process) filters

FFmpeg's already well-known libavcodec module has become the de facto standard library for video decoding and encoding in free software projects. Unfortunately, no similar standard library has surfaced for audio/video filtering and otherwise working with audio/video stream once it has been decoded. Various multimedia projects (such as MPlayer, Xine, GStreamer, VirtualDub, etc.) have implemented their own filter systems to various degrees of success. What is needed is a high quality audio and video filter API - efficient, flexible enough to meet all the requirements which have led various projects to invent their own filter system, and yet easy to use or develop new filters with. This proposal is to implement a high quality audio/video filter library for FFmpeg, where it can be easily used by other multimedia-related software projects.

Mentor: A'rpi (has expressed interest of possibly helping with implementing a filter API in FFmpeg, he also volunteering to help porting the MPlayer filters too if a such API becomes available http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-April/051164.html)

Adopt MPlayer's A/V filter system or create a new API 'from scratch' for pre-process and post-process audio/video filters:
- See http://article.gmane.org/gmane.comp.video.ffmpeg.devel/39130 for michaelni's idea of what to do.
  - Also read this discussion thread on MPlayer's mailing-list:
    - http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-April/051142.html
      - http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-April/thread.html#51142
- Take a look at other eixsting players API for filter plugins, like for example;
  - MPlayer (libmpcodecs vf_*.c filters), Xine, FFdshow, VLC, VirtualDub, GStreamer, foobar, and XMMS
- Decide on name of a such A/V filter API.
  - libavfilter (conflicts with LAVF)? libavmunge?
Create (or port) additional pre-process and post-process video filters to FFmpeg:
- General post-proc sources are MPlayer (libmpcodecs vf_*.c filters), Xine, FFdshow, VLC, VirtualDub, GStreamer, foobar, and XMMS
- More image scaling methods:
- Croping
- SSP (Statistical Post-Processing)
- DeBlocking
- DeRinging
- IVTC
- Sharpen / UnSharpen (Soften)
- ReQuantization
- Auto-Luminance
- Blurring / DeNoising / Spatial Blur / Temporal Blur
- Deinterlace (weave AND bob) filters
  - Possible sources: DScaler or y4mscale?
- 2:3 pull-down / ivtc (inverse telecine) for 24 progressive-frames on 30 FPS TV's
  - Possible sources: DScaler or y4mscale?
- NTSC => PAL, and PAL => NTSC frame-rate (FPS) adjust and reclock filter for NTSC <=> PAL conversion
  - NTSC <=> PAL frame-rate adjust FPS ratios?: 23.97 <=> 25, 24 <=> 25, 30 <=> 25, 25 <=> 30
Create (or port) additional pre-process and post-process audio filters:
- Audio re-sampler (sample rate converter) filter
  - Possible source is SRC (Secret Rabbit Code)
Create a SDK (Software Development Kit) with templates for the a/v filter API

FFmpeg Wishlist

Contents

FFmpeg module refactoring

Decoders

Encoders

Demuxers

Muxers

libavformat API improvements/enhancements

Microsoft DirectShow and DirectX and MediaFoundation

Native DirectShow support

Native MediaFoundation support

DirectX Video Acceleration (DXVA)

Features

Subtitles

Misc

Streaming Media Network Protocols

Audio and video (pre-process/post-process) filters

See Also

Navigation menu

FFmpeg Wishlist

FFmpeg module refactoring

Decoders

Encoders

Demuxers

Muxers

libavformat API improvements/enhancements

Microsoft DirectShow and DirectX and MediaFoundation

Native DirectShow support

Native MediaFoundation support

DirectX Video Acceleration (DXVA)

Features

Subtitles

Misc

Streaming Media Network Protocols

Audio and video (pre-process/post-process) filters

See Also

Navigation menu

Search