FFmpeg Wishlist
Temporary FFmpeg wish/todo list:
FFmpeg module refactoring
Multimedia programs tend to be highly modular in design and FFmpeg is no exception. However, it does not make the best use of independent modules. The major task in refactoring FFmpeg modules will be to reorganize code so that each individual codec or muxer/demuxer module can be easily enabled and disabled at compile time. This task also entails creating a test suite that can automatically enable each module, one at a time, and validate that FFmpeg still builds and works.
Decoders
- H.264 (MPEG-4 AVC) decoder improvements/enhancements:
- Add PAFF to the existing H.264 decoder
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
- VC-1 (a.k.a. Microsoft Windows Media Video 3 or 9) decoder improvements/enhancements:
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
- LGPL'ed LC-AAC and HE-AAC (Advanced Audio Coding) decoder, based on the emerging open specification Wiki document.
- Also add a aac parser so -acodec copy to mp4/mov will work
- Assembly optimizations (like SIMD for 3DNow, MMX/MMX2, SSE/SSE2/SSE3 and AltiVec)
- aacPlus (a.k.a. AAC+) decoder Note: aacPlus v1 is HE-AAC + SBR, aacPlus v2 is HE-AAC + SBR + PS.
- mp3PRO decoder (Note: mp3PRO is MP3 + SBR. Standard MP3 decoders can decode mp3PRO encoded files/streams but without SBR you do not get the full quality.
- MPEG Surround decoder/parser (for all audio but especially MP3/mp3PRO and AAC/aacPlus as those are in use today).
- ALAC decoder improvements/enhancements:
- Clean up the existing alac decoder code
- ffsvq3 (FFmpeg SVQ3) decoder improvements/enhancements:
- add b-frame support to the ffsvq3 decoder
- amr decoder
- integrate speex (glue code or native)
- g723.1/rtp decoder
- g729/rtp decoder
- Monkey's Audio decoder (APE)
- JPEG2000 decoder
- Dirac decoder (Dirac is a audio codec developed by BBC as an open standard, shares shares some features with Snow)
- GSM decoder
- QCELP decoder spec is c.s0020 and source is c.r0020
- AMV decoder, http://scrub50187.com/ has the creator. wikipedia has articles about the format also.
- integer only vorbis decoder (to replace tremor)
- Fix "[rv20 @ 009C8BF0]unknown bit3 set" in rv20 decoder
- Add j-type picture support to the existing wmv8 decoder
- MLP decoder
- Indeo 4 decoder and Indeo 5 decoder
- XEB - the RatDVD video codec (stored in XVO container format)
- VNC decoder, files created by vncrec. Re-use code from VMware Video decoder http://www.sodan.org/~penny/vncrec/
- Additional game formats support:
- Gremlin Digital Video
- ARMovie/RPL
- ESCAPE
- M95
- XMV / FMV (Xbox Media Video) decoder/demuxer (for the first Microsoft Xbox game-console, based on WMV8)
- Working legal source code for a decoder/demuxer can be found here but it is copyrighted and without open source licence: http://sourceforge.net/tracker/index.php?func=detail&aid=1097094&group_id=53761&atid=471491 also look at http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/25207/focus=25224 and http://www.maxconsole.net/?mode=news&newsid=411 for hints/tips
Encoders
- MP3 encoder (a simple fixedpoint implementation using existing infrastructure in ffmpeg when possible)
- Implement a good psychoacoustic model
- Support the usage of this psychoacoustic model from the AC3, MP2 and other audio encoders
- Snow - the FFmpeg's projects own video codec
- Write a formal specification of the codec and document the implementation.
- DocBook, rst, or doxygen for the code documentation steps
- A complete roadmap will be decided together with the candidates
- Improve the optimizations and refine the implementation in order to have FFmpeg playing Snow in more constrained environments
- Avoid cache trashing and vectorize the code
- AltiVec, MMX/SSE, VIS or any other vector extension assembly/C intrinsics for your favourite arch
- AltiVec would preferably be written as intrinsics
- SSE/MMX/3dNow! code as inline assembly.
- multiple reference frames improvements
- decide which frames to keep (e.g. long-term refs)
- some changes to the mv prediction code
- non-translational motion-compensation
- estimate non translational parameters per block by using surrounding motion vectors
- add a ac coded bit per block to switch between translational and non-translational MC
- borrow the non translational MC code from libmpcodecs/vf_perspective.c
- some changes to the encoder to decide between translational and non t.
- Trellis quantization (select quantized coefficient so as to minimize the rate distrortion
- 4x4 sized block support (we have 16x16 and 8x8 currently)
- 1/8 pel motion compensation / estimation support (pretty much just encoder changes needed which in case of the iterative me should be trivial)
- improve the intra color decision
- Write a formal specification of the codec and document the implementation.
- QTRLE encoder
Demuxers
- iff demuxer (with anim and sound decoding)
- mark cox (melbournemark at gmail dot com) is currently working on this.
- xine has a demuxer/decoder for iff. Manfred Tremmel the author of the xine code has agreed to relicence his xine code as LGPL to allow easier use in ffmpeg. Thanks Manfred.
- Xanim also contains code to support iff, it may contain some formats not available in xine.
- g723.1 / rtp demuxer
- g729 / rtp demuxer
- VIVO demuxer, look at the mplayer vivo demuxer for reference
- XMV / FMV (Xbox Media Video) demuxer (from Microsoft and based on WMV8)
- Open source and legal demuxer/decoder but copyrighted source/specification:
http://sourceforge.net/tracker/index.php?func=detail&aid=1097094&group_id=53761&atid=471491 also look at http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/25207/focus=25224 and http://www.maxconsole.net/?mode=news&newsid=411 for hints/tips
- AMV demuxer, http://scrub50187.com/ has the creator. wikipedia has articles about the format also.
- FluxDVD / RatDVD demuxer for XVO files (Note! RatDVD is the predecesor of FluxDVD)
- NUT demuxer (and container format specifications enhancements/improvements)
- improve the documentation available for FFmpeg's NUT implementation
- Study the current specification and clarify it, putting it in a more verbose and understandable form.
- Conversion to rst or docbook isn't really required, but would be greatly appreciated.
- Update the demuxer using libnut-produced files as testcases
- Testcase 1: it should demux the complete file correctly as sequential reads
- Testcase 2: it should seek correctly in the complete file
- Testcase 3: same as 1 but with corrupted file
- Testcase 4: same as 2 but with corrupted file
- Update the muxer using libnut and ffnut demuxer to validate the produced files
- Make sure that the interleaving rules are respected.
- improve the documentation available for FFmpeg's NUT implementation
Muxers
- DVB (MPEG-TS) muxer inside DVB containers
- MPEG-1/2 video-streams inside DVB containers
- MPEG-4 ASP video-streams inside DVB containers
- MPEG-4 AVC (H.264) video-streams inside DVB containers
- AC3 audio-streams inside DVB containers
- Mutiple AC3 audio-streams inside DVB containers
- MP3 audio-streams inside DVB containers
- Mutiple MP3 audio-streams inside DVB containers
- NSV muxer
- NSA muxer
- NUT muxer (and container format specifications enhancements/improvements)
- improve the documentation available for FFmpeg's NUT implementation
- Study the current specification and clarify it, putting it in a more verbose and understandable form.
- Conversion to rst or docbook isn't really required, but would be greatly appreciated.
- Update the demuxer using libnut-produced files as testcases
- Testcase 1: it should demux the complete file correctly as sequential reads
- Testcase 2: it should seek correctly in the complete file
- Testcase 3: same as 1 but with corrupted file
- Testcase 4: same as 2 but with corrupted file
- Update the muxer using libnut and ffnut demuxer to validate the produced files
- Make sure that the interleaving rules are respected.
- improve the documentation available for FFmpeg's NUT implementation
libavformat API improvements/enhancements
The libavformat API is the interface of FFmpeg that is responsible for splitting apart encoded audio and video data from multimedia files (demuxing) and putting it together in new multimedia files (muxing). While libavcodec, (the FFmpeg component that encodes and decodes audio and video data), enjoys widespread use among an impressive array of multimedia projects libavformat has not seen the same level of adoption. These tasks should entail investigating how to improve the libavformat API, how it interacts with client applications and input layers, developing proof of concept code for a new API and working to port existing muxers and demuxers to the new API. Reorganizing and refactoring FFmpeg libavformat module code so that each individual muxer/demuxer module can be easily enabled and disabled at compile time.
- Study the current specification and clarify what the issues are, putting it in a verbose and understandable form.
- Write a formal specification and roadmap of the new API and document the implementation.
- DocBook, rst, or doxygen for the code documentation steps
Microsoft DirectShow and DirectX and MediaFoundation
Native DirectShow support
Option to build FFmpeg decoder/encoder/demuxer/muxer and post-processing filters for the DirectShow API for Windows by Microsoft, (the native DirectX 8/9 Direct3D overlay for video playback), so that FFmpeg has native support to be compiled for DirectShow and thus be used directly by players that use DirectShow. It should be noted that there are already one very popular fork of FFmpeg available that has native DirectShow support and that is FFdshow, (a other DirectShow implementation of FFmpeg is available as "drffmpeg" which is part of the DrDivX project).
- DirectShow headers and compiling support http://en.wikipedia.org/wiki/DirectShow
- Native GUI for codecs/filter configuration like FFDshow
- http://en.wikipedia.org/wiki/Ffdshow
- Sources: http://sourceforge.net/projects/ffdshow-tryout/ (FFdshow),
http://sourceforge.net/projects/drdivx/ (drffmpeg)
Native MediaFoundation support
Microsoft Windows Vista™ has new audio (EVR) and audio (SAR) renderers that FFmpeg needs output modules for
- Microsoft Media Foundation API usage for optimized digital media playback on Microsoft Windows Vista
- Multimedia Class Scheduler Service (MMCSS) class
- Enhanced Video Renderer (EVR) class
- Streaming Audio Renderer (SAR) class
DirectX Video Acceleration (DXVA)
DirectX Video Acceleration (DXVA) 1.0 AND 2.0 support, (for GPU accelerated video decoding under Windows).
Note! Native DirectShow support in FFmpeg is before DirectX VA (DXVA) video decoding support can be added!
- DXVA 1.0 (DirectX SDK) specifications: http://msdn2.microsoft.com/en-us/library/ms798379.aspx
- DXVA 2.0 (Windows SDK) specifications: http://msdn2.microsoft.com/en-us/library/ms788119.aspx
http://download.microsoft.com/download/5/b/9/5b97017b-e28a-4bae-ba48-174cf47d23cd/MED134_WH06.ppt
Features
- Create a new audio API system
- radix-4 fft routines
- Grabbing from video devices under windows
- Apply this VFW capture patch http://lists.mplayerhq.hu/pipermail/ffmpeg-user/2006-December/005607.html
- Create a DirectShow patch
- -[h|v]flip options for ffplay
- Improved exition documentation and add additional means to document
- Web
- WIKI
- manpage
- Add XING and/or VBRI header parsing support to the MP3 decoder/parser (for VBR encoded audio files)
- Possibly port code from this MPlayer patch: http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-March/050609.html
- Add GAIN (MP3Gain) header parsing support to the MP3 decoder/parser
- Also add GAIN (AACGain) header parsing support to the AAC decoder/parser
Subtitles
- Create a common 'subtitles parser library' (and/or an API system for adding support for additional subtitle formats?) - a common sub-library to FFmpeg with all subtile decoders/demuxers/parsers gathered (similar to the libpostproc and libavutils). Call it "libsubs" (or "libsub", "libsubtitles" or whatever). Move FFmpeg's existing VobSub and DVBsub code there, so no matter if they are bitmap or text-based subs all existing and future subtile code is collected there. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional subtitles.
- Maybe use MPlayer's recently added "libass" (SSA/ASS subtile reader) as a base for such a common library?
- Support for advanced SSA/ASS rendering
- Possible source are libass or the asa library
- Support bold, italic, underline, RGB colors, size changes and font changes for a whole line or part of one line
- Line 23 signal (a.k.a. "Wide-screen signal") detecting and use for DVD-Video (VobSub)
- Support for the subtitles HTML tags
- Capability of displaying subtitles with no video enabled (for example for audio-books)
- Support for Karaoke subtitles (for kar and cdg, etc.)
- Dual-subtitle-display (display two subtitles/languages at the same time, one at the bottom as normal plus one at the top of the screen)
- Capability of moving the subtitles in the picture (freetype renderer)
- Support more subtitle formats (text and bitmap-based):
- Closed captioning (CC) subtile support - (Closed captions for the deaf and hard of hearing, also known as "Line 21 captioning", uses VobSub bitmaps)
- xine have a SPU decoder for subpictures and Closed Captions software decoding
- DirectVobSub (VSFilter) - standard VobSubs (DVD-Video subtitles) embedded in AVI containers
- DivX Subtitles (XSUB) display/reader/decoder (Note: bitmap based subtitle, similar to VobSub)
- SubRip (.srt) subtile support (Note: simple text-based based subtitle with timestamp)
- Subviewer (.sub) subtile support (Note: simple text-based based subtitle with timestamp)
- MicroDVD (.sub) subtile support (Note: simple text-based based subtitle with timestamp
- Sami (.smi) subtile support (Note: simple text-based based subtitle with timestamp)
- SubStation Alpha (.ssa+.ass) subtile support (Note: advanced text-based based subtitle with timestamps and XY location on screen)
- RealText (.rt) subtile support
- PowerDivx (.psb) subtile support
- Universal Subtitle Format (.usf) subtile support
- Structured Subtitle Format (.ssf) subtile support
- Closed captioning (CC) subtile support - (Closed captions for the deaf and hard of hearing, also known as "Line 21 captioning", uses VobSub bitmaps)
Misc
- Add a aac parser so -acodec copy to mp4/mov will work
- Clean up the h263 rtp patch found on this page: http://www.salyens.com/downloads/index.html#ffmpeg-0.4.7
Streaming Media Network Protocols
Streaming Media Network Protocols (client and server-side) improvements/enhancements and related ideas for new features/functions.
- Create a common 'stream demuxer/parser library' for the client-side (and/or API for adding support for additional streaming formats?) - a LGPL'ed sub-library in FFmpeg with all stream demuxers/parsers gathered (similar to the libpostproc and libavutil). Call it "libstream" (or "stream" or whatever). Move FFmpeg's existing stream code there like HTTP and RTSP/RTP. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional streaming formats. All togther making it super easy for audio/video players using FFmpeg to add all-in-one streaming support to their player.
- Add support for additional streaming protocols (on the client side) and improve/enhance support for existing protocols:
- HTTP (Hypertext Transfer Protocol) client
- plus a SSL (Secure Sockets Layer) client support for HTTPS
- UDP (User Datagram Protocol) client
- RTSP - Real-Time Streaming Protocol (RFC2326) client
- RTP/RTCP - Real-Time Transport Protocol/RTP Control Protocol (RFC3550) client
- RTP Profile for Audio and Video Conferences with Minimal Control (RFC3551) client
- RealMedia RTSP/RDT (Real Time Streaming Protocol / Real Data Transport) client
- SDP (Service Discovery Protocol) / SSDP (Simple Service Discovery Protocol) client
- MMS (Microsoft Media Services) client
- HTTP (Hypertext Transfer Protocol) client
- FFServer (streaming server) updating and improving:
- FFServer code hasn't been update for quite a while
- Support for RTSP interleaved RTP media
- RTSP over HTTP tunneling
- SLL (Secure Sockets Layer) support
- TLS (Transport Layer Security) support
- SCTP (Stream Control Transmission Protocol) support
- including tunnel SCTP over UDP
- Per-asset accounting options
- Profiling and performance improvements of the RTSP, HTTP and RTP server code
- Streaming to clients like WMP 9, 10 and 11 is broken
- MMS server streaming support in FFServer, (especially for Linux).
- Note that al3x has gotten something working with ffserver, you might want to ask him what needs to be done as well :) --Compn 14:22, 19 March 2007 (EDT)
- You should also take a look at the FENG (RTSP Streaming Server) code, NetEmbryo (Embedded Open Media Streaming Library), and also cURL --Gamester17 11:20, 29 March 2007 (GMT+1)
Audio and video (pre-process/post-process) filters
FFmpeg's already well-known libavcodec module has become the de facto standard library for video decoding and encoding in free software projects. Unfortunately, no similar standard library has surfaced for audio/video filtering and otherwise working with audio/video stream once it has been decoded. Various multimedia projects (such as MPlayer, Xine, GStreamer, VirtualDub, etc.) have implemented their own filter systems to various degrees of success. What is needed is a high quality audio and video filter API - efficient, flexible enough to meet all the requirements which have led various projects to invent their own filter system, and yet easy to use or develop new filters with. This proposal is to implement a high quality audio/video filter library for FFmpeg, where it can be easily used by other multimedia-related software projects.
Mentor: A'rpi (has expressed interest of possibly helping with implementing a filter API in FFmpeg, he also volunteering to help porting the MPlayer filters too if a such API becomes available http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-April/051164.html)
- Adopt MPlayer's A/V filter system or create a new API 'from scratch' for pre-process and post-process audio/video filters:
- See http://article.gmane.org/gmane.comp.video.ffmpeg.devel/39130 for michaelni's idea of what to do.
- Also read this discussion thread on MPlayer's mailing-list:
- And, this one:
- Take a look at other eixsting players API for filter plugins, like for example;
- Decide on name of a such A/V filter API.
- libavfilter (conflicts with LAVF)? libavmunge?
- See http://article.gmane.org/gmane.comp.video.ffmpeg.devel/39130 for michaelni's idea of what to do.
- Create (or port) additional pre-process and post-process video filters to FFmpeg:
- General post-proc sources are MPlayer (libmpcodecs vf_*.c filters), Xine, FFdshow, VLC, VirtualDub, GStreamer, foobar, and XMMS
- More image scaling methods:
- Croping
- SSP (Statistical Post-Processing)
- DeBlocking
- DeRinging
- IVTC
- Sharpen / UnSharpen (Soften)
- ReQuantization
- Auto-Luminance
- Blurring / DeNoising / Spatial Blur / Temporal Blur
- Deinterlace (weave AND bob) filters
- 2:3 pull-down / ivtc (inverse telecine) for 24 progressive-frames on 30 FPS TV's
- NTSC => PAL, and PAL => NTSC frame-rate (FPS) adjust and reclock filter for NTSC <=> PAL conversion
- NTSC <=> PAL frame-rate adjust FPS ratios?: 23.97 <=> 25, 24 <=> 25, 30 <=> 25, 25 <=> 30
- Create (or port) additional pre-process and post-process audio filters:
- Psychoacoustic audio processing
- Artificial reverberation
- Dolby Prologic 2 decoding
- Audio re-sampler (sample rate converter) filter
- Possible source is SRC (Secret Rabbit Code)
- Create a SDK (Software Development Kit) with templates for the a/v filter API
See Also
- FFmpeg's Google SoC (Summer of Code) 2007 list of tasks for more suggestions/requests (ideas for developers).
- FFmpeg's Google SoC (Summer of Code) 2006 list of tasks for more suggestions/requests (ideas for developers).
- FFmpeg bugs for bugs in FFmpegs (codecs) that you can help fix or add addition information/samples to.
- Category:Formats_missing_in_FFmpeg for formats not implemented in ffmpeg yet