Difference between revisions of "FFmpeg Wishlist"

From MultimediaWiki
Jump to navigation Jump to search
m (Decoders: Dirac is video, not audio)
(Decoders)
Line 16: Line 16:
 
* [http://www.codingtechnologies.com/products/mp3pro.htm mp3PRO] decoder (Note: mp3PRO is MP3 + [http://www.codingtechnologies.com/products/sbr.htm SBR]. Standard MP3 decoders can decode mp3PRO encoded files/streams but without [http://www.codingtechnologies.com/products/sbr.htm SBR] you do not get the full quality.  
 
* [http://www.codingtechnologies.com/products/mp3pro.htm mp3PRO] decoder (Note: mp3PRO is MP3 + [http://www.codingtechnologies.com/products/sbr.htm SBR]. Standard MP3 decoders can decode mp3PRO encoded files/streams but without [http://www.codingtechnologies.com/products/sbr.htm SBR] you do not get the full quality.  
 
* [http://www.mpegsurround.com MPEG Surround] decoder/parser (for all audio but especially MP3/mp3PRO and AAC/aacPlus as those are in use today).
 
* [http://www.mpegsurround.com MPEG Surround] decoder/parser (for all audio but especially MP3/mp3PRO and AAC/aacPlus as those are in use today).
* ALAC decoder improvements/enhancements:
+
* [[ALAC]] decoder improvements/enhancements:
 
**Clean up the existing alac decoder code
 
**Clean up the existing alac decoder code
 
* ffsvq3 (FFmpeg SVQ3) decoder improvements/enhancements:
 
* ffsvq3 (FFmpeg SVQ3) decoder improvements/enhancements:
 
** add b-frame support to the ffsvq3 decoder
 
** add b-frame support to the ffsvq3 decoder
* amr decoder
+
* [[AMR]] decoder
 
* integrate speex (glue code or native)  
 
* integrate speex (glue code or native)  
 
* g723.1/rtp decoder
 
* g723.1/rtp decoder
Line 29: Line 29:
 
**[http://jmac.sourceforge.net/]LGPLed Java implementation
 
**[http://jmac.sourceforge.net/]LGPLed Java implementation
 
* JPEG2000 decoder
 
* JPEG2000 decoder
* [[Dirac]] decoder ([[Dirac]] is a video codec developed by [[BBC]] as an open standard, shares shares some features with [[Snow]])
+
* [[Dirac]] decoder ([[Dirac]] is a video codec developed by [[BBC]] as an open standard, shares some features with [[Snow]])
 
* [[GSM]] decoder
 
* [[GSM]] decoder
 
* QCELP decoder [http://www.3gpp2.org/Public_html/specs/alltsgscfm.cfm spec] is c.s0020 and source is c.r0020
 
* QCELP decoder [http://www.3gpp2.org/Public_html/specs/alltsgscfm.cfm spec] is c.s0020 and source is c.r0020
Line 35: Line 35:
 
* integer only vorbis decoder (to replace tremor)
 
* integer only vorbis decoder (to replace tremor)
 
* Fix "[rv20 @ 009C8BF0]unknown bit3 set" in rv20 decoder
 
* Fix "[rv20 @ 009C8BF0]unknown bit3 set" in rv20 decoder
* Add j-type picture support to the existing wmv8 decoder
+
* Add [[XINTRA8|j-type]] picture support to the existing wmv8 decoder
* MLP decoder
+
* [[MLP]] decoder
 
* Indeo 4 decoder and Indeo 5 decoder  
 
* Indeo 4 decoder and Indeo 5 decoder  
 
* [[xeb|XEB]] - the [[RatDVD]] video codec (stored in [[xvo|XVO]] container format)
 
* [[xeb|XEB]] - the [[RatDVD]] video codec (stored in [[xvo|XVO]] container format)
 
* VNC decoder, files created by vncrec. Re-use code from [[VMware Video]] decoder http://www.sodan.org/~penny/vncrec/
 
* VNC decoder, files created by vncrec. Re-use code from [[VMware Video]] decoder http://www.sodan.org/~penny/vncrec/
 +
* RealAudio [[RealAudio sipr|Sipro]] decoder
 +
* Fix distortion in [[QDM2]] decoder
 
* Additional game formats support:
 
* Additional game formats support:
 
** [[Gremlin Digital Video]]
 
** [[Gremlin Digital Video]]

Revision as of 10:17, 26 August 2007

Temporary FFmpeg wish/todo list:

FFmpeg module refactoring

Multimedia programs tend to be highly modular in design and FFmpeg is no exception. However, it does not make the best use of independent modules. The major task in refactoring FFmpeg modules will be to reorganize code so that each individual codec or muxer/demuxer module can be easily enabled and disabled at compile time. This task also entails creating a test suite that can automatically enable each module, one at a time, and validate that FFmpeg still builds and works.

Decoders

Encoders

  • MP3 encoder (a simple fixedpoint implementation using existing infrastructure in ffmpeg when possible)
  • Implement a good psychoacoustic model
  • Snow - the FFmpeg's projects own video codec
    • Write a formal specification of the codec and document the implementation.
      • DocBook, rst, or doxygen for the code documentation steps
      • A complete roadmap will be decided together with the candidates
    • Improve the optimizations and refine the implementation in order to have FFmpeg playing Snow in more constrained environments
      • Avoid cache trashing and vectorize the code
      • AltiVec, MMX/SSE, VIS or any other vector extension assembly/C intrinsics for your favourite arch
      • AltiVec would preferably be written as intrinsics
      • SSE/MMX/3dNow! code as inline assembly.
    • multiple reference frames improvements
      • decide which frames to keep (e.g. long-term refs)
      • some changes to the mv prediction code
    • non-translational motion-compensation
      • estimate non translational parameters per block by using surrounding motion vectors
      • add a ac coded bit per block to switch between translational and non-translational MC
      • borrow the non translational MC code from libmpcodecs/vf_perspective.c
      • some changes to the encoder to decide between translational and non t.
    • Trellis quantization (select quantized coefficient so as to minimize the rate distrortion
    • 4x4 sized block support (we have 16x16 and 8x8 currently)
    • 1/8 pel motion compensation / estimation support (pretty much just encoder changes needed which in case of the iterative me should be trivial)
    • improve the intra color decision

Demuxers

  • iff demuxer (with anim and sound decoding)
    • mark cox (melbournemark at gmail dot com) is currently working on this.
    • xine has a demuxer/decoder for iff. Manfred Tremmel the author of the xine code has agreed to relicence his xine code as LGPL to allow easier use in ffmpeg. Thanks Manfred.
    • Werner Randelshofer the author of multishow has also granted permission for his iff code to be relicenced as LGPL. Thanks Werner.
    • Xanim also contains code to support iff, it may contain some formats not available in xine.
  • g723.1 / rtp demuxer
  • g729 / rtp demuxer
  • VIVO demuxer, look at the mplayer vivo demuxer for reference
  • XMV / FMV (Xbox Media Video) demuxer (from Microsoft and based on WMV8)
    • Open source and legal demuxer/decoder but copyrighted source/specification:

http://sourceforge.net/tracker/index.php?func=detail&aid=1097094&group_id=53761&atid=471491 also look at http://thread.gmane.org/gmane.comp.video.ffmpeg.devel/25207/focus=25224 and http://www.maxconsole.net/?mode=news&newsid=411 for hints/tips

  • AMV demuxer, http://scrub50187.com/ has the creator. wikipedia has articles about the format also.
  • FluxDVD / RatDVD demuxer for XVO files (Note! RatDVD is the predecesor of FluxDVD)
  • NUT demuxer (and container format specifications enhancements/improvements)
    • improve the documentation available for FFmpeg's NUT implementation
      • Study the current specification and clarify it, putting it in a more verbose and understandable form.
      • Conversion to rst or docbook isn't really required, but would be greatly appreciated.
    • Update the demuxer using libnut-produced files as testcases
      • Testcase 1: it should demux the complete file correctly as sequential reads
      • Testcase 2: it should seek correctly in the complete file
      • Testcase 3: same as 1 but with corrupted file
      • Testcase 4: same as 2 but with corrupted file
    • Update the muxer using libnut and ffnut demuxer to validate the produced files
      • Make sure that the interleaving rules are respected.

Muxers

  • DVB (MPEG-TS) muxer inside DVB containers
    • MPEG-1/2 video-streams inside DVB containers
    • MPEG-4 ASP video-streams inside DVB containers
    • MPEG-4 AVC (H.264) video-streams inside DVB containers
    • AC3 audio-streams inside DVB containers
      • Mutiple AC3 audio-streams inside DVB containers
    • MP3 audio-streams inside DVB containers
      • Mutiple MP3 audio-streams inside DVB containers
  • NSV muxer
  • NSA muxer
  • NUT muxer (and container format specifications enhancements/improvements)
    • improve the documentation available for FFmpeg's NUT implementation
      • Study the current specification and clarify it, putting it in a more verbose and understandable form.
      • Conversion to rst or docbook isn't really required, but would be greatly appreciated.
    • Update the demuxer using libnut-produced files as testcases
      • Testcase 1: it should demux the complete file correctly as sequential reads
      • Testcase 2: it should seek correctly in the complete file
      • Testcase 3: same as 1 but with corrupted file
      • Testcase 4: same as 2 but with corrupted file
    • Update the muxer using libnut and ffnut demuxer to validate the produced files
      • Make sure that the interleaving rules are respected.

libavformat API improvements/enhancements

The libavformat API is the interface of FFmpeg that is responsible for splitting apart encoded audio and video data from multimedia files (demuxing) and putting it together in new multimedia files (muxing). While libavcodec, (the FFmpeg component that encodes and decodes audio and video data), enjoys widespread use among an impressive array of multimedia projects libavformat has not seen the same level of adoption. These tasks should entail investigating how to improve the libavformat API, how it interacts with client applications and input layers, developing proof of concept code for a new API and working to port existing muxers and demuxers to the new API. Reorganizing and refactoring FFmpeg libavformat module code so that each individual muxer/demuxer module can be easily enabled and disabled at compile time.

  • Study the current specification and clarify what the issues are, putting it in a verbose and understandable form.
  • Write a formal specification and roadmap of the new API and document the implementation.
  • DocBook, rst, or doxygen for the code documentation steps

Microsoft DirectShow and DirectX and MediaFoundation

Native DirectShow support

Option to build FFmpeg decoder/encoder/demuxer/muxer and post-processing filters for the DirectShow API for Windows by Microsoft, (the native DirectX 8/9 Direct3D overlay for video playback), so that FFmpeg has native support to be compiled for DirectShow and thus be used directly by players that use DirectShow. It should be noted that there are already one very popular fork of FFmpeg available that has native DirectShow support and that is FFdshow, (a other DirectShow implementation of FFmpeg is available as "drffmpeg" which is part of the DrDivX project).

http://sourceforge.net/projects/drdivx/ (drffmpeg)

Native MediaFoundation support

Microsoft Windows Vista™ has new audio (EVR) and audio (SAR) renderers that FFmpeg needs output modules for

DirectX Video Acceleration (DXVA)

DirectX Video Acceleration (DXVA) 1.0 AND 2.0 support, (for GPU accelerated video decoding under Windows).
Note! Native DirectShow support in FFmpeg is before DirectX VA (DXVA) video decoding support can be added!

http://download.microsoft.com/download/5/b/9/5b97017b-e28a-4bae-ba48-174cf47d23cd/MED134_WH06.ppt

Features

Subtitles

  • Create a common 'subtitles parser library' (and/or an API system for adding support for additional subtitle formats?) - a common sub-library to FFmpeg with all subtile decoders/demuxers/parsers gathered (similar to the libpostproc and libavutils). Call it "libsubs" (or "libsub", "libsubtitles" or whatever). Move FFmpeg's existing VobSub and DVBsub code there, so no matter if they are bitmap or text-based subs all existing and future subtile code is collected there. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional subtitles.
    • Maybe use MPlayer's recently added "libass" (SSA/ASS subtile reader) as a base for such a common library?
  • Support for advanced SSA/ASS rendering
    • Possible source are libass or the asa library
  • Support bold, italic, underline, RGB colors, size changes and font changes for a whole line or part of one line
  • Line 23 signal (a.k.a. "Wide-screen signal") detecting and use for DVD-Video (VobSub)
  • Support for the subtitles HTML tags
  • Capability of displaying subtitles with no video enabled (for example for audio-books)
  • Support for Karaoke subtitles (for kar and cdg, etc.)
  • Dual-subtitle-display (display two subtitles/languages at the same time, one at the bottom as normal plus one at the top of the screen)
  • Capability of moving the subtitles in the picture (freetype renderer)
  • Support more subtitle formats (text and bitmap-based):
    • Closed captioning (CC) subtile support - (Closed captions for the deaf and hard of hearing, also known as "Line 21 captioning", uses VobSub bitmaps)
      • xine have a SPU decoder for subpictures and Closed Captions software decoding
    • DirectVobSub (VSFilter) - standard VobSubs (DVD-Video subtitles) embedded in AVI containers
    • DivX Subtitles (XSUB) display/reader/decoder (Note: bitmap based subtitle, similar to VobSub)
    • SubRip (.srt) subtile support (Note: simple text-based based subtitle with timestamp)
    • Subviewer (.sub) subtile support (Note: simple text-based based subtitle with timestamp)
    • MicroDVD (.sub) subtile support (Note: simple text-based based subtitle with timestamp
    • Sami (.smi) subtile support (Note: simple text-based based subtitle with timestamp)
    • SubStation Alpha (.ssa+.ass) subtile support (Note: advanced text-based based subtitle with timestamps and XY location on screen)
    • RealText (.rt) subtile support
    • PowerDivx (.psb) subtile support
    • Universal Subtitle Format (.usf) subtile support
    • Structured Subtitle Format (.ssf) subtile support


Misc

Streaming Media Network Protocols

Streaming Media Network Protocols (client and server-side) improvements/enhancements and related ideas for new features/functions.

  • Create a common 'stream demuxer/parser library' for the client-side (and/or API for adding support for additional streaming formats?) - a LGPL'ed sub-library in FFmpeg with all stream demuxers/parsers gathered (similar to the libpostproc and libavutil). Call it "libstream" (or "stream" or whatever). Move FFmpeg's existing stream code there like HTTP and RTSP/RTP. This will help reduce future code replication by sharing common code, thus making it easier to add support for additional streaming formats. All togther making it super easy for audio/video players using FFmpeg to add all-in-one streaming support to their player.
    • Maybe use either MPlayer's "stream" library structure, LIVE555, cURL, or probebly the better libnms (from NeMeSi) as a base for such a common library?
  • Add support for additional streaming protocols (on the client side) and improve/enhance support for existing protocols:
    • HTTP (Hypertext Transfer Protocol) client
      • plus a SSL (Secure Sockets Layer) client support for HTTPS
    • UDP (User Datagram Protocol) client
    • RTSP - Real-Time Streaming Protocol (RFC2326) client
    • RTP/RTCP - Real-Time Transport Protocol/RTP Control Protocol (RFC3550) client
    • RTP Profile for Audio and Video Conferences with Minimal Control (RFC3551) client
    • RealMedia RTSP/RDT (Real Time Streaming Protocol / Real Data Transport) client
    • SDP (Service Discovery Protocol) / SSDP (Simple Service Discovery Protocol) client
    • MMS (Microsoft Media Services) client
  • FFServer (streaming server) updating and improving:
    • FFServer code hasn't been update for quite a while
    • Support for RTSP interleaved RTP media
    • RTSP over HTTP tunneling
    • SLL (Secure Sockets Layer) support
    • TLS (Transport Layer Security) support
    • SCTP (Stream Control Transmission Protocol) support
      • including tunnel SCTP over UDP
    • Per-asset accounting options
    • Profiling and performance improvements of the RTSP, HTTP and RTP server code
    • Streaming to clients like WMP 9, 10 and 11 is broken
    • MMS server streaming support in FFServer, (especially for Linux).

Audio and video (pre-process/post-process) filters

FFmpeg's already well-known libavcodec module has become the de facto standard library for video decoding and encoding in free software projects. Unfortunately, no similar standard library has surfaced for audio/video filtering and otherwise working with audio/video stream once it has been decoded. Various multimedia projects (such as MPlayer, Xine, GStreamer, VirtualDub, etc.) have implemented their own filter systems to various degrees of success. What is needed is a high quality audio and video filter API - efficient, flexible enough to meet all the requirements which have led various projects to invent their own filter system, and yet easy to use or develop new filters with. This proposal is to implement a high quality audio/video filter library for FFmpeg, where it can be easily used by other multimedia-related software projects.

Mentor: A'rpi (has expressed interest of possibly helping with implementing a filter API in FFmpeg, he also volunteering to help porting the MPlayer filters too if a such API becomes available http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-April/051164.html)

Where Michael Niedermayer wrote the following: "" anyway if whoever designs the filter API decides to have a single audio+video API then i certainly wont complain but until now _everyone_ has failed to design a good video filter API, so adding yet another interdependancy is probably not going to make that any easier

to repeat again some goals of a video filter API

  • well documented (see mplayer for how not to do it)
  • writing a filter should not require complete knowledge of all (undocumented) internals of the video filter system
  • direct rendering (useing a buffer provided by the next filter)
  • inplace rendering (for example adding some subtitles shouldnt need the whole frame to be read and written)
  • slices based rendering (improves cache locality, but there are issues with out of order decoding ...)
  • multiple inputs
  • multiple outputs (could always trivially be handled by several filters with just a single output each)
  • timestamps per frame
  • also th number of frames consumed by a filter does not have to match the number output ((inverse)telecine, ...)

also i suggest that whoever designs the filter system looks at mplayers video filters as they support a large number of the things above ""


See Also