Codecs with extended information
For these codecs there exists extra information beside the binaries. It can be binaries with debug information in them, SDKs or something other that would help the reverse engineering process. This list doesn't cover codecs for which specifications (e.g. Windows Media Video 9/VC-1) or source (e.g. Monkey's Audio) can be found.
VQF/TwinVQ
For this codec a Linux/win32 SDK was released and some code from VQF in AAC was also found on the net, the TwinVQ entry has more details.
WMA Pro, WMA Lossless, WMA Speech
Linspire licensed some Windows Media codecs and bundled them with their distribution. They compiled the code into shared object files and loaded them from MPlayer. The glue code can be found in the MPlayer packages on the Linspire site, the files with the gluecode are in libavcodec/ wmv3dec.c, wmv2dec.c and wma3dec.c. To get the binaries download the torrent file from the Linspire site. The codecs should reside in /usr/lib with the name libwm*.so. These files have lots of debug information in them and they seam to not have been compiled with a high optimization level. Some work has been done on libwma3.so and this library has code to decode wmastdv1, wmastdv2, wmapro and wmalossless. But the gluecode only activates wmapro and wmalossless. Libwma2.so and libwma3.so seem to share some of the constants.
Multimedia Mike's blog mentions some libs with debug info in them, these also contain Windows Media Audio Voice. They seem to be based on the same source as the Linspire ones and a rudimentary check supports that. The code seems to be a bit more optimized also. The different objects in the library makes it easy to divide the diffrent codecs (the tables and code). So correlation of all these sources would probably yeild the best result.
There is also program called wmal2pcm.exe from the Microsoft itself (http://www.microsoft.com/windows/windowsmedia/9series/encoder/utilities.aspx), which unpacks WMA Lossless to PCM. It does not require additional libraries as it contains demuxer and decoder code.
Indeo 4 and Indeo 5
XAnim has external so libs for these codecs on its website. Some version of these binaries has a lot of debug information in it. There has also been some work on an Indeo 5 decoder.
Sipro, Atrac, RV30, RV40 and RALF
Real has sometimes in the past released versions that had binaries with a lot of debug information. The versions in question are version 5, 7 and a Helix Linux beta version. Version 5 can be found here and 7 can be found here. This link [1] has downloads to unstripped object code.
- Sipro debug info can be found in version 5, 7 and the debug build.
- Atrac debug info slipped into a Helixplayer beta version. The so binary doesn't work but the signatures can be transferred to the RealPlayer 10 atrac.dll. Other versions don't seem to use the same source code.
- RV30 debug info can be found in version 7 and the debug build.
- RV40 debug info can be found in the debug build.
- RALF debug info can be found in the helix download.
Bink
RAD Game Tools released a player for Linux that contained a lot of debug information.
QCELP
Qualcomm shared its source for QCELP during the development of the codec. Sometime along the development Apple licenced the codec for QuickTime. The code doesn't match the binary completely but will help an eventual RE effort. Qualcomm also supply a SDK. http://www.cdmatech.com/products/purevoice_download.jsp This is for the standard QCELP and not for the QCELP found in mov files.
Smush
LucasArts released the trailer for Grim Fandango as a self running executable. The executable contains lots of debug information including large amounts of strings. The trailer itself can be easily extracted with a tool like dd. The trailer can be found here.
MP3Pro
From http://www.all4mp3.com/tools/sw_ct_demo.html a xmms plugin can be downloaded. The plugin has lots of debug info left in it.