Objdump
objdump is a standard component of the GNU binutils. It is useful for obtaining all kinds of information from an ELF file. This page describes some of its more common reverse engineering applications
(If you prefer win32 platform, you may find tool dumpbin.exe there (shipped with visual studio) offering similar functionality)
Installation
If you have a standard C/C++ development environment set up on your Linux box, you ought to already have the GNU binutils installed. Type 'objdump' to find out. If it's not there, then you probably need to install the development toolchain for your system. This version of objdump will know how to take apart files built for your particular CPU architecture.
If you want to take apart ELF files compiled for a different architecture, you will need to compile a new copy of the binutils for a separate architecture target:
- get the official binutils distribution: http://www.gnu.org/software/binutils/
- unpack and enter binutils directory
- ./configure --target=<arch> --prefix=<directory> --program-prefix=<prefix>
- make && make install
About the configure options:
- <arch> is the architecture to build for. Examine the file bfd/config.bfd to get an idea of what targets are available. As an example of what the target should look like, the target for PowerPC processor code stored in an ELF file is powerpc-elf.
- <directory> is the base directory for the new binutils toolchain to be stored in. It helps to keep this separate from the native toolchain.
- <prefix> indicates the prefix string that should be prepended to each of the tools on installation. For example, if the program prefix is "powerpc-" then the built objdump tool will be named powerpc-objdump.
Common Usage
objdump requires that you supply at least some parameter. Here are some of the more interesting options for RE:
-d, --disassemble Display assembler contents of executable sections -D, --disassemble-all Display assembler contents of all sections -T, --dynamic-syms Display the contents of the dynamic symbol table -r, --reloc Display the relocation entries in the file -R, --dynamic-reloc Display the dynamic relocation entries in the file -C, --demangle[=STYLE] Decode mangled/processed symbol names The STYLE, if specified, can be `auto', `gnu', `lucid', `arm', `hp', `edg', `gnu-v3', `java' or `gnat' -z, --disassemble-zeroes Do not skip blocks of zeroes when disassembling
To disassemble an executable ELF file:
objdump -d <binary>
To disassemble a shared object (.so) ELF file:
objdump -dR <library.so>
The -R option is invaluable for dealing with relocatable code. Without it, there will be a lot of calls that appear to call back to the same location, e.g.:
5752: e8 fc ff ff ff call 5753 <free@plt+0xb3> 5757: 89 c3 mov %eax,%ebx
The actual address will be patched in by the OS when the file is loaded. However, the -R option asks objdump to insert information about the dynamic relocation:
5752: e8 fc ff ff ff call 5753 <free@plt+0xb3> 5753: R_386_PC32 malloc 5757: 89 c3 mov %eax,%ebx
Another useful option available for x86-targeted builds of objdump is the -Mintel option. This asks objdump to use Intel ASM syntax vs. AT&T syntax:
objdump -dR -Mintel <library.so>
5752: e8 fc ff ff ff call 5753 <free@plt+0xb3> 5753: R_386_PC32 malloc 5757: 89 c3 mov ebx,eax
Note the difference in the mov instruction syntax.
To disassemble code from a static library (.a) vs. a shared library (.so) while printing relocation information, use the -r option vs. the -R option.
When dealing with code that was compiled from C++ source and still retains its symbols, those symbols will be mangled. For example:
_ZN7Decoder14parseBitStreamEll
To demangle, use the -C option (which allows for a number of demangling options, GNU convention being the default). The above example is demangled to:
Decoder::parseBitStream(long, long)
The standard -d option only disassembles sections of an ELF file that are suspected to contain executable code, usually the .text sections. In order to see other sections that might contain data (e.g., .rodata sections), use the -D option to disassemble all sections, regardless of whether they have legitimate code chunks. Often, they will not and the disassembly will be bogus. But the raw data bytes can be inspected. Further, use the -z option to print long blocks of zeros which objdump would otherwise omit by default:
objdump -Dz <library.a>
[...] Disassembly of section .rodata: 00000000 <data_table>: 0: 80 81 70 70 82 83 71 add BYTE PTR [ecx-2088603536],0x71 7: 71 50 jno 59 <gs_VLCDecodeTable+0x59> 9: 50 push eax a: 50 push eax [...]
To put it all together, this command line disassembles all sections of a static library, demangles C++ names, patches in relocation information, shows all blocks of zeros, and prints the disassembly using Intel-standard ASM syntax:
objdump -DCrz -Mintel <library.a>