Objdump

From MultimediaWiki
Jump to: navigation, search

objdump is a standard component of the GNU binutils. It is useful for obtaining all kinds of information from an ELF file. This page describes some of its more common reverse engineering applications

(If you prefer win32 platform, you may find tool dumpbin.exe there (shipped with visual studio) offering similar functionality)

Installation

If you have a standard C/C++ development environment set up on your Linux box, you ought to already have the GNU binutils installed. Type 'objdump' to find out. If it's not there, then you probably need to install the development toolchain for your system. This version of objdump will know how to take apart files built for your particular CPU architecture.

If you want to take apart ELF files compiled for a different architecture, you will need to compile a new copy of the binutils for a separate architecture target:

  • get the official binutils distribution: http://www.gnu.org/software/binutils/
  • unpack and enter binutils directory
  • ./configure --target=<arch> --prefix=<directory> --program-prefix=<prefix>
  • make && make install

About the configure options:

  • <arch> is the architecture to build for. Examine the file bfd/config.bfd to get an idea of what targets are available. As an example of what the target should look like, the target for PowerPC processor code stored in an ELF file is powerpc-elf.
  • <directory> is the base directory for the new binutils toolchain to be stored in. It helps to keep this separate from the native toolchain.
  • <prefix> indicates the prefix string that should be prepended to each of the tools on installation. For example, if the program prefix is "powerpc-" then the built objdump tool will be named powerpc-objdump.

Common Usage

objdump requires that you supply at least some parameter. Here are some of the more interesting options for RE:

 -d, --disassemble         Display assembler contents of executable sections
 -D, --disassemble-all     Display assembler contents of all sections
 -T, --dynamic-syms        Display the contents of the dynamic symbol table
 -r, --reloc               Display the relocation entries in the file
 -R, --dynamic-reloc       Display the dynamic relocation entries in the file
 -C, --demangle[=STYLE]    Decode mangled/processed symbol names
                            The STYLE, if specified, can be `auto', `gnu',
                            `lucid', `arm', `hp', `edg', `gnu-v3', `java'
                            or `gnat'
 -z, --disassemble-zeroes  Do not skip blocks of zeroes when disassembling

To disassemble an executable ELF file:

 objdump -d <binary>

To disassemble a shared object (.so) ELF file:

 objdump -dR <library.so>

The -R option is invaluable for dealing with relocatable code. Without it, there will be a lot of calls that appear to call back to the same location, e.g.:

   5752:       e8 fc ff ff ff          call   5753 <free@plt+0xb3>
   5757:       89 c3                   mov    %eax,%ebx

The actual address will be patched in by the OS when the file is loaded. However, the -R option asks objdump to insert information about the dynamic relocation:

   5752:       e8 fc ff ff ff          call   5753 <free@plt+0xb3>
                       5753: R_386_PC32        malloc
   5757:       89 c3                   mov    %eax,%ebx

Another useful option available for x86-targeted builds of objdump is the -Mintel option. This asks objdump to use Intel ASM syntax vs. AT&T syntax:

 objdump -dR -Mintel <library.so>
   5752:       e8 fc ff ff ff          call   5753 <free@plt+0xb3>
                       5753: R_386_PC32        malloc
   5757:       89 c3                   mov    ebx,eax

Note the difference in the mov instruction syntax.

To disassemble code from a static library (.a) vs. a shared library (.so) while printing relocation information, use the -r option vs. the -R option.

When dealing with code that was compiled from C++ source and still retains its symbols, those symbols will be mangled. For example:

 _ZN7Decoder14parseBitStreamEll

To demangle, use the -C option (which allows for a number of demangling options, GNU convention being the default). The above example is demangled to:

 Decoder::parseBitStream(long, long)

The standard -d option only disassembles sections of an ELF file that are suspected to contain executable code, usually the .text sections. In order to see other sections that might contain data (e.g., .rodata sections), use the -D option to disassemble all sections, regardless of whether they have legitimate code chunks. Often, they will not and the disassembly will be bogus. But the raw data bytes can be inspected. Further, use the -z option to print long blocks of zeros which objdump would otherwise omit by default:

 objdump -Dz <library.a>
 [...]
 Disassembly of section .rodata:

 00000000 <data_table>:
   0:   80 81 70 70 82 83 71    add    BYTE PTR [ecx-2088603536],0x71
   7:   71 50                   jno    59 <gs_VLCDecodeTable+0x59>
   9:   50                      push   eax
   a:   50                      push   eax
 [...]

To put it all together, this command line disassembles all sections of a static library, demangles C++ names, patches in relocation information, shows all blocks of zeros, and prints the disassembly using Intel-standard ASM syntax:

 objdump -DCrz -Mintel <library.a>