IDA Pro
- Website: http://www.hex-rays.com/idapro/
- Demo version: http://www.hex-rays.com/idapro/idadowndemo.htm
- Freeware version (currently v4.9): http://www.hex-rays.com/idapro/idadownfreeware.htm
- Wingraph32 GPL source code: http://www.hex-rays.com/idapro/freefiles/wingraph32_src.zip
IDA Pro is a disassembler and debugger with a lot of features that is very useful for reverse engineering.
It is proprietary software that runs on Windows and Linux; there is a zero-cost evaluation version.
Boosting the freeware version
The freeware version of IDA Pro is an invaluable tool. One important function is the ability to produce different flow graphs from the disassembly. Sometimes these graphs can be quite messy and needs to be processed. The version of wingraph32.exe that is included with the freeware version of IDA Pro can't save the generated graphs. Just replace it with a version from a demo version of IDA Pro and the save option will be available. Saved graphs will have a .gdl suffix.
Converting gdl flow graphs to dot files
Since there aren't any really good tools to display and edit gdl files, you can convert them with the following script
#!/usr/bin/perl use strict; my $FILE1 = $ARGV[0]; open(OUTFILE, ">".$FILE1.".dot") or die "File doesn't exist\n"; my $indata = `cat $FILE1`; my @split = split(/node:/, $indata); my $graphname = shift @split; $graphname =~ s/^.*title:[^"]*"([^"]*)".*$/$1/s; print OUTFILE "digraph \"$graphname\" {\n"; print OUTFILE "\tgraph [\n"; print OUTFILE "\t]\n"; print OUTFILE "\tnode [\n"; print OUTFILE "\t\tshape = \"box\"\n"; print OUTFILE "\t]\n"; print OUTFILE "\tedge [\n"; print OUTFILE "\t]\n"; # convert nodes foreach my $n (@split) { $n =~ s/}.*$//s; my $label = my $title = $n; $title =~ s/^.*title:[^"]*"([^"]*)".*$/$1/s; $label =~ s/^.*label:[^"]*"([^"]*)".*$/$1/s; $label =~ s/\n/\\n/sg; print OUTFILE "\t\"$title\" [\n"; print OUTFILE "\t\tlabel = \"$label\"\n"; print OUTFILE "\t];\n"; } @split = split(/edge:/, $indata); shift @split; # convert edges foreach my $e (@split) { $e =~ s/}.*$//s; my $color = my $label = my $source = my $target = $e; $source =~ s/^.*sourcename:[^"]*"([^"]*)".*$/$1/s; $target =~ s/^.*targetname:[^"]*"([^"]*)".*$/$1/s; print OUTFILE "\t\"$source\" -> \"$target\" [\n"; if ($label =~ s/^.*label:[^"]*"([^"]*)".*$/$1/s) { $label =~ s/\n/\\n/sg; print OUTFILE "\t\tlabel = \"$label\"\n"; } if ($color =~ s/^.*color:[[:space:]]*([^ ]*)[[:space:]}].*$/$1/s) { print OUTFILE "\t\tcolor = $color\n"; } print OUTFILE "\t];\n"; } print OUTFILE "}\n"
Recovering the function prototypes from the disassembly
Well it's not really possible but a good hint is possible without much work. First load the binary you want to analyse. IDA Pro will process the file for some time. If you start browsing around you will see that IDA Pro can make a good guess on how many arguments a function have. That is one thing that can be extracted. The other is if a function returns something or not. The x86 C ABI declares that the return type if any has to be left in the eax register. So if the eax register is used after a call the function that was called returned something. The following not so nice perl script can process the asm file that can be produced from IDA Pro with Produce->ASM-File (Alt+F10).
#!/usr/bin/perl #Argument parser if (@ARGV){ print "@ARGV\n"; if ($ARGV[0] eq "-f"){ $FILE1 = $ARGV[1]; $FILE2 = $ARGV[2]; $filter = 1; } } else{ print "Usage:\n"; print "./argumentcounter.pl [command] file.asm output.txt\n"; print "-f dummy command\n"; exit; } #File IO open(OUTFILE, ">$FILE2") or die "File problem\n"; @indata = `cat $FILE1`; $fn = 0; #amount of founctions $functions[$fn] = ""; $arguments[$fn] = 0; $f_start = 0; $links= 0; foreach $rad (@indata) { $rowcounter++; if ($rad =~ m/ proc near/) { $functions[$fn] = $`; $fn++; $f_start = 1; $arguments[$fn] = 0; #print STDOUT "$`\n"; } if ($f_start == 1) { $radsubstring = (substr $rad,0,3); if ($radsubstring eq "arg") { $arguments[$fn]++; } } if ($rad =~ m/ endp/) { $f_start = 0; } #return type detector if ($rad =~ m/call\t/) { $retfunction = $'; $retfunction =~ m/\n/; $retfunction = $`; chop($retfunction); $rowcounter = 0; $callcheck = 1; } if (($callcheck) && ($rowcounter>0)) { @tmpsplit=split /\t/,$rad; chomp($tmpsplit[0]); chomp($tmpsplit[1]); chomp($tmpsplit[2]); chomp($tmpsplit[3]); $instruction = $tmpsplit[2]; $arguments = $tmpsplit[3]; chop($arguments); if ($arguments =~ m/, /) { @argum=split /, /,$arguments; } #chomp($argum[0]); #chomp($instruction); if ($argum[0] eq "eax") { $callcheck = 0; } if ($argum[1] eq "eax") { $callcheck = 0; } #check source operand if it is eax if (($argum[1] eq "eax") && ($instruction ne "xor")) { $callcheck = 0; $nonvoid{$retfunction} = $nonvoid{$retfunction} +1; #print STDOUT "$retfunction returns nonvoid\n"; } if (($argum[0] eq "eax") && ($argum[1] ne "eax")){ $callcheck = 0; $void{$retfunction} = $void{$retfunction} +1; #print STDOUT "$retfunction returns void\n"; } if ($instruction =~ /retn/) { #stop searching when the current function returns $callcheck = 0; $void{$retfunction} = $void{$retfunction} +1; #print STDOUT "$retfunction returns void\n"; } if ($rowcounter > 10) { $callcheck = 0; } $argum[0] = ""; $argum[1] = ""; } } for ($i=0 ; $i < $fn ; $i++){ #returntype handling $returntype = "unknown"; if ($void{$functions[$i]} > 0) { $returntype = "void"; } if ($nonvoid{$functions[$i]} > 0) { $returntype = "int"; } if (($void{$functions[$i]} > 0) && ($nonvoid{$functions[$i]} > 0)){ $returntype = "void $void{$functions[$i]} - $nonvoid{$functions[$i]} int"; } print STDOUT "$returntype $functions[$i] $arguments[$i]\n"; }