Sunday, August 28, 2005

ArchShadow Status and History

Well, I'm now in the process of rewriting a large bit of the core of my decompiler (ArchShadow), now nearing its 3rd year of development (though not on the same codebase)

ArchShadow originally started as a proof of concept decompiler that was fully standalone. It was written in PHP (please don't martyr me for that) and used its own, very poorly written, disassembler. It actually could decompile a number of large test binaries, but it was gcc-specific and very specific to certain ways of using constructs. It was a hack, by any definition of the word.

From there it grew into a pure C project, still implementing its own disassembler. This implementation didn't last long, as it simply wasn't worth the hastle for the limited return. It did make the disassembler core a decent bit cleaner, though.

After that, I worked on an implementation in Python, still using my own disassembler. This lasted for a while and let me get a lot of the SSA theory down. Eventually it was eliminated when the analysis work on the disassembler side (especially function detection) got to be too big to handle.

All of this taught me a big, very important issue. If the option to use an existing disassembler is there, USE IT.

ArchShadow is now in Python, sitting on top of IDA. It reads in a good bit of information from the DB and caches it allowing you to run it away from IDA as long as you don't change the code to the point that the information read from the DB is different. I'll eventually make it pull the entire IDA DB so that that's not such a big deal, but that's going to be a while in coming. The current version works good enough for now.

The problem with the current revision is that my SSA support for variables (used for detecting the modifications to different things over the course of a given function) is simply poor. It works, but to change names from the SSA name (var_#) I have to do a string replace which simply feels like a hack. I'm going to address this in my partial rewrite.

Anyway, enough history.

I'm considering building a system that can export data from an IDA database and then be used in an external interface. The main reason being that the interface for IDA on Linux is very poor, and there's absolutely no way for me to run any sort of IDA interface natively on OS X as it stands. One other option is writing a network layer where tvision would currently stand in the linux version and building a GUI that works with that, but I'm not sure of what is exposed ot tvision.

Anyway, enough wasted time for now...

Take care.
Cody Brocious

1 comment:

Anonymous said...

Er, excuse this random post but found your pyMusique app and I just wish to say well done and keep up the good work! Will give it a try soon and, heavens!, even make a donation.