Disassemblers & Decompilers

Jun 25, 2022
3 mins to read

Computer Science

To better understand what disassemblers and decompilers actually do, let’s recall what machine code, assembly code and assemblers are first.

Machine code is literally the 0s and 1s that represent CPU-specific instructions that can be executed directly by the CPU. When you see it, it’s usually displayed in hexadecimal, but regardless, it’s all unreadable to us.

Many executables (excluding scripts) contain mostly pure machine code (plus metadata). On Linux systems, you can inspect some of the metadata with readelf or objdump.

Assembly language is a human-readable programming language which is coupled to the computer’s specific hardware, so it’s not portable in the same way that languages like C are. Assembly language exists as a way to reason and work with CPU instructions with actual names and arguments.

Assembly language is really only ever written today by engineers that have to implement/modify performance-critical algorithms or if they want acess to specialised processor instructions that they otherwise wouldn’t have access to.

Assemblers are programs that convert assembly language to machine code.

Many compilers will first convert source code to assembly, then let an assembler convert assembly to machine code. Doing it this way is generally easier for the person implementing the compiler.

Disassemblers are programs that go the other way, translating machine code to assembly language, which is why it’s an incredibly useful tool for reverse engineering — assembly language is a lot more readable than machine code so you can derive a lot of a insight from just the executable binary itself!

There are lots of disassemblers used in CTFs and in industry. Two such examples are GDB and IDA.

Decompilers take it a step further from disassemblers. Decompilers are programs that can translate machine code to more readable pseudo-source-code, essentially a reverse compiler! It’s an incredibly hard problem, but there are tools out there that we can take advantage of in CTFs like IDA HexRays which do a reasonable job at ‘recovering’ the source code of executables. Sometimes, as attackers, we can use decompilers to find vulnerabilities to exploit.

Thanks for reading 🤓! Let me know if you found this interesting.

See more of my blogs.