To better understand what disassemblers and decompilers actually do, let’s recall what machine code, assembly code and assemblers are first.
Machine code is literally the 0s and 1s that represent CPU-specific instructions that can be executed directly by the CPU. When you see it, it’s usually displayed in hexadecimal, but regardless, it’s all unreadable to us.
readelf
or objdump
.Assembly language is a human-readable programming language which is coupled to the computer’s specific hardware, so it’s not portable in the same way that languages like C are. Assembly language exists as a way to reason and work with CPU instructions with actual names and arguments.
Assemblers are programs that convert assembly language to machine code.
Disassemblers are programs that go the other way, translating machine code to assembly language, which is why it’s an incredibly useful tool for reverse engineering — assembly language is a lot more readable than machine code so you can derive a lot of a insight from just the executable binary itself!
There are lots of disassemblers used in CTFs and in industry. Two such examples are GDB and IDA.
Decompilers take it a step further from disassemblers. Decompilers are programs that can translate machine code to more readable pseudo-source-code, essentially a reverse compiler! It’s an incredibly hard problem, but there are tools out there that we can take advantage of in CTFs like IDA HexRays which do a reasonable job at ‘recovering’ the source code of executables. Sometimes, as attackers, we can use decompilers to find vulnerabilities to exploit.