1 - Reverse Engineering x86 Processor Microcode

Microcoded decode units in a processor serve as the finite state machine logic for decoding macroinstructions – architected instructions accessible to the operating system or application programmer – to be processed by the functional units of the processor. Using microcode – as opposed to hardwired logic – allows for changes late in the development cycle (fabricating hardware is time-consuming and expensive) and enables complex macroinstructions that are too expensive to be implemented in hardware to be implemented as a series of simpler microinstructions. Microcode is typically store in ROM on-chip and is updated by writing to RAM during the early stages of booting the processor. When executing a microcoded instruction, the processor first checks RAM for an updated implementation before executing from ROM.

Koppe et al. took two approaches to reversing the microcode on AMD’s K8 and K10 architectures: 1) executing microinstruction streams and inspecting processor state to identify instruction formats and 2) physically delayering and inspecting the hardware structures on-chip to read the microcode installed on the ROM. To get their own microcode to be executed on the processors, the researchers first had to reverse the microcode update functionality. Once they could load their own code onto the chip, they could randomly inject data (what is code but a series of bits?) and inspect the processor state to infer the functionality of the instructions (if the processor crashed, it’s probable that an invalid instruction was executed). Reading the on-chip ROM mostly served to verify their findings.

Their findings allow for fine-tuned instrumentation (e.g. “how many times have we executed the div instruction?”) or installing a trojan (e.g. the div instruction could be hooked by a RAM update to execute arbitrary code whenever two magic numbers are divided). These findings seem like a very big deal(!), but are somewhat negated by the fact that most modern processors require cryptographically signed microcode updates (K8 and K10 were chosen because they don’t). This means that microcode isn’t really available to any developers (boo!) or hackers (yay!). I admire Koppe et al.’s approach to automating the clunky job of figuring out what-bits-do-what in a microinstruction and using resources including patents and previous research to inform and hone their analysis. Unfortunately, since these chips are somewhat dated (released sometime before 2011 I believe), I see their research primarily as an approach for further research if it’s possible to get past the signed code updates on today’s processors.