Technical Perspective: How Exploits Impact Computer Science Theory

Computing systems as we build them today tend to have a curious property: some combinations of inputs and external events cause them to behave against their builders’ intent repeatedly and reliably. Techniques to make them behave so are called exploits. We say that an exploitable system is vulnerable—and the exploit is a constructive proof of vulnerability. Distressingly, exploits appear to be ubiquitous in both software and hardware of our computing infrastructure.

Should exploits be a concern of computer science theory? Can they tell us about fundamental properties of computing rather than mere human errors of implementation? Or is there something about the fundamentals of computing that makes exploits endemic to our very models of computation?

The accompanying paper presents one of the finest pieces of evidence that says yes, they should, and yes, they can. It joins a growing body of examples of endemic exploits and of exploits having expressive power of general-purpose programming, with computation models as deep as those of CPUs, ISAs, and ABI/APIs. The authors introduce just such a model arising from the essential complexity of modern CPUs, which is only obviously suppressible by rejecting that essential complexity and the performance it delivers. With such results, we can see that exploitability is a deep computational property of the underlying system, calling for a comprehensive theory response.

That theory response is long overdue. Empirically, it is as if any implementation of the intended computing functionality invariably casts a long shadow of shocking yet repeatable emergent behaviors. These behaviors, rather than being sparse and fleeting, seem to inevitably form entire unintended but robust mechanisms that allow attackers to construct exploits despite multiple layers of security measures. It appears that no modern computing system ends up being only and exactly what it was meant to be.

In the not-so-distant past, exploits could be dismissed as crafty but ultimately ad hoc and idiosyncratic inventions, with no big lessons for the computing theory or the natural science of its applications. In the 1990s and 2000s, the very existence of exploits seemed precarious, a mere unfortunate confluence of implementations—for example, of C/C++ functions activation frames containing both call stack return addresses and arrays prone to being overwritten by naively implemented string copy functions, and the x86 CPU stack memory being executable. It seemed a few well-poised changes to the hardware and the compilers—although still economically non-trivial—would destroy the space where most exploits lived. Exploits seemed too platform-bound and short-lived to need a theory, an engineering problem at best.

These times are now gone. Not only did the exploits demonstrate surprising portability and resilience, but it became clear their advanced techniques primarily reuse the target’s own mechanisms and behaviors as designed, rather than some random and curiously deviant behaviors. It turned out the most effective exploit techniques leveraged the systems’ own abstractions on levels well above the ultimate binary executable—gaining both portability to seemingly unrelated implementations and the reliability of already well-debugged and well-used code. Against these patterns of adversarial reuse of the target’s own computing models, no set of discrete countermeasures would suffice—at least, not without substantial theories that consider the designed-in though unintended interactions between multiple models and levels of computation.

You may wonder about the term “weird machines” in the paper’s title. It reflects the shift in the understanding of exploitability’s root cause, from a programmer’s error to an endemic property of the target, a masterful reuse of the target’s own mechanisms and features against it. Though important, the initial programmer’s error is only one of the many doors to unlock this bounty of emergent execution. If closed, many others leading to the same emergent execution engine—the weird machine—will be found.

The road to this realization took several decades. It went from the naive understanding of stack buffer overflow exploits of the 1990s as needing native code payloads—indeed, the Windows XP Service Pack 2 advertised non-executable stacks as the mitigation of buffer overflows—to the realization the call stack machine embedded in every C/C++ program was Turing-complete on the sequences of well-formed stack frames with no executable content whatsoever, a.k.a. Return-oriented Programming. It went from heap exploits specific to Doug Lea’s malloc in-band chunk metadata to all heaps and the “heap Feng-shui” co-optation across all major memory allocation algorithms and architectures. It went from Spectre and Meltdown being considered weird x86 bugs to 50+ families of emergent behaviors affecting all modern superscalar CPUs.

You now witness the next step in this succession: from understanding the transient space of microarchitectural optimizations as a locus of side channels to a general-purpose execution environment of its own, a weird machine par excellence. Read on and join the new age of emergent behavior exploration. More information and bibliography can be found at https://weirdmachines.gitlab.io

Footnotes

a Distribution Statement “A” (Approved for Public Release, Distribution Unlimited).

Technical Perspective: How Exploits Impact Computer Science Theory

Technical Perspective: How Exploits Impact Computer Science Theory

DOI

December 2024 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Technical Perspective: How Exploits Impact Computer Science Theory

DOI

December 2024 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.