Technical Persepctive: Large-Scale Sound and Precise Program Analysis

You are given a program. Will it crash? Is it subject to a spoofing, buffer overflow, or injection attack? Is this part of it dead code? Can I replace that code fragment with a more efficient one? Which exceptions can arise from calling this function in that context? Does this variable always contain the same value? All of these questions require program analysis: We want to know whether program code (the input to the analysis) has a specified property.

Program analysis is inherently difficult. Take the following ingredients:

A Turing-complete programming language such as C, Java, Haskell;^a
A nontrivial behavioral program property; for example, does a program de-reference a null pointer, is a particular fragment dead code; and
Require an exact answer: yes if the property holds, no if it doesn’t.

The requirements look reasonable: Given an input program we want to check whether it has some desired "runtime" (behavioral) property. But they constitute an explosive mix: Any such problem is undecidable—no program can solve it.¹ Consequently, we need to give up at least one of these requirements to escape undecidability.

Remove Turing-completeness: Languages of restricted expressive power may have decidable nontrivial behavioral properties; for example, equivalence of regular expressions, which allows guaranteed optimization of regular expression matching.
Make the property trivial; that is, make it (or its negation) hold for all programs. In other words, build it into the programming language. This is what static type systems and emerging technologies such as proof-carrying code can provide. They constitute a cunning turning of the tables: Instead of program analysis fighting an uphill battle against arbitrary programs thrown its way it forces the programmer to provide what amounts to a checkable proof that the program has the desired property. Very often we need to analyze general-purpose programs without the privilege of foresight provided by the programmer and enforced by the programming language, however. This leaves only the third possibility:
Allow approximate answers; that is, allow the program analysis to return "don’t know" answers or to run forever (which is practically tantamount to "don’t know"). Since a "don’t know" answer constitutes a false positive whenever the (desirable) property really holds, a key challenge is designing a program analysis to be sufficiently precise for its intended application so it does not drown its user (whether human or another system such as a compiler) in false positives: An analysis returning 1,000 warnings ("don’t know") of which only 10 turn out to be bonafide bugs may not be practically useful.

The following paper provides a powerful sound static analysis framework for C programs and nontrivial behavioral properties. The authors show that a precise compositional program analysis that is context-sensitive and path-sensitive is possible for huge code bases of central importance to our day-to-day computing infrastructure. In particular, they show that null derefence bugs, which require interprocedural data flow analysis, can be found without drowning them in false positives.

Their starting point is an abstract model of a program where infinite domains are finitely approximated and certain steps are replaced by nondeterministic steps resulting in unknown results; for example, the integers may be replaced by a single abstract value (representing all integers) and the result of an arithmetic comparison may be approximated by a nondeterministic choice of a Boolean value for the result. We can think of these steps as internal choices: the abstract model just performs the internal choices without external control. This abstraction is familiar from software model checking.

The authors contend it is important for practical precision to model the results of internal choices as unknowns: once a choice is performed the result is unknown, but every time we need the value it is the same. Their key insight is we are typically only interested in whether an abstracted function has a property for all possible sequences of internal choices in its body or for some particular sequence, or indeed both. This is captured by logical constraints for the input-output relation of a function where all the internal choices are universally, respectively existentially bound.

The paper outlines conceptually independent and generally applicable steps assembled into a solution pipeline: quantifier elimination, symbolic fixed-point computation to compute closed forms for recursive constraints, and finally constraint solving, bringing together and leveraging techniques ranging from the foundation of (Boolean) logic to state-of-the-art SAT-solving. Their execution requires care, as the authors explain aided by their selection of an illustrative fragment of C and a running example developed from source code to analysis result.

The challenge of bug hunting in large C code bases has seen the rise and application of techniques that are neither sound nor complete nor even stable: their result may depend on the machine load, other programs analyzed before or simultaneously, and other factors completely extrinsic to the program at hand. A bug is a bug, whichever way you catch it, the argument goes. This paper illustrates that the days of feasible, well-specified, sound, and sufficiently precise static program analysis for bug hunting even in huge code bases are not numbered.