As with claims about beauty, claims about system correctness are relative to expectations. Such expectations for a computing system are called specifications. And while claims about beauty are necessarily subjective ("in the eye of the beholder"), claims about correctness need not be—they can be conveyed as proofs in a formal logic, which can be communicated to anyone and then independently checked.
Whether a given correctness claim is useful will depend on how completely the specification characterizes a system’s behaviors. Obviously, we are limited by what can be said using our specification language and what specifications can be proved using our formal logic. Over the last half-century, a consensus has emerged that specifying sets of behaviors called trace properties is a sweet spot. Trace properties suffice for describing most of the important aspects of a system’s behaviors. Moreover, trace properties for a system are straightforward to deduce from trace properties for its components, so we can reason about trace properties compositionally.
The description of a trace property must, by definition, characterize a set of system behaviors, where whether a system behavior a is included in that set depends only on a and not on other system behaviors in the set or on other properties of the set. So trace properties formalize partial and total correctness, mutual exclusion, deadlock freedom, starvation freedom, among others.
Moreover, trace properties are easily specified. The text of a program specifies a trace property comprising the set of that program’s possible executions, since an execution might depend on what has come before but not on what occurs in a different execution. By modeling a system behavior as a sequence of states and/or actions, an automaton can be used to specify the trace property comprising sequences the automaton accepts. And formulas of linear time Temporal Logic or its derivatives like TLA are satisfied by individual sequences, so the set of sequences that satisfy a given formula also is a trace property.
In a 1977 paper, Leslie Lamport used the terms safety property ("bad things" don’t happen) and liveness property ("good things" do happen) in connection with reasoning about mutual exclusion protocols and other concurrent algorithms.
These two kinds of specifications seemed to cover the landscape. In the decade that followed, Alpern and I showed that Lamport was right. We proved that safety properties and liveness properties are basic building blocks for all trace properties; invariants suffice for verifying safety properties; variant functions are also needed for verifying liveness properties. Thus, not only are safety properties and liveness properties a natural way to specify program behaviors, but such a decomposition provides insight into the approach needed for constructing a proof.
That’s the theory. The following paper represents an important step forward for the practice. It describes mechanically checked proofs for two non-trivial distributed services: A Paxos-based library to support replication and a shared key-value store. Appropriate safety properties and liveness properties are proved for each. These are not the first mechanically checked proofs for non-trivial systems software. But prior work just proved safety properties, where invariants suffice; additional techniques (as noted above) must be employed for proving liveness properties.
IronFleet is interesting because it uses refinement and reduction to simplify proof construction.
IronFleet is also interesting because it uses refinement and reduction to simplify proof construction. Refinement is a way to show that some high-level specification is satisfied by systems satisfying a low-level specification; reduction allows reasoning about coarse-grained atomic actions to support inferences about systems having fine-grained actions. These techniques are not new, but with IronFleet we see them in action on real system software. For example, we see how difficulties with reduction for proving liveness properties can be avoided by restricting the internal design of each individual step that a service may take. We thus learn about an engineering effort that, by necessity, encompasses both system construction and proof construction.
The authors also report performance impact and the expansion in code size needed to accommodate annotations required for the mechanical proof checking.
We are now closer to having the capability for software systems to be accompanied by correctness claims that are grounded in mechanically checked proofs. The need is great; the progress reported herein should be an inspiration.