Research and Advances
Systems and Networking Review articles


Diverse examples depict how indistinguishability plays a central role in computer science.
illustration with blocks
  1. Introduction
  2. Key Insights
  3. Automata and Learning
  4. Sequential Reductions in Concurrent Programming
  5. Real-Time Indistinguishability
  6. Global Indistinguishability Structure
  7. Conclusion
  8. References
  9. Authors
  10. Footnotes
illustration with blocks

The properties commonly ascribed to any object are, in last analysis, names for its behavior.
—Judson Herrick, An Introduction to Neurology, 1918

Back to Top

Key Insights

  • Lack of knowledge by a computer system component about other components can formally be captured through the concept of indistinguishability. Whenever abstraction or interaction take place in a computer system, indistinguishability plays a critical role.
  • Indistinguishability is the source of many lower bounds and impossibility results in CS. It is also the essence behind abstraction techniques so important in computing theory and in the design of large complex systems.
  • indistinguishability has a topological nature: local states of components that do not distinguish between two system states induce a higher-dimensional simplicial complex, a structure with topological properties preserved as the system execution evolves.

Dost thou love me? I know thou wilt say “ay,”
And I will take thy word. Yet if thou swear’st
Thou mayst prove false. At lovers’ perjuries,
They say, Jove laughs.
—Shakespeare’s Romeo and Juliet, Act 2

Abstraction—allowing the details of lower-level components to be ignored—and interaction—allowing individual computing entities to cooperate—are key concepts in computer science. Many would argue that they play a crucial role in the success of computing: abstraction allows separate layers of the computing stack to be improved orthogonally, whereas interaction allows the abundance of computing power to be harnessed. This comes at a significant cost: each component of a computer system has limited knowledge about the state of other components. This happens either by choice, in the case of abstraction, or out of necessity, in the case of interaction.

From the perspective of an individual component, all other components, either other layers within the same computing entity or other computing entities, can be considered as an environment. Seen in this way, lack of knowledge about other components can formally be captured through the concept of indistinguishability, namely inability to tell apart different behaviors or states of the environment. Indistinguishability is therefore a consequence of the fact that computer systems are built of individual components, each with its own perspective of the system.

This article argues that because of its intimate relation with key issues in computing, indistinguishability, in its various flavors, plays a critical role in many computing areas. We explain this core concept and demonstrate some of its variants and applications, through four examples, trying to illustrate different, fundamental aspects of indistinguishable situations of abstraction and interaction.

Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system. In our first example, we overview the role of indistinguishability in some of the most basic notions in computer science: state, automata, and learning. We will encounter both interaction (as means to reduce indistinguishability) and abstraction (captured by behavioral equivalence). Here, the environment is seen as a blackbox, implemented by an unknown automaton. What can an experimenter interacting with its environment through input/output symbols infer about the blackbox internals? The experimenter has an evolving mental model of the blackbox as an hypothesis automaton, which is indistinguishable from the actual automaton, given the current state of the interaction. The very notion of “state” is in terms of indistinguishability. In this example, indistinguishability has a “semantic” nature, related to computational complexity, namely the number of states in the automaton and the complexity of the learning algorithm.

Our second example demonstrates that indistinguishability is a powerful tool for deriving positive results. Examples abound, such as in artificial intelligence (for example, Turing’s test), cryptography (for example, pseudo-randomness), logic, and others. We consider the example of serializability in concurrent programming, where interaction is through shared variables, and locks permit the set of indistinguishable executions to be reduced. The correctness specification of a program is in terms of requiring that concurrent executions are indistinguishable from appropriate sequential executions. Abstraction is key, and indistinguishability becomes a powerful tool to design concurrent programs and prove their correctness, and in particular, to enable sequential reasoning.

We move in our third example to another very basic form of indistinguishability, related to time, and to the impossibility of observing realtime. An interaction among a set of computing entities can be seen as a partial order, representing causality relations between events happening in the system. Lamport’s seminal paper26 can be seen as using indistinguishability in two senses. First, it observed the relation to relativity theory, motivating the idea of reducing concurrent systems by indistinguishability to sequential thinking (by implementing a fault-tolerant distributed system as a replicated state machine). And second, it provided the framework for analyzing time-based algorithms, which depend on quantifying real-time indistinguishability. We illustrate this with a simple example showing how inherent limitations on clock synchronization can be derived through the impossibility of distinguishing the real-time occurrence of events in an execution up to given bounds on message transmission delays and clock drifts.

Prior examples consider a single execution, and analyze a set of executions that are indistinguishable from it, from the perspective of all the participating processes. Our final example considers how distributed computation is limited by the global indistinguishability structure of all possible executions. This structure is defined by a Kripke graph, where edges are labeled by processes that do not distinguish between the global states of the system represented by the two endpoints of the edge. It turns out that higher dimensional topological properties of this graph (more precisely, its dual, a simplicial complex) determine computability and the amount of interaction needed to distributively solve a problem.

Back to Top

Automata and Learning

We start with a simple scenario where a learner is trying to infer the internal construction of a blackbox. The learner knows that the blackbox is a deterministic finite automaton (DFA) accepting a language over an alphabet Σ, but does not know which specific automaton it is. Through a conversation, the learner and the blackbox exchange symbols, and there is a set of automata all indistinguishable with respect to the current conversation. As the interaction evolves, this set of indistinguishable automata shrinks. Eventually, the learner would like it to shrink until it captures the language accepted by the blackbox.

Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system.

Moore’s theorem. Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system. Ashby’s Cybernetics book3 from 1956 already includes a chapter called “The black-box.” At the same time, Moore12 proposed the problem of learning finite automata, and studied indistinguishability of deterministic finite state machines, stating (Theorem 2):

“Given any machine S and any multiple experiments performed on S, there exist other machines experimentally distinguishable from S for which the original experiment would have had the same outcome.”

Moore’s theorem shows an impossibility in the characterization of any physical system as a deterministic state machine on the basis of a finite number of observational outcomes. This is because after a finite interaction with the blackbox, approximately, if all words are at most of length k, the learner has explored only paths of length k in the automaton A of the blackbox.

This does not prevent the construction of theoretical models of the behavior of a system, but it does challenge the assumption that a system has only the behaviors that have been characterized by experimental observations, namely the assumption that any theoretical model is complete. Further discussion of the relation between Moore’s theorem and physics appears in Fields.16

The Myhill-Nerode theorem. If the interaction with the blackbox is only through input/output symbols, how can the learner know anything at all about its internal construction, even if it has any states at all? States are not directly observable, so what is a state, from the perspective of the learner? The Myhill-Nerode theorem, “one of the conceptual gems of theoretical computer science” according to Rosenberg,33 offers a complete mathematical characterization of the notion of state, via basic algebraic properties defined only on input/output behavior.

A string t ∈ Σ* distinguishes two strings u and v in a language L, if vtL and vtL. If there is a string t distinguishing u and v, then the state s = δ(q0, u) must be different from the state s’ = δ(q0, v), for any automaton M with transition function δ, recognizing L. Conversely, two strings x and y are indistinguishable (by L) if there is no string t ∈ Σ* that distinguishes them. We have the equivalence Nerode congruence on Σ*, defined by


Let [s]L be the set of all strings that are indistinguishable from s, and Q be the set of all corresponding equivalence classes. Thus, the essence of the notion of “state” is an indistinguishability equivalence class; define a DFA Z as follows:

  • the states Q are the equivalence classes of ≡L,
  • the initial state q0 is [ε]L, the equivalence class of the empty word,
  • δ([u]L, a) = [ua]L for all [u]LQ and a ∈ Σ, and
  • the accepting states are F = {[u]L: uL}.

Selecting a representative for each equivalence class of ≡L, we get a set of access strings S ⊂ Σ*. Starting in the initial state, if we follow the transitions as indicated by uS, it leads us to a state q that is uniquely identified by u. Figure 1 depicts an example of a DFA A, and then it is explicitly represented by access strings as H2.

Figure 1. Learning example.

The Myhill-Nerode theorem states that L is recognized by Z as defined earlier, and furthermore, Z is minimal: if a DFA M accepts L, then the equivalence relation ≡M is a refinement of the equivalence relation ≡L, where


and we say that x and y are indistinguishable to M.

Proofs that a given language cannot be recognized by a finite automaton can be viewed as indistinguishability arguments, based on the Myhill-Nerode theorem. Automata with infinitely many states can be viewed as abstractions of programs that can make infinitely many discriminations regarding the structure of a set of possible input strings.

Let λq(v) = 1 whenever Z accepts v ∈ Σ* starting at state q, and λq(v) = 0 otherwise. If q = q0, we may omit the sub-index, that is, L = {w : λ(w) = 1}. For learning, we will use the notion of a string t being a witness that two states are different. Notice that:

  • For any pair of distinct states q, q’ of Z, there is a distinguishing word t ∈ Σ* such that λq(t) ≠ λq’(t).

Learning automata. Following the classic approach of learning finite automata,36 three additional approaches have been studied: computational learning,25 model learning,37 and grammatical inference.34 We next describe automata learning algorithms with a minimally adequate teacher (MAT), demonstrating fundamental ideas that are relevant to all four learning branches.

Minimization algorithms related to the Myhill-Nerode theorem work by merging indistinguishable states of a DFA. We describe algorithms working in the opposite direction, splitting states when discovering a witness string t demonstrating they are distinguishable.

The learner poses membership queries to the blackbox to try to learn the language L it accepts: Does x ∈ Σ* belong to L? The learner starts with a hypothesis automaton H, that it updates during the conversation. The experimenter has no way of knowing when to stop asking questions, because there could be machines with more and more states, which return answers consistent with the current experiment. Even if the number of states of M is known to the experimenter, an exponential number of membership queries is required.2 To circumvent this, the MAT framework admits equivalence queries:

  • Does H correctly recognize L? If not, give me an example of a string x ∈ Σ* such that xL(H) − L(M) or xL(M) − L(H).

Using membership and equivalence queries, the experimenter can learn L with a number of queries that is polynomial in n, the number of states in Z, the Myhill-Nerode automaton for L, and in m the longest counterexample returned by the blackbox. (There are always counterexamples of length at most 2n.) The algorithm terminates with a DFA H that is isomorphic to Z. The MAT framework and the efficient algorithm, called L*, were introduced in a seminal paper of Angluin.1 We stress that this kind of learning algorithms can be extended to learn other types of blackboxes, for example, logical formulas.

We illustrate the ideas behind the MAT framework through an example (inspired by Isberner et al.24), to show how distinguishing is the basis of learning. Learning something new means splitting a state into two states (which are different, as evidenced by a new witness t).

Assume the blackbox is implemented by the DFA A in Figure 1. The learner maintains a set of prefix-closed access strings S ⊂ Σ*; recall that access strings are representatives of equivalence classes. Distinct access strings u, u’ correspond to distinct states of A that the learner has identified, and the learner has a witness of this fact, through a string t, such that λ(u · t) ≠ λ(u’ · t). The learner maintains this set of discriminating suffixes D ⊂ Σ*, that it has found through membership queries.

The basic data structure is the observation table, with two types of rows (in the figure, a horizontal line in a table divides the two types). Each row of the first type is identified by an access string uS, and each row of the second type identifies a transition of the hypothesis automaton. Each column is identified by a discriminating string t. The content of a cell in the table is λ(u·t) (where λ refers to the current hypothesis automaton). Each time the learner gets a counterexample, it extracts from it a discriminating suffix. Many algorithms have been proposed, differing in how they extract a discriminating suffix from a counterexample. Here we are only concerned with the fact that it is always possible to do so.

The learner initially has as hypothesis the DFA H0. It then learns that ε is discriminating ε and b, and hence splits state [ε] creating state [b]. In the table, the new row for access string b is added, and the transition for b is replaced by the two transitions ba, bb. Thus, the new hypothesis automaton is H1, and by following string b in this automaton, one “accesses” state [b], an equivalence class of strings indistinguishable from the representative of the class, b (for example, aab also belongs to [b]; it s indistinguishable from b and also accesses [b]). In H1, we have a (single) column identified by ε, witnessing that states [ε] and [b] are different, because ε concatenated with ε is in L, whereas ε concatenated with b is not. Then, H2 is produced when it learns that b discriminates ε and a, λ(ε · b) ε λ(a · b), and hence the state [ε] is split creating the state [a]. More generally, if w is a counterexample for H, then it has a suffix at, s.t. for two access strings u, u’U, ua and u’ reach the same state in H, but λ(ua · t) ≠ λ(u’t). Thus, u’U, ua is a transition in the observation table, and both rows are equal, and adding t to table distinguishes ua and u’, with ua being moved to the upper part of table.

Behavioral equivalences. Behavioral equivalences17 are based on the idea that two systems are equivalent whenever no external observation can distinguish between them. They are used to abstract from unwanted details; to formalize the idea that it is not the internal structure of a system which is of interest but its behavior with respect to the outside world.

Bisimulation, the strongest form, is a rich concept independently discovered in computer science, modal logic, and set theory, with applications to many areas,35 and we would have devoted much more space to it if it was not for lack of space. We touched on it, with the Myhill-Nerode theorem example, which is the basis for automata minimization algorithms modulo bisimilarity.23 Another typical application is to prove the correctness of an algorithm, with a big automaton representation M, by analyzing a smaller bisimilar model Z that captures its essence, as illustrated in Figure 2, where R is the bisimilar relation between states of Z and M. Intuitively, two systems are bisimilar if they match each other’s moves. Verifying the algorithm M using a model checking problem M |= ϕ is equivalent to solving the much smaller problem Z |= ϕ. From the in-distinguishability perspective, it is interesting to consider iterative abstraction-refinement, see Clarke et al.9

Figure 2. Schematic illustration of bisimulation.

Back to Top

Sequential Reductions in Concurrent Programming

A notable example of behavioral equivalence is the notion of serializability, utilized in most of the database systems (in various variants) since their early days in the 1970s. The notion is used in concurrency control of databases and in various transactional systems (processing, management, transactional memory, etc.), both centralized and distributed. A key challenge in the design and analysis of concurrent systems is dealing with all possible interleavings of concurrent processes. Indistinguishability is useful for defining the semantics of a concurrent program, in terms of the notion of serializability. It is also important in verification, as it can be exploited to verify a concurrent program by checking only its sequential executions.a

Serializability and two-phase locking. Serializability is studied in a setting where processes interact through shared variables. Two executions α1 and α2 are indistinguishable to a specific process, if the process accesses the same sequence of variables in both executions, and returns the same results. An execution is serializable8,39 if it is indistinguishable to all processes from a sequential execution, in which each process executes its procedure invocation to completion, without interleaving of any other process.

The classic way to ensure serializability is to protect shared variables with locks, using a locking protocol governing how locks are acquired and released. Thus, an execution of the system, α, is a sequence of events each taken by a single process; the events either access shared variables, or acquire and release locks on these shared variables. In two-phase locking (2PL),13 each process has a growing phase of lock acquisition (in some order), followed by a shrinking phase of lock release. Namely, once a process released a lock, it can no longer acquire any lock, even on another variable. For example, given shared variables X, Y, and two processes p1, p2:


Two-phase locking is a mechanism for enforcing indistinguishability from sequential executions, as demonstrated by the following geometric interpretation. An execution of the processes p1, p2 defines a particular interleaving of the order in which the processes acquire and release the locks. It can be represented as a path in a two-dimensional space (see Figure 3). If a lock is acquired or released by p1, the path moves one unit on the horizontal axis; similarly, when a lock is acquired or released by p2, the path moves one unit on the vertical axis. All paths start in (0, 0), when no operations have occurred, and they all end in (1, 1), where all operations have occurred, by both processes.

Figure 3. Geometric interpretation of all interleavings of two processes acquiring and releasing shared variables X, Y.

Each time two operations of an execution are swapped, in a way that is indistinguishable to both processes, the path is deformed. In Figure 3, two such paths are illustrated: P1 which is sequential (p1 then p2), and P2 where acq(Y) by p2 is swapped with rel(X) by p1.

There are two forbidden rectangles, where no execution path can go through: in the vertical (blue) one, Y would be acquired simultaneously by both, whereas in the horizontal rectangle (red), the same holds for X. Their union is the forbidden region where no execution enters. Notice that if both processes acquire X and Y (in either order), the protocol enters the deadlock region. The main point is that there are two classes C1, C2, of homotopic paths, that is, paths within a class can be deformed to each other. In one class, all paths go above the forbidden region and are indistinguishable from a sequential execution in which p2 goes first, whereas in the other class, all executions go below the forbidden region and are indistinguishable from a sequential execution where p1 goes first.

Notice that in a program where both processes acquire the locks in the same order, the forbidden region is a square, and hence no deadlocks can happen. Directed topology and the geometric theory of execution paths homotopy are studied in Fajstrup et al.,15 showing a direct representation of indistinguishability as continuous deformation of paths in an n-dimensional space (for n processes).

Verifying two-phase locking. Because indistinguishable executions can be substituted for each other, it means that checking whether one execution satisfies a particular property informs us whether all indistinguishable executions satisfy this property. Therefore, indistinguishability facilitates the verification of concurrent programs. When a program is serializable certain properties can be verified by considering only sequential (noninterleaved) executions of the program. This is equivalent to reasoning assuming a sequential setting.

But how can we prove that a program is serializable? Obviously, if we prove that it follows the two-phase locking protocol, then it is serializable. However, in reality, we are not given an execution example, but a program, possibly including conditional and repeat statements. Thus, we need to consider all its possible executions, to see if each one satisfies the two-phase locking regime. It turns out that we can ensure that the program follows 2PL, by considering only its sequential executions. The next theorem holds provided the program has no nonterminating loops.

THEOREM 3.1. If any execution satisfies two-phase locking when events of different processes are not interleaved, then any interleaved execution also satisfies two-phase locking.

Proving the theorem goes through showing that every execution that violates 2PL is indistinguishable from a noninterleaved execution in which the protocol is also violated. This implies that if we check (manually or mechanically) all noninterleaved executions of the protocol without finding a violation of 2PL, then all executions of the protocol do not violate 2PL.

Toward a contradiction, assume the claim does not hold and let α = α'(it, e) be the shortest execution that violates 2PL for which there is no indistinguishable noninterleaved execution; see Figure 4. Note that (it, e) is an event of process pi, that violates 2PL, that is, acquires a lock after releasing a lock, or accesses an unlocked location. As α is the shortest such execution, we know that for prefix α’ of α there is an indistinguishable noninterleaved execution cacm6305_a.gif , (where αij, contains events by pij only).

Figure 4. Moving the event (it, e) to after pit‘s events.

We argue that moving the event (it, e) to after pit‘s events in cacm6305_b.gif , will still cause pit to take the offending event. Intuitively, this happens because the event depends only on information that is local to the process pit or locked by it, and pit does not distinguish between the original execution and the noninterleaved execution. Namely, pit has the same state at the end of α and at the end of αi1 …αit. Therefore, the event can be moved to appear after the events αit of the same process. Hence, pit will make the same offending event (it, e), implying that the noninterleaved execution αi1 … αit(it, e), also violates 2PL.

The reduction holds for any noncentralized locking protocol, such as commonly used ones like two-phase, handover-hand, tree, and dynamic graph locking. It allows sequential reasoning, whether manual or automated, about concurrent programs both in verifying that they adhere to a locking protocol and in development of algorithms for them. The reduction enables simpler and more efficient verification algorithms of a class of properties, called transaction-local. It justifies the use of sequential Hoare Logic or sequential type systems or sequential abstract interpretation to verify that the program adheres to a locking protocol. Programmers wishing to add, for example, a new procedure to swap two adjacent elements in a list to a program that uses hand-over-hand locking, do not have to worry about concurrent interleaving with other procedures. More details are in Attiya et al.,6 such as the case of nonterminating loops.

Indistinguishability is also used to prove a theorem that shows that if serializability is ensured in a program with two processes and two variables, it is ensured in any program, provided the implementation satisfies certain structural properties, one of them being symmetry.19 The proof goes by contradiction, taking an execution of the larger system that violates serializability and perturbing it into a bad execution for a system with two processes and two variables; a key step relies on an indistinguishability argument using symmetry.

Back to Top

Real-Time Indistinguishability

The previous examples describe asymmetric interactions, where one party interacts with another party, whose semantics (internal details) are hidden or abstracted away. Our next example ignores the semantics of the interactions, concentrating only on their timing.

The fundamental problem is estimating distant simultaneity—the time difference between the occurrence of two spatially separated (at different processes) events. This is behind many real-time applications in computer science that depend on clock synchronization, such as synchronizing cellphone communications, positioning systems (for example, GPS), failure detection, efficient use of resources (for example, releasing a connection), timestamping events and timeouts, and so on.

Computer clocks are typically based on inexpensive oscillator circuits and quartz crystals that can easily drift seconds per day. However, atomic clock time, so ubiquitous and integral to modern life, trickles down to the clocks we use daily, distributed through the Network Time Protocol and other means. Atomic clocks are so precise that if such a clock existed when Earth began, about 4.5 billion years ago, it would be off by only 30 s today.

How precise the time of an atomic clock can be estimated depends on the transmission delay bounds along communication paths from the atomic clock to the local computer, and on the drift bounds of the clocks of the computers along such paths. In other words, when a computer gets a message with the time of some atomic clock, the actual moment when the clock reading took place could have occurred at any moment within some range B, and from the computer’s perspective, it is indistinguishable which exact moment within B is the actual one. Thus, the computer’s best estimate of the atomic clock time is based on |B|/2. Indeed, selecting the mid-point is hedging the bets, because anything else leaves open the possibility of a bigger mistake. We now explain in more detail how to compute B.

Consider a process p1 trying to synchronize its clock with an atomic reference clock, assumed to give real-time exactly, located in p0. The basic interaction is when p1 has a direct link to p0, as illustrated in Figure 5. Process p1 sends a message to p0 and gets back a response. The send event cacm6305_c.gif by p1 occurs at real-time 1, the event of p0 receiving it, cacm6305_d.gif , occurs at real-time 6 (to simplify the example, we assume p0 responds immediately, in the same event), and p1 receives the response in event cacm6305_e.gif at real-time 12. Real-time is not directly observable, instead, each event occurs at some local time, which the process can observe. The precise meaning of real-time not being observable is through indistinguishability. Namely, suppose that, although the first message delay was 5 time units, it is known that it must have taken at least 4 time units; also, assume the return message cannot take more than 9 time units. As for the local clock of p1, suppose its drift is bounded, such that between the sending and the receiving events, at most 12 time units could have passed.

Figure 5. p0 sends and p1 responds.

What is the latest time that cacm6305_e.gif could have occurred with respect to cacm6305_d.gif ? Answering also what is the earliest it could have occurred, would yield the desired indistinguishability interval B, where cacm6305_e.gif could have occurred, and selecting the midpoint would be used to compute the optimal correction to the local clock time of p1. The crucial insight is that to compute how late cacm6305_e.gif could have occurred with respect to cacm6305_d.gif we have to shift to the right as much as possible the point of occurrence of cacm6305_e.gif , subject to two constraints: (1) the maximum delay of the second message (9 units) and (2) the minimum delay of the first message plus the minimum length of the time interval from cacm6305_c.gif to cacm6305_e.gif (the fastest that p1‘s clock could have been running). In the example, the latest that cacm6305_e.gif can happen is at real-time 14 determined by the fastest delay of the first message and the slowest clock drift of p1, and not by the largest delay of the second message (which could have been delivered at 15).

More generally, p1 may be further away from the process p0 with an atomic reference clock, and an arbitrary execution α is used to synchronize p1‘s clock, where many more message exchanges take place, along different paths between p1 and p0. The goal is to estimate the indistinguish-ability interval of an event e at process p1, with respect to an event e0 in p0. The previous example hints that the task at hand has to do with computing distances, on paths formed by indistinguishability intervals, formalized as follows.

The execution α is represented by a weighted directed graph G = (V, E, r, l). Each vertex of V is an event of α, either a send or a receive event. The ith event happening in process j is denoted cacm6305_f.gif . The directed edges E are causal relationships: there is a directed edge between two consecutive events in the same process, cacm6305_g.gif , and there is a directed edge cacm6305_h.gif , whenever cacm6305_i.gif is a send event and cacm6305_j.gif is the corresponding receive event. The weight functions r, l timestamp the events. For each eV, real(e) is the real-time occurrence of event e, and local(e) is the time according to the clock of the process where e happens. Since the clock of p0 is perfect, for all events cacm6305_k.gif in p0, we have cacm6305_l.gif .

For each pair of events e1, e2 joined (in either direction) by a directed edge of G, bounds on the relative real-time occurrence of two events can be estimated,


both when the edge represents a message transmission delay, and when it represents the time it takes a process to execute consecutive computational events. Then, define local(e1, e2) = local(e1) − local(e2), and let w(e1, e2) = B(e1, e2) − local(e1, e2). These weights w can be positive or negative, but summing them along a cycle always gives a nonnegative value (the telescopic sum of local(ei, ei+1) along a cycle is 0). Thus, for a pair of events e1 and e2, the distance d(e1, e2) with respect to these weights is well defined. Interestingly, observe that d(e, e’) = 0, for any two events in p0. It is not hard to show30 that the indistinguishability interval of an event e at some process p1, with respect to an event e0 in p0 is as follows.

THEOREM 4.1. real(e) ∈ [−d(e0, e), d(e, e0)]

The meaning of this theorem is that e might have occurred at any time in this interval. Furthermore, for each such time, there is an execution indistinguishable to all processes.

Indistinguishability is useful for defining the semantics of a concurrent program in terms of the notion of serializability.

These results are based on Patt-Shamir and Rajsbaum,30 a follow up of,5, 20 which studied how closely in terms of real-time processes can be guaranteed to perform a particular action, in a failure-free environment. The possibility of failures affects the size of the indistinguishability interval, providing a very interesting topic from the indistinguishability perspective. The standard technique is to consider several clock reference values, and taking the average after disregarding the most extreme values. There are many papers on clock synchronization algorithms, see, for example, Attiya and Ellen4 for references on the more theoretical perspective, and the book28 from the more practical perspective.

Back to Top

Global Indistinguishability Structure

The previous examples of indistinguish-ability have a local flavor: we look at a single execution α and the executions indistinguishable from α to all processes. It turns out that studying executions that are indistinguishable to a subset of processes lead to understanding the global indistinguishability structure of all executions. This uncovers an intimate relation between indistinguishability and higher-dimensional topological properties. The overview presented here is very informal; for a more precise description, see Herlihy et al.22

Initial indistinguishability structure. Consider three processes b, g, w (black, gray, white) that communicate with each other to solve some task. When the computation begins, each process receives an input value. In the binary consensus task, the set of input values is {0, 1}. In certain renaming tasks, processes start with distinct input values taken from the set {0, 1, 2, 3}. Initially each process knows only its own input. An initial state


is a set of three initial local states, each one consisting of a pair of values. Two initial states, I1 and I2 are indistinguishable to a process, if the process has the same input value in both states, that is, if I1I2 contains its initial local state. If we draw an initial state as a triangle, whose vertices are the local initial states, I1 and I2 share an edge if they are indistinguishable to two processes, and they share only a vertex if only one process does not distinguish between them. Figure 6 shows the input complex for consensus looks like a triangulated sphere, and the one for renaming looks like a triangulated torus. Each one is a simplicial complex because it consists of a family of sets closed under containment (each edge of a triangle is a set of two local states, and each vertex is a singleton set).

Figure 6. Consensus and renaming input complexes.

How indistinguishability evolves. As processes communicate with each other, learning about each other’s input values, the structure of indistinguishability evolves. Suppose that the processes publicly announce their input values, but each process may miss hearing either or both of the other processes’ announcements, as determined by a communication pattern, namely a directed graph G on the vertices b, g, w; an arrow vv’ signifies that v’ hears the input from v. Thus, v’ hears inputs from the set N(v’) of processes which have an arrow toward vertex v’. Which input value v hears from v depends on which initial state I is G applied to. Applying G to an initial state I, produces a new state, {(b, view(b)), (g, view(g)), (w, view(w))}, where the local state of p, view(p), is the subset of I of processes N(p).

Figure 7 illustrates the IS-patterns (immediate snapshot or block executions), a subset of all possible communication patterns. An IS-pattern for a set of processes P is defined by an ordered partition S1, …, Sk of P (1 ≤ k ≤ |P|), specifying that processes in Si hear the values from all processes in Sj, ji. Consider, for instance, the IS-pattern {b, g, w} consisting of the trivial partition of {b, g, w}, which corresponds to the center triangle, where all processes hear from each other. The arrows gw belong also to the top triangle, corresponding to the partition {b}, {g, w} where the only difference is that b does not hear from the other two processes.

Figure 7. IS-communication patterns.

IS-patterns are important because when applied to an input complex, I, the resulting protocol complex P is a subdivision of I. In Figure 8, IS-pat-terns are applied to two consensus input simplexes. One can see that b and w with input 0 belong to two input triangles, and this edge is subdivided into three edges in P, which belong to both the blue and the yellow subdivided triangles, due to IS-patterns where b and w do not hear from g (and hence cannot tell if its input is 0 or 1).

Figure 8. Two input triangles, application of IS-patterns on them, and the requirement to produce consensus outputs.

In the same way that we applied each IS-pattern to each initial state to get P, we can again apply each IS-pattern, but now to each state of P, obtaining a subdivision of P, and so forth. Each time the processes communicate once more through an IS-pattern, the input complex is subdivided more and more finely. Indeed, a fundamental discovery is that there are topological invariants, preserved no matter how many times the processes communicate, and no matter what they tell each other each time they communicate. In the case of any unreliable asynchronous communication by either message passing or read/write shared-memory, P “looks like” (is homotopic to) the input complex I.

Remarkably, topological invariants determine the computational power of the model. In other, more reliable models of computation (for example, at most t out of n, t < n − 1 processes can fail, or synchronous models, or shared-memory primitives stronger than read/write registers), P preserves weaker topological invariants, and “holes” are created, giving the model its additional computability power.

Specifications as indistinguishability requirements. Suppose that after communicating through IS-patterns, each process produces an output value. Let (p, view(p)) be the local state of a process p in the protocol complex P, after an IS-pattern. Hence, the output value produced by p is a function of its view, δ(p, view(p)). Namely, if p does not distinguish between two triangles of P, then it must decide the same value in both.

A simplicial complex defined by triangles labeled with output values is used to specify the task that the decision values should satisfy. For binary consensus, the output complex, in Figure 8, consists of two disjoint triangles, one labeled with 0 output values in all its three vertices, and another labeled with 1 in all its three vertices. Thus, a taskI, O, Δ〉 consists of an input complex I, an output complex O, and a relation Δ specifying for each input triangle σ ∈ I, which output of O, Δ(σ), represent valid outputs for the task.

Finally, Figure 8 is meant to represent that the decision function δsolves the task, if for any triangle σ’ in P, δ(σ’) is a triangle τ ∈ O, such that τ ∈ Δ(σ), where σ is the input triangle for σ’.

To summarize, a new indistinguishability global structure (represented by P) is generated after communication, and a task specifies a target indistinguishability structure (represented by O). The question is whether P can be (simplicially) mapped to O respecting Δ. This is a topological question with deep implications to distributed task computability in various models (message-passing and shared memory, synchronous and asynchronous, with crash and Byzantine failures).

This formalization can be interpreted as a question of gaining knowledge, as explained in Goubault et al.,18 where it is described how the simplicial complexes described in this section have an equivalent representation as Kripke models. Roughly speaking, each triangle is a state of the Kripke graph, and if two triangles share a vertex of process p, then the two corresponding states are connected by an edge labeled p. Indeed, there is an intimate relation between indistinguishability and the theory of reasoning about knowledge for distributed computing described in Fagin et al.14

Back to Top


Indistinguishability plays a central role in computer science. Examples from different areas (automata theory, learning, specification, verification, distributed computing and epistemic logic) demonstrate how different levels of abstraction entail distinct notions of indistinguishable observations, and different uses of indistinguishability (to show computability and complexity limitations, and also to design solutions). Some examples should be treated in more depth, and there are many additional application areas.

One application area is computational learning and related complexity topics, as recently reviewed in Wigderson.40 Many subareas can be viewed through the lenses of probabilistic indistinguishability, for example, PAC learning,38 cryptography, communication complexity, indistinguishability despite errors,32 and coding theory.

Indistinguishability plays a role in artificial intelligence, for example, in Turing’s test, and more generally, Turing-like tests for other applications, such as Go simulators10 and writing a program simulating a living organism.21

We discussed formal methods, another area where indistinguishability is a key, notably in behavioral equivalences.11 And we discussed logic, where the longstanding connection between modal logic and topology goes back to McKinsey and Tarski,27 and up to today, with a topological semantics for belief.7 Another interesting example from logic is Ehrenfeucht-Fraïssé games.31

Distributed computing is all about interactions, with abundant instances where indistinguishability is a key. Examples include labeling schemes, synchronizers, mutual exclusion, anonymity and symmetry, and partitioning. Many impossibility results are discussed in Attiya and Ellen.4

Finally, indistinguishability cuts across topics. Multi-agent epistemic logic relies on Kripke models to represent indistinguishability.14 These in turn, can be considered as the dual of simplicial complexes,18 and we described how the indistinguishability structure evolves as interaction occurs preserving topological properties. Also, having knowledge means being able to distinguish between situations, so the same action must be taken in indistinguishable setups.29 We discussed the duality between indistinguishability and knowledge also in the context of learning automata.

Acknowledgments. We would like to thank Hans van Ditmarsch, Jérémy Ledent, Arnold Rosenberg, Jennifer Welch, and the reviewers for helpful comments. Supported by grants from UNAM-PAPIIT IN106520 and ISF 380/18.

    1. Angluin, D. Learning regular sets from queries and counterexamples. Inf. Comput. 2, 75 (1987), 87—106.

    2. Angluin, D. Queries and concept learning. Mach. Learn. 2, 4 (1988), 319—342.

    3. Ross Ashby, W. An Introduction to Cybernetics. John Wiley, New York, 1956.

    4. Attiya, H., Ellen, F. Impossibility Results for Distributed Computing. Morgan & Claypool, 2014, 162.

    5. Attiya, H., Herzberg, A., Rajsbaum, S. Optimal clock synchronization under different delay assumptions. SIAM J. Comput. 25, 2 (1996), 369—389. DOI:10.1137/S0097539794266328

    6. Attiya, H., Ramalingam, G., Rinetzky, N. Sequential verification of serializability. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages POPL, ACM, 2010, 31—42

    7. Baltag, A., Bezhanishvili, N., Özgün, A., Smets, S. A topological approach to full belief. J. Philos. Logic 48, 2 (2019), 205—244.

    8. Bernstein, P.A., Hadzilacos, V., Goodman, N. Concurrency Control and Recovery in Database Systems. Addison-Wesley Pub. Co. Inc., Reading, MA, 1987.

    9. Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H. Counterexample-guided abstraction refinement. In LNCS-Computer Aided Verification. E. Allen Emerson and A.P. Sistla, eds. Volume 1855. Springer Berlin, Heidelberg, Berlin, Heidelberg, 2000, 154—169.

    10. Coquidé, C., Georgeot, B., Giraud, O. Distinguishing humans from computers in the game of go: A complex network approach. Europhys. Lett. 4, 119 (2017), 48001.

    11. De Nicola, R. Behavioral Equivalences. Springer US, Boston, MA, 2011, 120—127. DOI: 10.1007/978-0-387-09766-4_517

    12. Edward, M. Gedanken-experiments on sequential machines. In Automata Studies. C.E. Shannon and J. McCarthy, eds. Number 34 in Annals of Mathematics Studies. Princeton University Press, Princeton, 1958, 129—153.

    13. Eswaran, K.P., Gray, J.N., Lorie, R.A., Traiger, I.L. The notions of consistency and predicate locks in a database system. Commun. ACM 19, 11 (1976), 624—633.

    14. Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y. Reasoning About Knowledge. MIT Press, Cambridge, MA, USA, 2003.

    15. Fajstrup, L., Goubault, E., Haucourt, E., Mimram, S., Raussen, M. Directed Algebraic Topology and Concurrency. Springer, Switzerland, 2016.

    16. Fields, C. Bell's Theorem from Moore's Theorem. Cambridge, MA, 2012. arXiv 1201.3672v6.

    17. van Glabbeek, R.J. The Linear Time—Branching Time Spectrum I. Elsevier, 2001, 3—99.

    18. Goubault, E., Ledent, J., Rajsbaum, S. A simplicial complex model for dynamic epistemic logic to study distributed task computability. In Proceedings 9th International Symposium on Games, Automata, Logics, and Formal Verification (GandALF) (EPTCS). A. Orlandini and M. Zimmermann, eds. Volume 277, Electronic Proceedings in Theoretical Computer Science, 2018, 73—87.

    19. Guerraoui, R., Henzinger, T.A., Vasu, S. Model checking transactional memories. Distrib. Comput. 22, 3 (2010), 129—145.

    20. Halpern, J.Y., Megiddo, N., Munshi, A.A. Optimal precision in the presence of uncertainty. J. Complex. 1, 2 (1985), 170—196. DOI: 10.1016/0885-064X(85)90010-X

    21. Harel, D. A grand challenge for computing: Towards full reactive modeling of a multi-cellular animal. Bull. EATCS, 81 (2003), 226—235.

    22. Herlihy, M., Kozlov, D., Rajsbaum, S. Distributed Computing Through Combinatorial Topology, 1st edn. Elsevier-Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2013.

    23. Hopcroft, J.E., Motwani, R., Ullman, J.D. Introduction to Automata Theory, Languages, and Computation, 3rd edn. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2006.

    24. Isberner, M., Howar, F., Steffen, B. The TTT algorithm: A redundancy-free approach to active automata learning. In Runtime Verification (LNCS). B. Bonakdarpour and S.A. Smolka, eds. Volume 8734. Springer International Publishing, Cham, 2014, 307—322.

    25. Kearns, M.J., Vazirani, U. An Introduction to Computational Learning Theory. MIT Press, Boston, MA, USA, 1994.

    26. Lamport, L. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (1978), 558—565.

    27. McKinsey, J.C.C., Tarski, A. The algebra of topology. Ann. Math. 45, 1 (1944), 141—191.

    28. Mills, D.L. Computer Network Time Synchronization: The Network Time Protocol on Earth and in Space, 2nd edn. CRC Press, Florida, USA, 2016.

    29. Moses, Y. Relating knowledge and coordinated action: The knowledge of preconditions principle. In Proceedings of the Fifteenth Conference on Theoretical Aspects of Rationality and Knowledge, TARK 2015, Carnegie Mellon University, Pittsburgh, USA, June 4—6, 2015, 231—245.

    30. Patt-Shamir, B., Rajsbaum, S. A theory of clock synchronization (extended abstract). In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing (STOC '94). 1994, 810—819.

    31. Poizat, B. A Course in Model Theory—An Introduction to Contemporary Mathematical Logic. Springer, NewYork, NY, USA, 2000.

    32. Ron, D., Rubinfeld, R. Learning fallible deterministic finite automata. Mach. Learn. 2—3, 18 (1995), 149—185. DOI: 10.1007/BF00993409

    33. Rosenberg, A.L. The Pillars of Computation Theory: State, Encoding, Nondeterminism, 1st edn. Springer Publishing Company, Incorporated, 2009.

    34. Sakakibara, Y. Recent advances of grammatical inference. Theor. Comput. Sci. 1, 185 (1997), 15—45.

    35. Sangiorgi, D. Introduction to Bisimulation and Coinduction. Cambridge University Press, New York, NY, USA, 2011.

    36. Trakhtenbrot, B.A., Barzdin, Y.M. Finite Automata, Behavior and Synthesis. North Holland, Amsterdam, 1973.

    37. Vaandrager, F. Model learning. Commun. ACM 2, 60 (2017), 86—95.

    38. Valiant, L. Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World. Basic Books, Inc., New York, 2013.

    39. Weikum, G., Vossen, G. Transactional Information Systems: Theory, Algorithms, and The Practice of Concurrency Control and Recovery. Elsevier, 2001.

    40. Wigderson, A. Mathematics and Computation. Princeton University Press, Princeton, USA, 2018. To appear.

    a. Indistinguishability is central also in other consistency conditions like linearizability.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More