Systems and Networking Review articles

Indistinguishability

Diverse examples depict how indistinguishability plays a central role in computer science.

Posted May 1 2020

Introduction
Key Insights
Automata and Learning
Sequential Reductions in Concurrent Programming
Real-Time Indistinguishability
Global Indistinguishability Structure
Conclusion
References
Authors
Footnotes

The properties commonly ascribed to any object are, in last analysis, names for its behavior.
—Judson Herrick, An Introduction to Neurology, 1918

Key Insights

Lack of knowledge by a computer system component about other components can formally be captured through the concept of indistinguishability. Whenever abstraction or interaction take place in a computer system, indistinguishability plays a critical role.
Indistinguishability is the source of many lower bounds and impossibility results in CS. It is also the essence behind abstraction techniques so important in computing theory and in the design of large complex systems.
indistinguishability has a topological nature: local states of components that do not distinguish between two system states induce a higher-dimensional simplicial complex, a structure with topological properties preserved as the system execution evolves.

Dost thou love me? I know thou wilt say “ay,”
And I will take thy word. Yet if thou swear’st
Thou mayst prove false. At lovers’ perjuries,
They say, Jove laughs.
—Shakespeare’s Romeo and Juliet, Act 2

Abstraction—allowing the details of lower-level components to be ignored—and interaction—allowing individual computing entities to cooperate—are key concepts in computer science. Many would argue that they play a crucial role in the success of computing: abstraction allows separate layers of the computing stack to be improved orthogonally, whereas interaction allows the abundance of computing power to be harnessed. This comes at a significant cost: each component of a computer system has limited knowledge about the state of other components. This happens either by choice, in the case of abstraction, or out of necessity, in the case of interaction.

From the perspective of an individual component, all other components, either other layers within the same computing entity or other computing entities, can be considered as an environment. Seen in this way, lack of knowledge about other components can formally be captured through the concept of indistinguishability, namely inability to tell apart different behaviors or states of the environment. Indistinguishability is therefore a consequence of the fact that computer systems are built of individual components, each with its own perspective of the system.

This article argues that because of its intimate relation with key issues in computing, indistinguishability, in its various flavors, plays a critical role in many computing areas. We explain this core concept and demonstrate some of its variants and applications, through four examples, trying to illustrate different, fundamental aspects of indistinguishable situations of abstraction and interaction.

Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system. In our first example, we overview the role of indistinguishability in some of the most basic notions in computer science: state, automata, and learning. We will encounter both interaction (as means to reduce indistinguishability) and abstraction (captured by behavioral equivalence). Here, the environment is seen as a blackbox, implemented by an unknown automaton. What can an experimenter interacting with its environment through input/output symbols infer about the blackbox internals? The experimenter has an evolving mental model of the blackbox as an hypothesis automaton, which is indistinguishable from the actual automaton, given the current state of the interaction. The very notion of “state” is in terms of indistinguishability. In this example, indistinguishability has a “semantic” nature, related to computational complexity, namely the number of states in the automaton and the complexity of the learning algorithm.

Our second example demonstrates that indistinguishability is a powerful tool for deriving positive results. Examples abound, such as in artificial intelligence (for example, Turing’s test), cryptography (for example, pseudo-randomness), logic, and others. We consider the example of serializability in concurrent programming, where interaction is through shared variables, and locks permit the set of indistinguishable executions to be reduced. The correctness specification of a program is in terms of requiring that concurrent executions are indistinguishable from appropriate sequential executions. Abstraction is key, and indistinguishability becomes a powerful tool to design concurrent programs and prove their correctness, and in particular, to enable sequential reasoning.

We move in our third example to another very basic form of indistinguishability, related to time, and to the impossibility of observing realtime. An interaction among a set of computing entities can be seen as a partial order, representing causality relations between events happening in the system. Lamport’s seminal paper²⁶ can be seen as using indistinguishability in two senses. First, it observed the relation to relativity theory, motivating the idea of reducing concurrent systems by indistinguishability to sequential thinking (by implementing a fault-tolerant distributed system as a replicated state machine). And second, it provided the framework for analyzing time-based algorithms, which depend on quantifying real-time indistinguishability. We illustrate this with a simple example showing how inherent limitations on clock synchronization can be derived through the impossibility of distinguishing the real-time occurrence of events in an execution up to given bounds on message transmission delays and clock drifts.

Prior examples consider a single execution, and analyze a set of executions that are indistinguishable from it, from the perspective of all the participating processes. Our final example considers how distributed computation is limited by the global indistinguishability structure of all possible executions. This structure is defined by a Kripke graph, where edges are labeled by processes that do not distinguish between the global states of the system represented by the two endpoints of the edge. It turns out that higher dimensional topological properties of this graph (more precisely, its dual, a simplicial complex) determine computability and the amount of interaction needed to distributively solve a problem.

Automata and Learning

We start with a simple scenario where a learner is trying to infer the internal construction of a blackbox. The learner knows that the blackbox is a deterministic finite automaton (DFA) accepting a language over an alphabet Σ, but does not know which specific automaton it is. Through a conversation, the learner and the blackbox exchange symbols, and there is a set of automata all indistinguishable with respect to the current conversation. As the interaction evolves, this set of indistinguishable automata shrinks. Eventually, the learner would like it to shrink until it captures the language accepted by the blackbox.

Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system.

Moore’s theorem. Indistinguishability is at the core of the difficulty of constructing theoretical models for the behavior of a physical system. Ashby’s Cybernetics book³ from 1956 already includes a chapter called “The black-box.” At the same time, Moore¹² proposed the problem of learning finite automata, and studied indistinguishability of deterministic finite state machines, stating (Theorem 2):

“Given any machine S and any multiple experiments performed on S, there exist other machines experimentally distinguishable from S for which the original experiment would have had the same outcome.”

Moore’s theorem shows an impossibility in the characterization of any physical system as a deterministic state machine on the basis of a finite number of observational outcomes. This is because after a finite interaction with the blackbox, approximately, if all words are at most of length k, the learner has explored only paths of length k in the automaton A of the blackbox.

This does not prevent the construction of theoretical models of the behavior of a system, but it does challenge the assumption that a system has only the behaviors that have been characterized by experimental observations, namely the assumption that any theoretical model is complete. Further discussion of the relation between Moore’s theorem and physics appears in Fields.¹⁶

The Myhill-Nerode theorem. If the interaction with the blackbox is only through input/output symbols, how can the learner know anything at all about its internal construction, even if it has any states at all? States are not directly observable, so what is a state, from the perspective of the learner? The Myhill-Nerode theorem, “one of the conceptual gems of theoretical computer science” according to Rosenberg,³³ offers a complete mathematical characterization of the notion of state, via basic algebraic properties defined only on input/output behavior.

A string t ∈ Σ^* distinguishes two strings u and v in a language L, if vt ∉ L and vt ∉ L. If there is a string t distinguishing u and v, then the state s = δ(q₀, u) must be different from the state s’ = δ(q₀, v), for any automaton M with transition function δ, recognizing L. Conversely, two strings x and y are indistinguishable (by L) if there is no string t ∈ Σ^* that distinguishes them. We have the equivalence Nerode congruence on Σ^*, defined by

Let [s]_L be the set of all strings that are indistinguishable from s, and Q be the set of all corresponding equivalence classes. Thus, the essence of the notion of “state” is an indistinguishability equivalence class; define a DFA Z as follows:

the states Q are the equivalence classes of ≡_L,
the initial state q₀ is [ε]_L, the equivalence class of the empty word,
δ([u]_L, a) = [ua]_L for all [u]_L ∈ Q and a ∈ Σ, and
the accepting states are F = {[u]_L: u ∈ L}.

Selecting a representative for each equivalence class of ≡_L, we get a set of access strings S ⊂ Σ^*. Starting in the initial state, if we follow the transitions as indicated by u ∈ S, it leads us to a state q that is uniquely identified by u. Figure 1 depicts an example of a DFA A, and then it is explicitly represented by access strings as H₂.

Figure 1. Learning example.

The Myhill-Nerode theorem states that L is recognized by Z as defined earlier, and furthermore, Z is minimal: if a DFA M accepts L, then the equivalence relation ≡_M is a refinement of the equivalence relation ≡_L, where

and we say that x and y are indistinguishable to M.

Proofs that a given language cannot be recognized by a finite automaton can be viewed as indistinguishability arguments, based on the Myhill-Nerode theorem. Automata with infinitely many states can be viewed as abstractions of programs that can make infinitely many discriminations regarding the structure of a set of possible input strings.

Let λ_q(v) = 1 whenever Z accepts v ∈ Σ^* starting at state q, and λ_q(v) = 0 otherwise. If q = q₀, we may omit the sub-index, that is, L = {w : λ(w) = 1}. For learning, we will use the notion of a string t being a witness that two states are different. Notice that:

For any pair of distinct states q, q’ of Z, there is a distinguishing word t ∈ Σ^* such that λ_q(t) ≠ λ_q’(t).

Learning automata. Following the classic approach of learning finite automata,³⁶ three additional approaches have been studied: computational learning,²⁵ model learning,³⁷ and grammatical inference.³⁴ We next describe automata learning algorithms with a minimally adequate teacher (MAT), demonstrating fundamental ideas that are relevant to all four learning branches.

Minimization algorithms related to the Myhill-Nerode theorem work by merging indistinguishable states of a DFA. We describe algorithms working in the opposite direction, splitting states when discovering a witness string t demonstrating they are distinguishable.

The learner poses membership queries to the blackbox to try to learn the language L it accepts: Does x ∈ Σ^* belong to L? The learner starts with a hypothesis automaton H, that it updates during the conversation. The experimenter has no way of knowing when to stop asking questions, because there could be machines with more and more states, which return answers consistent with the current experiment. Even if the number of states of M is known to the experimenter, an exponential number of membership queries is required.² To circumvent this, the MAT framework admits equivalence queries:

Does H correctly recognize L? If not, give me an example of a string x ∈ Σ^* such that x ∈ L(H) − L(M) or x ∈ L(M) − L(H).

Using membership and equivalence queries, the experimenter can learn L with a number of queries that is polynomial in n, the number of states in Z, the Myhill-Nerode automaton for L, and in m the longest counterexample returned by the blackbox. (There are always counterexamples of length at most 2n.) The algorithm terminates with a DFA H that is isomorphic to Z. The MAT framework and the efficient algorithm, called L^*, were introduced in a seminal paper of Angluin.¹ We stress that this kind of learning algorithms can be extended to learn other types of blackboxes, for example, logical formulas.

We illustrate the ideas behind the MAT framework through an example (inspired by Isberner et al.²⁴), to show how distinguishing is the basis of learning. Learning something new means splitting a state into two states (which are different, as evidenced by a new witness t).

Assume the blackbox is implemented by the DFA A in Figure 1. The learner maintains a set of prefix-closed access strings S ⊂ Σ^*; recall that access strings are representatives of equivalence classes. Distinct access strings u, u’ correspond to distinct states of A that the learner has identified, and the learner has a witness of this fact, through a string t, such that λ(u · t) ≠ λ(u’ · t). The learner maintains this set of discriminating suffixes D ⊂ Σ^*, that it has found through membership queries.

The basic data structure is the observation table, with two types of rows (in the figure, a horizontal line in a table divides the two types). Each row of the first type is identified by an access string u ∈ S, and each row of the second type identifies a transition of the hypothesis automaton. Each column is identified by a discriminating string t. The content of a cell in the table is λ(u·t) (where λ refers to the current hypothesis automaton). Each time the learner gets a counterexample, it extracts from it a discriminating suffix. Many algorithms have been proposed, differing in how they extract a discriminating suffix from a counterexample. Here we are only concerned with the fact that it is always possible to do so.

The learner initially has as hypothesis the DFA H₀. It then learns that ε is discriminating ε and b, and hence splits state [ε] creating state [b]. In the table, the new row for access string b is added, and the transition for b is replaced by the two transitions ba, bb. Thus, the new hypothesis automaton is H₁, and by following string b in this automaton, one “accesses” state [b], an equivalence class of strings indistinguishable from the representative of the class, b (for example, aab also belongs to [b]; it s indistinguishable from b and also accesses [b]). In H₁, we have a (single) column identified by ε, witnessing that states [ε] and [b] are different, because ε concatenated with ε is in L, whereas ε concatenated with b is not. Then, H₂ is produced when it learns that b discriminates ε and a, λ(ε · b) ε λ(a · b), and hence the state [ε] is split creating the state [a]. More generally, if w is a counterexample for H, then it has a suffix at, s.t. for two access strings u, u’ ∈ U, ua and u’ reach the same state in H, but λ(ua · t) ≠ λ(u’t). Thus, u’ ∈ U, ua is a transition in the observation table, and both rows are equal, and adding t to table distinguishes ua and u’, with ua being moved to the upper part of table.

Behavioral equivalences. Behavioral equivalences¹⁷ are based on the idea that two systems are equivalent whenever no external observation can distinguish between them. They are used to abstract from unwanted details; to formalize the idea that it is not the internal structure of a system which is of interest but its behavior with respect to the outside world.

Bisimulation, the strongest form, is a rich concept independently discovered in computer science, modal logic, and set theory, with applications to many areas,³⁵ and we would have devoted much more space to it if it was not for lack of space. We touched on it, with the Myhill-Nerode theorem example, which is the basis for automata minimization algorithms modulo bisimilarity.²³ Another typical application is to prove the correctness of an algorithm, with a big automaton representation M, by analyzing a smaller bisimilar model Z that captures its essence, as illustrated in Figure 2, where R is the bisimilar relation between states of Z and M. Intuitively, two systems are bisimilar if they match each other’s moves. Verifying the algorithm M using a model checking problem M |= ϕ is equivalent to solving the much smaller problem Z |= ϕ. From the in-distinguishability perspective, it is interesting to consider iterative abstraction-refinement, see Clarke et al.⁹

Figure 2. Schematic illustration of bisimulation.

Sequential Reductions in Concurrent Programming

A notable example of behavioral equivalence is the notion of serializability, utilized in most of the database systems (in various variants) since their early days in the 1970s. The notion is used in concurrency control of databases and in various transactional systems (processing, management, transactional memory, etc.), both centralized and distributed. A key challenge in the design and analysis of concurrent systems is dealing with all possible interleavings of concurrent processes. Indistinguishability is useful for defining the semantics of a concurrent program, in terms of the notion of serializability. It is also important in verification, as it can be exploited to verify a concurrent program by checking only its sequential executions.^a

Serializability and two-phase locking. Serializability is studied in a setting where processes interact through shared variables. Two executions α₁ and α₂ are indistinguishable to a specific process, if the process accesses the same sequence of variables in both executions, and returns the same results. An execution is serializable^8,39 if it is indistinguishable to all processes from a sequential execution, in which each process executes its procedure invocation to completion, without interleaving of any other process.

The classic way to ensure serializability is to protect shared variables with locks, using a locking protocol governing how locks are acquired and released. Thus, an execution of the system, α, is a sequence of events each taken by a single process; the events either access shared variables, or acquire and release locks on these shared variables. In two-phase locking (2PL),¹³ each process has a growing phase of lock acquisition (in some order), followed by a shrinking phase of lock release. Namely, once a process released a lock, it can no longer acquire any lock, even on another variable. For example, given shared variables X, Y, and two processes p₁, p₂:

Two-phase locking is a mechanism for enforcing indistinguishability from sequential executions, as demonstrated by the following geometric interpretation. An execution of the processes p₁, p₂ defines a particular interleaving of the order in which the processes acquire and release the locks. It can be represented as a path in a two-dimensional space (see Figure 3). If a lock is acquired or released by p₁, the path moves one unit on the horizontal axis; similarly, when a lock is acquired or released by p₂, the path moves one unit on the vertical axis. All paths start in (0, 0), when no operations have occurred, and they all end in (1, 1), where all operations have occurred, by both processes.

Figure 3. Geometric interpretation of all interleavings of two processes acquiring and releasing shared variables X, Y.

Each time two operations of an execution are swapped, in a way that is indistinguishable to both processes, the path is deformed. In Figure 3, two such paths are illustrated: P₁ which is sequential (p₁ then p₂), and P₂ where acq(Y) by p₂ is swapped with rel(X) by p₁.

There are two forbidden rectangles, where no execution path can go through: in the vertical (blue) one, Y would be acquired simultaneously by both, whereas in the horizontal rectangle (red), the same holds for X. Their union is the forbidden region where no execution enters. Notice that if both processes acquire X and Y (in either order), the protocol enters the deadlock region. The main point is that there are two classes C₁, C₂, of homotopic paths, that is, paths within a class can be deformed to each other. In one class, all paths go above the forbidden region and are indistinguishable from a sequential execution in which p₂ goes first, whereas in the other class, all executions go below the forbidden region and are indistinguishable from a sequential execution where p₁ goes first.

Notice that in a program where both processes acquire the locks in the same order, the forbidden region is a square, and hence no deadlocks can happen. Directed topology and the geometric theory of execution paths homotopy are studied in Fajstrup et al.,¹⁵ showing a direct representation of indistinguishability as continuous deformation of paths in an n-dimensional space (for n processes).

Verifying two-phase locking. Because indistinguishable executions can be substituted for each other, it means that checking whether one execution satisfies a particular property informs us whether all indistinguishable executions satisfy this property. Therefore, indistinguishability facilitates the verification of concurrent programs. When a program is serializable certain properties can be verified by considering only sequential (noninterleaved) executions of the program. This is equivalent to reasoning assuming a sequential setting.

But how can we prove that a program is serializable? Obviously, if we prove that it follows the two-phase locking protocol, then it is serializable. However, in reality, we are not given an execution example, but a program, possibly including conditional and repeat statements. Thus, we need to consider all its possible executions, to see if each one satisfies the two-phase locking regime. It turns out that we can ensure that the program follows 2PL, by considering only its sequential executions. The next theorem holds provided the program has no nonterminating loops.

THEOREM 3.1. If any execution satisfies two-phase locking when events of different processes are not interleaved, then any interleaved execution also satisfies two-phase locking.

Proving the theorem goes through showing that every execution that violates 2PL is indistinguishable from a noninterleaved execution in which the protocol is also violated. This implies that if we check (manually or mechanically) all noninterleaved executions of the protocol without finding a violation of 2PL, then all executions of the protocol do not violate 2PL.

Toward a contradiction, assume the claim does not hold and let α = α'(i_t, e) be the shortest execution that violates 2PL for which there is no indistinguishable noninterleaved execution; see Figure 4. Note that (i_t, e) is an event of process p_i, that violates 2PL, that is, acquires a lock after releasing a lock, or accesses an unlocked location. As α is the shortest such execution, we know that for prefix α’ of α there is an indistinguishable noninterleaved execution , (where α_i_j, contains events by p_i_j only).

Figure 4. Moving the event (it, e) to after p_i_t‘s events.

We argue that moving the event (i_t, e) to after p_i_t‘s events in , will still cause p_i_t to take the offending event. Intuitively, this happens because the event depends only on information that is local to the process p_i_t or locked by it, and p_i_t does not distinguish between the original execution and the noninterleaved execution. Namely, p_i_t has the same state at the end of α and at the end of α_i₁ …α_i_t. Therefore, the event can be moved to appear after the events α_i_t of the same process. Hence, p_i_t will make the same offending event (i_t, e), implying that the noninterleaved execution α_i₁ … α_i_t(i_t, e), also violates 2PL.

The reduction holds for any noncentralized locking protocol, such as commonly used ones like two-phase, handover-hand, tree, and dynamic graph locking. It allows sequential reasoning, whether manual or automated, about concurrent programs both in verifying that they adhere to a locking protocol and in development of algorithms for them. The reduction enables simpler and more efficient verification algorithms of a class of properties, called transaction-local. It justifies the use of sequential Hoare Logic or sequential type systems or sequential abstract interpretation to verify that the program adheres to a locking protocol. Programmers wishing to add, for example, a new procedure to swap two adjacent elements in a list to a program that uses hand-over-hand locking, do not have to worry about concurrent interleaving with other procedures. More details are in Attiya et al.,⁶ such as the case of nonterminating loops.

Indistinguishability is also used to prove a theorem that shows that if serializability is ensured in a program with two processes and two variables, it is ensured in any program, provided the implementation satisfies certain structural properties, one of them being symmetry.¹⁹ The proof goes by contradiction, taking an execution of the larger system that violates serializability and perturbing it into a bad execution for a system with two processes and two variables; a key step relies on an indistinguishability argument using symmetry.

Real-Time Indistinguishability

The previous examples describe asymmetric interactions, where one party interacts with another party, whose semantics (internal details) are hidden or abstracted away. Our next example ignores the semantics of the interactions, concentrating only on their timing.

The fundamental problem is estimating distant simultaneity—the time difference between the occurrence of two spatially separated (at different processes) events. This is behind many real-time applications in computer science that depend on clock synchronization, such as synchronizing cellphone communications, positioning systems (for example, GPS), failure detection, efficient use of resources (for example, releasing a connection), timestamping events and timeouts, and so on.

Computer clocks are typically based on inexpensive oscillator circuits and quartz crystals that can easily drift seconds per day. However, atomic clock time, so ubiquitous and integral to modern life, trickles down to the clocks we use daily, distributed through the Network Time Protocol and other means. Atomic clocks are so precise that if such a clock existed when Earth began, about 4.5 billion years ago, it would be off by only 30 s today.

How precise the time of an atomic clock can be estimated depends on the transmission delay bounds along communication paths from the atomic clock to the local computer, and on the drift bounds of the clocks of the computers along such paths. In other words, when a computer gets a message with the time of some atomic clock, the actual moment when the clock reading took place could have occurred at any moment within some range B, and from the computer’s perspective, it is indistinguishable which exact moment within B is the actual one. Thus, the computer’s best estimate of the atomic clock time is based on |B|/2. Indeed, selecting the mid-point is hedging the bets, because anything else leaves open the possibility of a bigger mistake. We now explain in more detail how to compute B.

Consider a process p₁ trying to synchronize its clock with an atomic reference clock, assumed to give real-time exactly, located in p₀. The basic interaction is when p₁ has a direct link to p₀, as illustrated in Figure 5. Process p₁ sends a message to p₀ and gets back a response. The send event by p₁ occurs at real-time 1, the event of p₀ receiving it, , occurs at real-time 6 (to simplify the example, we assume p₀ responds immediately, in the same event), and p₁ receives the response in event at real-time 12. Real-time is not directly observable, instead, each event occurs at some local time, which the process can observe. The precise meaning of real-time not being observable is through indistinguishability. Namely, suppose that, although the first message delay was 5 time units, it is known that it must have taken at least 4 time units; also, assume the return message cannot take more than 9 time units. As for the local clock of p₁, suppose its drift is bounded, such that between the sending and the receiving events, at most 12 time units could have passed.

Figure 5. p₀ sends and p₁ responds.

What is the latest time that could have occurred with respect to ? Answering also what is the earliest it could have occurred, would yield the desired indistinguishability interval B, where could have occurred, and selecting the midpoint would be used to compute the optimal correction to the local clock time of p₁. The crucial insight is that to compute how late could have occurred with respect to we have to shift to the right as much as possible the point of occurrence of , subject to two constraints: (1) the maximum delay of the second message (9 units) and (2) the minimum delay of the first message plus the minimum length of the time interval from to (the fastest that p₁‘s clock could have been running). In the example, the latest that can happen is at real-time 14 determined by the fastest delay of the first message and the slowest clock drift of p₁, and not by the largest delay of the second message (which could have been delivered at 15).

More generally, p₁ may be further away from the process p₀ with an atomic reference clock, and an arbitrary execution α is used to synchronize p₁‘s clock, where many more message exchanges take place, along different paths between p₁ and p₀. The goal is to estimate the indistinguish-ability interval of an event e at process p₁, with respect to an event e₀ in p₀. The previous example hints that the task at hand has to do with computing distances, on paths formed by indistinguishability intervals, formalized as follows.

The execution α is represented by a weighted directed graph G = (V, E, r, l). Each vertex of V is an event of α, either a send or a receive event. The ith event happening in process j is denoted . The directed edges E are causal relationships: there is a directed edge between two consecutive events in the same process, , and there is a directed edge , whenever is a send event and is the corresponding receive event. The weight functions r, l timestamp the events. For each e ∈ V, real(e) is the real-time occurrence of event e, and local(e) is the time according to the clock of the process where e happens. Since the clock of p₀ is perfect, for all events in p₀, we have .

For each pair of events e₁, e₂ joined (in either direction) by a directed edge of G, bounds on the relative real-time occurrence of two events can be estimated,

both when the edge represents a message transmission delay, and when it represents the time it takes a process to execute consecutive computational events. Then, define local(e₁, e₂) = local(e₁) − local(e₂), and let w(e₁, e₂) = B(e₁, e₂) − local(e₁, e₂). These weights w can be positive or negative, but summing them along a cycle always gives a nonnegative value (the telescopic sum of local(e_i, e_i+1) along a cycle is 0). Thus, for a pair of events e₁ and e₂, the distance d(e₁, e₂) with respect to these weights is well defined. Interestingly, observe that d(e, e’) = 0, for any two events in p₀. It is not hard to show³⁰ that the indistinguishability interval of an event e at some process p₁, with respect to an event e₀ in p₀ is as follows.

THEOREM 4.1. real(e) ∈ [−d(e₀, e), d(e, e₀)]

The meaning of this theorem is that e might have occurred at any time in this interval. Furthermore, for each such time, there is an execution indistinguishable to all processes.

Indistinguishability is useful for defining the semantics of a concurrent program in terms of the notion of serializability.

These results are based on Patt-Shamir and Rajsbaum,³⁰ a follow up of,^{5, 20} which studied how closely in terms of real-time processes can be guaranteed to perform a particular action, in a failure-free environment. The possibility of failures affects the size of the indistinguishability interval, providing a very interesting topic from the indistinguishability perspective. The standard technique is to consider several clock reference values, and taking the average after disregarding the most extreme values. There are many papers on clock synchronization algorithms, see, for example, Attiya and Ellen⁴ for references on the more theoretical perspective, and the book²⁸ from the more practical perspective.

Global Indistinguishability Structure

The previous examples of indistinguish-ability have a local flavor: we look at a single execution α and the executions indistinguishable from α to all processes. It turns out that studying executions that are indistinguishable to a subset of processes lead to understanding the global indistinguishability structure of all executions. This uncovers an intimate relation between indistinguishability and higher-dimensional topological properties. The overview presented here is very informal; for a more precise description, see Herlihy et al.²²

Initial indistinguishability structure. Consider three processes b, g, w (black, gray, white) that communicate with each other to solve some task. When the computation begins, each process receives an input value. In the binary consensus task, the set of input values is {0, 1}. In certain renaming tasks, processes start with distinct input values taken from the set {0, 1, 2, 3}. Initially each process knows only its own input. An initial state

is a set of three initial local states, each one consisting of a pair of values. Two initial states, I₁ and I₂ are indistinguishable to a process, if the process has the same input value in both states, that is, if I₁ ∩ I₂ contains its initial local state. If we draw an initial state as a triangle, whose vertices are the local initial states, I₁ and I₂ share an edge if they are indistinguishable to two processes, and they share only a vertex if only one process does not distinguish between them. Figure 6 shows the input complex for consensus looks like a triangulated sphere, and the one for renaming looks like a triangulated torus. Each one is a simplicial complex because it consists of a family of sets closed under containment (each edge of a triangle is a set of two local states, and each vertex is a singleton set).

Figure 6. Consensus and renaming input complexes.

How indistinguishability evolves. As processes communicate with each other, learning about each other’s input values, the structure of indistinguishability evolves. Suppose that the processes publicly announce their input values, but each process may miss hearing either or both of the other processes’ announcements, as determined by a communication pattern, namely a directed graph G on the vertices b, g, w; an arrow v → v’ signifies that v’ hears the input from v. Thus, v’ hears inputs from the set N^–(v’) of processes which have an arrow toward vertex v’. Which input value v hears from v depends on which initial state I is G applied to. Applying G to an initial state I, produces a new state, {(b, view(b)), (g, view(g)), (w, view(w))}, where the local state of p, view(p), is the subset of I of processes N^–(p).

Figure 7 illustrates the IS-patterns (immediate snapshot or block executions), a subset of all possible communication patterns. An IS-pattern for a set of processes P is defined by an ordered partition S₁, …, S_k of P (1 ≤ k ≤ |P|), specifying that processes in S_i hear the values from all processes in S_j, j ≤ i. Consider, for instance, the IS-pattern {b, g, w} consisting of the trivial partition of {b, g, w}, which corresponds to the center triangle, where all processes hear from each other. The arrows g ↔ w belong also to the top triangle, corresponding to the partition {b}, {g, w} where the only difference is that b does not hear from the other two processes.

Figure 7. IS-communication patterns.

IS-patterns are important because when applied to an input complex, I, the resulting protocol complex P is a subdivision of I. In Figure 8, IS-pat-terns are applied to two consensus input simplexes. One can see that b and w with input 0 belong to two input triangles, and this edge is subdivided into three edges in P, which belong to both the blue and the yellow subdivided triangles, due to IS-patterns where b and w do not hear from g (and hence cannot tell if its input is 0 or 1).

Figure 8. Two input triangles, application of IS-patterns on them, and the requirement to produce consensus outputs.

In the same way that we applied each IS-pattern to each initial state to get P, we can again apply each IS-pattern, but now to each state of P, obtaining a subdivision of P, and so forth. Each time the processes communicate once more through an IS-pattern, the input complex is subdivided more and more finely. Indeed, a fundamental discovery is that there are topological invariants, preserved no matter how many times the processes communicate, and no matter what they tell each other each time they communicate. In the case of any unreliable asynchronous communication by either message passing or read/write shared-memory, P “looks like” (is homotopic to) the input complex I.

Remarkably, topological invariants determine the computational power of the model. In other, more reliable models of computation (for example, at most t out of n, t < n − 1 processes can fail, or synchronous models, or shared-memory primitives stronger than read/write registers), P preserves weaker topological invariants, and “holes” are created, giving the model its additional computability power.

Specifications as indistinguishability requirements. Suppose that after communicating through IS-patterns, each process produces an output value. Let (p, view(p)) be the local state of a process p in the protocol complex P, after an IS-pattern. Hence, the output value produced by p is a function of its view, δ(p, view(p)). Namely, if p does not distinguish between two triangles of P, then it must decide the same value in both.

A simplicial complex defined by triangles labeled with output values is used to specify the task that the decision values should satisfy. For binary consensus, the output complex, in Figure 8, consists of two disjoint triangles, one labeled with 0 output values in all its three vertices, and another labeled with 1 in all its three vertices. Thus, a task 〈 I, O, Δ〉 consists of an input complex I, an output complex O, and a relation Δ specifying for each input triangle σ ∈ I, which output of O, Δ(σ), represent valid outputs for the task.

Finally, Figure 8 is meant to represent that the decision function δsolves the task, if for any triangle σ’ in P, δ(σ’) is a triangle τ ∈ O, such that τ ∈ Δ(σ), where σ is the input triangle for σ’.

To summarize, a new indistinguishability global structure (represented by P) is generated after communication, and a task specifies a target indistinguishability structure (represented by O). The question is whether P can be (simplicially) mapped to O respecting Δ. This is a topological question with deep implications to distributed task computability in various models (message-passing and shared memory, synchronous and asynchronous, with crash and Byzantine failures).

This formalization can be interpreted as a question of gaining knowledge, as explained in Goubault et al.,¹⁸ where it is described how the simplicial complexes described in this section have an equivalent representation as Kripke models. Roughly speaking, each triangle is a state of the Kripke graph, and if two triangles share a vertex of process p, then the two corresponding states are connected by an edge labeled p. Indeed, there is an intimate relation between indistinguishability and the theory of reasoning about knowledge for distributed computing described in Fagin et al.¹⁴

Conclusion

Indistinguishability plays a central role in computer science. Examples from different areas (automata theory, learning, specification, verification, distributed computing and epistemic logic) demonstrate how different levels of abstraction entail distinct notions of indistinguishable observations, and different uses of indistinguishability (to show computability and complexity limitations, and also to design solutions). Some examples should be treated in more depth, and there are many additional application areas.

One application area is computational learning and related complexity topics, as recently reviewed in Wigderson.⁴⁰ Many subareas can be viewed through the lenses of probabilistic indistinguishability, for example, PAC learning,³⁸ cryptography, communication complexity, indistinguishability despite errors,³² and coding theory.

Indistinguishability plays a role in artificial intelligence, for example, in Turing’s test, and more generally, Turing-like tests for other applications, such as Go simulators¹⁰ and writing a program simulating a living organism.²¹

We discussed formal methods, another area where indistinguishability is a key, notably in behavioral equivalences.¹¹ And we discussed logic, where the longstanding connection between modal logic and topology goes back to McKinsey and Tarski,²⁷ and up to today, with a topological semantics for belief.⁷ Another interesting example from logic is Ehrenfeucht-Fraïssé games.³¹

Distributed computing is all about interactions, with abundant instances where indistinguishability is a key. Examples include labeling schemes, synchronizers, mutual exclusion, anonymity and symmetry, and partitioning. Many impossibility results are discussed in Attiya and Ellen.⁴

Finally, indistinguishability cuts across topics. Multi-agent epistemic logic relies on Kripke models to represent indistinguishability.¹⁴ These in turn, can be considered as the dual of simplicial complexes,¹⁸ and we described how the indistinguishability structure evolves as interaction occurs preserving topological properties. Also, having knowledge means being able to distinguish between situations, so the same action must be taken in indistinguishable setups.²⁹ We discussed the duality between indistinguishability and knowledge also in the context of learning automata.

Acknowledgments. We would like to thank Hans van Ditmarsch, Jérémy Ledent, Arnold Rosenberg, Jennifer Welch, and the reviewers for helpful comments. Supported by grants from UNAM-PAPIIT IN106520 and ISF 380/18.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Indistinguishability

View in the ACM Digital Library

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from permissions@acm.org or fax (212) 869-0481.

DOI

10.1145/3376902

May 2020 Issue

Published: May 1, 2020

Vol. 63 No. 5

Pages: 90-99

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

News Sep 30 2025

AI Datacenters Need Nuclear Fusion

Mark Halper

Architecture and Hardware

BLOG@CACM Sep 30 2025

Durable Skills in the Age of AI

Beth Nichols and Russ Ministeri

Artificial Intelligence and Machine Learning

BLOG@CACM Sep 30 2025

Why Dexterous Hands Matter for Embodied AI

Shaoshan Liu

Architecture and Hardware

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Key Insights

Automata and Learning

Sequential Reductions in Concurrent Programming

Real-Time Indistinguishability

Global Indistinguishability Structure

Conclusion

Indistinguishability

DOI

May 2020 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.