In the early 2000s, the multicore revolution began, when it became difficult to increase the clock speed of microprocessors, and manufacturers shifted to the approach of increasing performance through multiple processing cores per chip. The “free lunch” of relying on increasing clock speed to obtain faster programs was over. Unfortunately, developing multi-process programs can be dramatically more difficult. The difficulty of reasoning about many things happening at the same time is compounded by the fact that processes are executed in a highly asynchronous and unpredictable way, possibly even crashing, and furthermore, affected by low-level architectural details. An approach strongly advocated to cope with these difficulties was elegantly presented by Shavit:^{36}

### Key Insights

- While is the de facto correctness condition to reason about safety properties in concurrent it deals only with sequential specifications.
- Reducing the complexity of reasoning about concurrency through sequential thinking has limitations, which can be dealt with through two natural, progressively more general generalizations of linearizability.
- Set-linearizability is a correctness condition used to reason about specifications of sets of operations happening at the time. Interval-linearizabilty can be used for specifications about operations that can overlap in time in arbitrary ways. Both retain the properties of linearizability that have made it so popular: Proving the correctness of individual objects implies the correctness of the whole system; it is possible to reason about nonblocking implementations; they are state based.

*“It is infinitely easier and more intuitive for us humans to specify how abstract data structures behave in a sequential setting, where there are no interleavings. Thus, the standard approach to arguing the safety properties of a concurrent data structure is to specify the structure’s properties sequentially.”*

Providing the illusion of a sequential computation from the users’ perspective has been used since the early seminal works that paved the way of modern distributed systems, for example, by Lamport,^{27} as well as in the subsequent advances in defining general, practical correctness conditions, most notably, *linearizability* introduced by Herlihy and Wing.^{24} The successful history of reducing the complexity of concurrency through sequential thinking spans over half a century but may be reaching its limits.^{31}

*First limitation: Inherently concurrent problems.* It is not clear that providing users with the illusion of a sequential computation should go as far as implementing solutions only to sequential problems. It is not obvious because some problems are inherently concurrent. Consider for example a ticket reservation system, say for seats at a concert. There is nothing wrong with taking care of two reservations of different seats concurrently, and it may be more efficient than serializing the two reservations. Furthermore, it may be acceptable to let both reservations happen *at the same time* from the users’ perspective, namely, none of them went first. Linearizability precludes such concurrent problems from being implemented in a distributed system simply because, by definition, linearizability is a tool to show that a concurrent algorithm implements a problem specified through a *sequential specification* (see sidebars 1 and 2). In the case of the ticket reservation example, a sequential version would be a queue. In what sense is a queue a “sequential version” of the data structure of the example? This is an interesting philosophical question in its own, in any case, a sequential version may artificially change the semantics of the problem.

There are examples of distributed problems that do not have any sequential version that could even remotely mimic their behavior. Consider Java’s *Exchanger* object, which allows two processes (threads) to atomically exchange a value if invocations are concurrent. In a sequential version, the object would always return that the exchange failed to take place.

*Second limitation: The penalty of sequential specifications.* The second reason for doubting the axiom of illusion of sequentiality is that linearizable implementations of sequential specifications may be expensive, or even impossible to implement. A classic result is the impossibility of solving consensus by asynchronous processes that may crash, using only simple `read/write`

primitives.^{15,28} It is impossible to build concurrent implementations of some of the classic sequential specifications (for example, sets, queues, stacks) that completely eliminate the use of expensive synchronization primitives.^{5} Finally, there are formal complexity lower bounds for some sequential objects.^{11} As a result, distributed system designers in some cases have to give up either the idealized goals of scalability and availability, or relax the consistency linearizability requirement.

*Three benefits of linearizability.* Despite these limitations, programmers are afraid of relaxing consistency for very real and concrete technical reasons. Building large systems demands composability, and up to now linearizability is the de facto standard because it allows us to do so: it is sufficient to prove that each of the implementations of concurrent objects is linearizable to guarantee that the whole system of multiple objects is also linearizable. Additionally, linearizability is particularly attractive for reasoning about nonblocking implementations (see Sidebar 3). Finally, in any linearizable implementation there is a well-defined notion of the state of the system at any time, which in turn facilitates writing correctness proofs.^{22}

*The universe of concurrent object specifications.* Numerous correctness conditions have been proposed over the years. More recently, algorithms implementing concurrent objects have been adapted to cope with multicore processors with relaxed memory architectures, requiring new correctness conditions. For example, Viotti and Vukolic^{39} present a formal framework for defining correctness conditions for multicore architectures, covering both standard conditions for totally ordered memory and newer conditions for relaxed memory. Yet, the sequential paradigm is so entranced that correctness of concurrent implementations is understood in terms of conditions that determine relationships between concurrent executions of an implementation and sequential executions of the object being implemented. Even recently proposed correctness conditions for recoverable objects in multicore architectures with durable persistent memory are based on sequential specifications, for example, strict linearizability,^{1} recoverable linearizability^{4} and durable linearizability.^{25}

*Concurrent specifications.* The desire for truly concurrent semantics is old, going back to Lamport,^{27} where a specification of a concurrent object is simply the set of all the concurrent executions that are considered correct. Reasons for the desire of concurrent specifications have been argued since at least the seminal work of Montanari:^{29} concurrent specifications are more informative, testing sequences defining partial orderings may carry the same information as an exponentially larger number of interleaving traces. Another reason is that in truly concurrent models the existing fine parallelism of the application is fully specified. In some situations, truly concurrent semantics is the most natural, as in Petri nets. Other important works presenting concurrent semantics have been proposed.^{14}

*Linearizability-based concurrent specifications.* The Linearizability framework (see sidebars 1 and 2) can be progressively extended to specify more concurrent objects, as illustrated in Figure 1. The automaton used in a sequential specification is modified to specify set-sequential objects, where a transition can be labeled with more than one operation taking place at the same time. Further, *interval-sequential* objects are defined with an automaton where operations overlap in arbitrary ways. The aim is to define concurrent specifications while preserving the notion of state. *Set-linearizability* and *interval-linearizability*, the associated correctness conditions, define a way of associating a concurrent execution with a concurrent specification, either of a set-sequential or an interval-sequential object. Intuitively, we move from *linearization points* to *linearization sets* of points, and more generally to *linearization intervals*, representing the overlapping in time among concurrently executed operations.

**Figure 1. A linearizability-based hierarchy.**

The goal of this article is to describe set-linearizability and interval-linearizability, stressing the benefits of linearizability are kept—both are composable, state-based, and nonblocking conditions. First, set-linearizability,^{20,30} where more than one operation can be linearized at the same point. Then, interval-linearizability,^{8} where operations can overlap in arbitrary ways. In fact, interval-linearizable specifications have been shown to be the most expressive ones.^{17}

### Specifying and Implementing a Concurrent Object

The *lattice agreement* problem has been actively investigated recently. Unlike consensus, it is implementable in an asynchronous system where processes can fail. Several papers have presented replicated state machine implementations based on lattice agreement,^{12,40} instead of the usual consensus-based implementations. Recently, Kuznetsov, Rieutord, and Tucci-Piergiovanni^{26} showed how to implement re-configurable lattice agreement and explained how it can directly be used to obtain re-configurable versions of several sequential types such as max-register, conflict detector, and in fact, any state-based commutative abstract data type (ADT).

A *join-semilattice* is a tuple (*L*, ⊑), where *L* is a set partially ordered by the binary relation ⊑, such that for all elements of *x*, *y* ∈ *L*, there exists a least upper bound, called *join.* For the purposes of this exposition, we take *L* as the set with all finite sets of natural numbers and ⊑ as the subset relation, with the join of *x*, *y* being *x* ∪ *y.*

*Specifying lattice agreement.* The lattice agreement abstraction is presented in^{26} in the style that has been traditionally used in distributed computing: a list of requirements that operations must satisfy. An operation *propose* (*x*) invoked by process *p* with input *x* ∈ *L*, returns a value *v’* ∈ *L*, such that:

**Validity.** If a *propose* (*v*) operation returns *v’* then *v’* is the join of some proposed values including *v* and all values returned by previous operations.

**Consistency.** The values returned are totally ordered by ⊑.

In addition, there is a progress requirement, that will not be central here; a common one is if a process invokes a propose operation and does not fail then the operation eventually returns (see Sidebar 3).

This specification is not very formal. The idea of using this style of specification started with the goal of describing *objects*, by which it is usually meant a sequential specification (Sidebar 1). Consider an automaton defining our lattice agreement example as a sequential object. A transition from *s* to *s’* would be labeled with *propose* (*v*) → *v’*, meaning that if the object is in state *s*, and a process that invokes *propose* (*v*) could get back *v’* in case the object moves to state *s’.*

The first challenge is to come up with such a sequential automaton, which would be a formal specification of the above informal list of requirements. A natural one would identify each state with a pair of subsets of *L*, to remember all the elements that have been proposed and those that have been returned so far. The initial state is *s*_{0} = (∅, ∅), and for the transition from *s* = (*s*_{1}, *s*_{2}) to *s’*, we would have that *s’* = (*s*_{1} ∪ *v*, *s*_{2} ∪ *v’*). The reader can easily complete the formal specification of the sequential automaton.

*Identifying correct implementations.* The second challenge is to check the correctness of an execution of the object against the sequential automaton specification, following the linearizability definition (Sidebar 2). In Figure 2, an example of an execution for three processes, *p*_{1}, *p*_{2} and *p*_{3}, is represented, where one can see that operation calls (from invocation to response) overlap in time. Linearizability requires to find a linearization point in between each invocation and response, so that these points define a valid sequential execution of the lattice agreement automaton. In the example of Figure 2, the linearization points do not produce a valid sequential execution because the operation call by *p*_{1} cannot possibly return the value 2, which has not yet been proposed. Furthermore, from a linearizability point of view, the execution is incorrect, because there are no linearization points that satisfy the automaton specification: any way we order the operations by *p*_{1} and *p*_{2} would give an incorrect response for one of them. The same problem occurs with respect to any sequential specification of lattice agreement.^{a}

**Figure 2. Example of a non-linearizable lattice agreement execution.**

But what is wrong with the execution of Figure 2? It certainly satisfies the lattice agreement consistency requirement stated above. The problem is the validity requirement—the execution would violate it with respect to any sequential specification of lattice agreement. Validity seems to assume a priori that no operations are invoked concurrently. But the whole point of the lattice agreement state machine replication idea was to avoid using consensus to order operations!

There are several lattice agreement implementations.^{26,40} For illustration, consider the simple one-shot lattice agreement implementation using `read/write`

primitives on a shared memory in Figure 3 (adapted from ^{8}); one-shot means that each process invokes only once the *propose* operation.^{b} In the algorithm, each process first writes its proposal in a dedicated entry of the shared memory (Line 1), and then repeatedly reads the whole memory and computes the join of the proposals that have been written so far, until it sees no new relevant proposal (Lines 3 to 7). The execution of Figure 2 can be produced by this algorithm, if *p*_{1} and *p*_{2} write their proposals (in Line 1, in any order), and then both execute the loop of Line 4 twice, both returning {1, 2}, before *p*_{3} starts executing its code.

**Figure 3. A one-shot lattice agreement implementation based on read/write primitives (code of process p_{i}).**

*From sequential objects to truly concurrent objects.* What is lattice agreement, given that it has no sequential specification? What problem is the algorithm of Figure 3 solving? One encounters publications with similar situations: a list of requirements are used to specify a problem with no sequential specification, and for the lack of a name to such an entity, researchers have either used different ad hoc names such as “abstraction” or “problem” or “type,” or simply called it an “object with no sequential specification,” without further explanation of what this might be.

Finally, in 1994, Neiger^{30} came up with an idea largely overlooked in the literature. So overlooked, that some 20 years later Hemed, Rinetzky and Vafeiadis^{20} independently rediscovered it—the idea of a *set-sequential* specification (called *concurrency-aware* specification^{20}). The transitions of the automaton specification are labeled with sets of operation invocations, each one together with its response, as in Figure 4.

**Figure 4. Part of a set-sequential automaton.**

The corresponding correctness condition is *set-linearizability.* Its aim is to allow for the *simultaneity* of some operations: one can put linearization points grouping together several operations at the same moment of time in a set. Figure 5 shows a set-linearization of the execution we have been considering. It can be tested against the set-sequential automaton illustrated in Figure 4 to show that it is a correct execution, that is, set-linearizable.

**Figure 5. A set-linearizable lattice agreement execution.**

But now let us consider the execution of Figure 6. Again, it seems to satisfy the consistency requirement of lattice agreement, and it seems correct with respect to an intuitive interpretation of the validity requirement. Furthermore, again there is an algorithm that can produce it, namely the one in Figure 3. As before, *p*_{1} and *p*_{2} execute their write operations (Line 1) concurrently, but now *p*_{1} executes alone the loop of Line 4 twice, while *p*_{2} is delayed, then *p*_{3} executes its write operation, and finally both *p*_{2} and *p*_{3} execute the loop of Line 4 twice.

**Figure 6. An interval-linearizable execution of lattice agreement that is not set-linearizable.**

However, the execution of Figure 6 is not only not linearizable but also is not set-linearizable. The reason is the operations by *p*_{1} and *p*_{3} are not concurrent, and hence they cannot be set-linearized together. But the operation of *p*_{2} must be set-linearized with both because its proposed value has been returned by the operation of *p*_{1}, and it has returned the value proposed by the operation of *p*_{3}. Namely, the operation of *p*_{2} could not have taken effect at a single point of time.

This type of example motivated us to propose in Castañeda et al.^{8} one further generalization of linearizability—*interval-linearizability.* The corresponding generalization of a set-sequential object is an interval-sequential object. It is defined in terms of an automaton whose transitions are labeled with sets of operation invocations, but each such invocation is not necessarily matched with a response; the response can appear later in another transition. The interval of each operation is now marked with either one linearization point (in case it appears to be executed instantaneously) or with two linearization points (in case it overlaps with at least two other non-overlapping operations). An example of part of such an automaton for lattice agreement is presented in Figure 7. This automaton validates the execution of Figure 6. Notice that the operation of *p*_{2} is invoked in the first transition, and its corresponding response appears in the second transition, concurrently with the operation of *p*_{3} (a single point).

**Figure 7. Part of an interval-sequential automaton.**

Hence, interval-linearizability extends set-linearizability by allowing *time-ubiquity* of operations—an operation can appear as being executed concurrently with several consecutive, non-overlapping operations.

Interval-linearizability extends set-linearizability by allowing time-ubiquity of operations—an operation can appear as being executed concurrently with several consecutive, non-overlapping operations.

*Keeping the benefits of linearizability.* We stress that the extensions of linearizability to set-linearizability and interval-linearizability are not done at the price of losing any of its three properties, as proved in Castañeda et al.^{8} First, both are state-based specifications, which is useful for documentation and correctness proofs. Second, they are composable, can safely use several linearizable, set-linearizable or even interval-linearizable object implementations because their composition will maintain the corresponding property. Third, the non-blocking property of linearizability also is preserved (see Sidebar 3).

We proceed now to explore in more de*tail* set-linearizability and then interval-linearizability, with additional examples.

### Set-Linearizability

As discussed, the idea is to allow pre-defined subsets of operations to be seen as occurring simultaneously; such a set of operations is called a *concurrency class.* Hence, set linearizability is associated with operation simultaneity. A *set-sequential* object is specified by an automaton whose transitions are labeled with concurrency classes. It defines a set of valid *set-sequential* executions, each one consisting of a sequence of concurrency classes. The corresponding correctness notion, set-linearizability, allows several operation calls to be linearized at the same linearization point, namely, all the operations belong to the same concurrency class.

Observe that when each concurrency class consists of a single operation, *set-linearizability* boils down to linearizability (recall Figure 1). Moreover, the containment is strict, since there are set-sequential objects with no sequential specification, such as the two different set-sequential objects.

The *exchanger object.* The Java documentation provides the following specification:

*“A synchronization point at which threads can pair and swap elements within pairs. Each thread presents some object on entry to the exchange method, matches with a partner thread, and receives its partner’s object on return.”*

Clearly there is no sequential specification of an exchanger. Such a specification is outside the domain of linearizability simply because linearizability rules out concurrency. An exchanger however can be specified as a set-sequential object whose set-sequential executions contain a concurrency class for each pair of operation calls exchanging elements, and a concurrency class for every operation call that is not able to exchange its element. Figure 8 depicts an example of a set-linearizable execution of a concurrent exchanger implementation.

**Figure 8. Example of a set-linearizable exchanger execution.**

Exchangers are useful synchronization objects that have been used in several concurrent implementations,^{23,35,37} however, the lack of a sequential specification of an exchanger makes correctness proofs intricate. As a concrete example, consider the scalable and linearizable *elimination backoff* stack implementation of Hendler, Shavit, and Yerushalmi.^{23} Very briefly, the idea in this stack implementation is as follows: whatever the state of the stack, two concurrent `push`

(`x`

) and `pop()`

invocations can be “eliminated” if the `pop()`

operation returns *x*, since the pop operation can be linearized right after `push`

(*x*); if an operation does not find a concurrent operation to be eliminated with, it uses a “slower” stack implementation to complete its invocation. The elimination scheme is implemented through an array of exchanger objects where operations try to exchange elements.

Having a formal specification of the exchanger object is important for developing modular verification techniques for concurrent implementations. The set-sequential specification of the exchanger object has been exploited in Hemed et al.^{20} to obtain a modular proof of the elimination backoff stack. Namely, the exchanger objects used in the elimination scheme are independently shown to be correct (that is, set-linearizable) and then the elimination back-off stack is shown to be linearizable assuming the elimination scheme is made of set-linearizable exchanger objects. Thus, the correctness of the elimination back-off stack does not rely on any implementation of the elimination scheme.

*A relaxed queue.* Despite of its benefits, linearizability has drawbacks, beyond the fact there are inherently concurrent objects without a sequential specification. There are impossibility results for several concurrent objects, like sets, stacks, queues, and work-stealing, showing that any linearizable implementation must use synchronization mechanisms that can be implemented only through expensive instructions of current multicore architectures. These synchronization mechanisms are the read-modify-write primitives, like `fetch&inc`

, `swap`

, and `compare&swap`

, and the read-after-write synchronization pattern (also known as the *flag principle*^{22,32}), in which a process writes in a shared variable *A* and then reads another shared variable *B.*^{c}

Herlihy^{21} showed that any linearizable nonblocking implementation of a queue or stack cannot use only the simple `read/write`

primitives, it must use more powerful read-modify-write primitives. In the same direction, Attiya et al.^{5} proved that any linearizable implementation with the minimal progress guarantees of a set, stack, queue, or work stealing must use either read-after-write synchronization patterns or read-modify-write primitives.

Recently, set-linearizability has been used to define relaxations of queues and stacks that admit set-linearizable implementations that use only `read/write`

primitives and without read-after-write synchronization patterns, hence evading the impossibility results. Intuitively, in a queue with *multiplicity*,^{9} the usual definition of a sequential queue is relaxed in a way that distinct dequeue operation calls can return the same item, but this can happen only if they are concurrent. Then, dequeue operations returning the same item belong to the same concurrency class. In all other cases, the object behaves like a usual sequential queue. The expressiveness of set-linearizability allows to precisely specify that the relaxation can happen *only* in case of concurrency.

As an example of a set-linearizable implementation, consider the simple implementation of a single-enqueuer queue with multiplicity in Figure 9. *Single-enqueuer* means there is one distinguished process, called *the enqueuer*, that can invoke the enqueue operation. As we will explain, the implementation uses only `read/write`

primitives and is devoid of read-after-write synchronization patterns. (For clarity, we consider here the single-enqueuer case, but it has been shown there are set-linearizable implementations for the multi-enqueuer case with similar properties^{9}). It merits mention that the impossibility results in Attiya et al.^{5} and Herlihy^{21} apply also for the single-enqueuer case. The implementation in Figure 9 is derived from the implementations in Castañeda et al.,^{7} where work-stealing with multiplicity is studied, and shown to be useful to derive relaxed work-stealing implementations with better performance than classic (that is, non-relaxed) work-stealing solutions, when solving problems such as parallel spanning tree.

**Figure 9. A set-linearizable implementation of a single-enqueuer queue with multiplicity (code for process p_{i}).**

The simple-enqueuer implementation in Figure 9 uses a shared array *ITEMS* where *items* are stored in/removed from, and two shared integers, *TAIL* and *HEAD*, to store the current *head* and *tail* of the queue. *ITEMS* and *TAIL* are manipulated through simple `read/write`

primitives, while *HEAD* is manipulated by dequeuers through the `max _ read`

and `max _ write`

linearizable operations: `max _ read`

returns the maximum value written so far in *HEAD* and `max _ write`

writes a new value in *HEAD* only if it is greater than the largest value that has been written so far. Aspnes, Attiya, and Censor-Hillel have proposed wait-free (see Sidebar 3) linearizable implementations of `max _ read`

and `max _ write`

that use only *read/write* primitives and are devoid of read-after-write synchronization patterns,^{2} and thus the implementation in Figure 9 possesses these properties too.

In the implementation, whenever the enqueuer wants to enqueue an item, it first reads the current value *t* of *TAIL*, then stores its item *x* in *ITEMS*[*t*] and finally increments *TAIL* by one (Lines 9 to 11). A dequeue operation first reads the current value *h* of *HEAD* using `max _ read`

and then reads the value *x* in *ITEMS*[*h*] (Line 13 and 14); if *x* is distinct from ⊥, then *x* is an item that has been enqueued and the operation return *x*, after it increments *HEAD* by one using `max _ write`

which logically “marks” the item in position *ITEMS*[*h*] as taken (Lines 16 and 17), otherwise *x* is equal to ⊥, which means that the queue is empty as *HEAD* has “surpassed” *TAIL*, and hence the operations returns empty (Line 19).

Two or more concurrent enqueue operation calls can return the same item. For example, the operations can read one after the other, in some arbitrary order, the same value *h* from *HEAD* in Line 13 and then read one after the other, again in some order, the value in *ITEMS*[*h*] in Line 14. Namely, all these operations read the item in *ITEMS*[*h*] before the first of them “marks” the item in *ITEMS*[*h*] as taken by updating *HEAD* using `max _ write`

in Line 16. Due to the semantics of `max _ read`

and `max _ write`

, *HEAD* only “moves forward,” hence a “slow” dequeue operation cannot write a small value in *HEAD* that could cause another (possibly non-concurrent) dequeue operation to return an item that has already been dequeued. Therefore, enqueue operation calls that return the same item can be linearized at the same linearization point, that is, in the same concurrency class.

### Interval-Linearizability

Set-sequential specifications are more expressive than sequential ones, but there are situations where an operation appears as being executed concurrently with a sequence of several sequentially executed operations, as discussed previously. An interval-sequential specification^{8} defines a sequence of concurrency classes, with the possibility that an operation has its invocation in one concurrency class and its response in a later concurrency class, implying that it executed over an interval of time instead of a single point. Figure 10 takes the view of a poset determined by the operation intervals, and the corresponding order diagram, for the three types of specifications.^{d}

**Figure 10. Interval poset vs. order diagram.**

*The batched counter.* The rapid increase of data production nowadays naturally asks for parallelization of computations in big data processing systems, to achieve timely responsiveness. A common task in such systems is that of counting events in batches (for example, number of queries of a website) for doing some statistical analysis later. The task is modeled in the *batched counter*, a sequential object that stores an integer *R* initialized to 0, and provides two operations, update(*x*) that increments *R* by *x*, and `query()`

that returns the current value of *R.* One would like to have a concurrent implementation that allows several processes to update and query the counter concurrently and rapidly. It turns out that linearizability prevents the existence of efficient linearizable implementations of the batched counter using the simple `read/write`

primitives. Rinberg and Keidar^{33} proved that for any linearizable `read/write`

implementation of batched counter for *n* processes that is wait-free, the update operation (arguably the most frequently called operation in a big data processing system) has step complexity^{e} Ω(*n*). This lower bound calls for well-defined relaxations of the objects that admit efficient implementations.

Set-sequential specifications are more expressive than sequential ones, but there are situations where an operation appears as being executed concurrently with a sequence of several sequentially executed operations.

Indeed, Rinberg and Keidar proposed a relaxation of the batched counter that has a wait-free `read/write`

implementation with constant step complexity of its update operation. The implementation appears in Figure 11. The relaxation is formally defined through the *intermediate value* (IV) linearizability formalism introduced in Rinberg et al.,^{33} an extension of linearizability for quantitative data processing sequential objects. Loosely speaking, IV-linearizability admits update operation calls to return a value that approximates the correct response; the sequential specification of the object we seek to implement defines the correct responses. An execution is IV-linearizable if the output of every update operation lies in an *interval* defined by two sequential executions of the object.

**Figure 11. An interval-linearizable implementation of a batched counter (code for process p_{i}).**

The relaxed batched counter can be specified by an interval sequential automaton. In each valid interval-sequential execution, update operations happen atomically and sequentially, namely, each such operation spans a single concurrency class, and no more than one update operation appears in a concurrency class. A query operation, however, can span several concurrency classes, and its output value lies in the interval defined by the contents of the counter when the operation starts and terminates, respectively; a query operation can also appear in a single concurrency class, denoting that it is not concurrent with any other operation, and hence it must return the current value of the counter in this case.

Figure 12 depicts an interval-linearizable execution of the relaxed batched counter. The first query operation by *p*_{3} returns value 10 because it is implemented by reading the `update`

(2) of *p*_{1} and then the two update operations of *p*_{2}. This `query()`

→ 10 operation cannot be linearized at a single point: update(1) of *p*_{1} must happen before the `query()`

→ 8 of *p*_{2}, which in turn happens before the `update`

(3) of *p*_{2}.

**Figure 12. An interval-linearizable execution of a batched counter object.**

The implementation of the batched counter in Figure 11 from Rinberg et al.^{33} is indeed interval-linearizable. In an interval-linearization of an execution of the implementation, the update operation calls are sequentially ordered according to the moment they execute Line 21 (each operation has its own concurrency class), while each query operation call is interval-linearized to the interval of that sequence that spans the update operations that are concurrent to it.

To conclude our example, we observe that the implementation in Figure 11 has a drawback—the step complexity of its query operation is linear in the number of processes *n.* This drawback can be solved as follows.

First, we note that using the atomic read-modify-write `fetch&inc`

primitive, it is easy to obtain a linearizable implementation of the batched counter with constant step complexity in both its update and query operations. The `fetch&inc`

(*R*, *d*) primitive atomically returns the current value of *R* and adds *d* to *R.* In a simple linearizable implementation of the batched counter, there is a shared variable *A* initialized to zero, and `update`

(*x _{i}*) simply performs

`fetch&inc`

(*A*,

*x*), while

_{i}`update`

(*x*) simply reads

_{i}*A*, that is,

`read`

(*A*). Despite the good theoretical properties of this simple implementation, it does not perform well in practice as all processes work on the shared variable

*A*which becomes a bottleneck, creating high contention in real multi-core architectures.

An intermediate solution between the linearizable solution, and the one in Figure 11 consists in having an array *A* of length *K* (instead of length *n* as in the implementation in Figure 11), where *K* is a system-dependent constant (or maybe a sublinear function); `update`

(*x _{i}*) first randomly picks an entry

*A*[

*k*] of

_{i}*A*and performs

`fetch&inc`

(*A*[

*k*],

_{i}*x*), while

_{i}`update`

(*x*) returns the sum of the

_{i}*K*entries of

*A*, similarly to update in Figure 11. The idea is to randomly spread the contention over the distinct components of

*A.*The implementation has good properties: it retains wait-freedom and interval-linearizability and has a constant step complexity in both operations.

*Completeness results for interval-linearizability.* It is known that interval-sequential specifications are *complete* in the sense that they are powerful enough to specify any concurrent object given by a set of concurrent executions, that is, sequences of invocations and responses. We will say that such specifications are *set-based.* Arguably, set-based specifications, proposed by Lamport,^{27,f} are the most general way to define a concurrent object. Such a set specifies all the concurrent behaviors that are considered valid for any concurrent algorithm implementing the object. For example, the set-based specification of a FIFO queue contains all executions that are linearizable, while the set-based specification of a FIFO queue with multiplicity contains all executions that are set-linearizable.

It turns out that interval-sequential specifications can model any set-based specification having some reasonable properties, like non-emptiness, prefix-closure and well-formedness (that is, each process alternates between issuing invocations and responses, starting with an invocation). The result was originally proved in^{8} under some assumptions, and later generalized by Goubault, Ledent, and Mimram.^{17} Furthermore, they prove that in any reasonable computational shared memory model, every algorithm for a given set-based specification must satisfy all properties. Therefore, in a formal sense, interval-sequential specifications are fully general.

### Conclusion

This article has presented two known extensions of linearizability—set-linearizability (that captures simultaneity) and interval-linearizability (that captures time-ubiquity), together with the corresponding formalisms to define more general concurrent objects: set-sequential automaton and interval-sequential automaton. This extended linearizability framework preserves the benefits of composability, nonblockingness, and the notion of state. We surveyed recent work that has already been taking advantage of this approach, but there seem to be many more opportunities.

There is a very active current trend by practitioners to move away from sequential specifications, due to the performance limitations, and even for simplicity, where it would be interesting to explore the use of the extended linearizability framework. Notable is the history of blockchain technology, which started with Bitcoin and its paradigm of sequentializing all monetary transactions in the system via tremendously energy consuming consensus mining algorithms, toward recent efforts of allowing for concurrent ledgers cooperation,^{38} and ledgers restricted to only monetary transactions that do not need consensus.^{6,10} The project CALM of Hellerstein and Alvaro focuses on the class of programs that can achieve distributed consistency without the use of coordination.^{19} Conflictfree replicated data types^{34} provide another interesting direction for future work.^{16,26} Their benefits of commutativity have been extended to composable libraries and languages, enabling programmers to reason about correctness of whole programs in languages like Bloom.^{19} In the context of distributed storage systems, large-fragmented objects with relaxed read operations have been introduced in Fernández et al.,^{13} which admit efficient implementations. Another recent trend is relaxed specifications.^{18} There have been several studies on relaxation in the shared-memory context, focusing on SkipLists, log-structured merge trees and other sequential data structures. Another line of research consists of looking for possible links between the presented linearizability hierarchy and the notions of strict linearizability,^{1} durable linearizability,^{25} recoverable linearizability.^{4}

### Acknowledgments

This work has been partially supported by the French projects BYBLOS (ANR-20-CE25-0002-01) and PriCLeSS (ANR-10-LABX-07- 81) devoted to the design of modular distributed computing building blocks, and UNAM-PAPIIT projects IN106520 and IN108720.

## Join the Discussion (0)

## Become a Member or Sign In to Post a Comment