Credit: Alicia Kubista / Andrij Borys Associates
The transaction abstraction encapsulates the mechanisms used to synchronize accesses to data shared by concurrent processes, dating to the 1970s when proposed in the context of databases to ensure consistency of shared data.7 This consistency was determined with respect to a sequential behavior through the concept of serializability;25 concurrent accesses must behave as if executing sequentially or be atomic. More recently, researchers have derived other variants (such as opacity13 and isolation30) applicable to different transactional contexts.
The transaction abstraction was first considered as a programming language construct in the form of guards and actions by Liskov and Scheifler more than 30 years ago,22 then adapted to various programming models, including Eden,1 ACS,12 and Argus.21 The first hardware support for a transactional construct was proposed in 1986 by Tom Knight,19 basically introducing parallelism in functional languages by providing synchronization for multiple memory words. Later, the notion of transactional memory was proposed in the form of hardware support for concurrent programming to remedy the trickiness and subtleties of using locks (such as priority inversion, lock-convoying, and deadlocks)18 (see Figure 1).
Since the advent of multicore architectures approximately 10 years ago, the very notion of transactional memory has become an active topic of research (http://www.cs.wisc.edu/trans-memory/biblio/list.html). Hardware implementations of transactional systems18 turned out to be limited by specific constraints programmers could "abstract away" only through unbounded hardware transactions. However, purely hardware implementations are complex solutions most industrial developers no longer explore. Rather, a hybrid approach was adopted through a best-effort hardware component that must be complemented by software transactions.4
The following letter was published in the Letters to the Editor of the June 2014 CACM (http://cacm.acm.org/magazines/2014/6/175169).
--CACM Administrator
Vincent Gramoli's and Rachid Guerraoui's article "Democratizing Transactional Programming" (Jan. 2014) suggested transaction theory was first formalized in 1976. As with many early developments in computing, the concepts of transaction management for data storage and retrieval developed in engineering practice before emerging in research papers or industry standards. Even so, the article postponed the theorization of transactions by at least a couple of years. Its Figure 1 ("History of Transactions") pinpointed an article by Eswaran et al. (including Jim Gray) called "Notions of Consistency and Predicate Locks in a Database System" in Communications (Nov. 1976) as the earliest published work on transactions. In fact, Eswaran et al. had condensed their own substantively identical IBM Research technical report (Dec. 30, 1974) On the Notions of Consistency and Predicate Locks in a Data Base System.
Reflecting the considerable development of industrial data management systems by the early 1970s, at least some essential features of the transaction concept had already been presented in an earlier published work the CODASYL Data Base Task Group April 71 Report which was indeed the first industry standard in concurrent database management. It included the now ubiquitous terms "data definition language" and "data manipulation language" to enable the declaration of a schema (and views upon it) and specified the statements needed to retrieve, insert, and update data elements.
In 1974, Eswaran et al. wrote: "While fairly static locking schemes are acceptable in a conventional operating system, a particular data base transaction may lock an arbitrary logical subset of the data base." A transaction that establishes priority claims over access to a set of shared related data items enables execution of concurrent programs affecting consistent (application-meaningful) state transitions.
Although the CODASYL report did not include the term "transaction," it did include rules for managing locking and contention between different "run-units," or independent program executions accessing a single data store. The CODASYL data manipulation language verbs OPEN and CLOSE demarcated units of work in which record sets and individual records could be locked, and competing run-units allowed degrees of access or completely excluded until the run-unit closed. The CODASYL report also contemplated rollback semantics in the face of failures.
Maurice Wilkes, a famous early pioneer of computing, used the term "transaction" when describing "transaction journals" as recoverable records of "a condensed summary of the transaction recorded immediately before the update is made" in his 1972 article "On Preserving the Integrity of Data Bases" in Computer Journal. Two years later, R.A. Davenport's "Design of Transaction-Oriented Systems Employing a Transaction Monitor" in Proceedings of the ACM Annual Conference recalled the role of early transaction processing monitors in crystallizing the transaction abstraction.
Notwithstanding these efforts, Eswaran et al.'s 1974 technical report took the transaction concept from semi-codified expression of practical knowledge to rigorous formalization, intersecting relational database theory in its early years. It stands today as a landmark publication in the history of automated data management.
Alastair Green
London, U.K.
Displaying 1 comment