Computing Applications Research highlights

Technical Perspective: Data Races Are Evil with No Exceptions

Posted Nov 1 2010

Article
References
Author
Footnotes

Exploiting parallelism has become the primary means to higher performance. Shared memory is a pervasively used programming model, where parallel tasks or threads communicate through a global address space. Popular languages, such as C, C++, and Java have (or will soon have) standardized support for shared-memory threads. Unfortunately, shared-memory programs are notoriously prone to subtle bugs, often due to data races.

Languages that allow data races obfuscate true communication and synchronization. Since any load or store may communicate or synchronize with another thread through a data race, it becomes difficult to reason about sections of code in isolation. Data races also create non-determinism since the ordering between two racing accesses is timing-dependent. Racy programs therefore require reasoning about many interleavings of individual memory accesses and their executions are difficult to reproduce. Data races have therefore been widely considered as symptoms of bugs and there is much research to automatically detect them.

An arguably more fundamental problem with data races concerns semantics. Every programming language must specify what value a load can return, also called the memory model. It has been surprisingly difficult to specify an acceptable model that balances ease-of-use and performance for languages that allow data races.¹ Programmers usually expect a sequential interleaving-based model called sequential consistency. For data-race-free programs, it is straightforward to provide sequential consistency and high performance. With racy code, however, routine compiler and hardware optimizations can result in surprising behaviors. Consequently, the upcoming C and C++ memory models specify "undefined" behavior for programs with data races. This gives freedom to implementers but poses challenges for debugging racy code. Java’s safety requirements preclude the use of "undefined" behavior. The Java memory model, therefore, specifies very weak semantics for racy programs, but is extremely complex and currently has known bugs.

Despite the debugging and semantics difficulties, some prior work refers to certain data races (for example, some unsynchronized reads) as benign and useful for performance. Unfortunately, reasonable semantics for programs with any type of data race remain elusive; C, C++, and Java do not provide usable semantics with any (benign or not) data race.

A natural conclusion is to strive for languages that eliminate data races. Although there is much progress, such general-purpose languages,² are not yet commercially available. The following two papers on Goldilocks and Fast-Track suggest alternative solutions that use always-on data race detection for current languages.

Goldilocks was the first work to propose a data race be treated as a language-level exception, just like null pointer dereferences. This insight cleans up the most difficult part of the memory models mess—executions are either sequentially consistent or throw an exception. No complicated semantics of Java and no unsafe behavior of C/C++!

The challenge, compared to much prior work on race detection, is that an always-on language-level exception mechanism must be both fast and precise. The well-established race detection algorithm based on vector clocks is precise, but incurs orders-of-magnitude slowdown. Faster algorithms (for example, based on a lockset approach) exist, but produce false positives, which are unacceptable for enforcing language-level semantics.

Goldilocks extends the lockset approach to make it precise, at a much faster speed than the previous precise vector clock algorithm. FastTrack followed up by optimizing the vector clock approach to make it run faster, without losing precision and performing even better than Goldilocks. For the first time, these papers made it possible to believe that "data race as an exception" may be a viable solution to the vexing debugging and semantics problems of shared memory.

The absolute slowdown of both techniques is still quite significant, but they expose an exciting research agenda. An immediate challenge is to improve performance. Another concerns mapping races detected in compiler-generated code to source code. Others are exploring hardware for similar goals.³

More broadly, these papers encourage a fundamental rethinking of programming models, languages, compilers, and hardware. Should languages be designed to eliminate data races by design?² Or should the runtime automatically detect races? Or should the hardware?³ Can similar techniques be used for enforcing even stronger properties such as determinism and atomicity? Does this impact how we view shared memory as a programming model? The answer to all these questions is probably "yes."

It is unlikely that any of language, runtime, or hardware techniques alone will strike the best balance between ease-of-use, generality, performance, and complexity, but designing systems that combine the strengths of such techniques remains challenging. It is also unclear what high-level language properties such systems will finally ensure; for example, race elimination is certainly a critical factor in ensuring determinism and atomicity. What is clear is these papers provide key insights to shape the final solutions and are important steps toward making parallel programming amenable to the masses.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Technical Perspective: Data Races Are Evil with No Exceptions

View in the ACM Digital Library

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

DOI

10.1145/1839676.1839697

November 2010 Issue

Published: November 1, 2010

Vol. 53 No. 11

Page: 84

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Apr 26 2024

Optimizing Energy Efficiency in Datacenters with Advanced Cooling Technologies

Alex Williams

Architecture and Hardware

Credit: Getty Images Servers in snowy setting.

News Apr 23 2024

Maximizing Power Grid Security

R. Colin Johnson

Security and Privacy

News Apr 18 2024

Keeping AI Out of Elections

Bennie Mols

Artificial Intelligence and Machine Learning

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Technical Perspective: Data Races Are Evil with No Exceptions

DOI

November 2010 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.