An old proverb observes that each long journey begins with a single step. A modern philosopher, Woody Allen, more recently noted that eighty percent of life was showing up. The first speaks to the importance of deciding and beginning, even if one’s goal seems remote. The second emphasizes the pragmatic reality that much of any endeavor is ordinary and seemingly innocuous, but necessary, to achieve any goal. As any calculus student knows, Achilles does catch the tortoise.
Yet there is a cautionary counterpoint to these aphorisms. Sometimes, incrementalism births only failure. One cannot leap a canyon in two jumps, even if Wile E. Coyote and The Road Runner might suggest otherwise. When circumstances necessitate, one must marshal resources and innovate disruptively. In mathematical terms, it is a point of discontinuity.
At this point, the gentle reader may be wondering if the writer’s perambulations will converge into a coherent treatise or diverge like a random walk in n-space. Fear not, the topic is at hand.
High-performance computing has been international news in recent weeks. China’s Tianhe-2 system is now ranked first on the June 2013 Top500 list of the world’s fastest computers. It contains 32,000 Intel Xeon 12 core nodes and 48,000 Xeon Phi boards (57 cores each) for a total of 3,120,000 cores. Tianhe-2 overtook Oak Ridge National Laboratory’s Titan system, which contains nearly 300,000 AMD Opteron cores (~18,000 16 core nodes) and nearly 20,000 NVidia Tesla accelerators).
The global race is on to build ever-faster supercomputers, fueled by a combination of scientific and engineering needs to simulate phenomena with greater resolution and fidelity, continued advances in semiconductor capabilities, and economic and political competition. In the minds of many, the mantra is "exascale or bust," pitting the United States, China, Japan and the European Union against one another in a competition for bragging rights for the first system capable of sustaining one thousand petaflops or one quintillion floating point operations per second.
Like all goals, the exascale journey is as important as the destination, for valuable technical lessons must be learned along the way. Today, we face limits on aggregate chip energy dissipation and an inability to concurrently active all transistors on a chip, so-called dark silicon. In turn, this has led to multicore designs and systems-on-a-chip (SoCs) that increasingly embody heterogeneous cores (both performance and functional heterogeneity) and specialized accelerators. In this environment, compilers, runtime systems, libraries and application developers must now manage multiple execution models and options while balancing often-conflicting optimization goals.
All these challenges are convolved with the massive scale of high-performance computing systems and their concomitant issues of systemic resilience and energy efficiency. In addition, high-performance computing remains bedeviled by two other, longstanding problems – ease of programming and performance scalability.
It seems increasingly doubtful that we can reach the exascale destination by incrementalism. Instead, radical innovations in semiconductor processes, computer architecture, system software and programming systems are needed. Simply put, we are facing a chasm of challenges and opportunities that cannot be bridged in small steps.
We need a catastrophe – in the mathematical sense – a discontinuity triggered by a sustained research program and development program that combines academia, industry and government expertise.
It is time to take the leap. The research insights will have unexpected benefits, technically and economically.