The articles presented in our Special Issue on parallel processing on the supercomputing scale reflect, to some extent, splits in the community developing these machines. There are several schools of thought on how best to implement parallel processing at both the hard- and software levels. Controversy exists over the wisdom of aiming for general- or special-purpose parallel machines, and what architectures, networks, and granularities would serve these best. The efficiency of processing elements that are loosely or tightly coupled is still in question. And, in the software amphitheatre, there is debate over whether new languages should be written from scratch, whether old languages should be modified into parallel processing dialogues, or whether investments in old, sequential programs should be leveraged by recompiling them into parallel programs.
These issues were readily apparent at this year's International Conference on Parallel Processing (ICPP), as they have been during the 15 years since the annual conference first met. Few expect resolutions to these questions within the next 15 years, which is not to say that these subfields are not progressing rapidly; quite the contrary. As we now see, an outpouring of recent commercial unveilings of both super and supermini parallel processors represents the expected potpourri of design philosophies.
Related to the general- versus special-purpose issue is the direct, applications-oriented question: What do parallel processor architects expect their machines to be used for? And, taking this a step further, what would they like to see next-generation machines, with one hundred times more speed and memory, do? I asked ICPP attendees and other computer scientists these questions. Many answered straightforwardly that they see simulation as the main application now and in the future. Others deflected the question. Said one such architect quite flatly, "I just build the things. You'd be better off asking users." Yes and no; I wanted to know what was on architects' minds. Why were their imaginations so dry? But then others plunged in quite daringly, answering at a completely different level as reported below. Perhaps the range of all responses reflects differences in priorities that are to be expected in a young field in flux. Some of their thoughts convey that they are somewhat overwhelmed by new choices and freedoms. It seems that the advent of parallelism may be more than just the beginning of a new era within computer science. "Historically, computer scientists have been unable to predict many of the uses of faster machines," says DARPA/Information Science Technology Office Program Manager Stephen H. Kaisler, "because each such machine opens up new regimes of computing to explore." Indeed, parallel computing itself—not just its promise of improved speed—is all at once exposing new, unforeseen possibilities: a wide vista of architectures, languages, and operating systems. "To date, we've been playing around with a very limited range of these, but now many novel, promising combinations are within our grasp," says George Adams, a research scientist at the Research Institute for Advanced Computer Science who is studying parallel processors for NASA Ames Research Center. Until recently, the technology was not advanced enough to create a machine with large numbers of processing elements, for example.
Today, says Adams, cheap chips and improved communications permit running new permutations of languages and operating systems on such machines. These combinations were never before testable.
According to Adams, the balance between CPU capability and memory should be of prime concern for next-generation parallel processors and supercomputers, which he predicts are on a collision course and will become one and the same. Scientists at Ames are most often limited not by CPU speed, he says, but by memory size. Because data space for problems involving simulation of physical systems cannot be held entirely in memory, portions must reside on disk. This causes wall-clock problem solution time to "suffer drastically," says Adams. Since disk access is typically 100,000 times slower than memory access, users prefer not to wait for a value to be retrieved. Instead, they often recalculate values even if hundreds of mathematical operations are involved. So, if a parallel supercomputer with two orders of magnitude more CPU speed and memory suddenly appeared, "these scientists would likely cheer," says Adams. "Then they would ask for (still) more memory."
For Paul Castleman, president and CEO of BBN Advanced Computers, there is really no reason to limit CPU and memory increases for the next generation of general-purpose parallel processors to two orders of magnitude: "That's thinking relatively conservatively" for the next decade, he says. "We have in the works a prototype processor that is 1000 times faster than today's, and we are beginning work on configurations that are tens of thousands times faster." But a product's usability is the final determinant, says Castleman, "not the macho of how many more MIPS you can get . . . not whether you've souped up your sports car to run that much faster, but whether it feels comfortable to use." The solution is in the software development environment, which is why DEC and IBM have done so well, he says. Consequently, BBN Advanced Computers is now putting most of its effort into software tools for both scientific and nonscientific users. Graphics, for example, can further a user's understanding of many simultaneous processes—each using information from a common database—with graphs of processing elements' [PEs') results. An economist may watch one PE number crunching dollars flowing into consumption and another PE measuring capital accumulation: or a physical plant operator may observe calculations of pressure that is causing a tank to overflow while another PE handles variables affecting a chemical reaction. Besides being an aid to users, if graphics tools are also provided, each user's applications programmer would employ these utilities to generate the necessary aggregate graphics.
But DARPA's Kaisler says that, in exploiting the first wave of commercially available parallel processors, little effort has been expended toward using these machines for research and development in computer science. "What is needed is a new effort, a new push to open up new avenues of algorithm development," he says, "beginning from first principles about what constitutes an algorithm and how to map it to the computational model provided by a specific parallel processor."
The impact of commercial unveilings draws different if not diametrically opposed conclusions, however. A cautionary note about "hardware revelations" comes from David Gelernter, associate professor of computer science at Yale University. Hardware designers thus stricken will build machines from "the bottom up" that no one will know how to program, he says. Already, a dozen models have a dozen different low-level parallel programming systems. Moreover, Gelernter bemoans the fact that "machine-independent methods for parallel programming have been slow to emerge, and that consequently programmers have been forced to accommodate themselves to the machines rather than vice versa." He proposes that researchers imagine new kinds of programs before they imagine new kinds of machines. Once again the computer science equivalent of the age-old chicken-before-the-egg question arises. How far can hard- and software developments proceed independently? When should they be combined? Parallelism seems to bring these matters to the surface with particular urgency.
The first article in our Special Issue is "Data Parallel Algorithms," by W. Daniel Hillis and Guy L. Steele, Jr. These authors, from Thinking Machines Corporation, discuss algorithms and a new programming style for fine-grained single instruction multiple data (SIMD) parallel processors like the Connection Machine®. They cite examples such as parsing and finding the ends of linked lists—problems that they had assumed were inherently sequential—as milestones in their transition from "serial to parallel thinkers."
The next article, "Advanced Compiler Optimizations for Supercomputers," is by David A. Padua and Michael J. Wolfe, of the University of Illinois and Kuck and Associates, respectively. They represent those who believe sequential algorithms should be recompiled to accommodate vector, concurrent, and multiprocessor architectures. In a discussion of data dependence testing in loops, they show how parallelism in sequential codes for operations, statements, and iterations can be automatically detected for vector supercomputers. Further, they discuss improving data dependence graphs and optimizing code for parallel computers.
In a more theoretical vein, Robert Thomas and Randall Rettberg of BBN Advanced Computers discuss contention, or "hot spots," the phenomenon that some have predicted might cripple certain parallel processors. In their article, "Contention Is No Obstacle to Shared-Memory Multiprocessing," the authors describe engineering approaches to controlling the backup of data in networks and switches. Besides reporting specific methods used to control contention, they offer benchmarks on their own machine, the Butterfly™.
Two applications-oriented articles complete the set. "Toward Memory-Based Reasoning," by Craig Stanfill and David Waltz, suggests that memory of specific events, rather than of rules like those used in expert systems, is the right foundation for intelligent machines. In "Parallel Free-Text Search on the Connection Machine System," Craig Stanfill and Brewster Kahle harness unique properties of massive parallelism to implement a successful document-retrieval search paradigm. They use simple queries and relevance feedback techniques to produce intriguing results on a Reuters news database of 16,000 stories.
Karen A. Frenkel
Evaluating two massively parallel machines
Two radically different parallel computers prompt a debate about the best parallel architectures and may mark the commercial viability of parallelism on the supercomputer scale.
Computers, complexity, and the Statue of Liberty restoration
Twentieth-century techniques such as computer-aided engineering and finite-element analysis were used to restore the nineteenth-century monument.
To illustrate the "remarkable extent to which complexity theory operates by means of analogs from computability theory," Richard Karp created this conceptual map or jigsaw puzzle. To lay out the puzzle in the plane, he used a "graph planarity algorithm." The more distantly placed parts might not at first seem related, "but in the end, the theory of NP-completeness does bring them all together," Karp says.
The upper right portion of the puzzle shows concepts related to combinatorial explosions and the notion of a "good" or "efficient" algorithm. In turn, "Complexity" connects these concepts to the upper left portion, which represents the concerns of early computability theorists.
The traveling salesman problem is closer to the upper right corner because it is probably intractable. It therefore borders on "NP-completeness" and "Combinatorial explosion."
To some extent, however, certain divisions blur. "Linear programming," for example, has an anomalous status—the most widely used algorithms for solving such problems in practice are not good in the theoretical sense, and those that are good in the theoretical sense are often not good in practice. One example is the ellipsoid method that was the object of so much attention six years ago. It ran in polynomial time, but the polynomial was of such a high degree that the method proved good in the technical sense, but ineffective in practice. "The reason is that our notion of polynomial-time algorithms doesn't exactly capture the notion of an intuitively efficient algorithm," Karp explains. "When you get up to n5 or n6, then it's hard to justify saying that it is really efficient. So Edmonds's concept of a good algorithm isn't quite a perfect formal counterpart of good in the intuitive sense." Further, the simplex algorithm is good in every practical sense, Karp says, but not good according to the standard paradigm of complexity theory. The most recent addition to linear programming solutions, an algorithm devised by Narendra Karmarkar that some think challenges the simplex algorithm, is good in the technical sense and also appears to be quite effective in practice, says Karp.
The good algorithm segment is adjacent to "Heuristics" because a heuristic algorithm may work well, but lack a theoretical pedigree. Some heuristic algorithms are always fast, but sometimes fail to give good solutions. Others always give an optimal solution, but are not guaranteed to be fast. The simplex algorithm is of the latter type.
"Undecidability, " "Combinatorial explosion," and "Complexity" are on the same plane because they are analogs of one another; undecidability involves unbounded search, whereas combinatorial explosions are by definition very long but not unbounded searches. Complexity theory bridges the gap because, instead of asking whether a problem can be solved at all, it poses questions about the resources needed to solve a problem.
The lower left region contains the segments Karp has been concerned with most recently and that contain open-ended questions. "Randomized algorithm," for example, is situated opposite "Probabilistic analysis" because both are alternatives to worst-case analyses of deterministic algorithms. Randomized algorithms might be able to solve problems in polynomial time that deterministic ones cannot and that could mean an extension of the notion of good algorithms. Perhaps through software designs for non-von Neumann machines, algorithms can be made more efficient in practice through parallelism. Finally, some parts of the puzzle are not yet defined. Says Karp, "They correspond to the unknown territory that remains to be explored in the future."
Complexity and parallel processing: an interview with Richard Karp
In the following interview, which took place at ACM 85 in Denver, Karp discusses the relation of his work to leading-edge computing topics like parallel processing and artificial intelligence. Tracing his experience as a pioneer in highly theoretical computer science, Karp describes how the decision to go against established wisdom led to the work for which he is best known and how a colleague's findings led him to see links between two previously unrelated areas. Throughout, he stresses the exchange of ideas with colleagues that helped yield fundamental insights.
Report on the microelectronics and computer technology corporation conference
MCC, the consortium of companies headed by Admiral B. R. Inman, that some consider America's reply to Japan's Fifth Generation initiative, invited members and nonmembers to brainstorm on software design. Also on the agenda was the announcement of LEONARDO, MCC's software research project.
Toward automating the software-development cycle
Knowledge-intensive rather than labor-intensive processes are being advanced to spur programming productivity.
Shape the Future of Computing
ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.
Get Involved