The theory of computation is one of the crown jewels of the computer science curriculum. It stretches from the discovery of mathematical problems, such as the halting problem, that cannot be solved by computers, to the most celebrated open problem in computer science today: the P vs. NP question. Since the founding of our discipline by Church and Turing in the 1930s, the theory of computation has addressed some of the most fundamental questions about computers: What does it mean to compute the solution to a problem? Which problems can be solved by computers? Which problems can be solved efficiently, in theory and in practice?
Yet computational theory occupies an ambiguous role in the undergraduate curriculum. It is a required core course for the computer science major at many institutions, whereas at many others it is an upper-level elective. And whether required or not, the theory course can have a reputation as an austere and perhaps even irrelevant niche, disconnected from the skills and ideas that comprise computer science. This is not a new phenomenon, and in recent decades the CS community has worked diligently to improve the accessibility and perceived relevance of the theory course. Notable contributions include the JFLAP software for experimentation with automata,8 and various efforts to promote "NP-completeness for all" via visualizations and practical laboratory exercises.1
This Viewpoint discusses two specific suggestions for continuing to improve the accessibility of the theory course while maintaining its rigor: first, emphasizing search problems rather than decision problems in certain parts of the course; and second, employing computer programs written in a real programming language as one of the standard computational models, complementing the use of automata such as Turing machines. The suggestions here apply specifically to an undergraduate course in which students encounter theory of computation for the first time. The content of such courses varies widely, and the following suggestions are most applicable to introductory courses that incorporate both computability and complexity theory.
The theory of computation is usually phrased in terms of decision problems: questions with a single-bit yes/no response. In other areas of computer science, however, we are usually interested in search problems, whose solutions consist of more than a single bit. As an example, consider the problem of finding a Hamilton cycle in a graph—that is, a route visiting every vertex exactly once. Theory courses usually discuss the decision problem, "Does the graph G contain a Hamilton cycle?" But if the answer is yes, we still do not know the route of a Hamilton cycle in G. It is more natural and useful to consider the related search problem, "Find and output a Hamilton cycle of G, if one exists."
Once they have finished their first theory course, computer science undergraduates will recognize the connections between search and decision problems. But search problems are more familiar and more immediately applicable, so there are good reasons to teach the more elementary parts of the theory course with an emphasis on search problems. It is worth noting that Knuth prize winner Oded Goldreich is an advocate of this approach2; his books are among the few modern textbooks3,6 that adopt search problems as a primary paradigm. But search problems can easily be incorporated into more traditional approaches that retain decision problems as the standard model, and I do recommend this as a means of connecting the theory course more closely to other parts of the undergraduate CS curriculum.
The advantages of, and techniques for, teaching CS theory via search problems have been discussed in detail elsewhere.5 One interesting result, based on a survey of CS undergraduates, is that search problems are perceived as significantly more useful than decision problems. Because perceived relevance is known to be a factor in achieving good learning outcomes, this provides indirect evidence the approach is beneficial.
Another technique for increasing student engagement and connections with other parts of the CS curriculum is to employ code in a real programming language. This can provide a beneficial supplement to the automata and grammars that typically dominate a course in theory of computation. Formal models such as Turing machines are of course essential, especially for providing a rigorous definition of computation itself. However, it is possible to teach a mathematically rigorous theory course using a programming language as the primary model of computation. In this approach, the program model is layered over Turing machines as an underlying model, and Turing machines are still employed when required in certain proofs and definitions. A strong majority of CS theory textbooks do not employ a programming language as the primary computational model, but several authors have done so, for example using Python,6 Ruby,9 and a variant of LISP.4,7 As an example of the approach, consider the Python program shown in the figure here.
This code provides the basis for a proof by contradiction, demonstrating that a certain computational problem is undecidable. Specifically, it proves the undecidability of the following question: "Given a Python function
P() and input string
P return the value '
yes' when invoked with input
I?" A detailed explanation of the proof is outside the scope of this Viewpoint; here, I focus on the potential advantages for undergraduate students who are encountering this type of material for the first time. Note that, in practice, the code shown in the figure would be presented in class only after exposure to and experimentation with prerequisite concepts, such as Python functions that take the source code of other Python functions as input and analyze them or transform them in some way. Nevertheless, for concreteness and compactness in this Viewpoint, I describe the potential benefits to students directly as they appear in this proof.
First, note the undecidability result itself can be described in terms of Python programs: "It is impossible to write a Python program that determines whether other Python programs will output '
yes' on a given input." From one point of view, this is a purely cosmetic change from the equivalent statement in terms of Turing machines: "there does not exist a Turing machine that determines whether other Turing machines will accept a given input." After all, students in any theory course must come to understand the equivalence between Turing machines and computer programs. Nevertheless, the practice of discussing results in terms of computer programs that have clear connections to other areas of computer science provides the instructor with opportunities for increased engagement; this has certainly been my own experience.
Second, there are some steps in theory proofs that are surprisingly subtle when expressed in terms of Turing machines, but become obvious and familiar in a programming language. One example in the figure is the trick in which a single parameter
P is duplicated and passed on in two separate roles to the two-parameter function
yesOnStr (P,P). In class, this can be further explicated by stepping through the program in a debugger and showing the dual roles of
P: as source code in the first parameter and as a text string in the second parameter. (Even Alan Turing recognized the challenges inherent in proofs based on automata. In the seminal 1936 paper that introduced Turing machines, he sympathized with readers who might feel "there must be something wrong" in his first such proof.10)
Third, students can build an intuitive understanding of code-based proofs by active experimentation with the code. In the example in the figure, one can provide an approximate version of
yesOnStr () that works correctly on a limited class of inputs. Students can then predict the output of
diagYesOnStr () on various inputs, and check their answers by running the code. They can construct variants of the code, discussing which variants produce the desired contradiction and which do not. By implementing
yesOnStr () via simulation, students can discover an important extension to this result: we can in fact write a Python program that always terminates correctly on positive instances of this problem, so the problem is recognizable but not decidable.
Many theory courses could benefit from making more explicit connections to other parts of the computer science curriculum.
Fourth, some students may find the programming approach transfers more easily to novel problems. In recent years I have taught three different approaches for undecidability proofs to all students: traditional reductions employing prose descriptions of Turing machines; explicit Python programs (similar to the example in the figure here) supplemented by a prose explanation of the desired contradiction; and the application of Rice's theorem. In tests and exams, students may choose which proof method to use, and there is an approximately even split among these three proof techniques. In particular, a significant fraction of students choose to write out a Python program as part of their exam answer. This provides empirical evidence that the programming approach is beneficial for some students, and it is plausible all students gain improved understanding from seeing multiple approaches.
Over a period of eight years, I have experimented with techniques for making the undergraduate theory course more accessible and engaging. This Viewpoint suggests two possibilities: emphasizing search problems and employing real computer programs. I do not advocate the universal or complete adoption of these suggestions. I have backed away from some aspects of the approach myself. For example, after experimenting with teaching NP-completeness based on search problems, I concluded this part of the course works better when taught with the traditional focus on decision problems. Similarly, I found there are some technical results about polynomial-time verifiers that can be proved more instructively—for the target audience of novice undergraduates—using Turing machines rather than computer programs.
Every instructor and every group of students is different; instructors must adopt a style of teaching that is authentic to themselves, achieves the goals of the students, and is based realistically on the students' level of preparedness. I do believe that many theory courses could benefit from making more explicit connections to other parts of the computer science curriculum, and it is possible to do this incrementally. If decision problems and Turing machines are retained as the central paradigms, search problems can be still be mentioned when relevant, and snippets of code can be used to illustrate subtleties.
Whether or not the specific ideas suggested here are adopted, it seems important that we continue to strive for accessibility and engagement in the undergraduate theory course. The theory of computation is a profound and important cornerstone of computer science; I hope that in the years ahead, an ever-growing number of students will appreciate both its beauty and its significant connections to the rest of the computer science curriculum.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.
I believe this is a common transition that has been going on for at least 20 years.
For a complexity class for computer scientists, one could even take this a step further and not even talk about classes P or NP (or the hierarchies beyond). Instead just postulate that SAT is a difficult (exponential time) problem, show the standard reductions to 3SAT, Vertex Cover, CLIQUE, etc. using standard programming languages as you have done.
Then spend a little time on the reductions between search problems and optimization problems (usually easy one way and requiring a binary search approach the other way) illustrating the computational relationship between these.
For the majority of CS students, the mathematical beauty of completeness arguments and problem classification has limited interest, they are much more interested in differentiating between what is hard and what is easy. Doing this using language of programming concepts they are already familiar with and going directly to the core of the complexity problem simplifies the learning task considerably.
MacCormick's viewpoint is a timely discussion of the value of the Theory
of Computation and how one can offer it in innovative ways. I would like
to point out two developments that might be of widespread interest.
The first is a tool called Automaton Tutor that automatically grades problems
and also empowers one to generate new
problem instances (see https://link.springer.com/chapter/10.1007/978-3-030-53291-8_1).
It has been widely used, with the above article including many testimonials.
The second is a tool called Jove (https://github.com/ganeshutah/Jove.git) that
provides Jupyter notebooks covering all traditional Theory of Computation topics
plus many more timely additions (e.g., Binary Decision Diagrams used in Formal
Methods - viewable as minimal DFA). It introduces context-free parsing in a hands-on
manner through a calculator, asking students to extend it with new operators.
Jove does not require any software installation: it can be run directly off its
Github page on Google's Colab.
Jove includes the best of JFLAP (multiple animation modes directly within notebook
cells) and a REPL interface
(scriptable executions that construct and display large automata). Its largely
declarative code adds to one's understanding. One can verify machine constructions
through animations, scripted executions, and overall rigorous checks (e.g., isomorphism
of minimized DFA).
Jove includes important topics lost in the mists of time: Brzozowski's DFA
minimization which is one line of Python code ("Reverse; Determinize; Reverse;
Determinize") and Brzozowski's Derivatives to deal with extended regular expressions.
The incremental rule-based nature of Derivatives is reminiscent of calculus rules,
and also fits directly into how one writes Structural Operational Semantic rules.
An entire undergraduate course in the form of weekly Jupyter notebook-based exercises
is available and has especially helped during the distance-mode of education.
In conclusion, I wholeheartedly agree with MacCormick; this subject is far from
being stodgy; rather, it is a springboard for many a future teaching adventure.
Automatically learning DFA from examples and Markov Models are a small step away!
Professor, School of Computing, University of Utah
Displaying all 2 comments