Computing Applications Viewpoint

Let’s Be Honest

Seeking to rectify the two mutually exclusive ways of comparing computational power—encoding and simulation.

Posted May 1 2021

Introduction
The Solution
Related Concerns
Takeaway
References
Author
Footnotes
Sidebar: Counter Machines

We have a serious problem with how we have been teaching computability theory, a central component of the ACM/IEEE computer science curriculum.

Let me explain. For a fair number of years, I taught a computability course. Following the standard curriculum (such as described by Hopcroft and Ullman¹⁴), and in concert with my colleagues in the field, I made claims on countless occasions that one model of computation is more powerful than another or that two models have the same power of computation. In some cases the argument appealed to ordinary set inclusion, while at other times it involved a notion of simulation via encodings. Imagine my chagrin when I came to realize these two methods of comparison are in fact incompatible!

When two models work with the same entities, simple set inclusion of formal languages or sets of functions is employed naturally by everyone. We teach that finite-state automata recognize the same languages as defined by regular expressions but are strictly weaker than pushdown automata, and we bring palindromes or non-square words as proof positive.¹³ Similarly, we assert that primitive recursion (or, equivalently, looping via bounded for loops only) is weaker than general recursion (with while loops, too) because of the two models, only the latter can compute the Ackermann function.¹⁴

When, on the other hand, the domains of the models under consideration differ, encodings are required before they can be compared. For example, Alan Turing, in an appendix to his profound landmark 1936 paper, showed that the lambda-computable functions and the functions that can be computed using his Turing machines are of equivalent computational power. “Standard” machine descriptions (lists of quintuples) were turned into decimal numbers, which in turn were expressed in the lambda calculus as Church numerals. Turing also proved that his machines and general recursion are equipotent.²⁴ To show that Turing machines can compute all general recursive functions, numbers are normally (and wastefully) represented on the machine tape as a sequence of tally marks in unary.¹⁴

Unfortunately, the preceding two methods of comparison, namely inclusion and simulation, can yield mutually exclusive outcomes. The simplest example is the counter machine (a.k.a. Minsky machine, abacus model). Each counter holds a natural number that can be incremented, decremented, and tested for zero (see the sidebar). With only two counters, the model is not even powerful enough to square its input^2,20 or recognize primes.¹⁵ However, if we agree to represent a number n as the exponential 2ⁿ (a simpler encoding than Gödel numbering of expressions), then, courtesy an ingenious proof by the late Marvin Minsky,^14,17 we find that two counters suffice to compute every computable function. This is why one encounters statements such as:

It is well known that a finite-state automaton equipped with two counters is Turing-complete.⁹
[Minsky proved that a] two-counter machine is universal, and hence has an undecidable halting problem.¹⁶

Such claims of completeness or universality would be blatantly false were one to subscribe to the set-inclusion sense, whereas they are manifestly true in the simulation sense, which is indeed what Minsky proved. As Rich Schroeppel expressed it: “Any counter machine can be simulated by a 2CM, provided an obscure [sic!] coding is accepted for the input and output.”²⁰ So, I take issue with a statement like this: “The surprising result about counter machines is that two counters are enough to simulate a Turing machine and therefore to accept every recursively enumerable language.”¹³ Two-counter machines do simulate Turing machines, but they do not “accept” all recursively enumerable languages in the usual “as-is,” unencoded sense, primes being a prime example.

The point is it behooves teachers to be forthright and forthcoming and to address this inconsistency. We cannot carry on oblivious to the fact that by one of the methods of comparison that we use in our lectures 2-counter machines are strictly weaker than (the complete) 3-counter machines, while by a second method that we also endorse the two are to be deemed equivalent.

The Solution

All is not lost, thankfully. We can eat our proverbial cake and still have it, provided we invest extra effort.

To begin with, it would be an unmitigated disaster to abandon simulations, since the idea that all our traditional unrestrained models are of equivalent power, despite operating with different entities, stands at the core of computability theory, as enshrined in the Church-Turing thesis. Consequently, as painful as it may seem, we are obliged to give up the inclusion notion for paradigms that compute functions, such as general recursion and primitive recursion—though not for formal languages, as I will explain later.

By “simulation” one usually means there is an (injective, 1–1) encoding c from the domain of the simulated model M into the domain of the simulating model M’ such that every function f computed by the former is mirrored by a function f’ computed by the simulator M’ such that f’(c(x₁),…,c(X_n)) = c(f(X₁…,X_n)) for all inputs X₁,…,X_n coming from the domain of M. The following are textbook quotations:

To show two models are equivalent, we simply need to show that we can simulate one by the other … Any two computational models that satisfy certain reasonable requirements can simulate one another and hence are equivalent in power.²²
Computability relative to a coding is the basic concept in comparing the power of computation models … Thus, we can compare the power of computation models using the concept “incorporation relative to some suitable coding.”²³

But what should be deemed “reasonable” or “suitable” lies in the eyes of the beholder. If we do take this simulation route, and I believe we must, and if we are to have a mathematically satisfying theory of computation, then we are in dire need of a formal definition of allowable encodings c.

Hartley Rogers elucidated, “The coding is chosen so that it is itself given by an informal algorithm in the unrestricted sense.”¹⁹ This requirement, however valid, is at the same time too informal and potentially too generous. And it is circular, since our goal is to demarcate the limits of effective computation. As Richard Montague complained: “The natural procedure is to restrict consideration to those correspondences which are in some sense ‘effective’ … But the notion of effectiveness remains to be analyzed, and would indeed seem to coincide with computability.”¹⁸ The only way around its informality would be to agree somehow on a uniform, formal notion of “effective” algorithm that crosses domains (such as Boker and Dershowitz⁴). Still, an “unrestricted” encoding could conceivably enlarge the set of functions that can be computed by the simulating model, as we saw with counter machines and an effective exponential encoding.

What other restrictions, then, should be imposed on encodings? Obviously, we need one and the same encoding c to work for all simulated functions f. Were one to examine lone functions, then it would be easy to come up with a bespoken “deviant encoding” that makes a single uncomputable function appear computable. For another thing, we must insist that the same encoding c be used both for the inputs x_i as well as for the output of f, or else everything can easily go belly-up (pace Butterfield et al.⁸). Ideally, the restrictions would ensure that (unlike for counter machines, or the lambda calculus, for that matter) no allowed encoding can expand the class of computed functions. Specifically, we must preclude the endowing of Turing’s machines or the recursive functions with superpowers (what is termed “hyper-computation”). How can we guarantee this? Shapiro²¹ has submitted that for an encoding of (Platonic) numbers to be “acceptable,” the encoded successor function should also be computable by M’. In other words, M’ must include a function s’: c(n) ↦ c(n + 1) simulating successor. I agree. But why successor?

Mercifully, encoding turns out not to be a problem for the usual use cases. Indeed, no encoding whatsoever can break the Turing barrier. Specifically, one can prove that there is no (injective) encoding that allows one to simulate all the computable functions plus an incomputable one like halting.³ It turns out, in fact, that simulating the successor function effectively is necessary and sufficient to guarantee that Turing machines cannot simulate (under a single-valued encoding) anything unexpected.⁴ And this is all we need for the big picture to remain intact.

On the other hand, with just a bit of effort, one can devise a recursive function (a modification of Ackermann’s function) that cannot be simulated by primitive recursion regardless of the encoding.³ So it thankfully remains true that primitive recursion is strictly weaker than recursion—in the very strong sense that no (injective) encoding whatsoever would endow primitive recursion with full Turing power. Were it not likewise provable that 1-counter machines cannot simulate all recursive functions, statements like “Combining these simulations, we see that two-counter machines are as powerful as arbitrary Turing machines (one-counter machines are strictly less powerful)”¹² would be indefensible.

Turning to formal languages (sets of words over some alphabet), the situation is reversed. Encodings are bad; inclusion is good. Homomorphic mappings may preserve the relative power of most language models (with their purely local impact on the structure of strings), but more general injections or bijections do not. In fact, there is a nefarious bijection between the words of any (nonsingular) alphabet with the disconcerting property that all the regular plus all the context-free languages can be recognized by mere finite-state automata. The situation is actually infinitely more intolerable: one can at the same time also recognize countably many arbitrary undecidable languages with vanilla finite automata via such a mischievous bijection.¹⁰

In the case of languages, then, we are compelled to adhere to straightforward inclusion and ban (even computable) mappings of input strings when comparing the power of language models. Earlier, when dealing with (all) the computable functions, we did have the flexibility of simulating via mappings, but that was because the same mapping is also applied to the full range of possible function outputs.⁵

Related Concerns

There is another ubiquitous use of encodings that similarly requires extra caution. Oftentimes, one wishes to compute a function on objects other than strings or numbers, such as graphs, logical formulae, or computer programs. For that purpose, one must somehow represent those objects in the input/output language of the computational model that is to manipulate them, typically strings or numerals. To quote: “A necessary preliminary to applying our work on computability … is to code expressions by numbers …. There are many reasonable ways to code finite sequences, and it does not really matter which one we choose.”⁶

It would be an unmitigated disaster to abandon simulations, since the idea that all our traditional unrestrained models are of equivalent power, despite operating with different entities, stands at the core of computability theory, as enshrined in the Church-Turing thesis.

To be “reasonable,” however, one needs to be sure that the encoding does not do anything beyond faithfully representing the input.

For example, it is a trivial matter to concoct an encoding of Turing machines that turns an undecidable problem about machines into a readily computable one. Let W₀, W₁ … be an enumeration of all binary strings (over the alphabet {0, 1}), and let M₀, M₁, … be some enumeration of all Turing machines (over that input alphabet). The following are four typical decidability questions:

T(i,j): machine M_i halts on input W_j.
H(i): machine M_i halts on input W₀.
U(i): machine M_i halts on all inputs W₀, W₁ …
D(i): machine M_i halts on input W_i.

Halting on a single particular input (like the empty word) is just the parity problem if one reorders a standard enumeration of machines so that the odd-numbered ones halt on that input while the even ones do not. The snag with such an encoding of Turing machines is that it also makes ordinary tasks incomputable. Specifically, once could not modify the code of a given machine to act in some related but different way, because one would need to ascertain the termination behavior of the modified machine. So whether problem H is decidable or not actually depends on exactly how machines are encoded.

Consider an assertion such as the following:

One of Turing’s key insights was the Halting Problem H (which takes an integer n and outputs H(n) = 1 if and only if n = <P> is an encoding of a valid [self-contained] program P and P terminates) is “undecidable.”⁷

For your usual, straightforward encodings of machine descriptions, the problem is indeed undecidable, but for any number of alternative encodings it becomes decidable. The same goes for the “universal halting” problem U.

On the other hand, the more basic halting problem, T(i,j), which asks about the behavior of the i^th machine on the j^th possible input, is undecidable regardless of how machines are encoded or how inputs are enumerated. Nevertheless, one must be careful how pairs <i,j> are encoded for models of computation—such as run-of-the-mill Turing machines—that allow only a single input representing both i and j. Similarly, the “diagonal” language D, consisting of the indices of those machines that halt when the input string has the same index in its enumeration as does the machine in its encoding, is not computable for any and all machine encodings and string enumerations. So it is true that “no encoding [of all Turing machines as numbers] can represent a TM M such that L(M) = L_d [the diagonal language],” as claimed in Hopcroft et al.,¹³ but the same immunity to encoding does not hold true for the collection of machines that accept nothing (L_e).¹³ Regrettably, no textbook I have seen clarifies which encodings of machines are valid and for what purpose and why. Nothing in the following remark, for example, precludes a representation from incorporating a finite amount of uncomputable information about the represented machine, such as whether it always terminates or halts on a specific input:

The details of the representation scheme of Turing machines as strings are immaterial [as long as]: (1) We can represent every Turing machine as a string. (2) Given the string representation of a Turing machine M and an input X, we can simulate M‘s execution on the input X.¹

The standard part of a malicious string encoding would allow one to simulate execution as usual, while tacked-on extras can allow an algorithm to decide otherwise undecidable questions about them. Overzealous encoding is not, however, a problem in programming languages that pass unadulterated programs as arguments, sans encoding.

As a final comment, when it comes to complexity comparisons, everyone realizes that representation is an issue to be taken into account, but the requirements remain vague: “The intractability of a problem turns out to be essentially independent of the particular encoding scheme … used for determining time complexity … It would be difficult to imagine a ‘reasonable’ encoding scheme for a problem that differs more than polynomially from the standard ones … What we mean here by ‘reasonable’ cannot be formalized …”¹¹

It would seem to me that standard string and image compression schemes are perfectly reasonable encodings, despite reducing size exponentially in many cases. In any event, a formal, principled definition of “reasonableness” is still sorely lacking for the theory of complexity. (But see Boker and Dershowitz⁵ for one proposal.)

Takeaway

To recapitulate the main points of the problem raised here:

Every single course in automata or computability utilizes set inclusion as the means of comparing the computational power of different formalisms for language definition.
Virtually every such course claims equivalence of a wide variety of models of computation in support of the Church-Turing thesis, an equivalence that is based on mutual simulations.
These two notions are logically incompatible as we have witnessed.
No textbook nor any instructor I have encountered recognizes, let alone addresses, this fundamental inconsistency.

At a bare minimum, then, we must make the following changes in the manner this subject is traditionally taught:

One should use set inclusion only as a means to compare classes of formal languages, such as in the demonstration that context-free grammars are a strictly more inclusive formalism than are regular expressions.
We should never use set inclusion to compare the power of primitive recursion with general recursion, or for-loop programs with while-loop ones, or one-counter machines with two counters, without mentioning that it has in fact been demonstrated that the one can also not simulate all of the other.
Instructors ought to emphasize that one must always be careful with encodings, as they easily alter computational power, while pointing out it has been proved this is not an issue for the usual use case of Turing-level computability.
One should definitely avoid using halting-on-empty-tape, or empty-language acceptance, or similar problems as fundamental examples of undecidability, as their decidability is encoding-dependent. Instead, we need to explicate the subtle role of input encodings when reducing the standard two-input halting problem to those other problems.
We should be cautious to never say or imply that two-counter machines recognize all recursively enumerable languages (they do not), nor that they compute (as opposed to simulate) all Turing-computable functions.
One should not choose the lambda calculus as a primary exemplar of a fully empowered computational model (since it simulates more than it computes).

Sidebar: Counter Machines

Counter machines are one of the very simplest models of computation.

Think of a collection of bowls of marbles, alongside a heap containing an unlimited supply of more marbles.

An n-counter machine comes with n bowls.

A program consists of a list of instructions of the following five simple types:

Place a marble taken from the pile into bowl X, where X is a particular bowl.
Remove a marble from bowl X, and return it to the pile; do nothing if there is nothing in the bowl.
Check if there are no marbles in bowl X; if so, continue with instruction K, where K is the number or label of one of the instructions in the program.
Continue with instruction K, unconditionally.
Halt.

The colors and sizes of the marbles do not matter; only the quantity does.

Initially, the bowls have some given number of marbles as input. When and if a program halts, the number of marbles in a designated bowl is the program’s output.

For example, the following is a 4-counter program for multiplying the quantities initially in bowls A and B. Bowls C and D start out empty. The product of A and B will be in D at the end. Bowl C serves as a holding area.

S: If A is empty, continue at H.
Remove a marble from A.
L: If B is empty, continue at R.
Remove a marble from B.
Place a marble in C.
Place a marble in D.
Continue at L.
R: If C is empty, continue at S.
Remove a marble from C.
Place a marble in B.
Continue at R.
H: Halt.

It is a fact that three bowls suffice to compute any computable single-argument function over the natural numbers, but to compute them all with only two bowls is only possible with an encoding such as 2ⁱ for i.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Let’s Be Honest

View in the ACM Digital Library

DOI

10.1145/3431281

May 2021 Issue

Published: May 1, 2021

Vol. 64 No. 5

Pages: 37-41

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Jul 26 2024

Establishing Standards for Embodied AI

Shaoshan Liu

Architecture and Hardware

vitruvian man on green binary code background, illustration

BLOG@CACM Jul 24 2024

A Pioneer in Using AI to Teach Reading

Jeremy Roschelle

Architecture and Hardware

BLOG@CACM Jul 23 2024

A Versal Story in the Era of Hardware AI: Why the Chinese Could Win

Aleksandr Romanov and Maksim Popov

Architecture and Hardware

worker amidst rows of circuit boards at Chinese factory

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

The Solution

Related Concerns

Takeaway

Sidebar: Counter Machines

Let’s Be Honest

DOI

May 2021 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.