Computer Science: the Science of and About Information and Computation

Information is unique in that it is simultaneously contextual and arbitrary. Contextual because it never exists in a vacuum, but always represents some computational process or function; arbitrary because it has representational power only by virtue of having some previously agreed-upon meaningful content. We suggest the latter property has given us a powerful tool for expanding and exploring the space of scientific possibility, and has made possible exciting new computing paradigms such as optical, biomolecular, and quantum computing.

These new paradigms, however, present serious challenges for new system architectures. To this end, we further suggest that taking into account the contextual nature of information will lead to better, more useful, and efficient computing systems. We call this approach “information engineering.”

The History of Information and Information Science

In their book Information Ages: Literacy, Numeracy and the Computer Revolution, Michael Hobart and Zachary Schiffman compellingly relate the evolution of information throughout history. The main message gleaned is simply this: As our knowledge and technology have grown in complexity and sophistication, information—which initially represented the immediate flux and flow of daily life and was intimately connected to the medium of conveyance—has gradually grown more and more abstract in such a way that we now find ourselves in an age when information is represented by arbitrarily assigned symbols conveyed by whatever means human ingenuity and nature provides.

More specifically, Hobart and Schiffman divide the history of information into three distinct periods, the first of which they call the Classical Age. While physicists use the term “information” to denote a measure of order in physical systems, and as such refer to an entity dating back to the very beginnings of the universe, the Information Age picks up the narrative starting with prehistoric human language and the oral tradition.

We refer to Hobart and Schiffman for the details, but here we will briefly explain the progression to alphabetic literacy. First, emblems or pictographs were used to stand for physical objects. Cuneiform, the earliest true form of writing, developed as the increasingly sophisticated Mesopotamian civilizations began to break free from the constraints of accounting and to model writing upon their language rather than the physical objects of everyday life. During the Classical Age the alphabet was developed, representing in the abstract the basic units of speech, such as vowels and consonants. Here we see that, through the alphabet, writing became a means of encoding any sort of information in piecemeal fashion; that is to say, writing was no longer restricted to a 1:1 reference between language and concrete experience.

Around 1450 A.D., Johannes Gutenberg invented the printing press and the Modern Age began. Descartes described it as “the analytical vision of knowledge” (in contrast to Aristotle’s classifying vision of knowledge that was typical of his age), wherein physical phenomena were ultimately reduced to mathematical equations. The now-familiar litany of giants in the field of mathematics characterizes this era: Descartes’ analytic geometry; the calculus of Newton and Leibnitz; Riemann’s non-Euclidean geometry; and Hamilton’s dynamics. In short, mathematics was a new and powerful technology that replaced the written word as the most efficient means of dealing with increasingly abstract information.

Sadly (or happily, depending on one’s point of view), the analytic vision of knowledge and the agenda of scientific determinism were shown to be castles in the air. Quantum mechanics, Russell’s Class Paradox, and Gödel’s Incompleteness Theorem defined this. And so, we come to what Hobart and Schiffman call the Contemporary Age, an age when we still manipulate mathematical symbols by fixed rules, but do so with the knowledge that the symbols we use are in some deep sense arbitrary, deriving their meaning only through some previously agreed-upon referents.

How, then, is the Contemporary Age different in some meaningful way from the Modern Age? Precisely in the power and opportunities for play enabled by the combination of our open knowledge system and the modern equivalent of the printing press—the computer. The power of the computer, coupling as it does logical operations with electronic circuitry, is well known. However, the computer-mediated play over arbitrary symbols that typifies our time is less clear and may be best explained by example.

A case in point is a virtual reality application developed by a team from the University of North Carolina-Chapel Hill called the nanoManipulator (nM). The nM provides a VR interface to a scanning-probe microscope capable of imaging and manipulating materials at the atomic level.

Information technology in the Contemporary Age has expanded the space of possibility. It has opened a new realm of inquiry allowing scientists from all disciplines to explore the impossible and to investigate synthetic and/or previously unimagined worlds. It is information, hence it can be combined, modified, and transformed. We are thus bounded only by our imaginations.

The Analog/Digital Distinction

Given the preceding discussion, it is a trivial observation that as technology has progressed so has information processing. That is to say, technological advances are utilized as they become available for better, faster, and more efficient computational processes. From cuneiform on clay, to the printed word on paper, to digital bits in electronic circuits, the transfer and processing of information is continually improving.

Today, one has three general choices when it comes to computing and information representation—namely analog, digital, and hybrid systems. An analog computer represents information as a continuous function of some parameter, and the processing or computation is performed indiscriminately on the whole function. A digital computer, on the other hand, segments the same informational content into discrete units, and the computational operations are selectively applied on small portions of the data or a single digit. The tradeoff is that while analog computation may be faster, digital computation is more flexible and provides access to local portions of the data. Hybrid systems, as the name implies, are a combination of digital and analog subsystems interconnected via converters. The lesson to be learned is that the freedom allowed by processing over symbols with arbitrarily assigned content also requires us to make principled choices between these three models of computation.

IT has opened a new realm of inquiry allowing scientists from all disciplines to explore the impossible and to investigate synthetic and/or previously unimagined worlds.

The expression “analog process” or “analog computation” means a continuous process or computation, as opposed to some discrete computation. The actual signal can take such forms as an electrical current or voltage as a function of time, the waveform of an optical signal, a mechanical force/mass/velocity, or perhaps the heat and pressure of concentrations of chemical compounds again as a function of time.

Hence, we see that in analog computation there is a circular relationship between the physical phenomena or process being modeled, the mathematical model of the physical process, and the physical substrate implementing the mathematical model (for example, mechanical gears and shafts, electrical resistors and conductors, chemical compounds, optical lenses, and mirrors). We shall return to this important point shortly.

The familiar debate between analog and digital enthusiasts typically hinges on the relative benefits of speed versus precision. Although digital computers cannot calculate over the set of real numbers as analog computers can, they can effectively compute numbers with as much precision as is practically necessary. When this fact is combined with system considerations, such as the greater likelihood of analog computers being prone to noise or error, then digital computing seems to be the clear winner. However, there are certain applications and classes of problems for which analog computing has definite advantages, such as in robotic control systems.

Information Engineering

The issue is this: If one thinks of computation not in the abstract but as a physically realized process that connects some input information to some output information, then perhaps the computational process and how the information is to be internally represented may be optimized—along continuous/discrete lines—in a way that is most efficient for a given system architecture, task, and the nature of the input/output information. Biology has plenty of examples to this effect. Consider a visual sensor. It may be directly connected to a motor controller for certain tasks. Or, when there is a need for decision making of any sort (from the simplest case, such as producing an inhibitory signal, to a more complex decision, such as enabling an ensemble of neural circuits for pattern recognition) the system produces a discrete decision. These discrete decisions can be modeled by continuous functions, so-called logistic functions.

The question, then, is: How does one optimize this process? To answer this question, we must first make explicit that we are only interested in processes or functions known to be computable. We leave to others the question of whether the universe is itself computable. This being the case, we assume:

Computational processes or functions can be abstractly expressed by some mathematical or algorithmic formalism; and
Mathematical or algorithmic formalisms describe a computational process or function at a specific level of granularity or resolution.

From these assumptions it follows that the granularity of the computational process or function is a determining factor in whether the formalism ought to be in the continuous or discrete domain, or both. Said differently, the representation should be appropriate to the computational process or function being described. Put even more simply, the choice of mathematical model must take into account the task or problem to be solved at the given scale of resolution.

Considering the granularity of the process or function being modeled is not a sufficient condition by itself, however, for determining the appropriate nature of the information representation. This is because computation is always implemented in hardware, either organic or synthetic. Let us elaborate by adding two additional assumptions:

Computation is ultimately physical, in the sense that the mathematical or algorithmic formalisms describing computational processes or functions are executed upon, or implemented in, a physical substrate; and
The physical substrate, and the controlled behavior thereof, may exhibit different levels of granularity or resolution. That is, a physical substrate exhibits continuous, discrete, or hybrid behavior depending upon which behavior is of interest. (Two familiar examples are light and electricity, which may take the form of discrete photons and electrons, respectively, or may be continuous electromagnetic waves.)

When we take into account the real-world nature of computation, two things should be immediately clear. First, the controllable behavior of the physical substrate is a determining factor in whether the formalism ought to be in the continuous or discrete domain, or both. In other words, the representation should be appropriate to the underlying physical substrate. This is trivially true. Second, and equally important, the converse also holds: the formalism being implemented, and by extension the computational process or function being described, are determining factors in whether the physical substrate (system architecture) ought to be in the analog domain, digital domain, or both.

Keeping in mind the preceding discussion, in Figure 1 we see that the physical process being formally modeled and the physical substrate implementing the model both play an important role when choosing the suitable mathematical model, or more generally the optimum information representation scheme. However, the relationship between the three is no longer circular. Why? Because the computer-mediated play over arbitrary symbols that is a defining feature of the Contemporary Information Age breaks the chain of analogous processes. The computational process or function being modeled remains a given, of course, derived by the task-oriented nature of computation. It directly and indirectly affects the choice of physical substrate and formalism. Simultaneously, the formalism and physical substrate must be optimally matched. These choices are made because computation is physically realized in hardware. We depict this model in Figure 2. (Note again, however, the causal relationship between physical substrate and computational process or function no longer exists.)

What, then, are the practical implications of this model? Briefly, we are suggesting that the traditional computer science subdisciplines of architecture, software, and theory need to recognize the contextual nature of information. That is to say, computer scientists should recognize that information does not exist in a vacuum, but instead always represents some computational process or function (real-world data or cognitive constructs) whenever it flows into a computer, is acted upon via some formalism, and is then output back into the real world. If this is so, then each of the arrows in Figure 2 represents a question of engineering, the answer to which must be balanced against the others via some cost functions:

A. Would a continuous, discrete, or hybrid formalism best represent the information and task at hand?
B. Would an analog, digital, or hybrid architecture be best suited for the information and task at hand?
C. Would a continuous, discrete, or hybrid formalism be best suited for the system architecture?
D. Would an analog, digital, or hybrid architecture be best suited for the formalism used?

Possible cost functions that may be used to guide one’s decisions include reliability, robustness, programmability, and flexibility in range of tasks, energy consumption, precision, and speed. This is an information engineering approach, as opposed to the more limited system engineering approach that more frequently only asks questions C and D.

Now, some may protest that with all this talk of continuous functions and analog and hybrid system architectures, we have given short shrift to digital general-purpose computing. One may point out that considerations of economy dictate use of a uniform representation at an elementary level and a physical substrate usable for the largest number of tasks and models, which implies universality of computation. (Computer scientists have focused on this almost exclusively.) And haven’t digital systems served us quite well for the last 50 years? We have no argument here. Indeed, our point is that for many cases the information engineering approach will suggest that digital general-purpose computing using discrete units for information representation is in fact the optimal solution. However, it should be obvious that digital computers are not best suited for all tasks or representations. In certain special-purpose cases, analog and hybrid systems may be the best solutions with regard to robustness and efficiency, but they clearly aren’t as flexible or easy to program. These are the tradeoffs one must make, the balance one must strike, when recognizing the contextual nature of information.

From an information engineering point of view, biological systems have evolved the optimal balance of answers to the questions for their specific purposes. For example, we suggest that the brain is not a general-purpose computer (like a universal Turing machine) but a highly evolved special-purpose machine that by sheer dint of complexity is capable of a great many tasks. Depending on the task and type of information being transmitted, networks of neurons automatically employ discrete, continuous, and hybrid computation whenever necessary. And perhaps this is the ultimate lesson to be learned from the most powerful computational machine in nature with the given task of survival: it seems that the brain is capable of automatic architecture customization for certain well-defined computational tasks.

Perhaps someday it will be commonplace for computers to likewise automatically tune their discrete and continuous representations depending on the context of the information and the task. In the near term, however, as new revolutionary computing paradigms emerge, it is vital at the very least to recognize and account for this increasingly salient, symbiotic relationship between the contextual representation of information and computation. Information engineering is a good first step in that direction.

Figures

Figure 1. The circular nature of rudimentary analog computing.

Figure 2. A model of computation in the Information Age.

Computer Science: the Science of and About Information and Computation