AI as (an Ersatz) Natural Science?

In many ways, we are living in quite a wondrous time for AI, with every week bringing some awe-inspiring feat in yet another tacit knowledge task that we were sure would be out of reach of computers for quite some time to come. Of particular recent interest are the large learned systems based on transformer architectures that are trained with billions of parameters over massive Web-scale multimodal corpora. Prominent examples include large language models like GPT3 and PALM that respond to free-form text prompts, and language/image models like DALL-E and Imagen that can map text prompts to photorealistic images (and even those with claims to general behaviors such as GATO) .

The emergence of these large learned models is also changing the nature of AI research in fundamental ways. Just the other day, some researchers were playing with DALL-E and thought that it seems to have developed a secret language of its own which, if we can master, might allow us to interact with it better. Other researchers found that GPT3's responses to reasoning questions can be improved by adding certain seemingly magical incantations to the prompt, the most prominent of these being "Let's think step by step." It is almost as if the large learned models like GPT3 and DALL-E are alien organisms whose behavior we are trying to decipher.

This is certainly a strange turn of events for AI. Since its inception, AI has existed in the no-man's land between engineering (which aims at designing systems for specific functions), and "Science" (which aims to discover the regularities in naturally occurring phenomena). The science part of AI came from its original pretensions to provide insights into the nature of (human) intelligence, while the engineering part came from a focus on intelligent function (get computers to demonstrate intelligent behavior) rather than on insights about natural intelligence.

This situation is changing rapidly–especially as AI is becoming synonymous with large learned models. Some of these systems are coming to a point where we not only do not know how the models we trained are able to show specific capabilities, we are very much in the dark even about what capabilities they might have (PALM's alleged capability of "explaining jokes" is a case in point). Often, even their creators are caught off guard by things these systems seem capable of doing. Indeed, probing these systems to get a sense of the scope of their "emergent behaviors" has become quite a trend in AI research of late.

Given this state of affairs, it is increasingly clear that at least part of AI is straying firmly away from its "engineering" roots. It is increasingly hard to consider large learned systems as "designed" in the traditional sense of the word, with a specific purpose in mind. After all, we don't go around saying we are "designing" our kids ( seminal work and gestation notwithstanding). Besides, engineering disciplines do not typically spend their time celebrating emergent properties of the designed artifacts (you never see a civil engineer jumping up with joy because the bridge they designed to withstand a category five hurricane has also been found to levitate on alternate Saturdays!).

Increasingly, the study of these large trained (but un-designed) systems seems destined to become a kind of natural science, even if an ersatz one: observing the capabilities they seem to have, doing a few ablation studies here and there, and trying to develop at least a qualitative understanding of the best practices for getting good performance out of them.

Modulo the fact that these are going to be studies of in vitro rather than in vivo artifacts, they are similar to the grand goals of biology, which is to "figure out" while being content to get by without proofs or guarantees. Indeed, machine learning is replete with research efforts focused more on why the system is doing what it is doing (sort of "FMRI studies" of large learned systems, if you will), instead of proving that we designed the system to do so. The knowledge we glean from such studies might allow us to intervene in modulating the system's behavior a little (as medicine does). The in vitro part does, of course, allow for far more targeted interventions than in vivo settings do.

AI's turn to natural science also has implications to computer science at large–given the outsized impact AI seems to be having on almost all areas of computing. The "science" suffix of computer science has sometimes been questioned and caricatured; perhaps not any longer, as AI becomes an ersatz natural science studying large learned artifacts. Of course, there might be significant methodological resistance and reservations to this shift. After all, CS has long been used to the "correct by construction" holy grail, and from there it is quite a shift to getting used to living with systems that are at best incentivized ("dog trained") to be sort of correct—sort of like us humans! Indeed, in a 2003 lecture, Turing laureate Leslie Lamport sounded alarms about the very possibility of the future of computing belonging to biology rather than logic, saying it will lead us to living in a world of homeopathy and faith healing! To think that his angst was mostly at complex software systems that were still human-coded, rather than about these even more mysterious large learned models!

As we go from being a field focused primarily on intentionally designed artifacts and "correct by construction guarantees" towards one trying to explore/understand some existing (un-designed) artifact, it is perhaps worth thinking aloud the methodological shifts it will bring. After all, unlike biology that (mostly) studies organisms that exist in the wild, AI will be studying artifacts that we created (although not "designed"), and there will certainly be ethical questions about what ill-understood organisms we should be willing to create and deploy. For one, large learned models are unlikely to support provable capability-relevant guarantees—be it regarding accuracy, transparency, or fairness. This brings up critical questions about best practices of deploying these systems. While humans also cannot provide iron-clad proofs about the correctness of their decisions and behavior, we do have legal systems in place for keeping us in line with penalties–fines, censure or even jail time. What would be the equivalent for large learned systems?

The aesthetics of computing research will no doubt change, too. A dear colleague of mine used to preen that he rates papers—including his own—by the ratio of theorems to definitions. As our objectives become more like those of natural sciences such as biology, we will certainly need to develop new methodological aesthetics (as zero theorems by zero definitions ratio won't be all that discriminative!). There are already indications that computational complexity analyses have taken a back seat in AI research!

Subbarao Kambhampati is a professor at School of Computing & AI at Arizona State University, and a former president of the Association for the Advancement of Artificial Intelligence. He studies fundamental problems in planning and decision making, motivated in particular by the challenges of human-aware AI systems. He can be followed on Twitter @rao2z.