There has been more than a little sensational speculation in recent months regarding the ability of artificial intelligence (AI) to attain sentience, but Bruce McNaughton, distinguished professor of neurobiology and behavior at the University of California at Irvine, does not waste much time pondering the possibility.
In fact, McNaughton doesn't ascribe much to the idea of sentience, period. He discourages his students from using the term, as well as "consciousness."
"I discourage them from using the term because as a scientist, I think of the brain as a physical system that obeys the laws of physics," McNaughton said. "We just didn't understand that implementation of the laws of physics could become so incredibly complex through the laws of evolution, and I think most people have only the fuzziest idea of how complex it really is in the brain."
Konstantinos Voudouris, a psychologist and graduate student researcher at the Leverhulme Centre for the Future of Intelligence of the U.K.'s University of Cambridge, shares McNaughton's sentiments to a fair degree. "It's beguiling almost to be anthropomorphic about how these systems are behaving," said Voudouris, first author of a recent study that directly compared the cognitive abilities of AI agents and children age 6-10. "It's almost a quality of human psychology to anthropomorphize things, but the psychologist can come in and scientifically evaluate that hypothesis against the many alternatives that exist."
Neither McNaughton nor Voudouris are computer scientists or engineers, yet their recent work on artificial intelligence is emblematic of a surge in multidisciplinary research; the mechanics of human cognition, which have served as the theoretical underpinnings of AI development for decades, are receiving greater attention in the development of AI systems that are subject to ever-greater expectations of what they will be able and need to do — and vice versa.
McNaughton said there is more back-and-forth between how cognitive science and artificial intelligence can affect each other because the principles, the mathematics, and the fundamental approaches of machine learning are the same principles and problems the brain confronts: how to make efficient generalizable knowledge that can be flexible and used in different situations.
"In order to capture the statistical structure of the world as we experience it or as an artificial network experiences it, it takes many, many trials in order to gain a good statistical representation of the domain of the data," he said. "The brain has that problem, and artificial neural networks have that problem."
For instance, he said, cognitive scientists are still searching for how the brain actually does the equivalent of backpropagation if, indeed, it doesn't do backpropagation itself: "That is a central problem in computational neuroscience. But that's where neuroscientists who have some ability to follow the machine learning literature can gain insight from that field."
Jay McClelland, director of Stanford University's Center for Mind, Brain, Computation, and Technology, who has published numerous influential explorations of cognitive science and AI with McNaughton, said he also sees a burgeoning dialogue between machine learning scholars and cognitive scientists.
"We do have a two-way street, in the sense that work from AI is at least leading to ways in which people in neuroscience can see how they can engage in the discussion with other people in other fields," he said. "And computer science can offer hypotheses and alternative ways of thinking about exactly what the brain is doing and how it is solving problems, or raise questions we need to answer as brain scientists."
Lifelong learning in machines and humans
One of the major factors driving this computer science/cognitive science dialogue is the Lifelong Learning Machines (L2M) research program launched by the U.S. Defense Advanced Research Projects Agency (DARPA) in 2017, with University of Massachusetts AI expert Hava Siegelmann as the project's first program manager.
The L2M program's core goal was to create AI systems that could take new data, leverage previously learned information, and learn on the fly. Traditional AI architectures, Siegelmann said, fall far short of that. "If you train a network to separate cats and dogs, then you use the same network to separate elephants from tigers, if you use just regularizers, your system won't be able to separate elephants from cats, because it was never a task that it learned."
Siegelmann convened a cross-discipline pool of computer scientists, neuroscientists, biologists, and others to fundamentally change the depth to which research into machine and human cognition interacted.
"They really stepped outside the box and tried to incorporate a range of ideas and thinking in the field," McNaughton said, "and part of that was a subset of neuroscientists who were also interested in these problems. Suffice to say that connection has been strengthened, at least from the perspective of the interests of the machine learning community in neuroscience."
"I wanted the biologists and neuroscientists to tell me the mechanism of how learning works in the brain," Siegelmann said. "I didn't want them to just tell me it goes from the hippocampus to the cortex. We know that. I wanted them to give me a mechanism in such detail I could actually write equations and program them."
The latest research by McNaughton's lab, published in the Proceedings of the National Academy of Science, hewed closely to Siegelmann's stipulations by addressing a persistent problem in artificial neural networks. Termed catastrophic interference or catastrophic forgetting, it is the rapid loss of previously acquired knowledge if new information is introduced too quickly, essentially because the new information re-weights the network to an extent that the system virtually forgets what it has previously learned. Traditionally, artificial network architectures try to alleviate this by re-introducing everything the system has learned as new information is introduced, but this approach becomes both time- and compute-resource impractical, especially if a system is expected to function successfully on the fly.
McNaughton's group, led by the study's first author, graduate student Rajat Saxena, refined a learning system introduced by McNaughton and McClelland in 2020 called Similarity Weighted Interleaved Learning (SWIL). The SWIL theory suggests that learning in artificial networks can be made more efficient by introducing only a subset of old items that share substantial representational similarity with the new information: "By using such similarity-weighted interleaved learning, artificial neural networks can learn new information rapidly with a similar accuracy level and minimal interference, while using a much smaller number of old items presented per epoch," the group concluded.
In their original paper, McNaughton, McClelland, and Andrew Lampinen concluded SWIL performed similarly to networks that interleaved every old item with the new ones to be learned, but used 40% fewer items. They did not find, though, that it scaled beyond a simple neural network.
The latest paper successfully scaled SWIL to work on traditional classification datasets (Fashion-MNIST, CIFAR10, and CIFAR100) as well or better than existing schemes such as Fully Interleaved Learning (FIL), Focused Learning (FoL), and Equally Weighted Interleaved Learning (EqWIL). The team concluded that SWIL's future, at least in terms of AI, probably lays in complementing other learning techniques, such as generative replay or elastic weight consolidation. And, while McNaughton called the latest SWIL research the "evolution" of a breakthrough concept rather than a breakthrough in itself, he did say it stimulated questions about human cognition.
"Once we realized this was going to be beneficial in the machine learning process, we began to ask ourselves how would the brain do this?" he said. "So we have a conceptual model outlined in the paper that Rajat is actually working on for his Ph.D. thesis."
That model employed in the brain, McNaughton said, may ultimately lead to behavioral therapies, as well as possible neurochemical or neurphysiological interventions to help people with cognitive impairments – but he also cautioned, "we are a ways from that."
Voudouris and his colleagues are also working on several projects funded through another DARPA grant, RECoG-AI (Robust Evaluation of Cognitive Capabilities and Generality in Artificial Intelligence). The work aims to improve AI evaluation by providing a framework and benchmarks for measuring the capabilities of AI systems. However, Voudouris said, insights into human cognition need to become more granular in some instances, in order to accurately evaluate AI's capabilities – or, at least, to accurately compare them to those of humans. And like McNaughton and McClelland, who have spent decades pondering the similarities and differences of human and machine cognition, Voudouris, just starting on his career path, concurs.
"Trying to implement and build agents that behave like we do throws up all sorts of subtle details that aren't really considered in cognitive science, because the kinds of theories that are proposed there tend to be posed at a higher level, without the deeper implementational detail," Voudouris said.
For instance, he said, it is taken as almost axiomatic in developmental and comparative psychology that humans and animals can individuate objects — the more interesting question to him is how we then reason about the objects we have identified. Yet in building AI that can do complex physical reasoning, the problem of having a robust sense of objecthood itself becomes vital, and those implementational details are crucial.
"That's because it's almost like the first step that needs to be conquered before we can build 'robots with physical common sense'," Voudouris said. "So, in this way, I suppose our work with direct human-AI comparison is highlighting the lacunae and blurriness in cognitive science, motivating us to develop more precisely specified theories about human cognition using the formal tools of machine learning."
Gregory Goth is an Oakville, CT-based writer who specializes in science and technology.