Conversations with AI

The fields of artificial intelligence and computational linguistics have witnessed a breakthrough with the advent of large language models (LLMs), with models like ChatGPT that can predict the continuation of a piece of text. On the one hand, LLMs can perform practical tasks such as text generation, text comprehension, summarization, and translation. On the other hand, they are also of theoretical interest to linguists: both as subjects of study and as tools for studying language use.

At the 2024 American Association for the Advancement of Science (AAAS) Annual Meeting in Denver in February, three computational linguists discussed some of the promises and pitfalls of LLMs for the study of languages.

To illustrate an important difference between LLMs and the human brain in learning and using language, James Martin of the University of Colorado at Boulder offered an interesting example. Take the following piece of a news story: “The plants were found during the search of a warehouse near Ashbourne on Saturday morning. Police said they were in ‘an elaborate grow house.’ A man in his late 40s was arrested at the scene.”

Asked to summarize this text, an LLM produced the following: “Police have arrested a man in his late 40s after cannabis plants worth an estimated￡100,000 were found in a warehouse near Ashbourne.”

“When we humans reason about narratives,” Martin explained, “we use things that are not in the text. We build mental models of what’s happening based on our background knowledge, the context, and our linguistic competence.” He wondered about the extent to which an LLM can do the same.

Looking at the summary in the example, the LLM makes some perfectly plausible inferences, Martin said. “That cannabis is involved seems plausible based on the words ‘grow house’. That the police arrested the man also seems logical, and that the arrest took place after the search does also. But where does the figure of ￡100,000 come from? Why 100,000? Why an amount in British pounds? Why mention an estimated value at all, as there is nothing in the original text?”

This result is an example of a phenomenon related to LLMs that has become known has ‘hallucination’: making stuff up. Martin was able to show that the amount of ￡100,000 did not come completely out of the blue; a simple Google search turned up many news articles about similar real-life stories that mention some monetary amount, and a particular BBC News story even mentions a similar amount. “LLMs produce text very differently from humans,” Martin concludes, “but the theory and methods of linguists can be very helpful to understand these differences.”

Kyle Mahowald, an assistant professor of linguistics at the University of Texas at Austin, elaborated on the extent to which a model that is good at language is also good at thinking, something we assume is the case in humans. Martin’s example already showed that being good at language doesn’t automatically mean one is good at thinking, “But at the same time,” Mahowald said, “getting the form of language right is actually an interesting result. Whether formal linguistic competence can emerge from a pure statistical approach to language is a major question in linguistics. So, if LLMs get the form of language right, that is scientifically informative.”

Humans can have a sense of what is grammatical even if they have never seen a certain grammatical construction before, said Mahowald. On the other hand, Mahowald’s own research has shown that LLMs also can learn constructions that are rare in the training data, like the construction ‘a beautiful five days’ instead of the more-standard form ‘five beautiful days’. How, exactly, LLMs learn such rare constructions is a subject of current linguistic research, but it is clear LLMs can be used to answer relevant theoretical linguistic questions, Mahowald concluded.

While LLMs are trained on hundreds of billions of words of digitized text, 99% of the more than 7,000 languages used in the world are not represented by large samples of digitized text, which suggests the question that Sarah Moeller of the University of Florida asked: What is the influence of LLMs on “smaller” languages? (In point of fact, many of these “smaller” languages—or, more accurately, less digitally represented languages—have many millions of speakers.)

“Language science is plagued by a long-recognized pitfall,” Moeller said, “which is that we know too little about so many languages. So, we need to recognize that a hypothesis based on English or based on an LLM is likely going to be biased towards English or towards other dominant languages used in LLMs.”

Might it be that the languages of successful LLMs put pressure on minorities to give up their own language? It is known that the extinction of minority languages correlates with high rates of suicide, depression, and addiction. On the other hand, communities that actively maintain their original, local language have better health statistics.

“Already before the dawn of this new age of AI, we had already estimated that 30% to 90% of the world’s languages will disappear by the end of this century,” said Moeller, “and that hasn’t really changed because of AI. So, the answer to whether LLMs endanger “small” languages depends more on the people who use and study them.”

The really important question that we should be asking, according to Moeller, is how scientists can capitalize on the potential promises of LLMs for the benefit of the study and protection of “smaller” languages. One option is to use computational methods to develop educational materials and new technology to keep endangered languages alive.

Bennie Mols is a science and technology writer based in Amsterdam, the Netherlands.