LLM Hallucinations: A Bug or A Feature?

Large Language Models (LLMs) can do astonishing things, like summarize complex data or generate creative content in seconds. Unfortunately, they also make things up— euphemistically referred to as hallucinating. Hallucinations happen when a model outputs plausible sounding but inaccurate, even nonsensical, responses to prompts, which can undermine LLMs’ trustworthiness and real-world deployment.

Palo Alto, CA-based AI company Vectara manages a Hallucination Leaderboard on GitHub. It currently shows that popular LLMs, including versions of GPT, Llama, Gemini, and Claude, hallucinate between 2.5% and 8.5% of the time; for some models, that statistic exceeds 15%. Hallucinations can be comical, but may also have serious consequences, especially in specialist sectors—such as healthcare, finance, and law—where hallucination rates are even higher. Matthew Dahl, Varun Magesh, Mirac Suzgun, and Daniel E. Ho, researchers at Stanford University’s RegLab, showed that hallucinations are “pervasive and disturbing” in response to verifiable legal queries, with rates ranging from 69% to 88%.

The issue is a fundamental one. LLM’s core functionality is to predict the next-most-likely output in a string of text or code; they do not inherently know fact from fiction. As Brent Mittelstadt, director of research at the Oxford Internet Institute (OII) of the University of Oxford, U.K., explained, “There is no absolute requirement for them to be factual, accurate, or line up with some sort of ground truth.”

The long-term societal consequences of unleashing terabytes of credible-sounding but false information into the world is hard to predict. Said Mittelstadt, “It’s really difficult to measure, in the same way that it’s difficult to measure the impact of misinformation/ disinformation, what impact that has on society and the beliefs of individuals over time.”

The impact is already upending politics. A survey by the U.S.-based Pew Research Center found that “about four in ten [American] adults have not too much or no trust in the election information that comes from ChatGPT.” A report by the U.K. think tank Demos and University College London stated that “Synthetic content produced by generative AI [including text, video, and audio content] poses risks to core democratic values of truth, equality, and non-violence.”

Frenzied and varied efforts are underway to reduce—if not eliminate—LLM hallucinations. Some researchers see them as a bug to be fixed, others as a feature to accept or even embrace.

A host of solutions, and perspectives

Causes of hallucinations include incomplete, biased, or contradictory training data; source-reference divergence (such as when a model generates responses that are not grounded in source material);jailbreak prompts, in which users deliberately try to side-step a model’s guidelines or exploit its vulnerabilities, and overfitting, in which models are too closely aligned with training data and fail to generate accurate outputs using new data.

The resulting hallucinations can be intrinsic, directly contradicting source material, or extrinsic, in that the model’s output may or may not be accurate, but cannot be verified by the source material. An extrinsic hallucination can be true, said Ariana Martino, a data scientist for New York-based digital presence platform Yext; “It just isn’t true in the context of the information that the language model has to work with.” This matters in enterprise use-cases, she said; “When a company is using AI for a particular business purpose, it’s really important that the information isn’t just true, but it’s true in the particular context that they’re working in.”

A common approach to reducing hallucinations is to fine-tune a model’s output using human responses, but this can be time-consuming and costly. It also raises transparency concerns, said Mittelstadt, who explained, “You’re essentially giving a lot of power to the providers of these systems to determine what is true or what is not true, or maybe more relevantly, what is appropriate to discuss versus what is not.”

The search for new solutions is markedly varied; there is no consensus on the best way forward. Some teams are focused on model improvement. Researchers at Facebook AI Research (now Meta), New York University, and University College London have proposed a fine-tuning technique using Retrieval Augmented Generation (RAG) for knowledge-intensive NLP tasks,which “combines pre-trained parametric and non-parametric memory for language generation.”

Pascale Fung’s group at the Center for Artificial Intelligence Research (CAiRE) at the Hong Kong University of Science and Technology have developed a self-reflection methodology that uses model knowledge acquisition and answer generation to reduce LLM hallucinations. Meanwhile, a team from Microsoft Responsible AI has put forward a hallucination detection and reduction method via post-editing using a Chain-of-Verification (CoVe) framework.

Other researchers suggested a rethinking of how we perceive hallucinations. Computing experts at the National University of Singapore (NUS), Ziwei Xu, Sanjay Jain, and Mohan Kankanhalli, director of the NUS’ AI Institute, used learning theory to argue that hallucinations are inevitable and cannot be eliminated. LLMs will always hallucinate as they “cannot learn all of the computable functions,” suggested the team. At Peking University, researchers Jia-Yu Yao, Kun-Peng Ning, Zhen-Hui Liu, Mu-Nan Ning, and Li Yuan proposed revisiting hallucinations and viewing them as “adversarial examples,” rather than bugs to be fixed.

Another tack is to develop sector-specific approaches. Mittelstadt and OII co-authors Sandra Wachter and Chris Russell argued that as hallucinations have the potential to “degrade science,” LLMs should be approached as “zero-shot translators” for converting source material into various forms. Said Mittelstadt, “Rather than using a language model as a knowledge repository, or to retrieve some sort of knowledge, you are using it as a system to translate data from one type to another or from one format to another.”

Targeting retail, Martino and Yext colleagues Michael Iannelli and Coleen Truong have shown that LLMs’ responses to online customer reviews can be improved by creating prompts using a Knowledge Injection (KI) framework that draws on contextual data about specific retail locations. The work also makes a case for building smaller, fine-tuned models for specific scenarios, says Martino, rather than taking the “giant approach” of relying on more-generalized API-based models for specialist tasks.

Keeping up or going in circles?

Research is happening at high speed, but public use of LLMs is growing faster. This means future model iterations may be trained on their own—possibly flawed—outputs, a problem of spiralling degradation that needs attention, said Mittelstadt, “Assuming you have hallucinations, or careless speech, in the current iterations of these models, the impact of those things can be greatly amplified over time: they end up eating their own tail.”

LLMs’ popularity, via the uptake of GenAI, took tech by surprise. Models are in circulation and there may already be a baked-in public perception that they are impressive but make things up; users may instinctively adapt. Nonetheless, new techniques for tackling hallucinations are likely to keep coming thick and fast.

Karen Emslie is a location-independent freelance journalist and essayist.

LLM Hallucinations: A Bug or A Feature?

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.