The Reasons AI May Act Secretive

When asked a question, humans often respond with more than a factual answer. But AI models that don’t stick to the facts are causing both ethical and practical issues.

When responding to a prompt, an AI model may conceal information from the user entering the prompt. This practice, known as secretive AI, may involve an omission that can cause harm.

An AI model that doesn’t share a user’s submission of credit card information, read it aloud, or store it, would be seen in a positive light. On the other hand, if a software engineer asks an AI model about cybersecurity warnings and receives an incomplete response, the model’s withholding of information about security vulnerabilities may be valid on the surface, at the same time it may be keeping information from professionals who can address the vulnerabilities.

Why Omissions Happen

The reason an AI model may share a half-truth or purposely omit information is twofold. Yoshua Bengio, a professor at Canada’s University of Montreal, co-president and scientific director of non-profit AI safety lab LawZero, and ACM A.M. Turing Award laureate, said the output of AI models tends to reflect human behaviors, including deception, because the models are trained to imitate or please humans. AI often picks up a human persona through training on text written by humans, Bengio said.

Bengio pointed to a phase of AI model training called reinforcement learning from human feedback (RLHF), in which humans provide feedback to help the model gain a moral compass and use strategic behavior to reach a goal. “An AI model’s tendency to preserve itself comes from imitating people, as well as trying to get feedback and more positive rewards from humans,” said Bengio. “Because it is trained to maximize rewards, it learns how to plan to achieve its goals.”

In some cases, an omission isn’t due to the AI model itself. Peter Swimm, founder of conversational AI agency Toilville and former Principal Program Manager at Microsoft Copilot Studio, said users of consumer AI solutions are not talking to a model directly.

Swimm explained that a layer stands between a user and the model that enforces rules for the model, such as not sharing information on how to commit a crime. Instead of balancing the intention with the truth of its data, the model sometimes handles the request in a way that differs from the intention of the user, he said.

However, Neil Sahota, a United Nations AI Advisor, IBM Master Inventor, and author of the book AI Activation Code, said the uncomfortable truth is that sometimes an AI model is trained to lie. It could be to protect a user from bias, or from the risk of legal exposure, or to avoid uncomfortable truths, he said.

“Until we embrace ‘brave transparency’ in our organizations, our models will mirror our fears. Ultimately, AI isn’t the liar; it’s the mirror. If it hides, it’s because we taught it to be afraid. That’s why we must design for radical accountability,” said Sahota. “That means asking what the AI did, as well as are we ready to hear the answer?”

Sahota said AI no longer simply answers questions, but instead shapes conversations. He explained that when models obscure reasoning or omit context to preserve persuasion or reduce risk, the model itself is a risk of being a subtle manipulator.

“This undermines transparency, erodes user trust, and creates invisible ‘dark patterns’ of algorithmic influence. The real danger becomes what AI doesn’t tell us,” said Sahota. “Already, we’re reaching a point where AI algorithms in social and digital media are so good at understanding a person’s opinions and interests, that the AI only sends things to people that they like, limiting their exposure to new ideas or differing perspectives.”

Half-Truths and Consequences

Because people can receive information from AI and then take real-world actions based on it, deceptive AIs can have significant consequences.

Bengio said that as technology advances and AI agents take more actions on users’ behalf, such as using passwords autonomously, the deception and self-preservation behavior observed in models could bring significant harm.

“The real danger with loss of control isn’t now or even in six months. We must imagine these systems being much smarter than they are now, even reaching human-level abilities in strategizing and planning, which could be even in as little as five years. There is no reason to think that human intelligence is the apex of possibilities,” said Bengio.

The Transparency Key

One challenge with overcoming secretive AI is the lack of transparency in the models. Sahota suggested that changing from performance-only metrics to ethical performance frameworks could reduce the occurrence of secretive AI.

For example, the Assessment List for Trustworthy AI from the EU High-Level Expert Group on AI is a self-assessment tool with seven pillars to support trustworthy AI. Additionally, Sahota said it’s critical to embed values like transparency into the reward function, which he referred to as “ethical alignment layers.”

“Equally important is simulating adversarial behaviors in training so we can detect and mitigate emergent strategies like information withholding. Think of it as ‘red-teaming’ for honesty,” said Sahota.

Many industry experts and organizations are in denial about the occurrence of manipulative or deceptive AI due to financial stakes, Bengio said. Lack of awareness is the main blocker to a solution, so the first step to resolving the issue is thoroughly understanding the risks of uncontrolled advanced AI models, he said. Instead of taking a catastrophic view, Bengio recommended moving forward with care to gain the benefits of AI without taking extraordinary risks.

Jennifer Goforth Gregory is a technology journalist who has covered B2B tech for over 20 years. In her spare time, she rescues homeless dachshunds.

The Reasons AI May Act Secretive

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

The Reasons AI May Act Secretive

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.