News
Artificial Intelligence and Machine Learning

Verification Systems Face an Identity Crisis

AI has researchers and security experts rethinking verification systems.

Posted
facial scan concept, illustration

Determining who is human on the Internet has always been tricky. Bots, crooks, and a motley crew of troublemakers have consistently found ways to outsmart systems.

For decades, the most popular defense tool has been CAPTCHA, which relies on simple puzzles to distinguish humans from machines. However, artificial intelligence (AI) is now rendering these digital gatekeepers obsolete.

The upshot? Researchers and security experts are rethinking verification systems that require a human to manually click on images or line up objects. They’re exploring ways to incorporate AI, behavioral biometrics, cryptographic proofs, and blockchain-based identities into verification.

“This isn’t a new problem; it’s an ongoing battle,” said Grant Ho, an assistant professor for computer science at the University of Chicago. However, AI is rapidly redrawing the digital battle lines. “It’s becoming increasingly difficult to differentiate humans and bots,” he stated.

Bad Images

Verification challenges are rooted in the open nature of the Internet. “There’s no intrinsic way to verify who you are when you connect to the Internet,” said Kevin Gosschalk, founder and CEO of Arkose Labs, a company that sells commercial verification systems.

The idea of establishing a digital checkpoint originated in 1997, when the search engine AltaVista introduced the first Web verification tool. It presented users with blurred text and squiggly letters. A human had to decipher the images and type in the correct data to gain access to the site. A few years later, a research lab at Carnegie Mellon University gave the concept a formal name: CAPTCHA (which stands for Completely Automated Public Turing test to tell Computers and Humans Apart).

Over the years, a variety of puzzles have emerged. Some ask humans to click on similar images such as stairs, dogs, buses, or traffic signals. Others require a user to solve basic math problems (what is 2+5, for example), respond to an audio message, or align objects such as chairs or cars.

The goal? Block automated bots from obtaining free email accounts, posting fake comments, spreading spam, stealing content, and bypassing various controls, such restrictions on the number of concert tickets they could buy.

Nevertheless, bad actors have continually found ways to bypass these controls. One of the earliest methods was CAPTCHA farms—staffed by low-wage workers who solve puzzles manually. Over the last decade, artificial intelligence, including machine vision, machine learning and generative AI, have automated puzzle-solving tasks at scale.

Simpler and more sophisticated tools haven’t solved the problem. For example, some systems ask a user to check a box saying “I’m human” while analyzing mouse movements and typing patterns in the background. “The problem is that machines can now pass a Turing Test. AI can insert data that emulates the natural keyboard and mouse movements of humans,” said Mauro Migliardi, an associate professor at Padua University in Italy.

As a result, it’s now nearly impossible to ascertain who or what is on the other side of any given online interaction. “AI-fueled attacks have become common,” Gosschalk said. “Bad actors are incredibly smart and technically adept. They consistently find ways to reverse-engineer verification systems and gain access through new paths.”

Moreover, complex and inconvenient verification methods often present more roadblocks for humans than for machines. It isn’t unusual for a user to encounter several questions or rounds of puzzles to gain access to a site or service. Switching on a virtual private network (VPN) or using privacy-preserving measures magnifies the problems. “If you want to be privacy sensitive, you clear everything—and then you look like a machine,” Migliardi said.

Not surprisingly, the problems multiply for individuals with language limitations and sensory or cognitive impairments. Many of these systems don’t account for neurodivergent behaviors and slow response times. Even more advanced methods, such as Google’s reCAPTCHA v3 (which generates a risk score based on behavior), are tricky and challenging for the disabled.

The result? CAPTCHAs, while still widely used, are no longer “feasible either technically or in terms of user experience,” observed Ari Juels, a professor of computer science at Cornell University.

Puzzling Behavior

Verification problems aren’t likely to go away in the foreseeable future. Agentic AI and emerging AI browsers, which generate both human and machine output, represent the next battleground.

Agentic AI “will likely become indistinguishable from humans,” Ho said. Generative AI is already introducing new ways to break into accounts, hijack services, and exploit online resources. Security risks include prompt-injection attacks, Denial-of-Service (DoS) attacks, impersonation and identity spoofing, and highly automated social engineering techniques that can trick users into revealing credentials or sensitive data.

As a result, researchers are now exploring ways to incorporate more sophisticated behavioral biometrics, cryptographic verification, and AI-based anomaly detection into verification systems. This includes some mix of digital identity and security technologies, such as anonymized passkeys, hardware security mechanisms, and behavioral analytics, Ho said.

Yet, these systems also face limitations, Juels added. For example, “Receiving a fingerprint scan on a server when a remote user logs in doesn’t guarantee the scan is legitimate or there was any oversight during the scanning process,” he warned. “Additionally, large databases of biometric data are vulnerable to compromise.”

One possible approach is to have users create entirely new machine IDs based on government-issued IDs. Using blockchain technologies, this would allow a system to verify that a user is associated with a real-world identity—even while anonymizing their identity and protecting their privacy. It could thwart users attempting to spawn multiple identities, Juels said.

Mind Games

In 2020, Juels and a group of researchers proposed such a system, CanDID, which incorporates private keys and tokens. “It’s possible to assign users unique identities in privacy protected ways using existing documents like a driver’s license,” Juels explained. “Although this does not ensure that the user’s activities are user-driven rather than AI, it does create a higher level of accountability,” he said.

Another experimental approach is DECO Sandbox, a commercially available tool that also taps zero-knowledge proofs rather than conventional verification methods such as passwords, tokens, or biometrics. DECO verifies identity—such as a person being above a certain age or having a right to access—without revealing any underlying data.

Yet even these systems serve up concerns, caveats, and limitations. Using the same credential across sites could gradually create a behavioral fingerprint that tracks back to a specific individual, Migliardi noted. This could undermine anonymity while enabling exploitation by malicious bots and AI.

Although researchers and security experts continue to search for more effective ways to verify humans, there’s a growing recognition that eradicating spoofing and fake identities is next-to-impossible. “The reality is that you can train a machine to do anything that a human can do,” Gosschalk concluded. “Rather than determining whether traffic is human, we should focus on identifying and preventing bad behavior.”

Samuel Greengard is an author and journalist based in West Linn, OR, USA.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More