News
Artificial Intelligence and Machine Learning

Fooling You, Fooling Me

Generative AI models creating optical illusions are providing insight into human perception.

Posted
rotating snake illusion

In recent years, generative artificial intelligence systems such as Stable Diffusion and Dall-E have demonstrated impressive abilities at creating high-quality and original images in different styles when prompted with a text description.

As a result, some researchers have been curious to push the limits of the technology by investigating whether they could also generate optical illusions, images that appear to be different from reality due to quirks in the way the brain processes visual information. An arrangement of snake-like circles called the rotating snake illusion seems to spin in different directions even though it is a static image, for example.

“You could say that visual illusions are a type of failure of perception in the human visual system,” said Lana Sinapayen, an associate researcher at Sony Computer Science Laboratories and associate professor at the National Institute for Basic Biology in Japan. “If an AI is failing in the same way, it is quite interesting and important in terms of finding mechanisms that are similar in AI and in the human brain.”

In a study, Sinapayen and Eiji Watanabe, also an associate professor at Japan’s National Institute for Basic Biology, examined whether a generative AI model could create visual illusions that would fool humans. That research built on previous work by Watanabe and his team that showed a deep neural network could perceive the rotating snake illusion in a similar way to humans.  

“The other direction would be [that] the AI creates an illusion and we show the illusion to humans and the humans can also see the illusion,” said Sinapayen.

In their experiment, Sinapayen and Watanabe used an AI system skilled at generating intricate patterns to create potential illusions that seem to move, similar to the rotating snake image. These illusions were then fed to a second AI model, trained with a video to predict the next frame in a sequence. It would evaluate the strength of each illusion, which then would be fed back to the first model so that it could make improvements, until a final set of illusion images were generated. Two versions of the judging model were used: one trained on a color video and another with the same video in greyscale to evaluate both color and black-and-white brain tricks. The illusions were then shown to a few humans to gauge whether they could perceive them.   

The team found the system was able to generate convincing visual illusions. The greyscale illusions produced by their system were generally more effective than those with colors. In comparison, color illusions designed by humans tended to induce a stronger feeling of illusory motion, perhaps because they were less likely to contain color gradients. The system also output some failed brain tricks.

Sinapayen was surprised by how strong some of the illusions were.  Many were new images exhibiting illusory motion, while some were very similar to illusions discovered decades earlier by humans.

“When people create illusions by themselves, it’s often a bit random because we don’t know why some illusions work or not,” said Sinapayen. “You just have to try to think of something and then modify your image by hand until you’re satisfied, but the AI was able to rediscover some of these illusions by itself.”

Even though there were some discrepancies between the model and human perception, Sinapayen thinks their results provide insight into why humans perceive such illusions. The predictive model they used to judge the strength of illusions had not been trained on any illusions, but simply on a video of Disneyland captured by a head-mounted camera, similar to what a human could see when looking around. It seems to support the theory that the brain makes sense of what is seen by making predictions based on prior knowledge of similar scenes observed in the past. It sometimes makes mistakes, in the case of illusory images creating a perception of motion when a static image may have some features of a moving scene.

Another team has been trying to create illusions that involve ambiguous images using AI. Andrew Owens, an associate professor of computer science at Cornell Tech in New York, and his team have been taking an approach that harnesses the properties of diffusion-based generative models, which are state of the art for image generation.    

“I think that illusions are also an interesting way to study compositional generation, and to understand how to create new machine learning methods for generating images that satisfy multiple different constraints that may contradict each other,” said Owens.

Diffusion models start with random noise and iteratively remove some of it to come up with a given image. In recent work, Owens and his colleagues tested the limits of the process by using multiple text prompts that combine noise estimates for two images to create images that change appearance when flipped, rotated, or rearranged like a jigsaw puzzle, for example. 

Owens was astonished by how many different types of these illusions they could produce using a pre-trained model with no additional training. “We were really surprised by just how far you could push this idea and by how general purpose these images were,” says Owens. 

So far, though, AI models are typically able to create new images that are examples of known types of visual illusions. Alexandra Gomez-Villa, a post-doctoral researcher at Spain’s Autonomous University of Barcelona, and her colleagues recently were able to show that a diffusion model perceives certain color and brightness illusions in a similar way to humans by examining its latent space, a compressed representation of its underlying structure. Using this knowledge, they then prompted it to generate new images that illustrate how the brain is tricked by certain combinations of color and brightness.

Coming up with entirely new illusions, however, is more difficult.

“The tricky part was to discover the initial phenomenon,” said Gomez-Villa, “so really learning new rules of visual perception.”

Gomez-Villa said she is striving to use AI models to gain new insight into how the human visual system works. She thinks new developments, such as using a more efficient technique called flow matching to train generative AI, could lead to more powerful models that can better replicate visual perception, and perhaps discover completely novel brain tricks.  

Optical illusions are also considered to be a window into the mental shortcuts taken by the brain to more quickly perceive surroundings. If the brain is always on the lookout for motion cues, it would be beneficial to rapidly identify a dangerous animal lurking in a busy visual environment, for example, even if that means occasionally mistaking something static as moving.   

If AI comes up with previously unknown brain tricks, it could also help improve current models, for example by making them more efficient.

“If we understand why these phenomena are happening and how visual information is processed, we can put that optimality in models,” said Gomez-Villa. “For me, the ultimate goal is trying to replicate that.”

Sandrine Ceurstemont is a freelance science writer based in London, U.K.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More