Artificial Intelligence and Machine Learning News

Augmented Reality Gets Real

Formidable optical challenges are yielding to intensive research, development.
AR projection of Princess Leia
  1. Introduction
  2. Problems, Trade-Offs
  3. How They Work
  4. Solving the Vergence-Accommodation Problem
  5. Author
AR projection of Princess Leia

This is our most desperate hour,” said the flickering blue image. “Help me, Obi-Wan Kenobi. You’re my only hope.” In 1977, in a classic moment in cinematic history, the Star Wars movie epic gave the public a preview of augmented reality (AR).

But it was only a preview. The three-dimensional (3D) hologram of Princess Leia standing on a real table had been simulated by special effects artists at Lucasfilm, but it was easy to imagine a day when consumers could bring actual 3D virtual images seamlessly into their physical environments and interact with them in real time. That is now possible in a variety of consumer, medical, and industrial applications, generally through the use of head-mounted displays (HMDs).

AR, sometimes loosely called “mixed reality,” combines virtual reality (VR) with the physical world. A recent application, offered by Schell Games, uses technology from Disney and Lenovo to bring Darth Vader into your living room—your actual living room. In the game Jedi Challenges, users with a smartphone-enabled headset, a light-saber controller, and a tracking beacon can engage the life-sized movie villain in a lightsaber battle.

The AR game has received enthusiastic reviews for its realism, yet no one would for an instant think the fallen Jedi Knight is actually standing between the sofa and the coffee table. Darth Vader has the same unreal blue hue as the Princess Leia avatar; he is semi-transparent, pixelated, subject to ghosting, and lacking in detail.

Yet the market for AR headsets is potentially huge, and the optical challenges in AR are the subject of intense research by companies and universities around the world. MarketWatch, published by Dow Jones, forecasts the AR market will grow at a compound annual rate of 75% to reach $50 billion by 2024, just five years from now. Most analysts look for the greatest growth in retail, automotive, and medical applications, although some predict eventually consumer AR devices will replace all conventional displays on laptops and smartphones.

Figure. Battling a virtual Kylo Ren in the augmented reality game Star Wars: Jedi Challenges.

The optical challenges when combining VR and AR are complex, and they have yielded ground only grudgingly. Depending on the application and the technology for it, images seen by users are often primitive: blurred, low in resolution, slow to refresh, subject to a narrow field of view, or generally unrealistic in look and behavior. Moreover, user gear can be expensive, aesthetically unpleasing, and uncomfortable to use, especially for a prolonged period of time.

There are a myriad of technical approaches to each of these problems, but they tend to solve one problem by creating others. A solution that is adequate for a living room game may be a complete failure in an operating room or construction site.

Back to Top

Problems, Trade-Offs

Martin Banks, a professor of optometry and vision science at the University of California, Berkeley, and director of the Center for Innovation in Vision and Optics, says the problems with the optical components in AR systems fall into two broad categories: how good the image is as seen by the eye, and how well addressed is the “vergence-accommodation conflict.”

The optical quality issue presents classic trade-offs. Diffractive lenses, often used in AR systems because they are small and light, suffer from quality issues—such as blur and color fringing—compared with larger, heavier refractive lenses. “To get good image quality, you generally need a refractive lens, and there goes your form-factor improvement,” Banks says.

Optical quality also depends on another difficult trade-off: the quest for a realistically wide field of view. “You want a large field of view, but once you spread out a fixed number of pixels, you can see them,” Banks says. Similarly, although the human eye can detect very fine detail (up to 50 alternating light and dark stripes per degree of vision), HMDs today typically can deliver just seven to 12 lines per degree, a limitation of the lens, the display, and the computer power available to process images.

The other major problem, the vergence-accommodation conflict, is even more difficult. The eyes converge (turn inward) to see objects that are near, but diverge to see far-away objects. That alignment of the two eyes is known as “vergence,” and when it fails, the viewer sees double images. A separate system, called “accommodation,” changes the shape of the lenses in the eyes so a person sees images in sharp focus. “The brain has a circuit that links these two systems so that you nearly always converge and accommodate to the same distance,” Banks says.

However, AR systems often present images for which those distances are not the same. The efforts of the eyes and brain to deal with that conflict can cause eye discomfort, and even nausea. Some users can’t cope at all, and just see doubled or blurry images. The conflict is the subject of intense research at companies worldwide, at Berkeley and Stanford in the U.S., and at universities in China, Korea, Canada, England, and Holland, Banks says.

There are two ways currently to address the vergence-accommodation problem, and they are equally applicable to virtual reality and augmented reality systems. In “vari-focal” devices, a focus-adjustable lens sits in front of each eye and, enabled by an eye-tracker, is able to estimate where the user is looking and then adjust the focus of the lens to make vergence and accommodation consistent with each other. Switzerland’s Optotune is a leading maker of such “focus-tunable lenses.”

The need to make wearable computers and power supplies small and lightweight constrains the amount of compute power available for high-quality, real-time rendering of 3D images.

The varifocal technique works, Banks says, but it adds cost, weight, complexity, and power consumption. A related approach is found in multi-focal AR systems. With varifocal lenses, the focal length can change continuously, but in multi-focal, digital circuits chose among two fixed focal planes.

The need to make wearable computers and their power supplies small and lightweight constrains the amount of compute power available for high-quality, real-time rendering of 3D images. A possible solution is a new technique called foveated rendering or gaze-contingent foveation, which uses an eye-tracker to put the best image quality only in front of the eye’s fovea, where visual acuity is greatest. “If we can track your eye, not your head, we can tell what you are looking at,” says Tom Corbett, an instructor in entertainment technology at Carnegie Mellon University. “Then we can render in high resolution only what you are looking at, and all else we can do at a slower frame rate.”

Back to Top

How They Work

Wearable devices for consumer AR typically employ one of two basic designs, each of which has variants. The first, the “curved mirror,” uses a semi-reflective, semi-transparent concave mirror placed in front of the eyes and connected to an off-axis projection system. An example is the Lenovo-Disney Jedi Challenges headset. Distortion and ghosting can be a problem with this method, requiring optical or electronic correction that adds weight and cost to the system, while reducing resolution.

The second basic approach, “waveguides,” uses diffractive, reflective, holographic, or polarizing optics to guide uni-directional waves of light from a side-mounted source to create an image in front of the eyes. Waveguide technology is used by a number of companies, including Microsoft in its HoloLens 2 HMD, Magic Leap in its headset, and by Akonia Holographics (purchased last year by Apple). Waveguide-based HMDs are smaller, lighter, and offer good optical quality, but they are (so far) more expensive, and offer more limited fields of view compared to curved mirrors.

Jesse Schell, CEO of Schell Games and a professor at Carnegie Mellon’s Entertainment Technology Center, predicts waveguides and curved mirrors will ultimately be dominated by a third approach which, so far, “no one likes to talk about.” Instead of looking at the real world through a piece of transparent glass, users have a headset with built-in cameras that show the user video of the real world blended with digital virtual images. One drawback with that method results from the time lag between when a camera captures an image and when it is presented to the eye, Schell says. This latency, often about 100 milliseconds, can cause nausea as the brain struggles to adjust. Solving the problem will require low-latency cameras driven by greater computer power for faster image refresh.

“With this video pass-through method, you don’t have the struggle with all the tricky optics, there is no field of view problem, no brightness problem, and the image quality will get better as low-latency cameras appear,” Schell says. “It will become a strikingly good experience, and very affordable.”

Schell predicts video pass-through will capture 80% of the market for industrial and military applications within five years.

Back to Top

Solving the Vergence-Accommodation Problem

Magic Leap uses a multi-focal approach to the vergence/accommodation conflict, in which the system dynamically chooses between two focal planes, whichever minimizes the conflict to a greater degree. That ensures the user does not see a conflict that exceeds a small “mismatch budget,” says Michael Klug, vice president for advanced photonics. He says the company has built HMDs that can shrink that budget to an arbitrarily small amount by continuously sweeping across many focal planes. However, Klug says such devices at present are too large, too complex, and use too much computer power.

Banks at UC Berkeley says the “holy grail” solution to the vergence-accommodation conflict is a complicated and experimental technology called “light field.” A pixel in most displays today has two dimensions: a vertical position and a horizontal position. A light-field display adds two more (vertical and horizontal direction), so vectors of light are sent to the eye. Banks says some companies today incorrectly use the term “light field” to describe their products, but they don’t in fact “create a reasonable approximation to the light field we experience in everyday life.”

Ronald Azuma, Augmented Reality Team Leader at Intel Labs, declined to be interviewed for this article but provided a video of a presentation to an industry group last year in which he said that Intel had shown experimentally that workable light-field displays are possible, but are not yet commercially practical.

“If we can track your eye, not your head, we can tell what you are looking at. Then we can render in high resolution only what you are looking at, and all else we can do at a slower frame rate.”

Thanks to robust research efforts, Azuma predicts progress will be made on that and on many other fronts in AR. In fact, he predicts consumer AR will become “as ubiquitous and invaluable as smartphones.”

In the longer term, Azuma says, “AR could become the dominant platform and interface, moving us away from fixating on display screens.”

*  Further Reading

Azuma, R.
The Road to Ubiquitous Consumer Augmented Reality Systems, Human Behavior and Emerging Technologies, 01 Feb. 2019

Billinghurst, M., Clark, A., and Lee, G.
A Survey of Augmented Reality, Foundations and Trends in Human–Computer Interaction, Vol. 8: No. 2–3, pp. 73–272, 31 Mar. 2015

Koulieris, G., Bui, B., Banks, M.S., and Drettakis, G.
Accommodation and comfort in head-mounted displays, ACM Transactions on Graphics (SIGGRAPH Conference Proceedings), Vol. 36, No. 4, page 11 July 2017

Kramida, G.
Resolving the Vergence-Accommodation Conflict in Head-Mounted Displays, IEEE Transactions on Visualization and Computer Graphics, Vol. 22, No. 7, July 1 2016

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More