News
Architecture and Hardware

The Eyes Have It

Posted
Biometric interfaces continue to advance, and will leap beyond smartphones; homes, smart TVs, and even automobiles could incorporate gaze and gesture controls.

One of the things that makes biometrics so powerful is that it links humans and machines in a natural way. Instead of tapping out cumbersome passwords or PINs, a physical characteristic such as a face scan or fingerprint becomes the method of authentication. It's convenient, fast, and typically secure.

Yet biometrics is emerging as more than an authentication tool. Researchers are exploring ways to build it into device interfaces, particularly on smartphones. The ability to read eye movements, facial expressions, and other physical characteristics may lead to significant advances in interfaces in the coming years.

"It's possible to greatly enhance the way we interact with devices through biometrics," says Karan Ahuja, a doctoral student at Carnegie Melon University (CMU) and  volunteer co-editor-in-chief of XRDS: Crossroads, the ACM magazine for students. "Embedding the technology into user interfaces could make many actions more natural and instinctive."

Ahuja and a team of researchers at CMU have developed a gaze-tracking tool called EyeMU that allows users to control their devices without having to lift, or apply, a finger. Meanwhile, another group in Japan has developed a system called ReflecTouch that reads the light striking one's pupils and adjusts the interface automatically.

"The integration of biometrics into experiences isn't just about improving interfaces, but human experiences in general," says Eugenio Santiago, senior vice president of user research at Brooklyn, NY-based digital design and consulting firm Key Lime Interactive. "It introduces an opportunity for greater personalization and contextual relevance."

Hands Off

Despite the powerful capabilities of today's smartphones, the functionality and usability of these devices remain somewhat stunted. For example, features like Siri do no good when a person is scrolling through photos or text messages. There's no way to escape swiping and flicking each image or object to arrive at the next one.

Larger devices and increasingly complex apps have also made it difficult to manage tasks—especially if a person has only one hand available. "There's a need to provide a user interface that is easy to use in a variety of situations. Biometrics can serve as a trigger and the sensors built into phones can capture the necessary information," says Xiang Zhang, a graduate student and researcher at Keio University in Yokohama, Japan.

The ReflecTouch system that Zhang and fellow researchers developed uses the selfie camera on a standard iPhone to detect light reflecting from a user's pupils. As the data flows through a machine learning algorithm, the phone automatically adjusts the interface to accommodate the task at hand. Depending on how a user is holding the phone in relation to his or her eyes, buttons and alignment may change.

"Our method…uses a combination of classical computer vision techniques and machine learning, and then detects grasp posture using a CNN (convolutional neural network)," Zhang says.

The EyeMU system combines gaze control and hand gestures to simplify navigation. It relies on various sensors built into the phone, along with eye tracking, to anticipate what a user wants to do. For instance, if the camera detects a person gazing at a notification or alert for a few seconds, a hand gesture—in this case a flick to the right—snoozes it, while a flick to the left dismisses it.

"The goal is to make it possible to navigate through actions in an intuitive way," Ahuja says. "Using gaze along with other motions, it's possible to review emails and text messages, news, photos, and much more."

A Vision and a View

While biometric interfaces aren't quite ready for prime time, they are inching closer to commercial reality. The CMU group, for example, reported achieving gesture classification accuracy of 97.3%, while the group from Keio University, Tokyo University of Technology, and Yahoo Japan Corporation achieved 85% accuracy.

Ultimately, an accuracy level above 99% is needed, Ahuja says. "The user experience has to be fluid and natural. People have very little tolerance for errors." On a technical level, there's a need for further advances in cameras—including depth capabilities—and improvements in processing speed and algorithms. "Any system must work across dramatically different situations, including lighting," he says.

Not surprisingly, there are some concerns about biometric interfaces, many of which revolve around security and privacy, and how the technology could be used and abused by advertisers and others. "If you are interested in understanding how people react to an experience or an interface, biometrics certainly can provide a window into that," Santiago says.

Nevertheless, biometric interfaces continue to advance, and they will leap beyond smartphones. Homes, smart TVs, and even automobiles—some of which already detect when a person is snoozing off—could incorporate gaze and gesture controls. "We can personalize things today, but 'work' is required for us to set it up," Santiago says.  "Biometrics opens the possibility for personalization to 'just happen.'"

Samuel Greengard is an author and journalist based in West Linn, OR, USA.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More
News
Computing Applications News

The Eyes Have It

Eye-tracking control for mobile phones might lead to a new era of context-aware user interfaces.
Posted
  1. Introduction
  2. Eyes As an Input Device
  3. Steady Progress
  4. Author
  5. Footnotes
  6. Figures
Queen's University associate professor Roel Vertgaal
"People tend not to point with their eyes," notes Roel Vertgaal, associate professor of human-computer interaction at Queen's University, who studies eye communication.


Human-computer interfaces (HCIs) controlled by the eyes is not a novel concept. Research and development of systems that enable people incapable of operating keyboard- or mouse-based interfaces to use their eyes to control devices goes back at least to the 1970s. However, mass adoption of such interfaces has thus far not been necessary nor pursued by system designers with any particular ardor.

“I’m a little bit of a naysayer in that I think it will be very, very hard to design a general-purpose input that’s superior to the traditional keyboard,” says Michael Holmes, associate director for insight and research at the Center for Media Design at Ball State University. “When it comes to text, it’s hard to beat the speed of a keyboard.”

However, one class of user interfaces (UIs) in particular has proven to be problematic in the creation of comfortably sized keyboards—the interfaces on mobile phones, which are becoming increasingly more capable computational platforms as well as communications devices. It might stand to reason that eye-based UIs on mobile phones could provide users with more options for controlling their phones’ applications. In fact, Dartmouth College researchers led by computer science professor Andrew Campbell recently demonstrated with their EyePhone project that they could modify existing general-purpose HCI algorithms to operate a Nokia N810 smartphone using only the device’s front-facing camera and computational resources. The new algorithms’ accuracy rates, however, also demonstrated that the science behind eye-based mobile control needs more refinement before it is ready for mass consumption.

Back to Top


Eyes As an Input Device

Perhaps the primary scientific barrier to reaching a consensus approach to eye-controlled mobile interfaces is the idea that trying to design such an interface flies against the purpose of the eye, according to Roel Vertegaal, associate professor of human-computer interaction at Queen’s University.

“One of the caveats with eye tracking is the notion you can point at something,” Vertegaal says. “We didn’t really like that. People tend not to point with their eyes. The eyes are an input device and not an output device for the body.”

Vertegaal says this basic incompatibility between the eyes’ intended function and the demands of using them as output controllers presents issues including the Midas Touch, postulated by Rob Jacob in a seminal 1991 paper entitled “The Use of Eye Movements in Human-Computer Interaction Techniques: What You Look At is What You Get.”

“At first, it is empowering to be able simply to look at what you want and have it happen, rather than having to look at it (as you would anyway) and then point and click it with the mouse or otherwise issue a command,” Jacob wrote. “Before long, though, it becomes like the Midas Touch. Everywhere you look, another command is activated; you cannot look anywhere without issuing a command.”

Another issue caused by trying to make the eyes perform a task for which they are ill-suited is the lack of a consensus on how best to approach designing an eye-controlled interface. For example, one of the most salient principles of mainstream UI design, Fitts’s Law, essentially states that the time to move a hand toward a target is affected by both the distance to a target and the size of the target. However, Fitts’s Law has not proven to be a shibboleth among eyetracking researchers. Many contend the natural accuracy limitations of the eye in pointing to a small object, such as a coordinate on a screen, limit its applicability. A lack of consensus on the scientific foundation of eye control has led to disagreement on how best to approach discrete eye control of a phone. The Dartmouth researchers, for example, used blinks to control the phone in their experiment. However, Munich-based researcher Heiko Drewes found that designing a phone that follows gaze gestures—learned patterns of eye movement that trigger specific applications, rather than blinks—resulted in more accurate responses from the phone.

“I tried the triggering of commands by blinking, but after several hundred blinks my eye got nervous—I had the feeling of tremor in my eyelid,” Drewes says. “I did no study on blinking, but in my personal opinion I am very skeptical that blinking is an option for frequent input. Blinking might be suitable for occasional input like accepting an incoming phone call.”

However, Drewes believes even gaze gestures will not provide sufficient motivation for mass adoption of eye-controlled mobile phones. “The property of remote control and contact-free input does not bring advantage for a device I hold in my hands,” he says. “For these reasons I am skeptical regarding the use of gaze gestures for mobile phones.

“In contrast, I see some chances for controlling a TV set by gaze gestures. In this case the display is in a distance that requires remote control. In addition, the display is big enough that the display corners provide helping points for large-scaled gesture, which are separable from natural eye movements.”

Back to Top


Steady Progress

Vertegaal believes the most profound accomplishment of the Dartmouth EyePhone work may be in the researchers’ demonstration of a mobile phone’s self-contained image and processing power in multiple realistic environments, instead of conducting experiments on a phone tethered to a desktop in a static lab setting. Dartmouth’s Campbell concurs to a large degree.

“We did something extremely simple,” Campbell says. “We just connected an existing body of work to an extremely popular device, and kind of answered the question of what do we have to do to take these algorithms and make them work in a mobile environment. We also connected the work to an application. Therefore, it was quite a simple demonstration of the idea.”

Specifically, Campbell’s group used eye-tracking and eye-detection algorithms originally developed for desktop machines and USB cameras. In detecting the eye, the original algorithm produced a number of false positive results for eye contours, due to the slight movement of the phone in the user’s hand; interestingly, the false positives, all of which were based on coordinates significantly smaller than true eye contours, seemed to closely follow the contours of the user’s face. To overcome these false positive results, the Dartmouth researchers created a filtering algorithm that identified the likely size of a legitimate eye contour. The new eye-detection algorithm resulted in accuracy rates of 60% when a user was walking in daylight, to 99% when the phone was steady in daylight. The blink detection algorithm’s accuracy rate ranged from 67% to 84% in daylight.

Campbell believes the steady progress in increasing camera resolution and processing capabilities on mobile phones will lead to more accuracy over time. “Things like changes in lighting and movement really destroy some of these existing algorithms,” he says. “Solving some of these context problems will allow these ideas to mature, and somebody’s going to come along with a really smart idea for it.”


Eye-based user interfaces on mobile phones could provide users with more options for controlling their phones’ applications.


However, veterans of eye-tracking research do not foresee a wave of eyes-only mobile device control anytime soon, even with improved algorithms. Instead, the eye-tracking capabilities on mobile devices might become part and parcel of a more context-aware network infrastructure. A phone with eye-gaze context awareness might be able to discern things such as the presence of multiple pairs of eyes watching its screen and provide a way to notify the legitimate user of others reading over his or her shoulder. An e-commerce application might link a user’s gaze toward an LED-enabled store window display to a URL of more information about a product or coupon for it on the phone. Campbell says one possible use for such a phone might be in a car, such as a dash-mounted phone that could detect the closing of a drowsy driver’s eyes.

Ball State’s Holmes says such multimodal concepts are far more realistic than an either/or eye-based input future. “Think about how long people have talked about voice control of computers,” he says. “While the technology has gotten better, context is key. In an open office, you don’t want to hear everybody talk to their computer. Voice command is useful for things like advancing slide show, but for the most part voice control is a special tool. And while I can see similar situations for eye gaze control, the notion that any one of these alternative input devices will sweep away the rest isn’t going to happen. On the other hand, what is exciting is we are moving into a broader range of alternatives, and the quality of those alternatives is improving, so we have more choices.”


*

 Further Reading

Drewes, H.
Eye Gaze Tracking for Human Computer Interaction Dissertation, Ludwig-Maximilians-Universität, Munich, Germany, 2010.

Jacob, R. J. K.
The use of eye movements in human-computer interaction techniques: what you look at is what you get. ACM Transactions on Information Systems 9, 2, April 1991.

Majaranta, P. and Räihä, K.-J.
Twenty years of eye typing: systems and design issues. Proceedings of the 2002 Symposium on Eye Tracking Research & Applications, New Orleans, LA, March 25–27, 2002.

Miluzzo, E., Wang, T. and Campbell, A. T.
EyePhone: activating mobile phones with your eyes. Proceedings of the Second ACM SIGCOMM Workshop on Networking, Systems, and Applications on Mobile Handhelds, New Delhi, India, August 30, 2010.

Smith, J. D., Vertegaal R., and Sohn, C.
ViewPointer: lightweight calibration-free eye tracking for ubiquitous handsfree deixis. Proceedings of the 18th Annual ACM Symposium on User Interface Software and Technology, Seattle, WA, Oct. 23–26, 2005.

Back to Top

Back to Top

Back to Top

Figures



UF1

Figure. “People tend not to point with their eyes,” notes Roel Vertgaal, associate professor of human-computer interaction at Queen’s University, who studies eye communication.

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More