News
Artificial Intelligence and Machine Learning

Can You Hear Me Now?

Posted
The software uses artificial intelligence to zoom in on the speech of a single person in a crowded room.
Columbia University researchers have developed prototype hearing aid software that addresses the 'cocktail party effect,' which has long plagued most conventional hearing aids.

Researchers at Columbia University have developed prototype hearing aid software that uses artificial intelligence (AI) to zoom in on the speech of a single person in a crowded room.

Essentially, the breakthrough enables the AI system and the human brain to work together to isolate speech in a room, and then automatically amplify what's being said for easier listening.

"It almost sounds like science fiction," says Mario A. Svirsky, a professor at New York University who specializes in hearing science. "What if a hearing aid was able to read your mind while you're trying to follow a conversation at a noisy cocktail party, determine the talker you are paying attention to, and then selectively amplify that talker's voice?"

In fact, this cocktail party effect is a common problem that has long plagued most conventional hearing aids. The technology traditionally used in such products tends to amplify all sounds in a crowded room, producing a sound image blurred by noise. That kind of noisy amplification makes it especially difficult for a hearing aid user trying to engage in conversation with a nearby person.

Fortunately, the Columbia researchers were able to overcome the problem by leveraging an interesting scientific observation by researchers at Columbia's Zuckerman Institute: when two people talk with one another, their brain waves begin to resemble one another. Seizing on that insight, they developed AI software that searches a crowd for brainwaves emanating from a person that most closely resemble the brainwaves of the listener. After the match is made, the AI auto-amplifies the voice of the person with the most similar brainwaves, so the hearing-aid wearer can hear the conversation much more easily.

Under the hood of the system, researchers used "Pytorch, the deep learning platform, together with Nvidia graphics cards for model training," says Yi Luo, a member of the research team. "The best reported model took around two days for training on a single Nvidia Titan XP graphics card," Luo adds.

The system relies as heavily on the listener's brain as it does on the AI software. The listener must focus on the speech of a specific person in the room, in order for the AI software to detect brain waves that closely mirror those of the listener.

The Columbia researchers stress their new system is an early prototype. The AI software they have developed can compare brain waves and amplify a specific voice to the wearer of the hearing aid, but running that AI software requires a full-sized lab server.

Also, the system currently requires electrodes to be inserted into a listener's brain, in order to monitor the brain waves of the listener and enable the system to identify a nearby person generating similar brain waves.

For experimental subjects, the researchers found a willing group of epilepsy patients at the Northwell Health Institute for Neurology and Neurosurgery in New York City.  Those patients were undergoing surgical treatment, and already had electrodes inside their brains.

"These patients volunteered to listen to different speakers while we monitored their brain waves directly via electrodes implanted in the patients' brains," said Nima Mesgarani, principal investigator at Columbia's Zuckerman Institute, a research facility devoted to the study of mind/brain behavior, and lead author of the study. "We then applied the newly developed algorithm to that data."

Ultimately, the Columbia researchers are looking to commercialize their system, although they need to find a way to miniaturize it to the size of a conventional hearing aid.\

Even so, their advance has created a stir in the scientific community.  "This is definitely where the hearing aid industry is going," says Esther Janse, an associate professor at Radboud University Nijmegen in the Netherlands who specializes in speech and hearing research.  "After years of not being able to improve listening in noise much further, beyond having directional microphones, or inventing the hearing glasses which would also boost the person's voice whom you turn your head towards, this brain wave technique is the new way forward."

Roger Miller, program director at the Neural Prosthesis Development National Institute on Deafness and Other Communication Disorders of the U.S. National Institutes of Health, agrees.  "This is an intriguing study that builds upon years of research by Dr. Mesgarani." The work "provides a significant advance in our understanding of what might be possible, because it demonstrates users could 'steer' the speech enhancement from one speaker to another," Miller says.

Miller says his institute is funding "a substantial amount of research" seeking to demystify how the brain represents sounds. AI gets a lot of that funding, along with a raft of other tech tools.\

"What distinguishes Mesgarani's research from others is the 'intelligence' employed to not only record brain signals in the hearing aid user, but also use those signals to selectively amplify the desirable signal," says Fan-Gang Zeng, a professor at University of California, Irvine, and director of its Center for Hearing Research. He described Masgarani's research as "the first meaningful application of brain-computer interface to hearing aids."

Not all hearers are alike

Commercialization of the AI system will also require researchers to accommodate hearing aid users whose ability to focus on the spoken words of a specific person may not be as powerful as most.

"One of my research findings is that not all listeners are equally well able to control their attention," says Radboud University's Janse. "Older adults are generally poorer in terms of their attentional abilities than younger adults, with huge individual differences in attentional abilities at any age. So the new technique rests on listeners being able to steer their attention to the target talker, but what happens if they are not able to control the focus of their attention?"

NYU's Svirsky also notes that the researchers' algorithm needs to be improved so the brain/machine interface can handle rapid conversation in a very noisy room. "The algorithms tested in Mesgarani et al's work still aren't fast enough or accurate enough to warrant implementation in a real hearing aid," Svirsky says.

Observes Rachel Ellis, a senior lecturer specializing in hearing loss among the elderly at Linkoping University, "The reported results are based on listening to multiple voices in a quiet environment. However, it is noisy environments that typically pose a problem for people with hearing loss. How this technology would perform in such environments is not yet clear."

Adds Morten Kolbæk. a postdoctoral researcher in the department of electronic systems of Aalborg University, "Systems we have seen in academia are fairly narrow in scope and do not perform that well in acoustical environments that are much different from the environments used when the systems were designed."

Bottom line:  Granted, there are a tough set of obstacles in the way of commercialization of this potential boon for the deaf. However, as NYU's Svirsky observes, "If anyone can do it, it's the Mesgarani Lab."

Joe Dysart is an Internet speaker and business consultant based in Manhattan, NY, USA. 

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More