Researchers at Columbia University have created artificial intelligence (AI) software that can broadcast aloud, in a computerized voice, what the human brain is 'hearing.'
"Our ultimate goal is to develop technologies that can decode the internal voice of a patient who is unable to speak, such that it can be understood by any listener," says neuroengineer Nima Mesgarani, senior author of the research study and a principal investigator at Columbia University.
"In this study, we combined the state of the art in artificial intelligence with advanced speech processing algorithms to reconstruct sounds from the brain that were much more intelligible compared to previous research," says Columbia's Mesgarani. "This is a huge milestone."
The Columbia researchers used experimental AI software trained to recognize digital representations of spoken sentences. Those digital representations were captured by electrodes implanted in the brains of epileptics, which recorded the firing of neurons in the epileptics' brains as they listened to sentences spoken to them.
The electrodes were a lucky break for the researchers; the electrodes had already been implanted in the epileptics' brains as part of a treatment program to diffuse the symptoms of their ailment.
Explained Hassan Akbari, a doctoral student at Columbia who was part of the research team, "We used stereo-electroencephalographic depth arrays, and high-density grid sensors implanted in the patients' brains; we acquired the signals from the sensors using a data acquisition module.
"Once the recordings were saved and processed, we used GPUs (graphics processing units) for training our models using Tensorflow API." Other GPUS used in the project, Akhbari said, included nVidia's Titan X, Titan XP, Tesla K40, and Tesla K80.
Once the Columbia researchers were confident their AI software had been sufficiently trained, they used the software to capture the recorded brain signals of the same epileptic patients as those patients listened to the numbers one through nine being read aloud.
The result: the software translated the digital representations of what the epileptics had heard and announced those representations in a computerized voice, which could be understood 75% of the time.
"This new work is brilliant," says Edward Chang, a neurosurgeon and professor of neurological surgery at the University of California, San Francisco, who is working in the same space. "It represents a breakthrough, and the researchers should be lauded for their ingenuity and accomplishment."
Christian Herff, a postdoctoral researcher in neurosurgery at Maastricht University in the Netherlands, said the study "is a great step into the right direction to translate brain activity into understandable speech. The quality of the speech output is extremely high."
"It is important to bear in mind that in this study, brain activity associated with speech perception is translated into speech," says Maastricht's Herff. "That means that the participants listened to speech audio and that the same speech audio could then be reconstructed from their brain activity.
"That is a long way from decoding what the persons were trying to say, and even further from translating thought."
University of California's Chang agrees the project was not focused on thoughts; "It was based on the brain's auditory responses to speech sounds. We're still way off from carrying out a sophisticated conversation by thought alone."
Similar work is being done by Maastricht's Herrf, whose team published in April a paper on a study that used deep neural networks to translate brain signals into audible speech. Herrf said in that study as well, "The output is of very high quality, too."
A team of UCSF researchers last November posted a pre-print (not peer-reviewed) paper on their research, in which they successfully translated brain signals into computer-generated speech. Said Herrf, "This group went one step further and showed that the process is also possible when only mouthing the words; moving the lips, but not producing speech."
The Columbia researchers hope eventually to create a product that will give people who can no longer speak the opportunity to communicate easily and fluently using a computer-generated voice. Said Mesgarani, "Ultimately, we hope the system could be part of an implant, similar to those worn by some epilepsy patients, that translates the wearer's imagined voice directly into words."
Ironically, evolving the technology to the point where a person will be able to use AI tech to carry on a sophisticated conversation relying only on their brain signals may not require higher level AI, Says Bradley Greger, a neuroengineer and associate professor at Arizona State University, "The bottleneck is not in computing power or in the AI software. It is in acquiring signals from the brain that are sufficiently rich in information for the AI software to decode. This will likely be achieved by using large numbers of micro-electrodes, or optogenetic techniques."
Joe Dysart is an Internet speaker and business consultant based in Manhattan, NY, USA.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment