News
Artificial Intelligence and Machine Learning

Earbud Translators: Not Perfect, Still Handy

Posted
Researchers working in the space acknowledge there is a long way to go to optimize earbud translation products.
Earbud translators are far from perfect, but useful for travelers seeking short answers to short questions.

Earbud translators are finally popping up on the market.

Far from perfect, these handy devices are still fairly reliable for travelers looking for very short answers to very short questions.

"While I understand the limitations, you can now travel to practically any place in the world and manage to communicate in the local language with the help of a small device," says Mikel Artetxe, inventor of the free translation app Mitzuli and a Ph.D. candidate at University of the Basque Country, Spain.

Some researchers trying to work out the kinks with the technology are betting on statistical machine translation (SMT), in which translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. Artetxe says SMT is akin to a game of chess in which you're working in a limited universe where you can only make a finite number of moves, as SMT calculates all the possible 'moves' to find the best translation. It does this by looking at six-word groupings in a source language, and then looking in the target language to find instances where those six-word groupings also occur to make the translation, according to Artetxe.

Other researchers working in translation are designing their chips for a newer approach, Neural Machine Translation (NMT), in which the artificial intelligence (AI) is designed to make translations via relentless trial and error. "NMT uses a neural network, a machine learning system that is explicitly trained to score translation candidates; this is done by teaching it to give a high score to the reference translations produced by humans," Artetxe says.

One of the primary advocates of NMT is Google. "Google's recent Neural Machine Translation is state of the art, and the introduction of deep learning machine translation models has even allowed translation to happen on the phone, which is amazing and great," says Christopher Manning, Thomas M. Siebel Professor in Machine Learning in the departments of computer science and linguistics at Stanford University in San Jose, CA.

Some earbuds, like Waverly Labs' $249 Pilot Translating Earpiece, use a Wi-Fi connection to make the translations with the help of cloud servers, while others, like Logbar's iLi handheld offline translator, do translations locally.

In the Pilot, says Waverly Labs founder Andrew Ochoa, "Words are passed to the cloud where [the message] is processed through speech recognition, machine translation, and speech synthesis, before it is sent back to the user. This happens within minimal delay, usually in milliseconds." The company claims the Pilot can work in person-to-person and group conversations, and is capable of translating speech between English, French, Italian, Portuguese, Spanish, Arabic, Mandarin Chinese, German, Greek, Hindi, Japanese, Korean, Polish, Russian, Turkish, and other languages.

Early adopters of these translation devices say you get the best results by sticking with the short and sweet. Explains Stanford's Manning, "If you use extremely short phrases like, 'What's today's weather?' then it's fine—you take a second or so to say that, and the translation is completed in less than two seconds. But as soon as you start trying to have any sort of real conversation," lag time between speech and translation can be 12 seconds or longer.

Researchers working in the space acknowledge there is a long way to go to optimizing these systems. Translation technologies based on AI still make significant errors that a human would never make, such as omitting words, getting stumped by proper names, puzzling over rare words, or serving up translations that are out of context. "Humans just don't make the fundamental, and sometimes crazy, errors that machine translation systems still make," Manning says.

Adds Artetxe, "In spite of the recent progress, machine translation still makes significant errors," even with major languages, for which machine learning programs have plenty of training data. "The challenge is even bigger for the vast majority of less-resourced languages."

Still, Manning is cautiously optimistic about the outlook for these products. "I'm sure machine translation will be considerably better in six years' time. Neural systems have been a great technology for providing continuous improvement by relentlessly refining models and exploiting more data."

However, Manning adds, "I am quite confident that the really hard problems of translation that require significant context won't be solved" sooner than six years, because dealing with every subtlety inherent in each language is proving a tough nut to crack.

Take varying treatment of pronouns among various languages, for example. "Some languages like Finnish, Tagalog, and Turkish, don't distinguish gender on pronouns," Manning says. "Others, like Chinese and Japanese, can freely drop pronouns."

So if a person says in Finnish or Chinese, "Sarah is studying at UC Davis," for example, current machine translation systems invariably translate the sentence into English as, "Sarah is studying at UC Davis. He is starting work as a veterinarian next year," according to Manning. "I believe cases like this still won't be solved in 2024," he says.

Adds Tim Bajarin, an analyst who monitors earbud translators for San Jose, CA-based market research firm Creative Strategies:  "Mobile processors are getting much faster, but they will be taxed by having to use a lot of AI-based software techniques that will be difficult to process in real time.

"If the translation is done on a laptop or desktop workstation, then we could see enough breakthroughs that by 2024 could deliver more accurate translations."


Translator Products

Apple Airpods
https://www.apple.com/airpods/, $159. 
Airpods are standard earbuds that can translate by connecting to a smartphone equipped with Google Translate or similar software.

Bragi Dash Pro, from Munich, Germany-based Bragi
https://www.bragi.com/thedashpro, $426.
These wireless earbuds connect to smartphones via BlueTooth and can translate between 30+ languages using the iTranslate app.

ili from Logbar Inc.
https://iamili.com/, $249.
This handheld translator lacks wireless connectivity because it does its translations locally.

Google Pixel Buds
https://store.google.com/us/product/google_pixel_buds?hl=en-US, $159.
Google's earbuds translate between 40 languages by connecting via Bluetooth to a smartphone running Google Translate.

Pilot Translating Earpiece, from Waverly Labs
https://www.waverlylabs.com/pilot-translation-kit/, $249. The Pilot App uses speech recognition, machine translation, and speech synthesis to translate spoken languages. The earpiece comes with free access to Latin/romance languages (French, Italian, Portuguese, Spanish, and English), and users can purchase additional languages.

TranslateOne2One, from Lingmo International
https://lingmo.global/translate-one2one/, $179
These wearable translation earpieces run on the IBM Cloud, and translate between nine languages in real time via SIM card, without the need for wireless connectivity.

Joe Dysart is an Internet speaker and business consultant based in Manhattan, NY. 

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More