Sign In

Communications of the ACM

ACM Opinion

Rest in Peas: The ­nrecognized Death of Speech Recognition


View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook

The accuracy of computer speech recognition flat-lined in 2001, before reaching human levels. The funding plug was pulled, but no funeral, no text-to-speech eulogy followed. Words never meant very much to computers--which made them ten times more error-prone than humans. Humans expected that computer understanding of language would lead to artificially intelligent machines, inevitably and quickly. But the mispredicted words of speech recognition have rewritten that narrative. We just haven’t recognized it yet.

After a long gestation period in academia, speech recognition bore twins in 1982: the suggestively-named Kurzweil Applied Intelligence and sibling rival Dragon Systems. Kurzweil’s software, by age three, could understand all of a thousand words--but only when spoken one painstakingly-articulated word at a time. Two years later, in 1987, the computer’s lexicon reached 20,000 words, entering the realm of human vocabularies which range from 10,000 to 150,000 words. But recognition accuracy was horrific: 90% wrong in 1993. Another two years, however, and the error rate pushed below 50%. More importantly, Dragon Systems unveiled its Naturally Speaking software in 1997 which recognized normal human speech. Years of talking to the computer like a speech therapist seemingly paid off.

However, the core language machinery that crushed sounds into words actually dated to the 1950s and '60s and had not changed. Progress mainly came from freakishly faster computers and a burgeoning profusion of digital text.

From Robert Fortner's Posterous
View Full Article


 

No entries found