“The Babel fish is small, yellow, leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received not from its own carrier, but from those around it. It absorbs all unconcious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language.” – Extract, The Hitchhiker’s Guide to the Galaxy.
Yes, Microsoft has created their own digital version of a Babel Fish (suck it, Google Translate). Teams of researchers at the software giant have managed to shoved it into Skype and it looks set to change the way we communicate forever. Jetzt können Sie Deutsch sprechen können, ohne zu wissen, wie man Deutsch sprechen!
Peter Lee, corporate vice president head of Microsoft Research, introduces the technology that Microsoft has been working on in a short video below, delving into a hands-on example of how the system works. Essentially similar to the way a Babel Fish might work, Skype pipes the audio to servers for speech recognition, adds in punctuation and grammar rules, then translates that into the destination language complete with its own grammar rules, spelling and punctuation and has the Skype software read it out to you in a very generic, robot-like voice that can switch between male or female voices to match genders. Its not real-time, but it follows a second or two after the fact.
“The problem of having a machine understand human speech has been around in Research for a very long time,” says Lee in the video “For over a decade we tried to develop the means for improving speech recognition using Gaussian-mixture models. All of that changed when our researchers and decided to take a new look at the use of deep neural nets, applied to speech recognition, now the new gold standard in speech.”
“Reasearch is a long game. By getting smarter every day, and sticking with the understanding of the value of basic research, we eventually get to the point where wonderful things can happen. Imagine being able to speak in German and have your message conveyed grammatically and somatically in English. That future is here,” adds Lee.
Skype Translator isn’t ready just yet for the prime time, but this is a big step forward for how humans interact with each other. Sure, there are privacy concerns here and its a given that the NSA will be snooping in it at some point. But overcoming the language barrier? That’s one hell of a thing to do. I can see uses in helping to learn a new language, even. Pouvez-vous parler français?