Google throws its weight behind voice

The company's size helps in speech recognition development, a researcher says

Google is taking advantage of its cloud infrastructure and the huge volume of typed search queries to refine its Voice Search function, part of a massive research effort in voice that spans both mobile devices and the Web.

Voice Search, introduced about 18 months ago, lets mobile users search the Web by speaking into their phones rather than typing in a query. It's available on the iPhone, BlackBerry, Nokia Series 60 devices and some Android phones.

Accuracy is a major factor for success, driving useful results that cause users to return to the service, said Michael Cohen, manager of speech technology at Google, in a speech Thursday at the Mobile Voice Conference in San Francisco. The company strives to make Voice Search a "frictionless" experience for the user, with correct results obtained easily. Making speech recognition more accurate has been a decadeslong effort, and Google is applying its massive scale to the problem, Cohen said.

Voice Search is based on "language models," which are statistical models of what sequences of words are most likely to occur. For example, a good language model would know that it's more likely a speaker would say "the dog barked" than "the dog talked."

Google is constantly "training" new language models for its speech recognition engine, Cohen said. In doing so, it taps into the search terms that users type into Google.com. From 230 billion words typed in search requests at Google.com, researchers have compiled the 1 million most-frequently used unique words to form a vocabulary with which to train the voice system. Both numbers are arbitrary, and 230 billion does not represent the total number of words entered at Google in any given period, Cohen said. AskOxford.com, from the publisher of the Oxford English Dictionary, estimates that there are at least 250,000 words in the English language; Cohen said the 1 million unique words include plurals and other versions of words.

It takes 70 "CPU years" -- the amount of work one CPU can perform in a year -- to process those 230 billion words from Google.com and train a new language model, Cohen said. Google trains new language models constantly as part of its research.

"There are huge computational demands as we're taking on lots and lots of data (and) bigger and bigger models," Cohen said. "Luckily, we have a lot of compute power we can apply to that. And there are demands on infrastructure, and luckily, Google has a very well-designed software infrastructure, so we can do things like quickly parallelize something," running it on thousands of computers at the same time, he said.

A cloud infrastructure offers other advantages in speech recognition, he said. For one thing, Google can rapidly test and refine its speech recognition software, sending out new versions, while consumers are using it in the field. In addition, as consumers use Voice Search, Google learns from real-world experiences.

In addition to making speech recognition easier to use, Google wants to make it ubiquitously available. A big step in that direction was a feature included in the Nexus One handset that gives the user the option of speaking instead of typing every time the keyboard pops up on the phone's screen, Cohen said.

Speech recognition is also a big part of Google Voice, powering its voicemail transcription feature. But Google's interest in voice goes beyond mobile phones, Cohen said. Voice is the biggest group in Google Research, and findings in this area can be useful in many areas, he said. The company wants to be able to understand and deliver spoken content on the Web as well as the written information it finds now through its search engine. One recent move was the addition of a closed-caption option for YouTube videos. Using that capability, Google is also beginning to offer foreign-language subtitles through text-to-text translation of those captions.

Cohen was a co-founder of Nuance Communications and has been working on speech recognition for 25 years. In that time, "It's come a long way, but it has a long way to go," he said.

Microsoft is also developing voice search capabilities for its Bing search engine.

Join the Good Gear Guide newsletter!

Error: Please check your email address.

Tags google voiceGooglesearch engines

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Stephen Lawson

IDG News Service
Show Comments

Essentials

Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >

Mobile

Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >

Exec

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

Budget

Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?