Google's Alfred Spector on voice search, hybrid intelligence and beyond

14th Feb 2010 | 08:00

Google's Alfred Spector on voice search, hybrid intelligence and beyond

Inside Google Labs

Search by voice and machine translation

Google has always been tight-lipped about products that haven't launched yet. It's no secret, however, that thanks to the company's bottom-up culture, its engineers are working on tons of new projects at the same time.

Following the mantra of 'release early, release often', the speed at which the search engine giant is churning out tools is staggering. At the heart of it all is Alfred Spector, Google's Vice President of Research and Special Initiatives.

One of the areas Google is making significant advances in is voice search. Spector is astounded by how rapidly it's come along.

The Google Mobile App features 'search by voice' capabilities that are available for the iPhone, BlackBerry, Windows Mobile and Android. All versions understand English (including US, UK, Australian and Indian-English accents) but the latest addition, for Nokia S60 phones, even introduces Mandarin speech recognition, which – because of its many different accents and tonal characteristics – posed a huge engineering challenge.

It's the most spoken language in the world, but as it isn't exactly keyboard-friendly, voice search could become immensely popular in China.

Technology challenge

"Voice is one of these grand technology challenges in computer science," Spector explains. "Can a computer understand the human voice? It's been worked on for many decades and what we've realised over the last couple of years is that search, particularly on handheld devices, is amenable to voice as an import mechanism.

"It's very valuable to be able to use voice. All of us know that no matter how good the keyboard, it's tricky to type exactly the right thing into a searchbar, while holding your backpack and everything else."

To get a computer to take account of your voice is no mean feat, of course. "One idea is to take all of the voices that the system hears over time into one huge pan-human voice model. So, on the one hand we have a voice that's higher and with an English accent, and on the other hand my voice, which is deeper and with an American accent. They both go into one model, or it just becomes personalised to the individual; voice scientists are a little unclear as to which is the best approach."

Machine translation

The research department is also making progress in machine translation. Google Translate already features 51 languages, including Swahili and Yiddish. The latest version introduces instant, real-time translation, phonetic input and text-to-speech support (in English).

"We're able to go from any language to any of the others, and there are 51 times 50, so 2,550 possibilities," Spector explains.

"We're focusing on increasing the number of languages because we'd like to handle even those languages where there's not an enormous volume of usage. It will make the web far more valuable to more people if they can access the English-or Chinese language web, for example.

"But we also continue to focus on quality because almost always the translations are valuable but imperfect. Sometimes it comes from training our translation system over more raw data, so we have, say, EU documents in English and French and can compare them and learn rules for translation. The other approach is to bring more knowledge into translation.

"For example, we're using more syntactic knowledge today and doing automated parsing with language. It's been a grand challenge of the field since the late 1950s. Now it's finally achieved mass usage."

The team, led by scientist Franz Josef Och, has been collecting data for more than 100 languages, and the Google Translator Toolkit, which makes use of the 'wisdom of the crowds', now even supports 345 languages, many of which are minority languages.

The editor enables users to translate text, correct the automatic translation and publish it. Spector thinks that this approach is the future. As computers become even faster, handling more and more data – a lot of it in the cloud – machines learn from users and thus become smarter. He calls this concept 'hybrid intelligence'.

"It's very difficult to solve these technological problems without human input," he says. "It's hard to create a robot that's as clever, smart and knowledgeable of the world as we humans are. But it's not as tough to build a computational system like Google, which extends what we do greatly and gradually learns something about the world from us, but that requires our interpretation to make it really successful.

"We need to get computers and people communicating in both directions, so the computer learns from the human and makes the human more effective."

Google Suggest and conceptual search

Examples of 'hybrid intelligence' are Google Suggest, which instantly offers popular searches as you type a search query, and the 'did you mean?' feature in Google search, which corrects you when you misspell a query in the searchbar. The more you use it, the better the system gets.

Training computers to become seemingly more intelligent poses major hurdles for Google's engineers. "Computers don't train as efficiently as people do," Spector explains.

"Let's take the chess example. If a Kasparov was the educator, we could count on almost anything he says as being accurate. But if you tried to learn from a million chess players, you learn from my children as well, who play chess but they're 10 and eight. They'll be right sometimes and not right other times. There's noise in that, and some of the noise is spam. One also has to have careful regard for privacy issues."

By collecting enormous amounts of data, Google hopes to create a powerful database that eventually will understand the relationship between words (for example, 'a dog is an animal' and 'a dog has four legs').

The challenge is to try to establish these relationships automatically, using tons of information, instead of having experts teach the system. This database would then improve search results and language translations because it would have a better understanding of the meaning of the words.

Conceptual search

There's also a lot of research around 'conceptual search'. "Let's take a video of a couple in front of the Empire State Building. We watch the video and it's clear they're on their honeymoon. But what is the video about? Is it about love or honeymoons, or is it about renting office space? It's a fundamentally challenging problem."

One example of conceptual search is Google Image Swirl, which was added to Labs in November. Enter a keyword and you get a list of 12 images; clicking on each one brings up a cluster of related pictures. Click on any of them to expand the 'wonder wheel' further.

Google notes that they're not just the most relevant images; the algorithm determines the most relevant group of images with similar appearance and meaning.

To improve the world's data, Google continues to focus on the importance of the open internet. Another Labs project, Google Fusion Tables facilitates data management in the cloud. It enables users to create tables, filter and aggregate data, merge it with other data sources and visualise it with Google Maps or the Google Visualisation API.

The data sets can then be published, shared or kept private and commented on by people around the world. "It's an example of open collaboration," Spector says.

"If it's public, we can crawl it to make it searchable and easily visible to people. We hired one of the best database researchers in the world, Alon Halevy, to lead it."

Google is aiming to make more information available more easily across multiple devices, whether it's images, videos, speech or maps, no matter which language we're using.

Spector calls the impact "totally transparent processing – it revolutionises the role of computation in day-today life. The computer can break down all these barriers to communication and knowledge. No matter what device we're using, we have access to things. We can do translations, there are books or government documents, and some day we hope to have medical records. Whatever you want, no matter where you are, you can find it."

------------------------------------------------------------------------------------------------------

First published in .net Issue 198

Liked this? Then check out How search engines are getting smarter

Sign up for the free weekly TechRadar newsletter
Get tech news delivered straight to your inbox. Register for the free TechRadar newsletter and stay on top of the week's biggest stories and product releases. Sign up at http://www.techradar.com/register

Google Labs Google
Share this Article
Google+
Edition: UK
TopView classic version