What will it take to make AI sound human?

'It's a matter of being personalized,' says CMU professor Alan Black

Pepper the robot appears on stage with a Softbank executive at CES in Las Vegas on Jan. 7, 2016 Credit: James Niccolai

Pepper the robot appears on stage with a Softbank executive at CES in Las Vegas on Jan. 7, 2016 Credit: James Niccolai

Conversation fillers such as "hmm" and "uh-huh" may seem like insignificant parts of human conversation, but they're critical to improving communication between humans and artificial intelligence.

So argues Alan Black, a professor in the Language Technologies Institute at the Carnegie Mellon School of Computer Science, who specializes in speech synthesis and ways to make artificially intelligent speech sound more real.

Both Siri and Cortana incorporate aspects of Black's work, he says. But for the most part, such technologies still boil down to a pretty simple pattern: The human speaks, then the machine processes that speech and answers.

"It's not really how humans interact," Black said in an interview on Friday. "It's a stilted kind of interaction."

Key to making such conversations more natural are pauses, fillers, laughs and the ability of speakers to anticipate and complete each other's sentences -- all of which help build rapport and trust.

"Laughing is part of communication," he said. "Machines don't do that -- if they did, it would be unbelievably creepy -- but ultimately they should."

Black and his students are working on those areas.

"You need mm-hmm, back channels, hesitations and fillers, and so far our speech synthesizers can't do that," Black said. "If a system does say 'uh-huh,' it sounds like a robot."

Technologies using synthetic voices typically use speech recorded by humans "in a little room reading sentences," he explained. That, in turn, is "why they sound bored."

Working with students, Black is experimenting with using voices recorded in dialog, so that even if you just capture and use one side, it's clear the speakers are engaged. The idea is to model and incorporate the variance in human responses rather than using the same response all the time -- otherwise, humans can tell it's fake, Black said.

Ultimately, good AI will also know your views on certain topics, such as which candidate you support or oppose in a political race, so it won't say something offensive.

"On a higher level, it's a matter of being personalized," Black said. "That can be creepy, but it can also be appropriate, and it's important for trust. It's all about building this thing that's close to what humans expect and makes it easier to have this conversation."

Looking ahead, another big issue is how to get people to learn to do new things with their devices. There's basic interaction happening now with technologies like Siri and Cortana, but the next challenge is to get users to turn to AI first for answers, Black said.

Some users have been embarrassed talking to their phones but more comfortable talking to Amazon Echo because all they have to do is speak out loud in their homes. "People are treating it differently," he said. "It's there in the room with you."

Join the Good Gear Guide newsletter!

Error: Please check your email address.

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Katherine Noyes

IDG News Service
Show Comments

Most Popular Reviews

Latest News Articles


GGG Evaluation Team

Kathy Cassidy


First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni


For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell


The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi


The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott


My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?