Siri’s voice to be more human in iOS 11

[This short story from iLounge summarizes and links to a detailed post by Apple about how the company is improving the “naturalness, personality, and expressivity of Siri’s voice” for iOS 11; click through to the bottom of that post to listen to several audio clips of Siri in iOS 9, 10 and 11 to hear the impressive differences. For other recent presence-related Siri news, see “AI Programs Are Learning to Exclude Some African-American Voices“ from MIT Technology Review and “To Win The AI Assistant Wars, Apple Is Melding Siri With Its Other Services” from Fast Company. –Matthew]

Apple publishes details of deep learning improvements to Siri voice quality in iOS 11

By Jesse Hollington
August 24, 2017

Apple’s latest post on its fledgling Machine Learning Journal provides some interesting insights into how the company is using artificial intelligence technology to not only improve Siri’s speech recognition capabilities, but to make the personal assistant’s voice sound more natural, smother, and provide more personality. The article provides technical details about the deep learning technology used to improve Siri behind the scenes, describing the basic technology behind speech synthesis and the differing approaches used to provide digitized and sampled speech, and how previously lower-quality “parametric” speech synthesis is being improved by the implementation of deep learning in speech technology.

Interestingly, in the post Apple also reveals that the company is using a new female voice talent for iOS 11, “with the goal of improving the naturalness, personality, and expressivity of Siri’s voice,” following an evaluation of hundreds of potential candidates. The article then goes on to explain how engineers and developers recorded over 20 hours of speech in a professional studio to build a new text-to-speech voice using the company’s newest deep learning based technology, with a script including reading audio books, reciting navigation instructions, prompted answers, and witty jokes. The result of all of this is a U.S. English Siri voice that sounds significantly better than the one used in prior iOS versions — at the bottom of the article, Apple provides a series of sample audio files comparing Siri in iOS 9, iOS 10, and iOS 11, along with several cross-references to academic research papers the company has published detailing its efforts in speech synthesis and deep learning.

This entry was posted in Presence in the News. Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*

  • Find Researchers

    Use the links below to find researchers listed alphabetically by the first letter of their last name.

    A | B | C | D | E | F| G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z