Why Amazon Polly is a potential game changer for the publishing industry

Esra Celebi

The way users interact with content is changing. We are in the midst of a paradigm shift. Users are already consuming more video and audio content than text. One of the most exciting developments is text-to-speech.

The way in which users interact with content is changing. We are in the middle of a paradigm shift. Users are already consuming more video and audio content than text. One of the most exciting developments is text-to-speech.

Text-to-speech is not exactly a new technology. Text-to-speech has been around for more than two decades. However, the long-awaited breakthrough failed to materialize because natural and realistic modulation was not yet available.

users see it the same way. For example, messages on smart speakers are not very popular because they rely on artificial voices.

However, text-to-speech technology has taken a big leap forward because of one particular company: this company is called Amazon.

Amazon Polly is changing everything because it finally comes close to the sound of the human voice when reading texts aloud. Small and large publishers should pay attention to Amazon Polly as it offers them many exciting opportunities.

Why audio is a great opportunity for publishers

Compared to other industries, publishers tend to discover trends a bit more delayed. According to Paul DeHart, CEO of Blue Toad, "Publishers are slowly evolving into media companies." In today's environment, publishers "must continue to explore new ways to create and deliver content to their readers."

One of these new, exciting ways to deliver content is audio. According to a recent study by Infinite Dial, U.S. users consume an average of 17 hours of audio per week.

Average time spend listening to audio according to an infinite dial study

Yes, it won't immediately overtake text or video consumption. However, it does provide a convenient way for users to consume content anywhere. One of the most exciting trends in audio is the rise of smart speakers.

According to a Reuters Institute report titled. The Future of Voice and the Implications for News. voice-activated speakers such as Amazon Alexa and Google Assistant are growing faster than smartphones and tablets previously did at a similar stage. Smart speaker use in the U.S., U.K. and Germany has also roughly doubled in the past year.

"Voice could become an important portal for media in the future."

Nic Newman, senior research associate at the Reuters Institute

Publishers, of course, also have monetization strategies for audio in mind. One way could be to include a sponsor message at the beginning of the listening experience. Such a 15- or 30-second ad could be a quick and easy way to make money from audio content.

Amazon Polly: What's so special?

Voice, then, is a tool that is becoming increasingly important to publishers. However, some users have been reluctant to use text-to-speech technology because the pseudo-human voice seemed completely inhuman. Indeed, until now, the listening experiences have not been pleasant at all.

Amazon Polly is the most exciting development in text-to-speech. Amazon Polly, the company itself writes, "is a text-to-speech service that uses Deep Learning to transform text into lifelike speech." Essentially, TTS is the generation of artificial speech from text.

It allows users, including publishers, to "build applications that talk and create entirely new categories of speech-enabled products." Amazon offers dozens of lifelike voices in a variety of languages. These include English, Danish, French, Japanese, Spanish and even Mandarin.

Languages that are included in Amazon Polly

In addition to the standard TTS voices, however, Amazon Polly also has two new features. They show the great progress in text-to-speech technology.

#1 Neural Text-to-Speech (NTTS)

The first is Neural Text-to-Speech (NTTS). NTTS provides advanced improvements in speech quality by understanding the differences in speaking styles, making speech look expressive and lifelike.

NTTS learns to speak by listening to recorded human speech and then copying it. Basically, the tool is designed to learn to speak the same way children do.

According to Julien Simon, global tech chief evangelist at Amazon, NTTS is a game changer "because it increases naturalness and expressiveness." It brings us closer and closer to automated voices that sound like real people.

Currently, NTTS is available for eleven voices that support U.S. and U.K. English.

#2 Newscaster Style

Amazon Polly has taken a big step forward with its so-called Newscaster style.

If you want to listen directly to an example:

Amazon Polly's NTTS supports a read-aloud style that is tailored to different types of narration. This is perfect for publishers looking for new ways to present their content.

Amazon Polly's Newscaster style makes narration sound very realistic. It is so advanced that the voice modulates depending on whether a newscast, sportscast, or even a college lecture is being read aloud.

Amazon Polly users can also take advantage of Amazon Translate. These two programs, working together, translate publishers' content into the user's preferred language. What does this mean for you? You can make your content available to a much larger audience.

Which publishers are already using Amazon Polly?

Amazon Polly is extremely exciting for large and small publishers alike.

Although the technology is still very young, some pioneers are already using Amazon Polly. Some of these publishers include Gannett, The Globe and Mail, Ringier, Success Magazine, TIM Media, Encyclopedia Britannica, and CommonLit.

Audio Now from The Globe and Mail

The Globe and Mail is one of Canada's most widely read print and digital newspapers. The Globe and Mail has used Amazon Polly specifically to increase customer engagement.

According to Greg Doufas, chief technical and digital officer at The Globe and Mail, the newspaper has used Amazon Polly to help users get better access to The Globe and Mail's award-winning journalism.

The in-house product is called Audio Now and uses the Amazon Polly Newscaster. Doufas says Audio Now is a first for Canada.

The Globe and Mail readers can access Audio Now by simply clicking on an article that interests them. Because Canada is a multilingual country, The Globe and Mail offers Audio Now in English and French (male and female).

The newspaper attracts a worldwide audience. Articles can even be read aloud in Mandarin. Audio Now from The Globe and Mail is a pioneer. It's already changing the way The Globe and Mail readers consume content.

Gannett deploys Amazon Polly

In addition to The Globe and Mail, Gannett uses Amazon Polly.

"Services like Amazon Polly and Newscaster Voice help us deliver timely and original news quickly and seriously, in line with our brand."

Scott Stein, VP of Content Ventures at Gannett

Amazon Polly extremely useful in the world of news: with news changing by the second, journalists simply don't have the time to go into a recording booth and record a voiceover of an article.

With Amazon Polly, that situation is changing. By using Amazon Polly, journalists can spend more time actually reporting the news and still not have to sacrifice an audio format of their articles.


We are still in the early days of technologies like Amazon Polly. So it's very likely that there's more innovation ahead.

I think it's worth checking out Amazon Polly right now. Audio is a proven way to consume content. Whether we're commuting to work or just enjoying a relaxing evening at home, audio provides a great way to be entertained while doing all of these everyday things.

Audio content consumption is steadily increasing. This trend will continue for some time. So what's the bottom line? Simple. Audio as a publishing channel is here to stay. Audio is here to stay. That's why publishers should definitely take this channel seriously.

Not sure if Purple suits you?

Or you have individual requirements?
We will be happy to advise you.
Kevin Kallenbach
Head of Sales