@Ignatia Webs: Big Question: How to use Text-to-Speech in eLearning and when

Thursday, 2 September 2010

Big Question: How to use Text-to-Speech in eLearning and when

In the Big Question launched by Tony Karrer this month, he wonders how we use Text-to-Speech (TTS) in our courses? He also reflects on budget and the combination of TTS and the text that can be shown, but I will only look at when and how I have been using TTS.

Tony Karer and Joel Harband (from Tuval software) got together and wrote up three posts on embedding Text-To-Speech in your eLearning courses. If you have not used TTS in your courses yet, have a look. If you have gone through these three posts, you really know how to embed TTS in your eLearning courses (e.g. integrate them into Articulate or Lectora, use them in Adobe Captivate...). The three posts focused on different angles of TTS:

Text-to-Speech Overview and NLP Quality: this post focuses on what TTS is and what the quality of the voices are when using these technologies (how natural do they sound?)
The following TTS post focused on the Digital Signal Processor (yes, for you techy audio people out there)
And in his last post, Tony and Joel look at how TTS can be embedded in eLearning authoring software.

But Tony's Big Question of this month is: when do you use TTS in your courses? Or when do you use it to keep up with some learning activities yourself?

When the audio is only a dry representation of the content => TTS fits
Of course, what do I call 'dry content'? And can you - as a trusted eLearning developer - allow yourself to distribute dry content? Preferably not, but sometimes, content is not that flashy and in these cases TTS might cut audio cost. But you would not want your learners to listen to lengthy TTS audiofiles anyway, any learning should be delivered bytesize, certainly when the topic is dry in itself.
Delivering an audio file for those who want to, goes quickly with TTS. TTS can greatly benefit all your learners that are vision impaired in any way, but also people that are learning in an environment that is not favorable to reading (a bumpy bus ride, while driving a car, learning at night while driving along with your co-worker, at night with power cut...). Or simply to indulge those learners that have a better audio memory than reading memory. So, if you address a variety of learners and you want to offer them an audio file, TTS can really make it easy (and cheap) to do this.

I use TTS for a variety of reasons, sometimes for professional reasons, sometimes personal.
Personal use: I use OdioGo (which is a TTS from RSS feeds to podcasts) to turn my blogposts into audio files, by linking them to iTunes, all my posts are easily downloaded as podcasts. So if anyone wants to listen to it, they can.

Professionally I do use it, but with a limitation. With two limitations actually (very small audio studio is one I will not go into). When the content is dry, a TTS works fine, but when the content is sensitive or in need for nuance, I tend to go for a voice over.
The real voice is preferred to get closer to the real learning context. To enhance our animations we sometimes use real voices
In the past I have used voice overs for animation, for instance with the moviestorm software (an animation software), I put my voice over it. This was also done with other eLearning movies that were based on animations (Spanish example below, the second half of the 2 min video features the animation).

Why did we go for voice over of a real person? It was a matter of keeping the quality of the voice in relation to the content. For many TTS softwares do not always sync to the feel of a text, and the same phrase voiced with different intonation can really alter the message of a sentence. For instance, if you want to sound ironic, you will use a basic sentence (e.g. Do you?), and only add an extra intonation to give it the ironical edge.

Adding a real voice, can sometimes also add to the cultural background and fit the learning relationships of certain cultures. For example in some parts of Africa the grandmother knows her medicine, which means that if you are ill, you go to grandma to get cured. So when we prepare a course to increase medical awareness, we tend to add a real African older woman's voice to the course. This way it also feels more relevant and truthful from a cultural perspective.

It is the same with gps-voices. These are all smooth, clean voices, but that does not add to their credibility. But when a GPS has a 'dialect' voice that fits a certain area (e.g. a Texan voice), it immediately comes closer to the heart of some people.

In short: for very dry material, I would go for TTS and give it as an option to learners that prefer to learn via audio. But for more sensitive and context related content, I still prefer voice-over to add a more human feel to a course.