e-Learning Narration (TTS vs In-House vs Professional Voice-Over)

As an e-Learning developer, I always work hard to improve my development flow when designing courses for my clients. Over the years, I’ve used Text-to-Speech (TTS), internal staff, and professional voice-over artists to record scripts. Here’s the information I’ve curated and my personal comments about some of these features.

When to use Text-to-Speech in E-Learning

Nicole Legault provides some key reasons to use TTS in her article, “When to Use Text-to-Speech in E-Learning.”

“Using text-to-speech (TTS) is a great option when you can’t add professional narration to your courses. Simply type up a script, and the system will automatically generate audio clips based on that text. You can pick the language and gender of the voice, as well as choose from different voices to find the one you like best.”
Nicole Legault at https://community.articulate.com/articles/when-to-use-tts

She goes on to provide a few examples of situations where it makes the most sense:

During the storyboarding process
When you don’t have a real person or voice actor
Increasing accessibility for learners with visual impairment
Simplifying content maintenance when one or two words change in the script

I like to use TTS during the storyboarding process to provide my clients with a feel for the flow of the course. TTS also has the added advantage of helping me fine-tune my script. As helpful as the TTS-generated voices are, I prefer not to use them in my final e-Learning courses. Here’s why:

“Even the best commercially-available concatenated speech systems do not even attempt to conquer the problem of emphasis. In normal speech, we convey emotions through a range of tricks – pauses, the timing of syllables, tone. Even in the lab, the best attempts at putting emotions like anger and fear in synthesized speech successfully convey these feelings only about 60% of the time, and the numbers are even worse for joy.
MIT Technology Review at https://www.technologyreview.com/2010/08/24/262187/why-synthesized-speech-sounds-so-awful/

I’ve used TTS features included with e-Learning applications and used online TTS generation services for my initial course builds. Over time, these have gotten much better but still fall short, in my opinion. Once I am confident I have a solid script approved by my client, I prefer to seek out the services of a professional voice-over person to elevate my client’s content.

What do I look for in a professional narrator?

Style

The topics covered determine whether the narration needs to be conversational or formal. Once I know the style I am looking for, I reach out to my professional voice-over network and ask for sample reads of a small section of the client’s script. For me, the client needs to hear the narrator read their script and not some unrelated content.

High-Quality Audio

I want the final product to have a crisp, clear sound which requires a professional recording studio. The pace and volume need to be consistent, especially when I have to go back and have a section re-recorded. I don’t want to spend large amounts of time editing dead space, fixing pacing problems, adjusting audio levels, or removing background noise.

Turn-around-time

I want them to be responsive. I try to plan 5-6 days lead time in my production schedule for the initial read. For revisions, I look for individuals that can turn them around in 24-48 hours.

What about using internal staff to record audio narration?

Using internal narrators can be a time-saver, but it can create more work in the end.

After all the effort and money invested in a proper LMS system, snazzy graphics, 3D animations, script preparation and approvals, music selection, and any other mechanics that go into the project, the last thing you want is to do is have the voice delivering the message to be one of the reasons people DON’T want to “be subjected” to the training. Voice-over is much more than just “reading it out loud into a microphone”.
David Gilbert at: https://www.lmspulse.com/2021/why-hire-a-professional-voice-over-for-your-elearning-project-the-3-key-reasons/

David Gilbert’s article lists several reasons why using internal staff might not be in your best interest. Here are some of the key points I found that made the most sense from my personal experience:

Internal staff narrators don’t usually have the formal training to control their pitch, volume, and energy.
Professionals know how to connect with the script, which in turn, helps the learner connect.
Internal staff narrators don’t usually have access to a recording room with professional-level equipment. They will usually record in the first empty room they can find. Each room introduces distracting acoustic issues such as echoes, background office, and street sounds. It’s almost impossible to get consistent quality recordings when moving from one room to another.
Without the proper equipment and knowing how to use it correctly, internal narrators create plosives, mouth noises and clicks, breath noises, and speak too close or too far away from the microphone. These mistakes result in the need to re-record and sometimes end up in the developer’s production queue where they have to address.
Should the internal staff narrator no longer work with the company, it introduces the problem of multiple narrators with varying degrees of competency and engagement over time. Numerous narrators result in multiple retakes and can delay the e-learning project.

Can you hear the difference?

I think the best way to hear the value of professional narration is to place it against TTS. Here are three sample audio narration tracks for a recent demo project I completed. I created the samples using Storyline 360’s TTS feature and an online TTS generator. David Kaplan from Voice On the Run, Inc. provided the third sample I used in my final demo.

Storyline 360 TTS

Online TTS Generator

Professional Voice Over (David Kaplan)