The Rise of Synthetic Voices in Global Multimedia Projects
It’s 1979. The Buggles’ hit song dominates the airwaves, forecasting the decline of radio as a medium:
“They took the credit for your second symphony
Rewritten by machine on new technology
And now I understand the problems you can see”
Fast forward to 2024, and a similar gloom hangs over human voiceovers, with Artificial Intelligence taking the stage.
Will AI kill the voiceover star?
AI takes a seat in multilingual multimedia projects
espell, as a boutique localization agency, where client needs tend to shape research and product development, has been experimenting with AI voiceover technology to explore its potential. According to Pál Szirmai (watch him speak in the video below), a localization engineer at espell, while AI voiceover has matured, it’s not ready to entirely replace human talent.
“We’re in the era of AI voiceover,” Szirmai explains,“but it’s not yet poised to dethrone the radio star.”
espell’s exploration began with a familiar challenge: tight deadlines.
“We had been working with this particular client regarding various projects, among them transcreation of videos and everything else it entails: casting, dubbing services, studio work, video and sound engineering, and so on.”
This long-standing client sought faster turnaround times for their international voiceover projects. Together, we began testing synthetic voice technology to speed up our processes: we wanted to find a solution that would help our clients localize a larger number of videos in a shorter period of time, while keeping the budget down as much as possible.
When should we use synthetic? Watch our interview above.
Cutting Turnaround Time
Human voiceover is the gold standard for multimedia projects. The downside? It’s time-intensive. From booking studio sessions and coordinating voice artists to editing and quality assurance, the process can stretch out for weeks. Any revisions can further delay the delivery, sometimes by weeks, depending on the availability of the voiceover artists and the infrastructure.
While justified for projects prioritizing originality and natural-sounding voices, this lengthy process can be inefficient for others. Enter synthetic voices.
Applying synthetic voice greatly reduces the time needed until delivery. While the transcreation, translation and editing processes are just as meticulous as they are with human voices, linguistic checks, and QA processes are all in place, the turnaround time can still be dramatically reduced.
By incorporating AI, espell dramatically reduced delivery times. Translation, transcreation, and QA processes remained rigorous, but generating synthetic voices in-house slashed production delays.
“We can create voices in-house much faster than booking studios and artists,” says Szirmai. “This saves us days, even weeks.”
Complement, not Replacement
Despite its advantages, AI hasn’t made human voiceovers obsolete. espell continues to collaborate with skilled voiceover artists, especially for projects demanding top-tier quality, such as advertising campaigns.
“For ads, movies, or any creative work where the human touch is essential, AI can’t compete,” states Szirmai.
espell’s multimedia services can use both approaches: in our full-service model we connect directly to the client’s content developers, agencies or creative departments and we produce multilingual versions of the creative works, videos, podcasts, print and web content on the fly. In this process, while making any culturally necessary modification to any part of the creative material, we may employ both human and synthetic voices, depending on the project’s attributes.
The Future of Voiceover
For now, AI voiceover occupies a niche. It’s ideal for projects where speed and efficiency outweigh artistic requirements, such as software feature demos, or educational content.
“Campaigns and films that need the nuanced performance of trained actors won’t be replaced by AI anytime soon,” Szirmai reassures.
Still, AI is evolving. It’s already shouldering some of the workload in international voiceover jobs, complementing rather than replacing human talent.
So, while synthetic voices are here to stay, it might still be too soon to sing along with The Buggles:
“In my mind and in my car
We can't rewind we've gone too far
Pictures came and broke your heart
Put the blame on VCR”