Text-to-speech has already become a significant part of many teams’ workflows, saving them time and money by replacing recorded voice overs. Now, with more realistic, human-sounding voices, it’s an even more powerful tool.
How has text-to-speech changed?
If you haven’t checked out text-to-speech tech in the last few months, you might be surprised by just how good it’s gotten.
In fact, we recently ran a quick experiment where we recorded the voices of our marketing team and played them alongside a single AI-generated voice. We asked our followers a simple question: Which voice was AI?
Commenters were all over the place. Some held fierce conviction that the (real) voices from our team must be AI; others asserted that, in fact, every voice was a human. One thing was for sure: AI voices now sound so natural that it's hard to tell the difference.
Realistic, high-quality text-to-speech voices
We’ve added high-quality, realistic voices across 9 languages into our existing text-to-speech feature, built on ElevenLabs’ powerful speech API.
ElevenLabs is at the forefront of exciting new developments in realistic text-to-speech; their research powers the natural human-like tones and speaking cadences of the new AI voices we’ve added to Kapwing:
Save time with AI voice overs
Recording voice overs manually takes more time and effort than it should – getting the audio levels just right, getting through the script without flubbing a take, getting last minute changes to the script that mean rerecording the whole thing. And that’s before you get to the editing process, removing filler words and background noise.
Text-to-speech lets you skip all that.
Kapwing’s newly-available AI voices allow you to convert a script or document into a video without ever having to press record.
With professional-sounding text-to-speech, you can:
- Create engaging social media videos
- Create polished, engaging video ads
- Add consistent voice overs to new training videos
- Update old training videos, instead of rerecording
Reach new audiences with translated voice overs
Only 5% of the global population are native English speakers, which is why localization becomes pivotal for scaling the reach of your content With AI-powered voice overs, you can confidently convert your videos for global audiences sooner than you thought possible.
In addition to three English accents (US, UK, and Australian), Kapwing now offers premium text-to-speech voices across eight other languages: Arabic, French, German, Hindi, Italian, Polish, Portuguese, and Spanish (both Castilian and Mexican).
Here's a sample using one of the new Italian voices:
Translate your video script or voiceover transcript to one of our supported languages, then let our text-to-speech tool handle the rest.
Finish editing your video all in one place
Just like our other AI-powered tools, these new AI voices are integrated right into the Kapwing editor. After creating your AI voice over, fine-tune the rest of your video with a full video creation suite at your fingertips.
- Quickly rough cut with Smart Cut or our text-based editor
- Generate and customize word-by-word subtitles
- Share your edits and get real-time feedback from your team
Get better AI-generated voice overs
Voice overs have always been integral to creating great videos; they literally give voice to your ideas. Text-to-speech voices have made it easier than ever to create those voice overs. But up until recently, they haven’t been very convincing. You knew you were listening to an AI voice over.
Now, truly realistic voices are just a few keystrokes away – right from the place where you already make your videos.
The new voices are live in Kapwing now; give them a try.Create content faster with Kapwing's online video editor →