How to use Text-to-Speech in Kapwing

Image of Kit waving with the words "Hello world" in a text box and waveform

You've seen viral videos on TikTok and Instagram with an automated voice that reads text, and you want to know how it's done. On Kapwing, you can automatically add a synthetic voiceover to a project with the Text-to-Speech feature! This tutorial will explain what this feature is and how to use it to allow for more stories to be created and shared.

What is Text-to-Speech?
How do you use Text-to-Speech?
Subtitles with Text-to-Speech
What languages does Kapwing support for Text to Speech?

What is Text-to-Speech?

Text-to-Speech allows users to generate audio from the text in their projects as well as lets them choose which default voice they want to use for the audio (eg. American male/female).

How do you use Text-to-Speech?

1. Create a project on Kapwing

2. Upload a video or image from the "Media" tab, URL Upload, and/or drag and drop.  

2. Add text to the project by selecting the "Text" tab on the left. Edit and customize the text's style, color, and font in this stage. Please note that there is a 1000-character limit for text boxes you want to use Text to Speech on.

3. While the text is still selected, on the right under Effects sidebar, select the "Text-to-Speech" button.

Alternatively, you can go directly to the "Audio" tab in the left sidebar and click the "Text-to-Speech" button. This option does not require adding a text layer stated in step 2. You can use either option to generate Text-to-Speech audio.

4. Choose a voice option for the text. To change the voice, select an alternative voice.

5. Continue editing your project by using the many other features Kapwing has to offer, such as Subtitles and Elements.

6. Once your project is to your liking you can export it by selecting the "Export" button at the top right of the page, confirming the export settings, and clicking "Export" again at the bottom of the side panel.

Note: The text and audio are not attached so if you move the text, the audio will also need to be moved on the timeline otherwise they will not align. You can also delete the text and only have the audio.

Subtitles with Text-to-Speech

Sometimes you might want to have an automated voice and instead of "text" you want them to be subtitles attached to the audio:

1. Create the audio you want by using the Test-to-Speech feature.

2. Make sure the "Automatically generate subtitles" is selected.

3. Then "Generate the audio layer". This will create both the audio and subtitles linked to the audio.

Note: The subtitles will move with the audio wherever you place them on the timeline. If you do not want the subtitles you will have to regenerate the audio with the  "Automatically generate subtitles" option not selected.

Need ideas for what to do with text-to-speech? Have an automated voice read the subtitles for a video, create a verbal description for the images or video in your project, make a viral video with the automated voice trend, delete everything but the audio, and export an audio file.

Voice Cloning

Kapwing offers the ability to save a clone of your voice so that you can create a text to speech layer using your own voice model. We've enabled Voice Cloning in partnership with Eleven Labs.

To add a voice clone, you must be a Business customer. Business plan customers can save up to 2 voice clones in their Brand Kit. Once you've upgraded to the Business Plan, click the "Add new Voice" button in the Text to Speech dropdown menu. You'll be prompted to upload an example of the speaker whose voice you want to clone. Note that customers MUST have the rights to clone a speaker's voice, as noted in Kapwing's terms of service.

To delete a voice clone, go to your Brand Kit and scroll down to the saved voice clones. Hover over a voice model icon and click the delete icon that appears in the upper corner.

What languages does Kapwing TTS support?

Kapwing uses same 30 different languages for text to speech as it does for dubbing. See the full list of supported languages below.

Supported Language List

English (US)
English (UK)
English (AUS)
Arabic (Multi-Region)
Chinese (Mandarin)
Filipino (Tagolog)
Portuguese (Brazil)
Portuguese (Portugal)
Spanish (Spain)
Spanish (Mexico)

* we do not support voice cloning in this language

Looking for more help?

Check our Release Notes for tutorials on how to use the latest Kapwing features or contact us.