How to use Text-to-Speech in Kapwing

Image of Kit waving with the words "Hello world" in a text box and waveform

You've seen viral videos on TikTok and Instagram with an automated voice that reads text, and you want to know how it's done. On Kapwing, you can automatically add a synthetic voiceover to a project with the Text-to-Speech (TTS) feature! This tutorial will explain what this feature is and how to use it to allow for more stories to be created and shared.

What is Text-to-Speech?
How do you use Text-to-Speech?
Subtitles with Text-to-Speech
What languages does Kapwing support for Text to Speech?

What is Text-to-Speech?

Text-to-Speech (TTS) allows users to generate audio from the text in their projects as well as lets them choose which default voice they want to use for the audio (eg. American male/female).

How do you use Text-to-Speech?

1. Create a project on Kapwing

2. Upload a video or image from the "Media" tab, URL Upload, and/or drag and drop.  

2. Add text to the project by selecting the "Text" tab on the left. Edit and customize the text's style, color, and font in this stage. Please note that there is a 1000-character limit for text boxes you want to use Text to Speech on.

3. While the text is still selected, on the right under Effects sidebar, select the "Text-to-Speech" button.

Alternatively, you can go directly to the "Audio" tab in the left sidebar and click the "Text-to-Speech" button. This option does not require adding a text layer stated in step 2. You can use either option to generate Text-to-Speech audio.

4. Choose a voice option for the text. To change the voice, select an alternative voice.

5. You cannot add more than 1000 characters in the text to speech audio box. The limit is to ensure accuracy and a smooth experience. Though you cannot go over 1000 characters, you can create multiple text to speech boxes. So if your first TTS box isn't enough, click anywhere else on the timeline outside the audio file and type in the now empty box for more TTS audio.

6. To edit existing TTS, click the audio file in the timeline and edit the text in the text box. Click the re-generate for the update to fully process.

7. Once your project is to your liking you can export it by selecting the "Export" button at the top right of the page, confirming the export settings, and clicking "Export" again at the bottom of the side panel.

Note: The text and audio are not attached so if you move the text, the audio will also need to be moved on the timeline otherwise they will not align. You can also delete the text and only have the audio.

Your text to speech limits will renew on the first of the month, regardless of your billing date. For example, if you have run out of limits by June 15th and your billing is June 20th, your limits will renew July 1st.

Subtitles with Text-to-Speech

Sometimes you might want to have an automated voice and instead of "text" you want them to be subtitles attached to the audio:

1. Create the audio you want by using the Test-to-Speech feature.

2. Make sure the "Automatically generate subtitles" is selected.

3. Then "Generate the audio layer". This will create both the audio and subtitles linked to the audio.

Note: The subtitles will move with the audio wherever you place them on the timeline. If you do not want the subtitles you will have to regenerate the audio with the  "Automatically generate subtitles" option not selected.

Need ideas for what to do with text-to-speech? Have an automated voice read the subtitles for a video, create a verbal description for the images or video in your project, make a viral video with the automated voice trend, delete everything but the audio, and export an audio file.

Voice Cloning

Kapwing offers the ability to save a clone of your voice so that you can create a text to speech layer using your own voice model. We've enabled Voice Cloning in partnership with Eleven Labs.

To add a voice clone, you must be a Business customer. Business plan customers can save up to 2 voice clones in their Brand Kit. Once you've upgraded to the Business Plan, click the "Add new Voice" button in the Text to Speech dropdown menu. You'll be prompted to upload an example of the speaker whose voice you want to clone. Note that customers MUST have the rights to clone a speaker's voice, as noted in Kapwing's terms of service.

To delete a voice clone, go to your Brand Kit and scroll down to the saved voice clones. Hover over a voice model icon and click the delete icon that appears in the upper corner.

What languages does Kapwing TTS support?

Kapwing uses same 30 different languages for text to speech as it does for dubbing. See the full list of supported languages below.

Supported Language List

English (US)
English (UK)
English (AUS)
Arabic (Multi-Region)
Chinese (Mandarin)
Filipino (Tagolog)
Portuguese (Brazil)
Portuguese (Portugal)
Spanish (Spain)
Spanish (Mexico)

* we do not support voice cloning in this language

Looking for more help?

Check our Release Notes for tutorials on how to use the latest Kapwing features or contact us.