How to use Text-to-Speech in Kapwing

Image of Kit waving with the words "Hello world" in a text box and waveform

You've seen viral videos on TikTok and Instagram with an automated voice that reads text, and you want to know how it's done. On Kapwing, you can automatically add a voiceover to a project with the Text-to-Speech feature! This tutorial will explain what this feature is and how to use it to allow for more stories to be created and shared.

What is Text-to-Speech?

Text-to-Speech allows users to generate audio from the text in their projects as well as lets them choose which default voice they want to use for the audio (eg. American male/female).

Where is Text-to-Speech in Kapwing?

Once you add text to the project and select the text, there are a couple of ways to get to the Text-to-Speech feature:

1. On the right sidebar under Edit click "Text-to-Speech" which will lead to Option 2.

OR

2. You can go directly by clicking "Audio" in the left sidebar. There will be a Text-to-Speech button and tab. You can select either one.

How do you use Text-to-Speech?

1. Create a project on Kapwing

2. Upload a video or image from the "Media" tab, URL Upload, and/or drag and drop.  

2. Add text to the project by selecting the "Text" tab on the left. Edit and customize the text's style, color, and font in this stage. Please note that there is a 1000-character limit for text boxes you want to use Text to Speech on.

3. While the text is still selected, on the right under Edit sidebar, select the "Text-to-Speech" button.

4. Choose a voice option for the text. To change the voice, select an alternative voice.

5. Continue editing your project by using the many other features Kapwing has to offer, such as Subtitles, Pinning, and Elements.

6. Once your project is to your liking you can export it by selecting the "Export" button at the top right of the page, confirming the export settings, and clicking "Export" again at the bottom of the side panel.

Note: The text and audio are not attached so if you move the text, the audio will also need to be moved on the timeline otherwise they will not align. You can also delete the text and only have the audio.

Subtitles with Text-to-Speech

Sometimes you might want to have an automated voice and instead of "text" you want them to be subtitles attached to the audio:

1. Create the audio you want by using the Test-to-Speech feature.

2. Make sure the "Automatically generate subtitles" is selected.

3. Then "Generate the audio layer". This will create both the audio and subtitles linked to the audio.

Note: The subtitles will move with the audio wherever you place them on the timeline. If you do not want the subtitles you will have to regenerate the audio with the  "Automatically generate subtitles" option not selected.

Need ideas for what to do with text-to-speech? Have an automated voice read the subtitles for a video, create a verbal description for the images or video in your project, make a viral video with the automated voice trend, delete everything but the audio, and export an audio file.

Looking for more help?

Check our Release Notes for tutorials on how to use the latest Kapwing features or contact us.