How to use Text-to-Speech in Kapwing

Image of Kit waving with the words "Hello world" in a text box and waveform

You've seen viral videos on TikTok and Instagram with an automated voice that reads text, and you want to know how it's done. On Kapwing, you can automatically add a voiceover to a project with the Text-to-Speech feature! This tutorial will explain what this feature is and how to use it to allow for more stories to be created and shared.

What is Text-to-Speech?

Text-to-Speech allows users to generate audio from text in their projects as well as letting them choose which default voice they want to use for the audio (eg. American male/female).

Where is Text-to-Speech in Kapwing?

Once you add text to the project and select the text, there are a couple ways to get to the Text-to-Speech feature:

1. On the right sidebar under Edit click "Text-to-Speech" which will lead to Option 2.

OR

2. You can go there directly by clicking "Effects" in the right sidebar. There will be the Text-to-Speech feature with the voice options.

How do you use Text-to-Speech?

1. Create a project on Kapwing

2. Upload a video or image from the "Media" tab, URL Upload, and/or drag and drop.  

2. Add text to the project by selecting the "Text" tab on the left. Edit and customize the text's style, color, and font in this stage. Please note that there is a 200 character limit for text boxes you want to use Text to Speech on.

3. While the text is still selected, on the right under Edit sidebar, select the "Text-to-Speech" button.

4. Chose a voice option for the text. To change the voice, select an alternative voice. The text will then be connected to an audio file that will read the written text.

Note: Once Text-to-Speech is applied the text's font and style is locked but it can be reverted back to a text layer if you select the text and select "Revert".

5. Continue editing your project by using the many other features Kapwing has to offer, such as Subtitles, Pinning, and Elements.

6. Once your project it to your liking you can export it by selecting the "Export" button at the top right of the page, confirm the export settings, and clicking "Export" again at the bottom of the side panel.

Using Just the Audio from Text-to-Speech

Sometimes you might want to have an automated voice but not the text on the screen. Here's a quick tutorial on how to do this:

1. Create the audio you want by using the Test-to-Speech feature.

2. Detach the audio from the text by selecting the text with audio in the timeline. Right click and select "Detach Audio" in the dropdown menu.

3. Delete the text above the newly separated audio by selecting the text and pressing the backspace button or by right clicking and selecting "Delete" in the dropdown menu. You can use the text to speech audio as you would any other type of audio.

Need ideas for what to do with text to speech? Have an automated voice read the subtitles for a video, create a verbal description for the images or video in your project, make a viral video with the automated voice trend, delete everything but the audio and export an audio file.

Looking for more help?

Check our Release Notes for tutorials on how to use the latest Kapwing features or contact us.