Dubbing Glossary — AI Terms Explained
AI dubbing can reduce dubbing costs by as much as 50%

As AI technologies continue to evolve, they offer greater access to advanced editing tools that were traditionally unavailable to most consumers. When used effectively, these tools can instantly boost the quality, impact, and reach of your digital content.
One of the most powerful tools now available is AI language dubbing. It uses artificial intelligence to translate spoken audio and generate a synchronized voiceover in a different language
While the process is automated, AI dubbing can still feel unfamiliar to creators who are new to video translation or editing in general. In this article, we’ll explain what AI dubbing is, define the key terms to know, and show you how to get started, even if you have no prior experience.
Visual learner? Check out our video guide on how AI dubbing and lip syncing works.
Table of Contents
What is AI Dubbing and How Does it Work?
To understand how AI dubbing works, it helps to first look at how the process is traditionally done. In manual dubbing, a human translator listens to a video and transcribes the dialogue into the target language. While some translators are fluent in multiple languages, most specialize in only a few, meaning large-scale multilingual projects often require an entire team.
Once the script is translated, a voice actor is brought in to record the new dialogue. Just like translators, most actors have specific language and tone specialties. Finding the right voice across multiple languages can be time-consuming and expensive.
AI dubbing follows a similar structure, but automates the process using machine learning. AI models can detect speech, transcribe the audio, translate it into a chosen target language, and generate a new voice over using pre-trained voices or voice cloning. The dubbed audio is then synced to match the timing of the original video.
In fact, even conservative estimates show that AI dubbing typically costs 15–50% less than traditional dubbing services, making it a practical choice for creators working with limited budgets or tight timelines.

Modern AI dubbing tools have made this technology more accessible than ever. With support for dozens of languages and hundreds of AI voices, creators can dub videos at scale with minimal effort.

AI Dubbing Glossary
Whether you're new to AI dubbing or just need a refresher, understanding the terminology is always helpful toward building confidence and improving your editing efficiency. Below is a breakdown of essential terms you'll come across when using AI dubbing, specifically when creating or repurposing video content.
Brand Glossary
A custom list of words, names, or phrases that should always be translated or pronounced in a specific way. This ensures brand consistency across all dubbed projects.
Custom Spelling
Allows you to define specific spelling preferences within Subtitles, such as American vs British English (e.g., “color” vs “colour”), or enforce consistent use of branded terms.

Dub Video
A command that initiates the AI dubbing process. It uses the transcript to generate translated speech and align it with the video timing.
Dubbing
Replacing a video’s original audio with speech in a new language. AI dubbing automates this process by translating dialogue and generating synced voice-overs, helping creators reach multilingual audiences without needing to re-record audio manually.
Find and Replace
A feature for quickly editing text or translated phrases in bulk, which is useful for refining subtitles or fixing consistent errors across a project.
Lip Sync
The process of aligning dubbed audio with the mouth movements of the speaker in a video. Accurate lip syncing makes AI-dubbed content look more natural and believable.
Example of a dubbed video with lip syncing applied.
Localization
More than just translating text, localization involves adapting a video to a specific region or culture. This can include idioms, slang, date formats, and other cultural references to ensure the content feels native to the target audience.
Playback Speed / Speed Adjust
Tools for changing the speed of a video or layer within the Kapwing timeline. These features can help match dubbed audio to visual pacing.
Pronunciation Rules
Controls are , especially useful for names, acronyms, or terms that text-to-speech systems often mispronounce.

SRT / VTT
Subtitle file formats are used to display captions alongside video content. SRT is widely compatible, while VTT includes support for styling and web-based playback. Kapwing supports both formats for exporting and editing subtitles.

Stock Voice
Pre-generated AI voices are available for immediate use. These voices vary in language, accent, and tone, and are ideal for quick dubbing or voice-over projects.
Subtitles
Text displayed on-screen that matches the spoken dialogue in a video. Subtitles can be automatically generated, translated, or manually adjusted to improve accessibility and viewer comprehension.
Target Language
The language selected for translated subtitles or dubbed audio. Choosing the correct target language ensures accurate translation, pronunciation, and localization.
Translate Tool
An editing feature that allows users to apply subtitle translation or AI dubbing to a video. This tool typically serves as the first step in the dubbing workflow.
Transcript
A written version of the spoken audio in a video. This text is used to generate subtitles, translated voice-overs, and dubbing content.
Transcript Review
A process that lets you review and manually edit the AI-generated transcript and translation before finalizing the dubbed audio. This ensures higher dubbing accuracy.
Translation Rules
Guidelines that adjust how the AI translates certain words or phrases, helping maintain tone or fix recurring issues across multiple projects.

TTS (Text-to-Speech)
Technology that converts written text into spoken audio using AI-generated voices. TTS allows creators to add dialogue in multiple languages without hiring voice actors.
Voice Clone
A feature that replicates a speaker’s voice using AI. It helps preserve tone and identity when dubbing content into another language.
How to Dub Without Any Experience
Accessing an AI dubbing platform is easier than you might think. In fact, integration into video editing tools is what makes dubbing so useful.
1. Import Your File for Translation
Upload a video or audio file to the Kapwing editor. To best visualize the dubbing process, video files are recommended. Dubbed videos will include automatically translated audio and automatically synchronized subtitles featured in the target language, making it easier to evaluate the result.
Upload or link any video file to begin automatically dubbing.
If you're looking to dub a video that’s already published on a platform like YouTube, simply paste the video link into the media tab — no need to download and reupload the file manually.
2. Select the Translate Tool
Once your media is uploaded, navigate to the Translate tab in the left-hand sidebar.
Here, you’ll see options to either dub your video or add translated subtitles. For this walkthrough, choose the Dub video option.

3. Configure Your Translation
On the next screen, you’ll find the dubbing settings. For a basic translation, the most important setting is the Translate to dropdown, where you’ll select the target language for your dubbed content.
This dropdown provides access to over 100 language options, including multiple dialects for better localization. This is especially important when creating personalized content for international audiences.

By default, Kapwing uses voice cloning to replicate the speaker’s original voice. You can leave this as-is for your first project, but the platform also provides advanced settings you can customize as needed.
Optional Advanced Settings
For additional editing options and to ensure the most accurate dub possible, there are a few advanced options to be aware of. Here’s what each option allows you to do:

- SRT/VTT File Upload: Upload an existing subtitle file for greater translation accuracy. If you don’t upload a file, a transcript will automatically generate based on the original audio.
- Voice Selection: Choose from over 180 voice models. You can filter by voice age, gender, and intended use (narration, news, ASMR, etc.) to find the best fit.
- Advanced Dubbing Settings: Access the Brand Glossary and enable automatic timing adjustments in your translated audio. These settings help improve accuracy and flow, especially when localizing content for regions with unique dialects or phrasing.
- Transcript Review: If no subtitle file is available, you can review the AI-generated transcript before dubbing begins. This helps catch errors early, particularly in videos with poor audio quality or multiple speakers.
Once you’re ready, click the Dub Video button at the bottom to begin the dubbing process — it only takes a few moments to complete.
4. Review Your Dubbed Video
As the dubbing runs, a progress bar will display the current stage of the process. Most short videos are processed in just a few minutes, although longer projects may take additional time.

When the dub is finished, your timeline will feature four new tracks.

- Subtitles: Automatically generated subtitles in your target language, timed to match the dubbed dialogue.
- Original Video: The original video file, now with its audio removed and moved to its own track.
- Background Audio: Isolated music or sound effects from the original video, with the vocals removed to preserve ambient sound.
- Translated Audio: The AI-generated voice track in the target language.
Each of these layers can be edited or deleted individually. To remove a track, right-click and select Delete, or press Backspace after selecting the layer.
At this point, your video is dubbed and ready to be exported. However, there are still a few dub-related changes you can make if needed. These aren't required for most projects, but can be useful depending on your content goals.
5. (Optional) Adjust Video Subtitles
Head over to the Subtitles tab in the left-hand sidebar. You’ll see the full transcript with timestamps, allowing you to adjust the length or timing of any subtitle block.
You can also use the Smart Tools dropdown in the top-right corner to access additional customization features:

Here is a rundown of these tools:
- Translate: Translate your subtitle text without affecting the dubbed audio. This is helpful when you want the on-screen text to remain in the original language, or display it in a different one for clarity or accessibility.
- Lip Sync: Apply an AI lip sync filter to make the speaker's lips align with the dubbed content. This is useful in professional applications like advertisements or testimonials, where a natural appearance is important to the success of the video.
- Custom Spelling: Manually update word spellings to reflect brand preferences or correct unusual phrasing.
- Auto Emojis: Automatically add emojis that match your subtitle text. This can help communicate tone or emotion across language barriers, especially for audiences who respond well to visual cues.