Dubbing

Dubbing Glossary — Terms Explained

AI dubbing can reduce dubbing costs by as much as 50%

Learn how to use AI dubbing — along with key terms to understand every step of the process.

In 2025, video translation and localization continues to grow in size. As AI has infused the market with new players, the industry-specific lingo and vernacular has also expanded. Before a marketing team evaluates different software vendors and translation platforms, professionals can read about and familiarize themselves with the vocabulary so that they know what questions to ask, how to describe their requirements, and what to look for in different solutions.

What is AI Dubbing?

AI language dubbing uses artificial intelligence to translate spoken audio and generate a synchronized voiceover in a different language. The impact is that it looks and sounds like the original speaker is speaking a new target language. This innovative approach to software has helped content and media teams scale video production.

While the process is automated, AI dubbing can still feel unfamiliar to creators who are new to video translation or editing in general. In this article, we’ll explain what AI dubbing is, define the key terms to know, and show you how to get started, even if you have no prior experience.

Visual learner? Check out our video guide on how AI dubbing and lip syncing works.

How Does AI Dubbing Work?

In manual dubbing, a human translator listens to a video and transcribes the dialogue into the target language. Once the script is translated, a voice actor is brought in to record the new dialogue. Just like translators, most actors have specific language and tone specialties, so it might take a full team to translate a commercial or video.

Dubbing follows a similar structure, but automates the process using machine learning and AI. AI models can detect speech, transcribe the audio, translate it into a chosen target language, and generate a new voice over using pre-trained voices or voice cloning. The dubbed audio is then synced to match the timing of the original video.

Modern AI dubbing tools have made this technology more accessible than ever. With support for dozens of languages and hundreds of AI voices, creators can dub videos at scale with minimal effort.

Graphic showing available AI voices and languages for automatic video dubbing. — Try dubbing your own video for free using Kapwing.

AI Dubbing Glossary

Whether you're new to AI dubbing or just need a refresher, understanding the terminology is always helpful toward building confidence and improving your editing efficiency. Below is a breakdown of essential terms you'll come across when using AI dubbing, specifically when creating or repurposing video content.

Brand Glossary

A custom list of words, names, or phrases that should always be translated or pronounced in a specific way. This ensures brand consistency across all dubbed projects.

Custom Spelling

Allows you to define specific spelling preferences within Subtitles, such as American vs British English (e.g., “color” vs “colour”), or enforce consistent use of branded terms. Defined within the brand glossary, these terms set rules that enable dubbing software to improve the quality of the translation.

Find and Replace

A feature for quickly editing text or translated phrases in bulk, which is useful for refining subtitles or fixing consistent errors across a project.

Lip Sync

The process of aligning dubbed audio with the mouth movements of the speaker in a video. Accurate lip syncing makes AI-dubbed content look more natural and believable because the mouth moves according to the translated words. Some lip sync vendors include Texels, Sync labs, and Sieve.

0:00

/0:13

Example of a dubbed video with lip syncing applied.

Localization

More than just translating text, localization involves adapting a video to a specific region or culture. This can include idioms, slang, date formats, and other cultural references to ensure the content feels native to the target audience. This word "localization" is sometimes abbreviated as i18n.

Playback Speed

This refers to the pacing of the text to speech layer, which is sometimes adjusted in the dubbing output to ensure that it's synchronized with the appropriate moment in the video. Occasionally, this can lead to timing issues with a dubbed video, and you may need a video editor like Kapwing to adjust the timing of the generated text output.

Pronunciation Rules

Controls are , especially useful for names, acronyms, or terms that text-to-speech systems often mispronounce. Defined in a brand glossary, translation rules give a translation agencies or software system guidance to ensure a higher quality translation.

SRT / VTT

Subtitle file formats are used to display captions alongside video content. SRT is widely compatible, while VTT includes support for styling and web-based playback. Kapwing supports both formats for exporting and editing subtitles.

Guide showing how to download subtitles as an SRT or VTT file. — Download your subtitles, dubbed or not, as an SRT, VTT, or TXT file.

Stock Voice

Pre-generated AI voices are available for immediate use. These voices vary in language, accent, and tone, and are ideal for quick dubbing or voice-over projects.

Subtitles

Text displayed on-screen that matches the spoken dialogue in a video. Subtitles can be automatically generated, translated, or manually adjusted to improve accessibility and viewer comprehension.

Target Language

The language selected for translated subtitles or dubbed audio. Choosing the correct target language ensures accurate translation, pronunciation, and localization.

Translate Tool

An editing feature that allows users to apply subtitle translation or AI dubbing to a video. This tool typically serves as the first step in the dubbing workflow. The side-by-side view of the original and target language allows a reviewer to flag and correct mis-translations.

Transcript

A written version of the spoken audio in a video. This text is used to generate subtitles, translated voice-overs, and dubbing content.

Transcript Review

A process that lets you review and manually edit the AI-generated transcript and translation before finalizing the dubbed audio. This ensures higher dubbing accuracy.

Translation Rules

Guidelines that adjust how the AI translates certain words or phrases, helping maintain tone or fix recurring issues across multiple projects.

TTS (Text-to-Speech)

Technology that converts written text into spoken audio using AI-generated voices. TTS allows creators to add dialogue in multiple languages without hiring voice actors.

Voice Clone

A feature that replicates a speaker’s voice using AI. It helps preserve tone and identity when dubbing content into another language.

How to Dub Without Any Experience

Accessing an AI dubbing platform is easier than you might think. In fact, integration into video editing tools is what makes dubbing so useful.

1. Import Your File for Translation

Upload a video or audio file to the Kapwing editor. To best visualize the dubbing process, video files are recommended. Dubbed videos will include automatically translated audio and automatically synchronized subtitles featured in the target language, making it easier to evaluate the result.

0:00

/0:22

Upload or link any video file to begin automatically dubbing.

If you're looking to dub a video that’s already published on a platform like YouTube, simply paste the video link into the media tab — no need to download and reupload the file manually.

2. Select the Translate Tool

Once your media is uploaded, navigate to the Translate tab in the left-hand sidebar.

Here, you’ll see options to either dub your video or add translated subtitles. For this walkthrough, choose the Dub video option.

Guide showing how to dub a video online for free. — Select the **Translate** tab from the left-hand sidebar to access video dubbing within the editor.

3. Configure Your Translation

On the next screen, you’ll find the dubbing settings. For a basic translation, the most important setting is the Translate to dropdown, where you’ll select the target language for your dubbed content.

This dropdown provides access to over 100 language options, including multiple dialects for better localization. This is especially important when creating personalized content for international audiences.

Guide showing how to begin dubbing a video. — Select a target language to begin dubbing your video.

By default, Kapwing uses voice cloning to replicate the speaker’s original voice. You can leave this as-is for your first project, but the platform also provides advanced settings you can customize as needed.

Optional Advanced Settings

For additional editing options and to ensure the most accurate dub possible, there are a few advanced options to be aware of. Here’s what each option allows you to do:

Guide showing how to use advanced dubbing options. — Use these advanced options to ensure accuracy and customize your video dub.

SRT/VTT File Upload: Upload an existing subtitle file for greater translation accuracy. If you don’t upload a file, a transcript will automatically generate based on the original audio.
Voice Selection: Choose from over 180 voice models. You can filter by voice age, gender, and intended use (narration, news, ASMR, etc.) to find the best fit.
Advanced Dubbing Settings: Access the Brand Glossary and enable automatic timing adjustments in your translated audio. These settings help improve accuracy and flow, especially when localizing content for regions with unique dialects or phrasing.
Transcript Review: If no subtitle file is available, you can review the AI-generated transcript before dubbing begins. This helps catch errors early, particularly in videos with poor audio quality or multiple speakers.

Once you’re ready, click the Dub Video button at the bottom to begin the dubbing process — it only takes a few moments to complete.

4. Review Your Dubbed Video

As the dubbing runs, a progress bar will display the current stage of the process. Most short videos are processed in just a few minutes, although longer projects may take additional time.

Progress bar showing the status of a video dub. — Check the status of your video dub while it is in progress.

When the dub is finished, your timeline will feature four new tracks.

Guide showing an AI dubbed video project timeline. — Once dubbed, your project will include 4 tracks in the timeline for easier editing and customization.

Subtitles: Automatically generated subtitles in your target language, timed to match the dubbed dialogue.
Original Video: The original video file, now with its audio removed and moved to its own track.
Background Audio: Isolated music or sound effects from the original video, with the vocals removed to preserve ambient sound.
Translated Audio: The AI-generated voice track in the target language.

Each of these layers can be edited or deleted individually. To remove a track, right-click and select Delete, or press Backspace after selecting the layer.

At this point, your video is dubbed and ready to be exported. However, there are still a few dub-related changes you can make if needed. These aren't required for most projects, but can be useful depending on your content goals.

5. (Optional) Adjust Video Subtitles

Head over to the Subtitles tab in the left-hand sidebar. You’ll see the full transcript with timestamps, allowing you to adjust the length or timing of any subtitle block.

You can also use the Smart Tools dropdown in the top-right corner to access additional customization features:

Guide showing how to use Smart tools after dubbing a video with AI. — Use these **Smart tools** to make further adjustments to your video dub for better message clarity.

Here is a rundown of these tools:

Translate: Translate your subtitle text without affecting the dubbed audio. This is helpful when you want the on-screen text to remain in the original language, or display it in a different one for clarity or accessibility.
Lip Sync: Apply an AI lip sync filter to make the speaker's lips align with the dubbed content. This is useful in professional applications like advertisements or testimonials, where a natural appearance is important to the success of the video.
Custom Spelling: Manually update word spellings to reflect brand preferences or correct unusual phrasing.
Auto Emojis: Automatically add emojis that match your subtitle text. This can help communicate tone or emotion across language barriers, especially for audiences who respond well to visual cues.

KAPWING