How to Transcribe Interviews Automatically: Turn Your Audio & Video Content Into Text

Interview transcripts help make your content more accessible, discoverable, and offer ways to get more from every interview. Here's how to create interview transcripts automatically with best-in-class tools.

How to Transcribe Interviews Automatically: Turn Your Audio & Video Content Into Text
Created with text-to-image AI using prompts around "transcribing interviews to text."

Interviews are the lifeblood of good marketing. Interviews help you understand customers and the market at large, and they're also an important part of building a reliable content engine, especially if you rely on expert opinions, customer proof, and content repurposing. And these days, who doesn't?

Running a great interview is hard enough, so we don't want to make transcribing the interview itself any harder than it needs to be. Fortunately, we're in the age of AI-powered transcription software and it has never been easier to get a highly accurate transcript in just a few seconds—but you need the right tools.

Below, we'll share a step-by-step tutorial for automatically generating and revising an interview transcript, along with other more manual approaches, best practices for transcripts, and all the ways transcripts can be used to power your marketing. Let's start with defining what we mean and getting a transcript made quickly.

What's an interview transcript?

An interview transcript is a written record of a conversation where the dialogue from an audio or video recording is converted into text. These transcripts are often verbatim and include all of the questions and responses in a conversation, but they're sometimes lightly edited or summarized.

Interview transcripts are used in many different ways but are most commonly used for:

  • Accessibility. Transcripts are especially valuable to users with hearing or other disabilities, or for users who simply prefer to get the information via text.
  • Discoverability. A video’s metadata helps Google index, classify, and surface it to searchers. Transcripts and subtitles/captions add lots of metadata to videos that would otherwise not exist.
  • Repurposing. Want to turn your interview into a blog post, email, or other text content? Getting more value from interviews starts with getting the transcript.
  • Localization. Translating audio and videos also begins with a detailed transcript that can then be localized to the target region via subtitles, voice overs, or dubbing.
  • Reference. Transcripts create a detailed record of the conversation and are typically easier to store and archive than long video or audio interviews; they're also easier to search.

How to transcribe interviews automatically

Though we'll cover other, more manual ways to transcribe an interview, let's start with the fastest and easiest way: using an AI-powered tool to automatically create a word-by-word transcript. Here's how to transcribe an interview using this approach:

1. Play back your interview

The first step is actually to play back your interview and check the recording quality. If there's an excessive amount of background noise, or if the audio quality could simply be clearer, try a tool like Kapwing's Clean Audio or Adobe's speech enhancer before you begin transcribing.

Doing audio cleanup now will save valuable time as you begin to transcribe your interview because you'll have fewer manual edits to make—the clearer your audio, the better AI-powered transcribers are at returning accurate transcripts.

📚 Learn more: How to Remove Silences from Podcast Audio with AI

2. Upload your interview file

Head over to Kapwing's Transcription tool and click "Start transcribing." You'll then be Kapwing and upload your interview file, whether it's audio or video.

During this tutorial, I'm going to transcribe an interview we recently hosted with John Collins, a content marketing consultant.

Select the "Transcript" tab from the left sidebar in Kapwing, then click on "Trim with Transcript." You'll then be able to select your language—the language the speakers are talking in—and from there you can click on the big "Generate transcript" button. Your word-by-word transcript will start being created.

What's useful about Kapwing's transcript tool is that you can edit your transcript like a text document to edit your audio or video interview. Just remove a section in the transcript and that portion of the interview will be removed as well.

Make any edits you need, try advanced tools like Remove Filler Words, or apply more hands-on edits via Kapwing's traditional timeline editor before moving on to the next step.

3. Export your transcript

Now it's time to export your transcript and download it to any device. When you're ready, select the "Download transcript" option just above the transcript editor. Your transcript will download as a plain .txt file which can be used almost anywhere.

If you used your transcript along with our Subtitle Generator, you'll also be able to download subtitle files in .srt and .vtt as well.

4. Proofread and publish

We've mentioned that although auto-generated transcripts are highly accurate, it's worth planning on at least one round of revisions to avoid any obvious mistakes and ensure your transcript is as accurate as possible. Depending on your industry or the context of the transcript, you may require additional reviews, too.

At the bare minimum, we recommend considering the following steps as you proofread your transcript and prepare it to be fit to publish:

  • Let an editor proofread your transcript. The editor can just be someone else on your team. Fresh eyes on a transcript will often help catch obvious misspellings or mistakes that will stick out to anyone reading your transcript.
  • Review your transcript with AI. No time for review? Then head over to Grammarly or ProWritingAid for a quick AI-powered grammar check. These tools may not catch misuse of industry terms or other niche-specific mistakes, but they'll help you clean up the basics.
  • Add Custom Spelling for future interviews. "Ca-poing?" It's Kapwing! Transcription tools are known for misinterpreting brand names and terms, so we've built a Custom Spelling tool that will automatically catch and fix common misspellings you notice in your transcripts.
Graphic that shows the Custom Spelling feature in Kapwing in action.
Custom Spelling in Kapwing let's you automatically flag and fix words that are commonly misspelled in transcripts.

Other ways to transcribe interviews

The reason we recommend using automatic transcription software is that these tools have become much more accurate over recent years. Many popular providers of hand-written transcription services have introduced AI-based transcriptions as part of their offering for this reason.

AI-powered transcription plus a human review and quick edit is the way to go for most creatives. But if you want to keep all of your options open, here are the two other ways to transcribe an interview:

1. Hire a transcription service

Hand-written transcription services still exist, but they're going to cost you a pretty penny in comparison to software. In terms of the typical expense, Rev, a leading provider of human-written transcripts, starts pricing at $1.50 USD per minute of audio/video. That's a palatable $18 for a 12-minute transcript for a YouTube video—so long as you don't add extras—but a tougher-to-justify $105 for a 70-minute webinar.

So, what do you get in return for the cost? Most service providers we saw promised "98-99% accuracy" for human-written transcriptions and, from experience, are usually able to deliver. Specialty services can also accommodate transcription work that needs to e.g. meet HIPAA compliance or be structured in a specific or unique way, such as for legal services.

Human transcriptions are also useful for extraordinarily long interviews with multiple speakers. If that's the case for your interview, you should also know that there are usually a few levels of service that you can request from transcription providers:

  • Verbatim transcripts. Every word, sound, and expression ("uh, um") will be transcribed without edits. Useful if you need everything in an interview or prefer to make editorial decisions yourself.
  • Intelligent transcripts. Verbatim transcripts minus filler words, filler words, and sections where a speaker stumbles over their words or repeats themselves.
  • Edited transcripts. Transcripts where editorial judgment is applied, including the removal of sections that are superfluous or simply not valuable to your audience.

Depending on which of these options (and more) you've selected, turnaround times can vary widely but typically range from 6-48 hours. Note that any human-written transcript that is word-by-word and guarantees 99% accuracy is going to take at least a few hours. Popular providers for these services include Rev, GoTranscript, TranscribeMe!, and Scribie.

2. Manually transcribe your interview

If you can avoid it, we strongly recommend not taking this approach. But, if you're resource-strapped or must manually transcribe your interview for other industry or company-specific reasons, let's look at the steps to take.

  1. Prepare to transcribe. All this really means is to grab a pair of headphones, sit in a (reasonably) quiet room, open a text editor, and then load up your audio or video interview.
  2. Listen from start to finish. If time allows, listen to the entire interview front to back to understand the full context before you begin transcribing. If there isn't time, skip to the next step.
  3. Break the interview into sections. Separate the audio into manageable segments that you can listen to in sequential order. You don't have to do this inside of your editing tool or word document; you can just pick stopping points e.g. 15-minute segments for a 60-minute interview.
  4. Slow down the playback. If needed, use software or your playback device to slow down the audio without altering its pitch for clarity. This can often be faster than having to re-listen to segments over and over.
  5. Transcribe a verbatim OR intelligent draft. You can either transcribe everything that you hear including filler words and false starts or make small editorial decisions as you go such as removing "uhs" and "ums" in your first pass.
  6. Review and edit. Read through the entire transcript to correct obvious errors and mispronunciations that turned into misspellings, along with any other factual or grammatical errors.
  7. Proofread and publish. Listen to the recording again while reading the transcript to ensure accuracy. Double-check for spelling, grammar, and punctuation. Then, ship it!

Benefits of creating an interview transcript

As fast as it may be to transcribe an interview with software, there's still work involved in exporting, reviewing, and formatting a transcript so that it's ready to go live.

Since it takes time, there needs to be upside—and we think there's plenty. Here are just a few of the benefits of converting your audio to text, for interviews and more.

1. Caption your audio/video interview

Transcripts are a frequent first step toward getting word-by-word video subtitles, which are a must-have for videos published online. The number of people who use subtitles is surprisingly high and the preference to watch subtitled videos with sound off is only growing among younger viewers.

Captioning your interview is obviously useful for video podcasts or other video formats, but interview transcripts can also be used to create the popular podcast audiograms that are so frequently found on social media. With subtitles & captions or audiograms, interviews can help fill your social media feed with useful and engaging multimedia content.

2. Repurpose interviews into written content

Well-constructed interviews with an expert guest or interesting customer can be the source for multiple new blog articles, emails, or social media posts. Taking just one interview and generating a podcast transcript could, all by itself, lead to 2-3 new search-driven blog posts—all built from the advice and insight from a single conversation.

Interviews are particularly valuable in industries that are complex or especially technical, such as developer products or cybersecurity. For these topics, you frequently need to interview subject matter experts to help answer a challenge your customers have; interviews are the bedrock of your content strategy. And with interview transcripts, you can get lots of mileage from these conversations for all forms of marketing content.

📚 Learn more: How to Transcribe a Podcast and Turn It Into a Blog Post

3. Improve the SEO for your audio/video assets

Audio and video files, just like web pages, rely on metadata to be crawled and surfaced by search engines like Google. Transcripts provide lots of relevant text-based content for audio or video interviews that search engines are efficient at crawling, which makes it easier to understand and index the content.

And that's just for the audio/video asset itself; as covered above, transcripts can also be used to create companion blog posts or other text-based web pages that can rank on Google web search with the multimedia content embedded. For example, you could transcribe an interview you hosted on YouTube and turn the transcription into a blog post while also embedding the interview inside the blog post. If the blog article then ranks in search, that also helps with visibility for your interview.

4. Turn transcripts into testimonials & case studies

Customer success is abundant in any good product, but customer proof is contingent on getting those perfect quotes from happy customers during a spur-of-the-moment chat. Transcripts are critical for capturing and surfacing these hidden gems that might otherwise disappear after the call ends.

A single great testimonial can go a long way. For more in-depth customer proof points, like case studies, you definitely need the entire transcript to create a clear Problem → Solution → Outcome story that helps other users see how your product can make them successful.

5. Edit your transcript to edit your video

Tools like Trim with Transcript allow you to automatically generate a transcript and then edit the text to edit your video. This means you can edit a rough cut just by removing words, sentences, and sections from the transcript.

Text-based editing is one of the fastest ways to rough cut a video because it's so straightforward—it's also one of the easiest ways to edit a video or audio file for marketers or anyone who isn't that familiar with timeline-based editors.

📚 Learn more: How to Quickly Transcribe Audio to Text with AI Tools

Effective examples of interview transcripts

So, who's going the extra mile with interview transcripts? To help answer that question, we collected a pair of transcripts that go much further than "plain text pasted to a blog." Use these as your inspiration.

Shopify Masters podcast transcript

What's in this interview transcript? This transcript example comes from Shopify Masters, a podcast hosted by Shopify on running a successful ecommerce business. This particular transcript is based on an interview with David Gaylord, who is the CEO of a direct-to-consumer company called Bushbalm.

Why we like this interview transcript: Shopify's podcast is a good example of what I'd call a "Transcript Plus" for marketing content. The story above is based on a transcript, to be sure, but it's also highly edited and includes original photography, inline images, pull quotes, and more. The transcript was ultimately a foundation for a really good piece of editorial content—along with the original interview.

Inside Intercom interview transcript

What's in this interview transcript? This transcript example comes from a roundtable discussion on Inside Intercom, an interview-style show produced by the customer support platform Intercom. In this particular interview, the Intercom team speaks to journalists, lecturers, and entertainers about the history of telephone support.

Why we like this interview transcript: This interview juggles multiple speakers quite well and is formatted in a clean and easy-to-navigate structure. Small editorial details like pull quotes and subheadings also make this transcript much easier to browse. Lastly, the conversations are organized in a coherent way so that ideas build on top of each other as new interviewees are introduced.

FAQs on Interview Transcripts

What is the easiest way to transcribe an interview?

Generating a word-by-word transcript with AI-powered transcription software is the fastest and easiest way to transcribe an interview. AI-generated transcripts are now highly accurate and require minimal editing. Tools like Kapwing, Rev, and Happy Scribe all offer software to automatically transcribe audio or video interviews.

What does an interview transcript look like?

Interview transcripts are typically plain text records of a conversation between two or more people. The text can either be unformatted entirely, or it can be lightly formatted and edited with speaker notes, slight revisions, or even inline assets such as if the transcript is repurposed to a blog article.

Is there an app that transcribes interviews?

Yes, Kapwing features a transcript generator that converts any audio or video file into text. Transcripts are available in minutes and can be downloaded as a .txt file, turned into subtitles, or even used to edit your video itself with Kapwing's text-based video editor.

Create content faster with Kapwing's online video editor →