Interactive Transcripts with YouTube & Descript

This article was originally shared on Substack.

A few weeks ago I added full transcripts to the STEAM Powered website (11 most recent episodes, from Shamini Bundell onwards, so far) and for these, both show notes and transcripts are interactive and connect to the embedded YouTube video making the video episodes fully navigable for optimum viewing and listening pleasure.

If you're interested, here's the video I shared on LinkedIn announcing this functionality:

And here's how I did it.

Requirements

  • Descript - To generate the timecoded transcripts. If you have subtitle files (.srt), you could do something similar with this instead, but I wanted to keep my formatting with markers and paragraphing for legibility.
  • A website running React - I'm using Next.js, for reference.
  • react-youtube - A React wrapper for the YouTube player API

If you don't use React, you'll have to do this the old-fashioned way with the player API.

Substack doesn’t support embedding individual files in gists, so you can find the gist with the code referenced in this article here.

Also, none of the code includes styling, so you can finesse to your heart's content.

Step 1. Timecoded transcripts

I use Descript to edit my recorded videos in text to clean up unwanted sounds and utterances, cross-talk, and shorten silences to create a more natural flow to the conversation for the listener.1 While I'm in there, I'll add in topic markers and correct transcription errors.

Once that's done, I use Descript's transcript export feature to create a markdown file with markers, speaker labels, and liberally applied timecodes at 3 second intervals (approximately how long a subtitle stays on screen), and all major points in the transcript.

Screenshot of Descript transcript export settings
Descript transcript export settings

Step 2. Parse the transcript

This is the horrible part: parsing markdown with timestamps into something programmatically iterable. I'm sorry.

You can invoke the parseTranscript function with the transcript markdown outside of Next.js and store the parsed result for later, or you can do this as part of the build process.

This is where it would have been much easier to use SRT files because there are many SRT parsers about and you don't have to roll your own parser. But then, I've done the hard work for you, so…

This is where it would have been much easier to use SRT files because the timestamps are more granular whereas this method is only to the second. There are also many SRT parsers about and you don't have to roll your own parser. But then, I've done the hard work for you, so…

Step 3. Set up the YouTube player

Next, set up the YouTube player in React with a function to control seeking that you can pass to your notes or transcript components. I've included ones for ShowNotes and a Transcript component which is what I've got on the site but you can create whatever you want to be navigable.

Step 4. Interactive show notes

The show notes are the markers I mentioned are exported as part of the transcript, and are identified as heading items in the parsed transcript from Step 2.

This provides a topic summary which is lovely for SEO, and gives people an idea of where to find the bits they care about. I've always had this, but now it's helpfully clickable.

Screenshot of Show Notes
Show Notes

The code effectively extracts the heading items from the parsed transcript array, displays them, and calls the seekTo function on click that moves the YouTube playhead.

Step 5. Interactive transcript

Second verse, same as the first.

This renders the lot, all formatted nicely with headings and paragraphs, as well as the current speaker name, which is easily done because the parser breaks it all down nicely for just this purpose.

Screenshot of transcript with active segment
Transcript with active segment

Conclusion

And there you have it. Interactive transcripts with YouTube using Descript.

If you end up using this for your own stuff, I'd love to see it. Let me know by leaving a comment here or on the gist.

I also won't object to appreciation being shown by follows, subscribes, or patronage through the STEAM Powered Patreon and this Substack.

Stay curious,

— Michele


Footnotes

  1. Aurally, this sounds fine as a podcast, but in video, it creates jump cuts that I find rather jarring, and truth be told, makes me very sad. But this is a trade-off I have had to accept to speed up editing and keep the audio tight(er). Before this, I spent far too much time trying to create smooth transitions using DaVinci Resolve by trying to mask the cuts between breaths because I didn't have B-roll to work with. This feels like I'm trying to justify my workflow to you, but I'm just justifying the compromise to myself.

Published December 28, 2023