Convert DeepGram to autoEdit transcript format
There isn’t a way to add custom STT adapters to autoEdit, but you can modify the transcriptions.json
directly.
Here’s a fragment that will allow you to generate an autoEdit compatible transcript from a DeepGram transcript using jq
.
There are a few modifications you will need to make to this gist to complete it for your own setup, but the following command will pull the bits of data you need and you can use variables to fill in the rest.
Just remember to take a backup of your current transcriptions.json
file before you modify/replace it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# WARNING: This is not a complete script. | |
# You could update the hardcoded values, but practically, it will need to | |
# be modified to programmatically process your source files and append the | |
# result to the autoEdit transcripts.json array. | |
# This is the internal autoEdit project ID. | |
# Once you've created the project in autoEdit you can get the ID from | |
# /path/to/digital-paper-edit-electron/db/projects.json | |
# on macOS this is /Users/username/Library/Application Support/digital-paper-edit-electron | |
project_id="1234567890" | |
title="Title" | |
description="Description" | |
# This can be anywhere you keep your source files | |
src_video="/path/to/original/video.mp4" | |
# autoEdit generates these files when you manually add a new piece of media. | |
# You can process these to be smaller to save space. | |
# I've had these point to the source files and it has worked fine in playback. | |
ae_audio="/path/to/digital-paper-edit-electron/media/audio.wav" | |
ae_video="/path/to/digital-paper-edit-electron/media/video.mp4" | |
vbase=$(basename -- "$og_video") | |
cat "deepgram.json" | jq -r --arg title "$title" --arg description "$description" --arg ae_audio "$ae_audio" --arg src_video "$src_video" --arg ae_video "$ae_video" --arg vbase "$vbase" '. | (.results.channels[0].alternatives[0].words | to_entries | map({id:.key, text: .value.punctuated_word, start: .value.start, end: .value.end })) as $w | (.results.channels[0].alternatives[0].paragraphs.paragraphs | to_entries | map({id: .key, start: .value.start, end: .value.end, speaker: ("SPEAKER_" + (.value.speaker|tostring))})) as $p | { projectId: $project_id, title: $title, description: $description, path: $src_video, url: $ae_audio, status: "done", "_id": .metadata.request_id, id: .metadata.request_id, metadata: { filePathName: $src_video, filename: $vbase, date: "NA", reelName: "NA", timecode: "NA", "r_frame_rate": "0/0", fps: "NA", duration: .metadata.duration, sampleRate: 16000 }, videoUrl: $ae_audio, transcript: { words: $w, paragraphs: $p }, clipName: $title, sttEngine: "Deepgram", audioUrl: $ae_audio }' |
Published November 2, 2022