This tutorial is written with macOS in mind. The same should also be possible to accomplish on Windows, but the exact steps will be different.
If you don’t already have Homebrew (a package manger to install open-source software on macOS) installed, follow these instructions to do so.
Install yt-dlp and ffmpeg by opening a Terminal window and running: brew install yt-dlp ffmpeg
Create a plain text file (e.g. in VS Code) that lists the URLs of your YouTube videos, one per line. E.g.:
https://www.youtube.com/watch?v=asdf1
https://www.youtube.com/watch?v=asdf2
yt-dlp --flat-playlist --print "%(url)s %(title)s" "CHANNEL_URL_GOES_HERE" >~/Downloads/videos.txt
Save the plain text file in your Downloads directory using the filename videos.txt
.
In the Terminal window, run the following command: cd ~/Download && yt-dlp -a videos.txt -S +size,+br --extract-audio --audio-format wav --postprocessor-args "ffmpeg:-ar 16000"
This should download one video file after the other, and convert them to WAV audio files in your Downloads directory (might take up quite some space!).
Whisper is an automatic speech recognition (ASR) model created by OpenAI. We can install an optimized open-source version on our computers:
git clone [<https://github.com/ggerganov/whisper.cpp>](<https://github.com/ggerganov/whisper.cpp>)
cd whisper.cpp
bash ./models/download-ggml-model.sh base.en
make
./main -otxt ~/Downloads/*.wav
cat ~/Downloads/*.txt >~/Downloads/transcript.txt