This is a utility/application for creating Subtitles (ie Captions) for existing videos on YouTube.
I created this utility mainly out of necessity and selfishness.
The necessity is that adding Subtitles (ie Captions) to your YouTube videos provides:
- individuals who are hard of hearing the ability to enjoy content on YouTube
- accurate subtitles thereby indexing of your YouTube content
The selfishness is that I produce a fair amount of content and creating subtitles was time consuming to go through this manual process for creating subtitles:
- find the mp4 or video that I wanted to create subtitles for
- generate subtitles/captioning by submitting the mp4 to a Speech-to-Text service like Deepgram
- navigate to YouTube, find the video and then upload the Subtitles to said video
There is a fair amount of setup to this project. So the complexity and time is front loaded, but when configured, this utility will:
- download your video from YouTube
- convert your video to mp3 (audio only) to reduce the upload time to Deepgram
- submit the mp3 to Deepgram to obtain the transcription
- convert the transcription to SRT subtitles
- upload and publish the SRT subtitles to your video
This utility does all that from a single command:
python caption_youtube_video.py --url "<your videos link in youtube>"
Depending on the length of your video, in about 30-60 seconds, you will have published subtitles on your video.
This utility makes use of:
- youtube-dl - which is a open source library to download public YouTube videos (I have been using this for years)
- Deepgram Python SDK - this generates the SRT subtitles on the Deepgram Platform
- YouTube Data API - specifically Google's Python YouTube Data SDK
The youtube-dl is used widely, but since Google doesn't like individuals downloading YouTube videos, this project is unsanctioned by Google. Google occassionally changes things that breaks this project, but after a few days, things start working again.
This does mean that you need to git clone
this project and do a developer install of this project into pip
. After cloning, change directory into the repo on disk and then do a pip install -e .
to make this library available to python.
Sign up for a Deepgram account and get $200 in Free Credit (up to 45,000 minutes), absolutely free. No credit card needed!
We encourage you to explore Deepgram by checking out the following resources:
Create an API in the Deepgram Console. Then set your API Key as an environment variable.
If using bash, this could be done in your ~/.bash_profile
like so:
export DEEPGRAM_API_KEY = "YOUR_DEEPGRAM_API_KEY"
You need to have a Google Cloud account, but if you are using YouTube, you probably don't even know you actually have one. You might need to click a button that says "Try for Free", but that would be about it.
If you have the YouTube Data API enabled already, you can skip this step. Navigate to your APIs & Services Dashboard and select the project you want use for YouTube Data API, click + Enable APIs & Services
and search for YouTube Data API v3
, and then click Enable
to enable the API.
In order to access and manage your YouTube content, you need to create an OAuth Client ID. In APIs & Services
, click Credentials
and then click Create Credentials
.
For your OAuth Client ID settings, select Desktop app
in the Application type
and then create a Name
for your OAuth Client ID.
Then you need to give your OAuth Client ID access to your YouTube content. First, click OAuth consent screen
.
Then access is granted by email address. Scroll down and select + ADD USERS
and then enter the email account associated with your YouTube account/content.
The last thing you will need is ffmpeg and ffprobe installed and accessible on your PATH
. A typical location to install something like this would be /usr/local/bin
. If the binaries don't have execute permissions, don't forget to chmod +x ./ffmpeg
and chmod +x ./ffprobe
.
If you chose to set an environment variable in your shell profile (ie .bash_profile
) you can run the example at the root of this repo like so:
python caption_youtube_video.py --url "<your videos link in youtube>"
or this could also be done by a simple export of the API Key before executing your Go application:
DEEPGRAM_API_KEY="YOUR_DEEPGRAM_API_KEY" python caption_youtube_video.py --url "<your videos link in youtube>"
During the Subtitle process, the application is going to request access to your YouTube content.
You will need to verify this application's access.
You need to select the Gmail account associated with your YouTube content. If you have multiple Gmail accounts logged in, you will see them here.
Then you need to authorize the application.
Copy the Authorization Code
to be used in the utility/application.
Then paste the Authorization Code
into the console.
That might seem like a lot, but it's a time saver! That's it! No joke!
Interested in contributing? We ❤️ pull requests!
To make sure our community is safe for all, be sure to review and agree to our Code of Conduct. Then see the Contribution guidelines for more information.
We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:
- Open an issue on this repository
- Join us on Discord
- Ask a question, share the cool things you're working on, or see what else we have going on in our Community Forum
- Tweet at us! We're @DeepgramAI on Twitter
Check out the Developer Documentation at https://developers.deepgram.com/