Subtitle/Caption Tracks¶
pytubefix exposes the caption tracks in much the same way as querying the media streams. Let’s begin by switching to a video that contains them:
from pytubefix import YouTube
yt = YouTube('http://youtube.com/watch?v=2lAe1cqCOXo')
subtitles = yt.captions
print(subtitles)
Now you can save subtitles to a txt file:
from pytubefix import YouTube
yt = YouTube('http://youtube.com/watch?v=2lAe1cqCOXo')
caption = yt.captions['a.en']
caption.save_captions("captions.txt")
Now let’s checkout the english captions:
>>> caption = yt.captions['a.en']
Great, now let’s see how YouTube formats them:
>>> caption.xml_captions
'<?xml version="1.0" encoding="utf-8" ?><transcript><text start="10.2" dur="0.94">K-pop!</text>...'
Oh, this isn’t very easy to work with, let’s convert them to the srt format:
>>> print(caption.generate_srt_captions())
1
00:00:10,200 --> 00:00:11,140
K-pop!
2
00:00:13,400 --> 00:00:16,200
That is so awkward to watch.
...