This is article 13 of the YouTube API With PHP series.
Download a caption track of a Video. The caption track is returned in its original format unless the request specifies a value for the tfmt parameter and in its original language unless the request specifies a value for the tlang parameter. Since this call requires user authentication, it can only download Caption tracks of videos which belong to you.
The Request URL is
GET https://www.googleapis.com/youtube/v3/captions/(id)
Parameters
- key (string) required. Your API key
- id (string) required This is added as a suffix to the GET url. This has to be a valid id for an existing Caption track.
- onBehalfOfContentOwner (string) optional. This is relevant only for YouTube Channel Partners. For this parameter, the API request URL should have user authentication.We will not be exploring this option.
- tfmt (string) optional. The tfmt parameter specifies that the caption track should be returned in a specific format. If the parameter is not included in the request, the track is returned in its original format. Possible values are :
- sbv – SubViewer subtitle
- scc – Scenarist Closed Caption format
- srt – SubRip subtitle
- ttml – Timed Text Markup Language caption
- vtt – Web Video Text Tracks caption
- tlang (string) optional. The tlang parameter specifies that the API response should return a translation of the specified caption track. The parameter value is the international two letter language code that identifies the desired caption language. The translation is generated by using machine translation, such as Google Translate.
Response
On a successful call, a binary file is returned which should be saved onto local disk,
Here is sample code which downloads a Caption track:
<?php error_reporting(E_ALL ^ E_NOTICE ^ E_WARNING ^ E_DEPRECATED); set_time_limit(60 * 3); session_start(); $clientId = "**"; $clientSecret = "**-"; $g_youtubeDataAPIKey = "**"; $captionId = "STDMK4mG9ONQRc2kPO88VeQN1mlD15SHfV5I8hY1acQ="; $_SESSION["code_id"] = $_SERVER["PHP_SELF"]; if ($_SESSION["access_token"] == null || $_SESSION["access_token"] == "") { // check for oauth response header("Location: ../../init-login.php"); exit; } $accessToken = $_SESSION["access_token"]; // make api request $url = "https://www.googleapis.com/youtube/v3/captions/" . $captionId . "?key=" . $g_youtubeDataAPIKey; $curl = curl_init(); curl_setopt_array($curl, array( CURLOPT_HTTPHEADER=>array('Authorization: OAuth ' . $accessToken), CURLOPT_RETURNTRANSFER => 1, CURLOPT_URL => $url, CURLOPT_USERAGENT => 'YouTube API Tester', CURLOPT_SSL_VERIFYPEER => 1, CURLOPT_SSL_VERIFYHOST=> 0, CURLOPT_CAINFO => "../../cert/cacert.pem", CURLOPT_CAPATH => "../../cert/cacert.pem", CURLOPT_FOLLOWLOCATION => TRUE )); $resp = curl_exec($curl); curl_close($curl); var_dump($resp); ?>
Here is the output:
string(3265) "0:00:00.000,0:00:08.220 a few months back sometime in 2016
I've 0:00:06.120,0:00:10.800 made a video which showcased the 0:00:08.220,0:00:13.080 features
of the web speech API the West 0:00:10.800,0:00:15.020 peach API is a technology made by google
0:00:13.080,0:00:18.600 which lets you do speech recognition 0:00:15.020,0:00:21.359 within your
browser unfortunately even 0:00:18.600,0:00:23.490 as of now the only browser which fully
0:00:21.359,0:00:26.160 supports that specification is google 0:00:23.490,0:00:28.769 chrome
the other browsers haven't really 0:00:26.160,0:00:30.449 got support for it so i suggest you
have 0:00:28.769,0:00:32.489 a look at that video first in order to 0:00:30.449,0:00:34.590
get an idea of the capabilities of web 0:00:32.489,0:00:37.350 speech API the link is right
below this 0:00:34.590,0:00:39.030 video so this time round i decided to 0:00:37.350,0:00:42.120
take that experiment a little further 0:00:39.030,0:00:44.940 and what i have here
is a single web 0:00:42.120,0:00:48.000 page application which does real-time 0:00:44.940,0:00:50.280
translation so the processing pipeline 0:00:48.000,0:00:53.399 is very simple the first thing
it does 0:00:50.280,0:00:56.160 is that it accepts speech using a 0:00:53.399,0:00:59.489
microphone and then translates the 0:00:56.160,0:01:01.699 speech into written text that text
is 0:00:59.489,0:01:03.899 them fed into a translation api which 0:01:01.699,0:01:07.140
translates that text into another 0:01:03.899,0:01:09.479 language and then that translated
takes 0:01:07.140,0:01:11.670 to spread into a text-to-speech engine 0:01:09.479,0:01:14.159 which
then plays back that text as 0:01:11.670,0:01:15.390 reporters audio so you can speak in one
0:01:14.159,0:01:17.009 language and you can hear the 0:01:15.390,0:01:19.860 translation
of the same thing in a 0:01:17.009,0:01:22.799 different language now since what
i'm 0:01:19.860,0:01:25.439 using your motif c-suite engines and AP 0:01:22.799,0:01:27.000 is the
machine translation is not really 0:01:25.439,0:01:29.930 hundred percent perfect and sometimes
0:01:27.000,0:01:33.270 the results can be quite funny but this 0:01:29.930,0:01:35.549 still forms
the base for any kind of 0:01:33.270,0:01:38.820 translation application of
software 0:01:35.549,0:01:40.020 which one might make and this can use as 0:01:38.820,0:01:41.880 a
base to make something more 0:01:40.020,0:01:44.990 sophisticated so i'm just going
to show 0:01:41.880,0:01:44.990 you a few examples here 0:01:47.100,0:01:54.240 what is your
name and where you live you 0:01:51.930,0:01:57.409 will have a callous me why I shouldn't 0:01:54
.240,0:02:02.909 even care where I usually say material 0:01:57.409,0:02:07.289 assiyah
me here for this table do you 0:02:02.909,0:02:13.920 believe them do you have any money
with 0:02:07.289,0:02:19.470 you muscle to kill every one
degree to 0:02:13.920,0:02:21.750 be happy imma get
between the phenomena 0:02:19.470,0:02:28.850 Tomas famous me your hundred your
0:02:21.750,0:02:28.850 amateur thank you for the kind words 0:02:29.780,0:02:39.080 necessary
master Anahata name is a photo 0:02:33.330,0:02:39.080 that dress PS different people around "
We pass the Caption track id as part of the API call. We also pass the OAuth token as part of the headers instead of the URL.
What the call returns are the contents of the Captions file. In this example we are dumping the contents . Ideally you would save it to a file. Since we have not specified what format we want the contents to be in via the tfmt parameter, the contents are returned in the original format as it was uploaded in.
We can set the tfmt parameter as part of the API URL as tfmt=xxx where xxx is a valid string as mentioned in the tfmt specs above.
Here is how the previous output looks if we put tfmt=vtt
string(12175) "WEBVTT Kind: captions Language: en Style: ::cue(c.colorCCCCCC) { color: rgb(204,204,204); }
::cue(c.colorE5E5E5) { color: rgb(229,229,229); } ## 00:00:00.000 --> 00:00:08.220
align:start position:19% a<00:00:01.280> few<00:00:02.280> months<00:00:02.429> back<00:00:02.790>
sometime<00:00:03.780> in<00:00:04.160> 2016<00:00:05.160> I've 00:00:06.120 --> 00:00:10.800
align:start position:19% made<00:00:06.509> a<00:00:06.540> video<00:00:06.839> which<00:00:07.200>
showcased<00:00:08.040> the 00:00:08.220 --> 00:00:13.080 align:start position:19% features<00:00:08.580>
of<00:00:08.610> the<00:00:08.910> web<00:00:09.090> speech<00:00:09.389> API<00:00:09.590>
the<00:00:10.590> West 00:00:10.800 --> 00:00:15.020 align:start position:19% peach<00:00:11.070>
API<00:00:11.460> is<00:00:11.580> a<00:00:11.820> technology<00:00:12.210> made<00:00:12.690>
by<00:00:12.870> google 00:00:13.080 --> 00:00:18.600 align:start position:19% which<00:00:13.650>
lets<00:00:13.950> you<00:00:14.099> do<00:00:14.280> speech<00:00:14.849> recognition
00:00:15.020 --> 00:00:21.359 align:start position:19% within<00:00:16.020>
your<00:00:16.260> browser<00:00:17.420> unfortunately
Note that it is not necessary that specifying a tfmt value will always return content in that format. If the format is invalid or it cannot be converted then no content is returned for that format.
plz help
/opt/lampp/htdocs/yt-sub/init-login.php:75:string ‘{
“error”: “invalid_grant”,
“error_description”: “Token has been expired or revoked.”
}
‘ (length=90)
/opt/lampp/htdocs/yt-sub/index.php:42:string ‘The permissions associated with the request are not sufficient to download the caption track. The request might not be properly authorized, or the video order might not have enabled third-party contributions for this caption.’ (length=225)
@jamal Are you downloading the caption track of a video which belongs to you? You cannot download caption tracks of videos belonging to other people. Also please check your credentials if they are being passed with the right parameters.
Is there another way to download subtitles for any YouTube video?
No not really. The API does not allow downloading of captions for videos which do not belong to the current user.
Thank you
I found this site diycaptions.com but I want the way it works its not needing youtube api
They must be using some other method. This blog post is only about the Youtube API way of downloading a caption file.