[youtube-dl] Download files from youtube.com or other video platforms

No, the duration bug is apparently old.

To get any further you'll have to reveal what show URLs you were actually passing, and ideally the youtube-dl -F -v ... output for at least one of those, as I mentioned.

The program IDs that failed in the posted qtube log were not valid, having an extra 0. So either they were wrong when passed to qtube, or they were incorrectly formed by qtube or the yt-dl ITV extractor.
 
No, the duration bug is apparently old.

To get any further you'll have to reveal what show URLs you were actually passing, and ideally the youtube-dl -F -v ... output for at least one of those, as I mentioned.

The program IDs that failed in the posted qtube log were not valid, having an extra 0. So either they were wrong when passed to qtube, or they were incorrectly formed by qtube or the yt-dl ITV extractor.
Thanks. You are correct about the extra zero which explains the initial failure. I had queued a few downloads and I edited the URL number suffix from '9' (0009) to '15' (00015) but it should have been '0015' not '00015':oops:
I still cannot get downloads to work though from ITV hub. I have tried three different programmes but all downloads fail when trying to read the webpage:
Code:
youtube -F https://www.itv.com/hub/endeavour/2a1229a0009                   
[ITV] 2a1229a0009: Downloading webpage
ERROR: Unable to download webpage: <urlopen error The read operation timed out>(caused by URLError(SSLError('The read operation timed out',),))
Downloads fail using 'fakehttp' and a modified 'ITV.py' or with an unmodified installation of Youtube-dl.
 
Last edited:
...
Downloads fail using 'fakehttp' and a modified 'ITV.py' or with an unmodified installation of Youtube-dl.
This timeout is a denial of service amounting to a protocol violation by whatever fronts the ITV Hub site. Instead of looking at the web request and replying according to the HTTP specification, either Yes, here's the requested resource or No, because (eg 404 - not found, 403 - not permitted, ...), it just ignores the request, not because it has too much to do, which might be acceptable, but because it doesn't like the look of the completely protocol-conformant request. Cloudflare and Akamai are two chief suspects in this behaviour, which is now affecting a lot of sites.

By changing the request in some way the virtual bouncer may be bypassed at least temporarily.

I found that setting a UA of just Mozilla/5.0 worked today, with the current PR version:
Code:
$ python -m youtube_dl --user-agent 'Mozilla/5.0' -F -v --no-geo-bypass 'https://www.itv.com/hub/endeavour/2a1229a0009'
[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--user-agent', u'Mozilla/5.0', u'-F', u'-v', u'--no-geo-bypass', u'https://www.itv.com/hub/endeavour/2a1229a0009']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Git HEAD: 0945174fa
[debug] Python version 2.7.17 (CPython) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[ITV] 2a1229a0009: Downloading webpage
[ITV] 2a1229a0009: Downloading JSON metadata
[ITV] 2a1229a0009: Downloading m3u8 information
[info] Available formats for 2a1229a0009:
format code  extension  resolution note
hls-68       mp4        audio only   68k , mp4a.40.2@ 64k
hls-102      mp4        audio only  102k , mp4a.40.2@ 96k
hls-285      mp4        512x288     285k , avc1.4D401F@ 203k, 25.0fps, mp4a.40.2@ 64k
hls-636      mp4        512x288     636k , avc1.4D401F@ 504k, 25.0fps, mp4a.40.2@ 96k
hls-848      mp4        512x288     848k , avc1.4D401F@ 703k, 25.0fps, mp4a.40.2@ 96k
hls-1272     mp4        896x504    1272k , avc1.4D401F@1103k, 25.0fps, mp4a.40.2@ 96k
hls-1908     mp4        896x504    1908k , avc1.4D401F@1703k, 25.0fps, mp4a.40.2@ 96k (best)
$
I also set --no-geo-bypass because I (and presumably most members of the forum) don't need to use a fake UK IP, and that might also trigger a virtual "no trainers" or "where's your jacket".

And this is the same thing (abbreviated) with the latest youtube package on a HD-Fox T2, updated with the latest PR version of itv.py patched to use fakehttp.py:
Code:
# youtube-dl --user-agent 'Mozilla/5.0' -F -v --no-geo-bypass "https://www.itv.com/hub/endeavour/2a1229a0009"
[debug] System config: [u'--restrict-filenames', u'--prefer-ffmpeg', u'-f', u'best[height<=?1080][fps<=?60]', u'-o', u'/media/drive1/Video/%(title)s.%(ext)s']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--user-agent', u'Mozilla/5.0', u'-F', u'-v', u'--no-geo-bypass', u'https://www.itv.com/hub/endeavour/2a1229a0009']
[debug] Encodings: locale ASCII, fs ASCII, out ASCII, pref ASCII
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.1 (CPython) - Linux-2.6.18-7.1-7405b0-smp-with-libc0
[debug] exe versions: ffmpeg 4.1, ffprobe 4.1
[debug] Proxy map: {}
[ITV] 2a1229a0009: Downloading webpage
[ITV] 2a1229a0009: Downloading JSON metadata
[ITV] 2a1229a0009: Downloading m3u8 information
[info] Available formats for 2a1229a0009:
format code  extension  resolution note
hls-68       mp4        audio only   68k , mp4a.40.2@ 64k
...
hls-1908     mp4        896x504    1908k , avc1.4D401F@1703k, 25.0fps, mp4a.40.2@ 96k (best)
#
 
This timeout is a denial of service amounting to a protocol violation by whatever fronts the ITV Hub site. Instead of looking at the web request and replying according to the HTTP specification, either Yes, here's the requested resource or No, because (eg 404 - not found, 403 - not permitted, ...), it just ignores the request, not because it has too much to do, which might be acceptable, but because it doesn't like the look of the completely protocol-conformant request. Cloudflare and Akamai are two chief suspects in this behaviour, which is now affecting a lot of sites.

By changing the request in some way the virtual bouncer may be bypassed at least temporarily.

I found that setting a UA of just Mozilla/5.0 worked today, with the current PR version:
Code:
$ python -m youtube_dl --user-agent 'Mozilla/5.0' -F -v --no-geo-bypass 'https://www.itv.com/hub/endeavour/2a1229a0009'
[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--user-agent', u'Mozilla/5.0', u'-F', u'-v', u'--no-geo-bypass', u'https://www.itv.com/hub/endeavour/2a1229a0009']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Git HEAD: 0945174fa
[debug] Python version 2.7.17 (CPython) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[ITV] 2a1229a0009: Downloading webpage
[ITV] 2a1229a0009: Downloading JSON metadata
[ITV] 2a1229a0009: Downloading m3u8 information
[info] Available formats for 2a1229a0009:
format code  extension  resolution note
hls-68       mp4        audio only   68k , mp4a.40.2@ 64k
hls-102      mp4        audio only  102k , mp4a.40.2@ 96k
hls-285      mp4        512x288     285k , avc1.4D401F@ 203k, 25.0fps, mp4a.40.2@ 64k
hls-636      mp4        512x288     636k , avc1.4D401F@ 504k, 25.0fps, mp4a.40.2@ 96k
hls-848      mp4        512x288     848k , avc1.4D401F@ 703k, 25.0fps, mp4a.40.2@ 96k
hls-1272     mp4        896x504    1272k , avc1.4D401F@1103k, 25.0fps, mp4a.40.2@ 96k
hls-1908     mp4        896x504    1908k , avc1.4D401F@1703k, 25.0fps, mp4a.40.2@ 96k (best)
$
I also set --no-geo-bypass because I (and presumably most members of the forum) don't need to use a fake UK IP, and that might also trigger a virtual "no trainers" or "where's your jacket".

And this is the same thing (abbreviated) with the latest youtube package on a HD-Fox T2, updated with the latest PR version of itv.py patched to use fakehttp.py:
Code:
# youtube-dl --user-agent 'Mozilla/5.0' -F -v --no-geo-bypass "https://www.itv.com/hub/endeavour/2a1229a0009"
[debug] System config: [u'--restrict-filenames', u'--prefer-ffmpeg', u'-f', u'best[height<=?1080][fps<=?60]', u'-o', u'/media/drive1/Video/%(title)s.%(ext)s']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--user-agent', u'Mozilla/5.0', u'-F', u'-v', u'--no-geo-bypass', u'https://www.itv.com/hub/endeavour/2a1229a0009']
[debug] Encodings: locale ASCII, fs ASCII, out ASCII, pref ASCII
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.1 (CPython) - Linux-2.6.18-7.1-7405b0-smp-with-libc0
[debug] exe versions: ffmpeg 4.1, ffprobe 4.1
[debug] Proxy map: {}
[ITV] 2a1229a0009: Downloading webpage
[ITV] 2a1229a0009: Downloading JSON metadata
[ITV] 2a1229a0009: Downloading m3u8 information
[info] Available formats for 2a1229a0009:
format code  extension  resolution note
hls-68       mp4        audio only   68k , mp4a.40.2@ 64k
...
hls-1908     mp4        896x504    1908k , avc1.4D401F@1703k, 25.0fps, mp4a.40.2@ 96k (best)
#
Thanks for investigating this and finding the cause. As you said, using the Mozilla user agent has got it working again:
Code:
--user-agent 'Mozilla/5.0'
 
@/df
Is it worth updating our package to 2021.12.17 and if so, which versions of youtube.py (30184?) and itv.py (30266?) should be used? Our then current ones (around Nov 24) or something else? I'm completely lost in the mire.
 
Last edited:
You could use the 2021.12.17 version with the drop-in youtube.py from PR 30184 and itv.py from PR 30266, although apart from the version number that would be the same code as 2021.06 with those updates.

Once the UA issue becomes clearer, a different default for ITV might be enforced, but currently it's a manual work-around (maybe the global default UA for yt-dl needs to be generic instead of, as now, picking a random Chrome-alike UA string). Also, there probably won't be another release before Feb at the earliest.
 
I gave up using yt-dl on my desktop and now use yt-dlp which seems to be much better and is updated regularly. I had automated a lot of yt-dl and the transition was painless.
Just a thought...
 
ATM yt-dlp is the best solution if your system supports it. I understand that yt-dl is trying to recruit some new and, at least initially, more committed maintainers to keep the project up-to date for older Pythons.

For instance, yt-dlp can't run on the CF unless someone manages to build Python 3.6 or later (2.7.1 now); even then it might be the last straw for the system's CPU/memory limits.
 
I understand that yt-dl is trying to recruit some new and, at least initially, more committed maintainers to keep the project up-to date for older Pythons.
Have you been asked? :)
yt-dlp can't run on the CF unless someone manages to build Python 3.6 or later (2.7.1 now)
We have 2.7.1.
even then it might be the last straw for the system's CPU/memory limits.
How/why does dlp take significantly more than dl?
 
Have you been asked? :)
:)
We have 2.7.1.
Which luckily is just good enough for yt-dl, though 2.7.18 would be better.
How/why does dlp take significantly more than dl?
Just my suspicion based on Py 3.6+ vs 2.7, as with these x86 Linux binaries:
Code:
$ ls -l /usr/bin/python?.?
-rwxr-xr-x 1 root root 3583160 Mar  5  2021 /usr/bin/python2.7
-rwxr-xr-x 2 root root 4772160 Sep 13 15:49 /usr/bin/python3.5
-rwxr-xr-x 1 root root 5622220 Sep  4 19:19 /usr/bin/python3.9
$
 
Last edited:
though 2.7.18 would be better.
I looked at it, and I think I got something built towards the end of last year, but without knowing the options our package was built with, it's not going to work properly. Not sure how to package it either. Wish af123 had put the details out of how to build these standard packages. I've worked out jim and sqlite3 so far...
 
I think I got something built towards the end of last year
Code:
humax /tmp # ./python
Python 2.7.18 (default, Jan 23 2022, 13:56:30)
[GCC 4.2.0 20070124 (prerelease) - BRCM 11ts-20090508] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

Probably better split this thread if we get onto Python related stuff though.
 
 
To add to the post #463, downloads from ITV hub with version 2021.06.06e of youtube-dl appear to be working (complete with subtitles) with just the Mozilla/5.0 user agent; the use of 'fakehttp' with a patched 'ITV.py' does not seem to be necessary. It would be good to test this further to be sure (I have tried four different downloads to date).
 
Last edited:
To add to the post #463, ...
Thanks:
  • the e version appears to have the HEAD~ (penultimate) version of the ITV extractor from PR 30266, ie all the functional fixes, but not the duration fix;
  • if ITV doesn't care about SSL/TLS versions, fakehttp isn't necessary, and apparently it doesn't (yet).
Of the major English language sites, Vimeo was the only one that needed fakehttp when I last checked. Several sites of continental European broadcasters need it too.
 
Last edited:
In case this is of help to anyone: I was trying to qtube a Rick Stein's Cornwall programme, and the link on the BBC website was https://www.bbc.co.uk/food/programmes/...

That failed, but succeeded once I removed the "/food".
There was also a proper iPlayer link just below.

Apparently Food is so important to the BBC that it's the only area of the site to have links like .../xxx/programmes/,,, Nonetheless we can allow for that, and any other such cases:
Code:
--- old/youtube-dl/youtube_dl/extractor/bbc.py
+++ new/youtube-dl/youtube_dl/extractor/bbc.py
@@ -44,7 +44,7 @@
                     https?://
                         (?:www\.)?bbc\.co\.uk/
                         (?:
-                            programmes/(?!articles/)|
+                            (?:\w+/)?programmes/(?!articles/)|
                             iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
                             music/(?:clips|audiovideo/popular)[/#]|
                             radio/player/|
 
You could use the 2021.12.17 version with the drop-in youtube.py from PR 30184 and itv.py from PR 30266, although apart from the version number that would be the same code as 2021.06 with those updates.
I've built and published 2021.12.17a with those and the above food patch. Seems to work on the things I've tested on (BBC food; ITV with user-agent; Youtube).
 
Last edited:
What about the situation where replacing "sounds" with "programmes" in a link such as "bbc.co.uk/sounds/xyz" makes it work (but not always)?
 
Back
Top