[youtube-dl] Download files from youtube.com or other video platforms

bottletop · May 25, 2020

Thanks for the pointers /df, you're right about the read construct which was what I changed (after unsuccessfully trying to use awk and getting myself thoroughly confused!).
I did eventually manage to get the script working on the Humax default ash environment. I placed my files in newly created directory /mod/tmp/ztools/.
I've only got average bash scripting skills so it is a bit of a trail and error route for me.
Also great tip regarding the Python script.

This script simply renames en.srt files to .srt from ITV or calls the python script for BBC .ttml to .srt conversion
Create script /mod/tmp/ztools/srt-subtitles.sh, change "/mnt/hd2/My Video/60-Downloaded_Video/", and make script executable.

Code:

#!/mod/bin/busybox/ash
#Find files of type ttml and checks to see if there is a matching srt for it.
#If srt does NOT exist, convert ttml to srt using the python script ttml2srt.py
#or rename the en.srt file to .srt
# /mod/bin/bash    /mod/bin/busybox/ash
# Note- change istartdir="/mnt/hd2/My Video/60-Downloaded_Video/" to suit your needs

ibatch=srt-subtitles.sh
echo $ibatch "..starts" $(date)
istartdir="/mnt/hd2/My Video/60-Downloaded_Video/"

find "$istartdir" -type f  \( -name '*.en*.ttml' -o -name '*.en.srt' \)  -print |
while IFS= read -r   iitem; do
#    printf '%s\n' "$iitem"
   ipath="${iitem%/*}"
   ifilename="${iitem##*/}"
   ibasename="${ifilename%%.*}"
   isuffix="${iitem##*.}"
   isrtname="$ibasename.srt"
   inewfile="$ipath/$isrtname"

   if [ ! -f "$inewfile" ]
   then
    if [ "$isuffix" = "ttml" ]
    then
     echo "Creating srt-"  "$inewfile"
     python /mod/tmp/ztools/ttml2srt.py "$iitem"  > "$inewfile" ;
    else
     echo Renaming "$iitem"  to  "$inewfile" ;
     mv "$iitem"  "$inewfile" ;
    fi
   fi
done

echo $ibatch "..ends  " $(date)

To perform the script every 30 minutes, install package cron-daemon and add this to end of file /mod/var/spool/cron/crontabs/root
*/30 * * * * /mod/tmp/ztools/srt-subtitles.sh >> "/mod/tmp/srt-subtitles-"$(date +%Y%m)".log" 2>&1

Then reboot the Humax or restart cron serviceservice cron restart

Ezra Pound · Jun 10, 2020

Not sure if this has already been highlighted but I have just tried to download an audio only file from Youtube and got the following error, is it possible to download non video Youtube content?

Code:

Humax HDR-Fox T2 (Humax1) 1.03.12/3.13

Humax1# youtube -F https://m.youtube.com/watch?v=DQ-f_OIFnrw
[youtube] DQ-f_OIFnrw: Downloading webpage
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Humax1#

EDIT
Oops I was using an old version of the youtube package (2020.03.08) the current one 2020.05.29 works fine

/df · Jun 10, 2020

Ezra Pound said:
Not sure if this has already been highlighted but I have just tried to download an audio only file from Youtube and got the following error, is it possible to download non video Youtube content?

Code:

Humax HDR-Fox T2 (Humax1) 1.03.12/3.13 Humax1# youtube -F https://m.youtube.com/watch?v=DQ-f_OIFnrw [youtube] DQ-f_OIFnrw: Downloading webpage ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Humax1#

I can't repeat that with youtube-dl of 2020.05.29:

Code:

# youtube -F https://m.youtube.com/watch?v=DQ-f_OIFnrw
[youtube] DQ-f_OIFnrw: Downloading webpage
[youtube] DQ-f_OIFnrw: Downloading js player 16a691a1
[youtube] DQ-f_OIFnrw: Downloading js player 16a691a1
[info] Available formats for DQ-f_OIFnrw:
format code  extension  resolution note
249          webm       audio only tiny   59k , opus @ 50k (48000Hz), 932.93KiB
250          webm       audio only tiny   76k , opus @ 70k (48000Hz), 1.18MiB
140          m4a        audio only tiny  130k , m4a_dash container, mp4a.40.2@128k (44100Hz), 2.33MiB
251          webm       audio only tiny  155k , opus @160k (48000Hz), 2.42MiB
160          mp4        192x144    144p   37k , avc1.4d400b, 25fps, video only, 584.33KiB
278          webm       192x144    144p   55k , webm container, vp9, 25fps, video only, 953.06KiB
242          webm       320x240    240p   85k , vp9, 25fps, video only, 1.06MiB
133          mp4        320x240    240p   90k , avc1.4d400d, 25fps, video only, 1.35MiB
243          webm       480x360    360p  185k , vp9, 25fps, video only, 2.60MiB
134          mp4        480x360    360p  205k , avc1.4d4015, 25fps, video only, 3.00MiB
244          webm       640x480    480p  362k , vp9, 25fps, video only, 5.97MiB
135          mp4        640x480    480p  636k , avc1.4d401e, 25fps, video only, 10.14MiB
18           mp4        480x360    360p  358k , avc1.42001E, 25fps, mp4a.40.2@ 96k (44100Hz), 6.45MiB (best)

Perhaps there was a glitch while yt-dl was inspecting the site?

Ezra Pound · Jun 10, 2020

Sorry I have just edited my post #302, I was using an old version 2020.03.08

/df · Jul 29, 2020

MymsMan said:
...Qtube makes it easy to queue the download requests and now this tool will save the need for individual cutting and pasting of episode URLs - I will try to incorporate it into the webif

I found that an iPlayer episode URL wasn't accepted by the WebIf Qtube plugin, but inserting this in the current version of the iplayer-episodes script gave a command-line tool that added all the found episodes to the queue:

Code:

youtube() {
        [ "$1" = -a ] && shift 2
        while read -r line; do
                qtube "$@" "$line"
        done
}

Black Hole · Jul 29, 2020

/df said:
I found that an iPlayer episode URL wasn't accepted by the WebIf Qtube plugin

Do you mean a series URL?

/df · Jul 29, 2020

Black Hole said:
Do you mean a series URL?

Indeed.

Also, the ffmpeg downloader is very chatty. This mod (log at most one "[download]" line for each decimal percent complete) makes it about 5 times less so:

Code:

--- /mod/webif/plugin/qtube/queue.hook
+++ /mod/webif/plugin/qtube/queue.hook.new
@@ -27,7 +27,7 @@
 
     set cmd "youtube --newline $opts $url"
     if {[catch {exec {*}$cmd \
-            | awk {{print strftime("%d/%m/%Y %H:%M:%S -"), $0; fflush() }} \
+            | awk {BEGIN { pct="" }; { if ($2 !~ /[.[:digit:]]%/ || $2 != pct) {print strftime("%d/%m/%Y %H:%M:%S -"), $0; fflush(); pct=$2 }}} \
             >@$::qlogfd } msg catchopts]
             } {
         log "Caught error: $msg" 0

MymsMan · Jul 29, 2020

/df said:
I found that an iPlayer episode URL wasn't accepted by the WebIf Qtube plugin, but inserting this in the current version of the iplayer-episodes script gave a command-line tool that added all the found episodes to the queue:

Code:

youtube() { [ "$1" = -a ] && shift 2 while read -r line; do qtube "$@" "$line" done }

Please stop reminding how long my to-do list is! :eek:

prpr · Jul 29, 2020

/df said:
I found that an iPlayer episode URL wasn't accepted by the WebIf Qtube plugin, but inserting this in the current version of the iplayer-episodes script gave a command-line tool that added all the found episodes to the queue

iplayer-episodes is part of youtube-dl, not of qtube. Adding the code as listed would create a dependency in the former on the latter. This is not good as the reverse already applies.

Black Hole · Jul 29, 2020

Black Hole said:
Do you mean a series URL?

/df said:
Indeed.

So we're talking about series URLs not episode URLs. This sounds like more of a job for get-iplayer.

/df · Jul 30, 2020

prpr said:
iplayer-episodes is part of youtube-dl, not of qtube. Adding the code as listed would create a dependency in the former on the latter. This is not good as the reverse already applies.

True, and I didn't intend the posted code as a submission for either package, just a simple hack to avoid finding and pasting a whole list of links into qtube.

Obviously the right thing would be to turn the logic of the script into Python as a pull request for the youtube-dl BBC extractor. That needs quite a lot of work (learning the yt-dl framework, mostly).

Having said that, it might be useful for iplayer-episodes to have a --queue option. It could just check for an executable qtube and use it if found, or print out the command that would have been run otherwise.

Then, if qtube was prepared to check each URL entered in the UI, it could punt to iplayer-episodes --queue for iPlayer series URLs. Or the logic could be transformed into Jim TCL (easier than the "right thing").

Black Hole said:
So we're talking about series URLs not episode URLs. This sounds like more of a job for get-iplayer.

The reason we have Python and youtube-dl on the HD/R is that the dependencies associated with get_iplayer and its Perl infrastructure were intractable.

/df · Jul 30, 2020

/df said:
...it might be useful for iplayer-episodes to have a --queue option. It could just check for an executable qtube and use it if found, or print out the command that would have been run otherwise....

Like this:

Code:

--- iplayer-episodes
+++ iplayer-episodes.new
@@ -1,7 +1,7 @@
 #!/bin/sh
 # scrape iPlayer programme URLs from a BBC web page
 
-# args: 1=iplayer_series_url
+# args: [--queue|-q] iplayer_series_url
 
 mung_url()
 { # prefix
@@ -12,6 +12,39 @@
     done
 }
 
+case $1 in 
+
+--queue|-q) 
+       if which qtube >/dev/null; then
+               qqq() {
+                       while read -r line; do
+                               qtube "$@" "$line"
+                        done
+               }
+       else
+               printf "No qtube program is installed; listing qtube commands\n" >&2
+               qqq() {
+                       while read -r line; do
+                               echo qtube "$@" $(printf "'%s'" "$line")
+                        done
+               }
+       fi
+       shift
+       ;;
+
+--help|-h) { 
+       printf "Usage:\n\n%s [--queue|-q] iplayer_series_url\n\n" "${0##*/}"
+       printf "Extract iPlayer programme URLs from series page and pass to youtube-dl.\n\n"
+       printf "With queue option, instead try to queue each URL for download.\n\n"
+       } 1>&2
+       exit
+       ;; 
+
+*)     qqq() { youtube -a -; }
+       ;;
+       
+esac
+
 # get BBC's base address
 bbc="$1"; bbc="${bbc%%/iplayer*}"
 
@@ -19,4 +52,4 @@
 # curl: -k insecure, needed due to Humax's old SSL libs, -s silent, -S show errors anyway
 # grep: -o print matching substring, -E match extended regular expression
 curl -k -s -S $1 | grep -oE "href=('|\")/iplayer/episode/[^'\"]+\\1" | mung_url $bbc | \
-  sort | uniq | youtube -a -
+  sort | uniq | qqq

The curl -k could now be changed, but neither the new SSL libraries nor a curl built with them are guaranteed by the dependencies of the youtube-dl package, and therefore it should stay. Should the package call for ffmpeg >= 4.1?

prpr · Jul 31, 2020

Updated as suggested.

/df · Oct 11, 2020

See this post and its successors from an apparently unrelated thread for details of a local patch that's been applied to the upstream BBC extractor.

MontysEvilTwin · Oct 13, 2020

The recent update of Youtube-dl (youtube-dl_2020.09.20-2) seems to have broken something. I tried to download an episode of TOTP and got the following errors:

Code:

Traceback (most recent call last):                                                  
  File "/mod/lib/python2.7/runpy.py", line 162, in _run_module_as_main              
    "__main__", fname, loader, pkg_name)                                            
  File "/mod/lib/python2.7/runpy.py", line 72, in _run_code                         
    exec code in run_globals                                                        
  File "/mod/bin/youtube-dl/__main__.py", line 19, in <module>                      
  File "/mod/bin/youtube-dl/youtube_dl/__init__.py", line 474, in main              
  File "/mod/bin/youtube-dl/youtube_dl/__init__.py", line 464, in _real_main        
  File "/mod/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 2019, in download        
  File "/mod/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 808, in extract_info     
  File "/mod/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 863, in process_ie_result                                                                                
  File "/mod/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 1497, in process_video_result                                                                            
AttributeError: 'list' object has no attribute 'items'

I reverted to an earlier version (youtube-dl_2020.09.20) and the download worked.

/df · Oct 13, 2020

I can check this if you send me the actual iPlayer URL or programme ID.

MontysEvilTwin · Oct 13, 2020

/df said:
I can check this if you send me the actual iPlayer URL or programme ID.

Here it is:

Code:

https://www.bbc.co.uk/iplayer/episode/m000n7g4/top-of-the-pops-04011990

Luke · Oct 13, 2020

MontysEvilTwin said:
Here it is:

Code:

https://www.bbc.co.uk/iplayer/episode/m000n7g4/top-of-the-pops-04011990

I've had a few like that on previous youtube versions where the URL contains "iplayer/episode". When that happens changing "iplayer/episode" to "programmes" works for me about half the time, and on this occasion also works with m000n7g4 and 2020.09.20-2.

Code:

https://www.bbc.co.uk/programmes/m000n7g4

/df · Oct 13, 2020

It worked for me, but I think there's a couple of bugs related to subtitles:

subtitles may be returned as None rather than an empty list (but see #2) - initial fix in https://github.com/ytdl-org/youtube-dl/pull/26821;
formats are returned as a list (doesn't have items()) but subtitles seem to be a dict (has items()).

Was an option like --list-subs involved?

/df · Oct 13, 2020

MymsMan said:
Please stop reminding how long my to-do list is!

There's a long-standing PR in the queue at the yt-dl GitHub that implements series downloads for iPlayer. If we're forking the CF repo version, maybe we should include that.

[youtube-dl] Download files from youtube.com or other video platforms

Active Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

May contain traces of nut

Well-Known Member

Ad detector

Well-Known Member

May contain traces of nut

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Knwοn Мember

Well-Known Member

Well-Known Member