WIP: Dealing with large files for qtube & youtube-dl

bottletop

Active Member

~new summary starts​

This is a work in progress - please do not add to this yet.
This does not apply to @/df - please can you add/copy your post
https://hummy.tv/forum/threads/yout...com-or-other-video-platforms.8462/post-174599
to the end of this thread as I think it is relevant and useful.
I'll tidy this post as I go along.

What's this all about?​

This is an attempt to retrospectively tidy up a discussion on an old thread that veered off topic a little and may have polluted the original thread. Moving that discussion into it's own thread, here, may make it easier to digest.

Old thread split​

So posts #209 to #249 of https://hummy.tv/forum/threads/qtube-webif-front-end-for-youtube-dl.8948/
was split into this thread,
Summary of changes -
old #209 will be new #1 of this thread.
old #212 (new #4) contents should refer to new #1 instead of old #209 - done
old #240 (new #32) contents should refer to new #28 instead of old #236 - done - thanks!
old #241 (new #33) contents will need tidying up old #236 should be new #28, old #233 should be new #25 - done
old #243 (new #35) delete? - updated - done - thanks!
old #245 (new #37) contents will need tidying up - done - thanks!
old #247 (new #39) contents should refer to new #28 instead of old #236 - done - thanks!
Basically we will move post #209 onward from the original thread to this one. We can subtract 208 from the old post numbers to get the new post number for this thread.
Thank you everyone for helping out. As this section, Old thread split, is no longer required, I'll delete it at the end of the week.​

What's this all about - part 2?​

This is a long post about how to help the HDR to cope with large download files from iPlayer via qtube and youtube-dl.
This section is an accumulation of ideas and suggestions of the pages on this thread.
Why use the HDR?
- pros: save energy. If it's for extended periods, why not use it? Also as it'll store the download file, we can keep it there for viewing on the main TV.
- cons: the HDR is circa 2011, limited CPU power, memory, etc. But the magic that hummy.tv and the CFW adds to this unit makes it a very flexible device.​

Preamble: Please refer to original threads for some background information.
https://hummy.tv/forum/threads/qtube-webif-front-end-for-youtube-dl.8948/
https://hummy.tv/forum/threads/what-is-an-malformed-aac-bitstream.11136/

There is a deficiency in the way youtube-dl handles the large files from iPlayer.
This is noticeable with the long duration files produced for some sporting events eg 5 hours of tennis.
There are some issues that the current process has trouble with, namely
large files may take time to download on non fibre connections (this is for info only and not addressed here)
post processing of the large file downloaded by youtube-dl - ie the fixup process.​

youtube-dl fixup process​

This can be bypassed by supplying --fixup warn or --fixup never for youtube-dl
This will ensure youtube-dl performs the download then skip the fixup processing.
The resultant MP4 will be playable on HDR but without cue/review feature.
If don't miss cue/review then stop here, enjoy the download. The quality is not bad and you'll save time and effort.​

Read on if you require fixup processing. The humble HDR with CFW will help.​

Under normal circumstance the HDR will fail during the later stages of the fixup process.
The HDR will eventually grind to a halt and not respond to the RC, WebIF, telnet, ssh, ping etc.
We suspect that reason why it normally fails is a combination of swapfile size and file contention issues. The fixup produces a large file that may have issues with the Humax DLNA server.
Use one of the the solutions below to get fixup processing to complete.​

Solution 1. Get HDR to perform fixup processing - with Content sharing Off.
Use swapfile 600MB-1024MB. Disable Content Share.
Kick off qtube/youtube-dl with fixup.
It will complain near the end. It will exhibit audio and video stutters before settling on a blank screen, no audio for 1-2 hours! It will eventually finish. This solution is most taxing for the HDR.
Solution 2. Get HDR to perform fixup processing - with Content sharing On.
Use swapfile 600MB-1024MB. Leave Content Share On. Save to alternative (non DLNA) destination like /mnt/hd2/virtual_disk/ (e.g. install virtual-disk2).
This is similar to solution 1, but saving to another location and leaving Humax DLNA on.
Solution 3. Get HDR to perform fixup processing - with HDR in maintenance mode.
Use swapfile 128MB. Content Share N/A as it is in maintenance mode.
This is the least taxing option for the HDR and the fastest. It can process 3GB in 10 minutes.
It requires user to use the command line and not have normal TV while running the commands.
First we need to run qtube/youtube-dl and bypass fixup processing, as above.
Set HDR to maintenance mode and reboot
(WebIF/Diagnostics/Maintenace Mode/Enable, WebIF/Diagnostics/Reboot System/Restart)
We will run the commands for fixup processing manually.
Code:
#maintenace mode access
#telnet into HDR
telnet <HDR IP-address>

#drop into cli
cli
#start swapfile 
/mod/etc/init.d/S00swapper start
#check free memory - is swapfile available? 
free -m

#go to directory that has the downloaded file
cd '/media/My Video/Eastbourne-2024'
#check file for media errors 
ffprobe -show_streams 'file:<filename>.mp4'
#(Fixup process) If there are errors, perform manual fixup by (save to same dir is ok becuase no DLNA)
ffmpeg -y -loglevel 'repeat+info' -i 'file:<filename>.mp4' -c copy -f mp4 '-bsf:a' aac_adtstoasc 'file:<filename>-newfile.mp4'

#leaving maintenace mode and reboot
exit
x
y

Solution 4. Get HDR to perform fixup processing - with HDR in normal mode.
Ideally use swapfile 150MB. Content Share On.
This solution uses bits of solution 2 and 3. It is nowhere as fast as solution 3, but has the benefit of allowing user to watch HDR TV while running the process. The runtime for a 10GB file fixup is approximately 1 hour. Near the end of that period, you will lose audio and video for roughly 30 minutes. (I'll need to perform further tests to confirm this.)

The steps are the same as solution 3, except we will use the HDR in normal mode so these are the differences:
No need to start swapfile, it should be in use already.
The output file of last command ffmpeg should store <newfile> in a location that DLNA index will not access. Eg like solution 2.
You can reduce AV stutter by not viewing live TV, so view a recording or even leave the unit on BBC Red Button channel 250.​

So, which solution is best?
Solution 3 has the shortest run time while solution 4 minimises TV downtime. The first two solutions are easier to perform, but they will interupt your TV viewing experience and take a long time to run.

Adjust swapfile size​

I am lazy - I leave my swapfile set to 1024MB. But this is not best practice.
How to change your swapfile size.
https://hummy.tv/forum/threads/swapper-virtual-memory.8844/post-171196
eg
Code:
echo 600 >/mod/boot/swapsize

Suggestions - alternate locations for fixup output file​

Suggested locations for placing fixup output file away from the reach of Humax DLNA.
The first one is directly accessible by the Humax UI (Media-Video/Storage(blue)/USB), so with the others you'll need to move file into My Video.
/mnt/hd2/virtual_disk/
/mnt/hd2/Tsr/
/mnt/hd2/mod/
/mnt/hd2/mod/tmp/

(WIP not finished)

What follows was the original post that started this discussion...​

There may be a bug in the interaction between qtube and youtube-dl.
I've tried this on 2 occasions and it looks like youtube-dl caused the HDR to hang during the final stages of the download (the fixup stage).

Code:
16/06/2024 08:36:08 - [debug] ffmpeg command line: ffmpeg -y -loglevel 'repeat+info' -i 'file:/mnt/hd2/My Video/40-Imported_Video/Tennis_-_Nottingham_Open_2024_Day_6.mp4' -c copy -f mp4 '-bsf:a' aac_adtstoasc 'file:/mnt/hd2/My Video/40-Imported_Video/Tennis_-_Nottingham_Open_2024_Day_6.temp.mp4'
16/06/2024 08:36:08 - [ffmpeg] Fixing malformed AAC bitstream in "/mnt/hd2/My Video/40-Imported_Video/Tennis_-_Nottingham_Open_2024_Day_6.mp4"
16/06/2024 08:36:06 - [debug] ffmpeg command line: ffprobe -show_streams 'file:/mnt/hd2/My Video/40-Imported_Video/Tennis_-_Nottingham_Open_2024_Day_6.mp4'
16/06/2024 08:36:04 - [download] Download completed
16/06/2024 05:52:48 - [download] Destination: /mnt/hd2/My Video/40-Imported_Video/Tennis_-_Nottingham_Open_2024_Day_6.mp4
16/06/2024 05:52:48 - [hlsnative] Total fragments: 2158
16/06/2024 05:52:46 - [hlsnative] Downloading m3u8 manifest
During the final stage - it cause the HDR to hang - no audio/video on HDMI. No access to HDR on network.
So I cycled power using the rear switch.
On resume - qtube tries to continue where it left off - so the HDR will hang again!

What I did to work around the issue
  • cycle power to the HDR
  • access HDR (eg telnet) and rename the queue file - mv /mnt/hd2/mod/etc/queue.db /mnt/hd2/mod/etc/queue.bak1
  • restart the HDR
  • delete the temp file
  • use the first downloaded file
Note1: I was trying to view https://www.bbc.co.uk/iplayer/episode/m0020g5t/tennis-nottingham-open-2024-day-6?seriesId=unsliced
The smaller files are over 5GB and the uninterrupted ones over 11GB, so they'll take a while on ADSL broadband.
There may be a better way to work around the issue. Eg just perform download without the fix by supplying --fixup warn to youtube-dl (I've since tested this and it works for these iPlayer downloads, alternatively increase swap file size as in link below).
Note2: Ignore me it's been reported before - https://hummy.tv/forum/threads/what-is-an-malformed-aac-bitstream.11136/
 
Last edited:
Last edited:
Last edited:
I've increased the swap file size. Made no difference to the end result (similar fail) - it just took longer to get there.
Increased it to what? You're trying to process multi-gigabyte files (which seems foolhardy).

Test the process on a (much) smaller file of the same type (if you can find one).
 
Increased it to what?
256MB, 512MB & 1GB swap. Take your pick. Results are similar.
You're trying to process multi-gigabyte files (which seems foolhardy).
Yes and no. There may be a misunderstanding somewhere.
They were downloads of the stuff they broadcast on BBC RB1 (601)
Try watching footie on that channel. It's dire.
Not try watching tennis on that channel. It's worse - like where's the tennis ball!
(They're showing some Euro 2024 and Tennis (Queens & Birmingham) on it at the moment. They'll probably use it for Wimbledon shortly.)
Test the process on a (much) smaller file of the same type (if you can find one).
Can't find one. I'm only downloading similar quality to normal SD(576i). Anything less makes it unwatchable. So it's a catch22 situation.

Anyway, it'll work with (to bypass fixup) --fixup warn
Otherwise there's a loop on restart that requires manual intervention.*
(*)The HDR retries the queue on restart before I can suspend items on the queue.
 
Last edited:
What happens if you use --hls-prefer-ffmpeg with the large video?

There is a limit to how much virtual memory can help, as this blog series explains (when it mentions Postgres, think settop). Hence the swapper setting should be no larger than necessary, but not smaller.
 
I'll have to pencil that in. It'll tie up the broadband and has the potential to crash the HDR after a good few hours. So what is the suggestion for the swapper size? I saw it use over 700MB when checking with free.
 
What happens if you use --hls-prefer-ffmpeg with the large video?
Similar. It looks like the HDR hangs during the fixup stage. If anything, it hangs earlier than before ie roughly an hour into the fixup stage. It use to hang after exhausting the swap file size - so maybe 2 hours into the fixup stage if a very large swap file is used.
There is a limit to how much virtual memory can help, as this blog series explains (when it mentions Postgres, think settop). Hence the swapper setting should be no larger than necessary, but not smaller.
Thanks for this, although I'll take that read with a pinch of salt. A lot of it goes over my head, but something doesn't make sense. Eg if I wish to run 2 programs that used, say 128MB and 256MB of swap memory what should I use for swap? I should be able to allocate a large number and let the OS manage it. I shouldn't have to consider how to allocate it dynamically depending on if I'm running one, the other, or even both at the same time.
 
I think a rough summary is that eventually the OS spends most of its time paging memory blocks and hardly any time running the user process.

Why is FFMPEG requesting so much RAM during the fix-up process? Seems to me:
  1. It's been badly written, and caches whole video files when it could stream-process;
  2. The fix-up requires access to a significant part of the whole video file at once and can't be stream-processed;
  3. The FFMPEG executable itself consumes a significant amount of RAM (relative to the actual RAM not virtual).
Perhaps if we knew what --fixup actually does? Is it trying to do something in one pass which would actually be more efficiently done in two passes?

Is there any control over the size of memory chunks which get paged to swap? Making them finer-grained might help.
 
... eventually the OS spends most of its time paging memory blocks and hardly any time running the user process.

Exactly. In the trade this is called "thrashing". The Humax HD/R platform is a toy compared with what developers of programs like ffmpeg are using or probably even envisage being used, despite packing a 1990-ish scientific minicomputer in the box.

Hence "the swapper setting should be no larger than necessary, but not smaller." There's no point having a bigger paging file (which is what it is nowadays, despite the 1980-era "swap" terminology) than is needed to make some program run that otherwise wouldn't. Usage patterns are critical: if two large programs are being run whose peak memory demands are not simultaneous, paging may help; otherwise, especially if a single large program needs significantly more memory than the actual RAM, not so much.

Regarding the fixup process, see https://hummy.tv/forum/threads/what-is-an-malformed-aac-bitstream.11136/post-171263 and
https://hummy.tv/forum/threads/what-is-an-malformed-aac-bitstream.11136/post-171265 (as these posts are more than 5 months old, I and some other forum members might well have forgotten them).
 
I think a rough summary is that eventually the OS spends most of its time paging memory blocks and hardly any time running the user process.
That may be true. I don't know for sure.
Why is FFMPEG requesting so much RAM during the fix-up process? Seems to me:
  1. It's been badly written, and caches whole video files when it could stream-process;
  2. The fix-up requires access to a significant part of the whole video file at once and can't be stream-processed;
  3. The FFMPEG executable itself consumes a significant amount of RAM (relative to the actual RAM not virtual).
Perhaps if we knew what --fixup actually does? Is it trying to do something in one pass which would actually be more efficiently done in two passes?

Is there any control over the size of memory chunks which get paged to swap? Making them finer-grained might help.
While I agree it's a fault with ffmpeg, it's not obvious why. I know it only accesses the downloaded MP4 a section at a time, otherwise it'll never work with files larger than the swap file. Eg the default swap and main ram is only 256MB combined, yet the --fixup works with MP4 files of, say, 3GB without issue. Also using the term stream in this instance may be misleading because it's processing the file, not serving it.
Exactly. In the trade this is called "thrashing". The Humax HD/R platform is a toy compared with what developers of programs like ffmpeg are using or probably even envisage being used, despite packing a 1990-ish scientific minicomputer in the box.

Hence "the swapper setting should be no larger than necessary, but not smaller." There's no point having a bigger paging file (which is what it is nowadays, despite the 1980-era "swap" terminology) than is needed to make some program run that otherwise wouldn't. Usage patterns are critical: if two large programs are being run whose peak memory demands are not simultaneous, paging may help; otherwise, especially if a single large program needs significantly more memory than the actual RAM, not so much...
I've made the swap file default to 1GB for over a year. This has worked fine without issue.
Eg I've successfully downloaded films on iPlayer (using qtube/youtube-dl) that were close to 3GB with the --fixup on (default).
 
Last edited:
Also using the term stream in this instance may be misleading because it's processing the file, not serving it.
Surely you understood what I meant: reading the input data linearly rather than by random access. Random access is obviously going to make greater demands on paging, and could potentially be avoided using a multi-pass process.

From what /df says we might assume the FFMPEG developers haven't put much effort into RAM efficiency, this being a niche use-case, and that imposes a limit on the size of file --fixup can process on the HDR-FOX (regardless of swap file).
 
Last edited:
I've made the swap file default to 1GB for over a year. This has worked fine without issue.
Eg I've successfully downloaded films on iPlayer (using qtube/youtube-dl) that were close to 3GB with the --fixup on (default).
Something for you to try: on the command line, run "free" every now and again before and while --fixup is running. I just checked one of my HDRs while it's not doing much, and there's only about 4% free RAM (it would be a surprise if Humax fitted significantly more RAM than it needs to be a PVR – what have we got, 320MiB [2x128MiB + 64MiB]? And some of that will be hardware-allocated to video buffering).

Code:
HDRFOX4# free                                                                                  
             total       used       free     shared    buffers     cached                      
Mem:        125016     120416       4600          0       7788      43956                      
-/+ buffers/cache:      68672      56344                                                        
Swap:       131064        508     130556

I don't know what the units are – KiB? Update: the default is KiB, or free -m gives the figures in MiB.

How much of the RAM used by the OS and settop is swappable? (rhet.)
 
Last edited:
Something for you to try: on the command line, run "free" every now and again before and while --fixup is running. I just checked one of my HDRs while it's not doing much, and there's only about 4% free RAM (it would be a surprise if Humax fitted significantly more RAM than it needs to be a PVR – what have we got, 320MiB [2x128MiB + 64MiB]? And some of that will be hardware-allocated to video buffering).

Code:
HDRFOX4# free                                                                                
             total       used       free     shared    buffers     cached                    
Mem:        125016     120416       4600          0       7788      43956                    
-/+ buffers/cache:      68672      56344                                                      
Swap:       131064        508     130556

I don't know what the units are – KiB?

How much of the RAM used by the OS and settop is swappable? (rhet.)
Why? What's the point? You said it''s a foolhardy pursuit..
Increased it to what? You're trying to process multi-gigabyte files (which seems foolhardy).

Test the process on a (much) smaller file of the same type (if you can find one).
Anyway you've determined it a memory problem already.
Yes, it runs out of memory processing the file.

So far I've found the easiest option is to use --fixup warn or --fixup never for youtube-dl.
Alternatively temporarily disable content sharing - Settings/System/Internet Setting/Content Share after queuing the download. Your HDR may slow down to a miserable speed and the HDMI audio and video will disappear near the end of the fixup process. But, it will chug along to completion. You'll need to reboot the HDR to resume normal functions.
 
Last edited:
Anyway you've determined it a memory problem already.
Sure, but that's what swapper is supposed to side-step. It's largely speculation that the system is grinding to a halt with your size of files, and I thought it would be interesting/informative to watch the free stats while it does – as confirmation if nothing else.

We might expect to see still plenty of swap space available but the system getting slugged by trying to use it. That's rather different than what happens when there isn't enough swap space (I can't remember whether the system crashes or the process fails – I think it's the latter).

I can't easily replicate what you're doing, simply because of my download speeds making it impractical.
 
These are the results of free -m for first 4 episodes of series 1 of Kin on iPlayer. There're smaller files (approx 1GB) so gives a rough idea of the processing. I have set the swapfile to 1GB. I'll try a larger file when after the weekend - maybe a film on HD which should be nearer 4GB download.
Code:
--restrict-filenames 
--prefer-ffmpeg 
--user-agent 'Mozilla/5.0'
-f "best[height<=?576][fps<=?60]"
-v
--no-progress
Code:
#Just after reboot
             total       used       free     shared    buffers     cached
Mem:           122        119          2          0          3         22
-/+ buffers/cache:         93         28
Swap:         1023          0       1023

#Near end of process youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122        119          2          0          1         30
-/+ buffers/cache:         86         35
Swap:         1023         62        961
Code:
#Just after reboot
             total       used       free     shared    buffers     cached
Mem:           122        119          2          0          7         37
-/+ buffers/cache:         74         47
Swap:         1023          0       1023

#Near end of process youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122         92         29          0          3         41
-/+ buffers/cache:         48         73
Swap:         1023          0       1023
Code:
#Just after reboot
             total       used       free     shared    buffers     cached
Mem:           122        118          3          0         15         53
-/+ buffers/cache:         48         73
Swap:         1023          0       1023

#Near end of process youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122        119          2          0          1         17
-/+ buffers/cache:        100         21
Swap:         1023         15       1008
Code:
#Just after reboot
             total       used       free     shared    buffers     cached
Mem:           122        118          3          0         15         54
-/+ buffers/cache:         48         73
Swap:         1023          0       1023

#Near end of process youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122        120          2          0          1         13
-/+ buffers/cache:        105         16
Swap:         1023         35        988

Edit1: Added this
Code:
#A while after reboot
             total       used       free     shared    buffers     cached
Mem:           122        116          5          0          2         29
-/+ buffers/cache:         84         37
Swap:         1023         30        993

#Near end of process youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122        119          2          0          1          7
-/+ buffers/cache:        110         11
Swap:         1023        390        633

#20/30 minutes after youtube-dl
             total       used       free     shared    buffers     cached
Mem:           122        117          4          0         19         55
-/+ buffers/cache:         42         79
Swap:         1023         35        988
 
Last edited:
Curious how the swap usage varies so much between 1, 3, and 4... and also curious how little swap space is being used.
 
Back
Top