Alternative to cropping out adverts

Oatcake

Member
This idea occurred to me while reading the thread "Detect Ads" (posts #36 and #37) regarding how to watch a programme after advert detection...
  • actually crop out the adverts, possibly keeping the original in case of bad detection
  • OR use the remote to manually skip to the bookmarks

An alternative might be (if it could be developed)...

Don't crop the file and use a new "auto skip ads" package that uses "ir" to automatically send the skip-to-bookmark commands at the correct points in the programme being watched. Then, if the bookmarks were incorrect, you can easily rewind or skip around using the remote. But when the ad detection is correct there's no action required (hands free viewing).

How would this work? When you start to watch a programme, check to see if ad-detect has run (see * below). If so, then start to monitor the "read position" in the file. When the read position goes past the next start-of-ads bookmark, then issue a skip-to-next-bookmark via IR package and flag this bookmark as "used". Now the package will ignore this pair of bookmarks which will allow the user to manually skip around in that region, if necessary.

* - How do we know if ad-detect bookmarks are present? By checking for bookmarks that conform to the right pattern. This means pairs of adjacent bookmarks with appropriate gapping. OR maybe the file could be flagged in some way after the advert detection process.


Would anyone use this feature if it was available? Can anyone foresee any problems, apart from finding someone willing to develop it? :)
 
Not trying to detract from your idea as the auto skip bit sounds pretty reasonable to me, but your:
Bullet point 1. The original is kept. You can chose to move it to the bin where it is kept for a period of time.
Bullet point 2. Yep. Just have the prog bookmark the end of the ads and do that. However I feel that option is a bit unnecessary as you generally only have to press the skip button twice to achieve pretty much the same thing.
 
Not sure how you could do that in real time.

There's 3 possibilities that I can think of, but there may be more (I have not tried any of these)...
  1. Using LD_PRELOAD it should be possible to catch all calls to fopen, read, and seek calls.
  2. There's a version of "strace" on the humax, so maybe it can attach to the humaxtv process and keep track?
  3. Build a kernel with support for /proc/<pid/fdinfo ? I don't know if this would be possible. This would be more af123's domain I think!
TBH I thought there wouldn't be much point in doing too much research unless there was general interest in the idea.
 
It is an interesting concept but I am not sure how you would track the current play position with any degree of accuracy.
I may have paused the programme to take a phone call or go to the loo or fast forwarded to skip over the waffle.
You can read the front panel display which gives you the time to nearest minute but really you would need it to within 5 seconds
 
TBH I thought there wouldn't be much point in doing too much research unless there was general interest in the idea.
It's an interesting enough idea, but it will take a lot more research to convince me it's feasible. Let's think it through:

Using LD_PRELOAD it should be possible to catch all calls to fopen, read, and seek calls.
Who says recording file reads are handled through software and not by hardware block transfers? If they are through software, they could be by direct access from settop and not as OS calls. If they are as OS calls, there's going to have to be a lot of real-time monitoring and analysis to work out which one corresponds to an access within an advert break (some of that could be off-loaded: compute the disk sector which corresponds to the next advert start point and trigger when you see a read to that particular sector).

There's a version of "strace" on the humax, so maybe it can attach to the humaxtv process and keep track?
settop is a big anonymous blob. It's difficult to surmise accurate recording access status at all, let alone read pointer position.

Build a kernel with support for /proc/<pid/fdinfo ? I don't know if this would be possible. This would be more af123's domain I think!
Same objections as above, plus the risk that the recompiled kernel won't fit.

Overall, I doubt this is going to be worth the effort, even in the long term. What exactly does it do which we can't do already by running a crop? Nothing. Nonetheless, it's still an interesting idea.
 
Thanks very much for the feedback.

On the technical side, there are some very good points by @MymsMan and @Black Hole. There are certainly no guarantees that it's possible to implement.

What exactly does it do which we can't do already by running a crop? Nothing. Nonetheless, it's still an interesting idea.

Actually it offers the following advantages over a crop...
  • If the crop cuts out a vital part of a programme, then it's a real pain to:- press "stop"; then "media"; then navigate to the original file; press "play"; skip to the appropriate advert break position; watch the missing part; and then press "stop" again; navigate back to the cropped file (so that any subsequent adverts will be cut out); and finally resume play. Surely it would be simpler to just use "skip back" or "skip forward" once or twice to see the missing section if this was implemented?
  • Don't have to manage two files - making sure that the cropped file is viewed before the original is deleted.
  • Less disk writing/ wear and tear - since splicing creates an extra (almost whole) copy of the programme
  • Less disk space reqd (if you're running low)
Having made these points, I'm hesitant whether it is worth the development time to do this because it would be hard work and besides the advert detection does a good job at finding the correct cut points in the vast majority of cases.
 
If I get some free time over Christmas I might nonetheless try some experiments to see if file position tracking is possible, for fun! 🤓
 
(some of that could be off-loaded: compute the disk sector which corresponds to the next advert start point and trigger when you see a read to that particular sector).
the nts file provides an index of time offset to file offset so working out where the ad breaks are within the recording is reasonably straightforward but I agree that tracking the relevant io requests is likely to be complex and possibly disruptive
If the crop cuts out a vital part of a programme, then it's a real pain to:- press "stop"; then "media"; then navigate to the original file; press "play"; skip to the appropriate advert break position; watch the missing part; and then press "stop" again; navigate back to the cropped file (so that any subsequent adverts will be cut out); and finally resume play.
On the rare occasions where I need to go back to the original I just continue watching the uncropped file and use the bookmark button to skip subsequent ads - I never go back to the cropped recording
 
If I get some free time over Christmas I might nonetheless try some experiments to see if file position tracking is possible, for fun! 🤓
Certainly worth experimenting many things originally thought to be impossible are now BAU, examples include:
  • Real time scheduling
  • Off line decryption
  • Chase decryption
 
For prototyping you could use something like polling with lsof to get the "read position". Attaching to a running humaxtv process with strace -p is tricky because, as I understand it, you'd have to attach to all its 100+ threads, and the command line for /usr/bin/humaxtv is burnt into /etc/init.d/S90settop (RO FS), making launching humaxtv with strace tricky as well, so LD_PRELOAD would be best for production.

You'd have to find some heuristic to identify the FD of interest, since the Humax code could be using the same recording file for, eg, DLNA streaming.

If it were my Christmas, I'd just crop with more padding so that only a flash of ad is left, that could be skipped using Bookmark if desired? Personally, I stick to marking rather than cropping, unless I want to prepare for a special viewing.

...
Who says recording file reads are handled through software and not by hardware block transfers? If they are through software, they could be by direct access from settop and not as OS calls. ...
The fact that file systems installed in the OS without reference to the Humax blob can host playable media files seems to guarantee that the first stage of getting the content to the screen is an OS read.

Apparently the retrieved data is then stashed (presumably using the Broadcom AV middleware) into the AV message buffers (the 128MB that Linux can't see), decoded and fed into the video display subsystem: all that processing is off-loaded from the CPU. From the Broadcom spec: "The Data Transport Processor is ... capable of simultaneously processing 255 PIDs ... in up to six independent transport streams ... selected from six external serial transport stream inputs, and five internal playback channels." Once upon a time we used to have 19" rack modules organised in the same way.
 
Thanks for that @/df ! Very useful info, and this still seems like a possible project.

It has become a bit of a habit of mine to do some coding, usually related the Humax, over Christmas! I guess Suduko type puzzles, etc, don't interest me as much as coding. So another part of the reason for this thread was me starting to think about this Christmas's puzzle!


This might be a good time to admit that, during a previous Christmas, I found out a way to obtain most of the text that the Humax displays to the TV screen. This includes the media list screen, TV guide, setup menus, etc. You can tell which word is highlighted too. But the display position isn't available, so the captured text can look quite "mixed up" on first glance.

I could not think of a good-enough use for this information, so I abandoned the project when the free-time of Christmas was over. There would have been a lot of extra work involved to make this properly usable.

For the tekkie people - this involved LD_PRELOAD and capturing the text that the Humax passed to the (obviously documented) freetype library.
 
BTW if you're a tekky person considering exploration of any of these ideas then be mega careful about using LD_PRELOAD on the humaxtv process. It is very easy to enter a cycle of immediate crash-reboot! (Lol I know this from experience.) Use a script to make your preload a one-shot affair. I also used a stub that could load new code dynamically into "humaxtv" so that I didn't have to reboot the machine for every code change under test.
 
This could be the solution to making recordings play from the WebIF - something oft requested.
Yes. It wouldn't play immediately, since the user would see the media list being (automatically) navigated on-screen in order to get to the desired recording. But the text grab could certainly be used for this purpose.
 
This is the first indication of a means to get menu feedback, and adds functionality not previously possible. Can you affect the on-screen text too?
 
Yes. It wouldn't play immediately, since the user would see the media list being (automatically) navigated on-screen in order to get to the desired recording. But the text grab could certainly be used for this purpose.
I don't think the intermediate menus being visible is a big problem, just knowing what is being shown on screen would be a major advance
The big problem with using ir for automation is that you have always been working blind unable to know what is currently shown on screen and whether the box has responded to your last set of ir key presses.
I think further development of this would be a much more productive use of your 'turkey time' and if you could publish what you have already achieved it might inspire others to do some more investigations - Probably start a new thread.
 
Please don't post any more in this thread about scraping text.

I started a new thread for this as @MymsMan suggested...

This is about a discovery that could lead to some new WebIF functionality, like being able to select a recording to play from the WebIF.

This discussion started in another thread, but please make any further posts on this topic here.
 
Back
Top