'Deep Ping' for Hummy?

mike_m · Jul 5, 2019

The crash I mentioned in an earlier post - when ping still worked but the rest of the Humax was unresponsive - did indeed have a frozen picture, whereas in a 'normal' crash I see no picture output at all. I don't know if it had sound, because the TV was on mute. But I couldn't login via Webif, so hopefully I can focus on that for diagnosis, as prpr suggested.

prpr · Jul 5, 2019

IME, the sound always continues, unless it's the RF tuner module that's got itself in to a funny state, and obviously then there is nothing being demodulated/decoded, so no sound.

Incidentally, I've managed to compile mosquitto for the T2 (with a bit of trouble - it's not very well structured). Is anyone interested in having it packaged up?

mike_m · Jul 5, 2019

prpr said:
Incidentally, I've managed to compile mosquitto for the T2 (with a bit of trouble - it's not very well structured). Is anyone interested in having it packaged up?

Does that mean the T2 will run as an mqtt client? If so, it could publish its current state regularly, rather than the RPi having to ping it. If it stops publishing, the RPi script intervenes with a power cycle. Or am I misunderstanding?

Black Hole · Jul 5, 2019

mike_m said:
If it stops publishing

What am I not getting through here? How will the settop process crashing stop mtqq (whatever that is) running, when they are separate threads that run independently of each other? What's needed is to monitor something that is regularly written to while settop is running, eg 0.ts (except that doesn't get written to when it's a radio or data service).

With regard to sound and vision: the audio stream is processed entirely in hardware so once the hardware has been initialised (by software) it can continue regardless. The video requires an interaction between hardware and software, so best case tends to be a frozen picture.

MymsMan · Jul 5, 2019

mike_m said:
The crash I mentioned in an earlier post - when ping still worked but the rest of the Humax was unresponsive - did indeed have a frozen picture, whereas in a 'normal' crash I see no picture output at all. I don't know if it had sound, because the TV was on mute. But I couldn't login via Webif, so hopefully I can focus on that for diagnosis, as prpr suggested.

You could use the IR package from the script you run to check if it is responsive but you might need to change channel to be able to detect the response which could be annoying if you are watching the box at the time

mike_m · Jul 6, 2019

Black Hole said:
How will the settop process crashing stop mtqq (whatever that is) running, when they are separate threads that run independently of each other? What's needed is to monitor something that is regularly written to while settop is running, eg 0.ts (except that doesn't get written to when it's a radio or data service).

By 'stops publishing', I don't mean that mqtt has necessarily stopped. A script running on the Humax would regularly check for something that shows everything's OK - presence of a file, as you suggest - or it could even be a number of different tests. If the tests pass, it publishes an mqtt message saying "I'm OK". If not, it keeps quiet. Obviously, if the Humax has crashed completely, it can't publish. The net effect of anything wrong will be that the OK message doesn't get published.

With my Hummies, the frozen condition doesn't seem to happen as often as the 'ordinary' crash, so for me personally I don't think it's worth a major overhaul to try and catch every instance of failure. I'll add a test of Port 80 and then just rely on prayers and sacrifices to the Gods of Small Electronic Devices!

Black Hole · Jul 6, 2019

mike_m said:
regularly check for something that shows everything's OK

The difficulty is working out what that could be, under all circumstances.

prpr · Jul 6, 2019

mike_m said:
Does that mean the T2 will run as an mqtt client?

Yes. Server/broker and/or client.

Black Hole · Jul 6, 2019

mike_m said:
I'll add a test of Port 80 and then just rely on prayers and sacrifices to the Gods of Small Electronic Devices!

Why not go belt and braces? Yes, pinging / port 80 might catch some failures so you can take immediate corrective action, but for the ones which slip through do a nightly reboot anyway.

Thinking about it, there are other advantages to having regular reboots - tunefix-update requires them, for example. I might go that way myself.

(My HDR3 has been crashed since last Saturday - I think - and I've only just noticed!)

mike_m · Jul 6, 2019

Black Hole said:
Why not go belt and braces? Yes, pinging / port 80 might catch some failures so you can take immediate corrective action, but for the ones which slip through do a nightly reboot anyway.

Yes, absolutely. But at this point I have a confession:

My wife bought our first Humax in the days before TVs had built-in Freeview or EPGs, so she watched everything via the Humax because of the user interface. However, if the machine was in standby, she found the wait (from stand-by to normal working) very annoying, especially when she had just got home and switched on to catch a news headline or a weather forecast, so she simply left it on all the time, and just turned the TV on and off. This habit has continued through several changes of machines and TVs over the years - the Humaxes are never on standby - they're always running. The reply to any objection of mine was always: “Well, it’s worked fine for 10 years, so what’s the problem?” to which I don’t have a convincing answer.

So my only reservation to a daily shutdown is my ingrained superstition about not powering off a hard drive before it spins down, unless I know that the heads are safely parked.

Trev · Jul 6, 2019

The 'convincing answer' could be "To save some electricity my dear". But I suppose the latter half of the sentence could possibly get you a slap. :frantic:

Black Hole · Jul 6, 2019

mike_m said:
My wife bought our first Humax in the days before TVs had built-in Freeview or EPGs, so she watched everything via the Humax because of the user interface. However, if the machine was in standby, she found the wait (from stand-by to normal working) very annoying, especially when she had just got home and switched on to catch a news headline or a weather forecast, so she simply left it on all the time, and just turned the TV on and off. This habit has continued through several changes of machines and TVs over the years - the Humaxes are never on standby - they're always running.

So are mine! Not so much because the TVs don't have Freeview tuners, but so they are instantly available to schedule recordings or whatever, and so the recordings on them are always available for streaming. If I implement a nightly shutdown, it will be briefly at about 4am: scheduled standby; power interruption; scheduled wake-up.

/df · Jul 6, 2019

If you wished to set up a liveness test, the following could be useful checks. You have to start by testing remote access because a local monitor can't be guaranteed to survive; but beware of some unrelated LAN issue preventing the connection.

Can you connect to the telnet server? Note: if you use some other protocol (HTTP, MQTT, ...) the server won't be running from the flash disk and may behave strangely if the /mod filesystem is in trouble.
Can you find the RW mount point for the /mod filesystem: mount | grep -E "on /$(ls -l /mod | sed -ne 's@^l.* -> @@;s@/mod$@@;p') type ext[234] \(rw,"?
Can you find the PID of a running humaxtv process PID="$(ls -l /proc/*/exe 2>&- | grep "/usr/bin/humaxtv" | cut -d'/' -f 3)"?
Now you can either analyse the output of lsof -p $PID (or similar) directly or just call status to check the health of the humaxtv process.
status will also show if the HD/R is recording or about to record in case you want to force a reboot.

If #1 fails you need to use the power switch (manual or networked), or the reset button on HD. if any of the later tests fails, or your analysis determines that it's a safe time to restart, you can try a software-invoked reboot, which would be better for @mike_m's disk heads, failing back to the power switch.

The tests need to be run with a timeout in case of a hang. This is a useful technique:

Code:

#/bin/sh +m
# +m: disable job (aka background process) messages

timeout_child () { # (timeout); uses $! = last BG process
    local child=$!
    local timeout=$1
    local timestep=1
    [ -n "$child" ] || return
    trap -- '' TERM KILL
    (      # use /proc to avoid variations in ps command
            while [ -d "/proc/$child" ]; do
                if [ $timeout -le 0 ]; then
                    kill -KILL $child
                    break
                fi
                sleep $timestep
                timeout=$((timeout - timestep))
            done
    ) &
    wait $child
}

hdstatus="$(status & timeout_child  10)" || echo "status command failed or hung"

mike_m · Jul 6, 2019

Trev said:
The 'convincing answer' could be "To save some electricity my dear". But I suppose the latter half of the sentence could possibly get you a slap.

Or in my case, the retort would be "Well, if we're talking about saving electricity, how about turning off your PC, your 2 laptops, 3 Raspberry Pi's, 2 printers, a scanner, a dozen chargers for various things, 3 assorted TVs in various rooms that we never watch but have been on standby for at least 8 years, your power supplies for time-lapse cameras, your various powered USB and ethernet hubs, and the night-time light on the landing?" "Er... I never installed the night-time light?" "Exactly!" Guilty as charged.

mike_m · Jul 6, 2019

Black Hole said:
If I implement a nightly shutdown, it will be briefly at about 4am: scheduled standby; power interruption; scheduled wake-up.

Yes, I think that's the best solution for me. I'll use the existing mqtt method for hourly checks and a 4am power-cycle (as a separate cron task) as the backstop.
How would you implement a scheduled standby - could it be done by Settings->Preferences->Time->Power Off Timer?

Actually, it occurs to me that I don't need to run a separate task for the power cycle. Presumably if the scheduled stand-by happens just before the hourly ping test, the test will fail and the power cycle will be triggered anyway.

Black Hole · Jul 6, 2019

mike_m said:
could it be done by Settings->Preferences->Time->Power Off Timer?

Yes, but there must also be a power on time.

mike_m said:
Presumably if the scheduled stand-by happens just before the hourly ping test, the test will fail and the power cycle will be triggered anyway.

Yes, as long as your polling cycle is long enough that you are not hitting it with a series of power interruptions. Just one, during the sleep time (which could be as short as a minute).

mike_m · Jul 6, 2019

/df said:
If you wished to set up a liveness test, the following could be useful checks. You have to start by testing remote access because a local monitor can't be guaranteed to survive; but beware of some unrelated LAN issue preventing the connection.

Can you connect to the telnet server? Note: if you use some other protocol (HTTP, MQTT, ...) the server won't be running from the flash disk and may behave strangely if the /mod filesystem is in trouble.

Can you find the RW mount point for the /mod filesystem: mount | grep -E "on /$(ls -l /mod | sed -ne 's@^l.* -> @@;s@/mod$@@;p') type ext[234] \(rw,"?

Can you find the PID of a running humaxtv process PID="$(ls -l /proc/*/exe 2>&- | grep "/usr/bin/humaxtv" | cut -d'/' -f 3)"?

Now you can either analyse the output of lsof -p $PID (or similar) directly or just call status to check the health of the humaxtv process.

status will also show if the HD/R is recording or about to record in case you want to force a reboot.

If #1 fails you need to use the power switch (manual or networked), or the reset button on HD. if any of the later tests fails, or your analysis determines that it's a safe time to restart, you can try a software-invoked reboot, which would be better for @mike_m's disk heads, failing back to the power switch.

I can telnet in via PuTTY and view the processes and open files, but that's me doing it manually - I'm not sure how to embed it into an automated script.

To be honest, I don't think I need to examine the processes in so much detail - my existing system only reboots if a Humax has definitely crashed, and a daily reboot at 4am is unlikely to conflict with a recording. At the most I'll lose one evening's recording, which I can live with.

mike_m · Jul 6, 2019

Black Hole said:
Yes, but there must also be a power on time.

Yes, as long as your polling cycle is long enough that you are not hitting it with a series of power interruptions. Just one, during the sleep time (which could be as short as a minute).

The polling cycle is every hour, on the hour. I'm thinking that I'll set the timers for 10 mins before and 10 mins after the hourly ping, to leave a really good margin in case the various clocks aren't synchronised. I'll try it tonight - the RPi script makes a log file so I can see if i worked.

Trev · Jul 6, 2019

mike_m said:
Or in my case, the retort would be "Well, if we're talking about saving electricity, how about turning off your PC, {snip}

Yeah, that's the sort of thing I meant. I supposed a sharp blow to the left ear is so last century.

mike_m · Jul 12, 2019

Just to report that the scheduled sleep (suggested by Black Hole), combined with the existing hourly ping test, has been working fine since last Saturday. To me, the great thing is that the ping script keeps running on the RPi all the time with no changes needed, and it restarts itself after a power cut. To change the frequency of the ping cycle, I just edit the crontab. Then I can set a scheduled sleep on each Humax independently, just making sure that the sleep and wake-up times straddle one of the pings.

I'm just using a bog-standard ping - I haven't bothered to find any more sophisticated checks for any 'half-crashed' or frozen modes - my thinking is that the usual sort of crash will be detected and fixed within an hour, and any other sort of problem will be handled by the nightly shut-down.

So I'm a happy bunny.

Thanks to everyone!

'Deep Ping' for Hummy?

Active Member

Well-Known Member

Active Member

May contain traces of nut

Ad detector

Active Member

May contain traces of nut

Well-Known Member

May contain traces of nut

Active Member

The Dumb One

May contain traces of nut

Well-Known Member

Active Member

Active Member

May contain traces of nut

Active Member

Active Member

The Dumb One

Active Member