Need guidance on diagnostics - large spike in HDD temp never seen before

rodp

Member
Hi All,

I got an alert a little while back (flashing red ring on the front) to say internal temp (HDD?) was too high. I checked it out , expecting a small rise over the threshold which on a hot day can occur but I found this. between 10 and 11pm the HDD temp reading "says" it went from 47 deg C to about 92 deg C at the point it turned off. I have never seen such a spike like this before. The unit is working fine and the fan is working (see second pic). I checked out in the RS logs to see what it was doing and it was simply recording one program 10:30 to 11pm. What other logs should I look at and also do you think this is infact a false temp read?

Some additional questions below.

I've checked the fan and it's running, no strange noises - very quiet but I can feel the air flow.

Thanks

Rodp

========================================

redring.log
Code:
4404    [RR] Tue May 14 04:20:02 2024: Persistent log starting, v2.20
4403    +++++++++++++++++++++++++++++++++++++++++++++++++++++++
4402    [RR] Mon May 13 23:49:32 2024:   Setting LED level.
4401    [RR] Mon May 13 23:49:32 2024: Standby ring dim detected.
4400    [RR] Mon May 13 23:49:29 2024:   Changing to dim blue.
4399    [RR] Mon May 13 23:49:29 2024: Ring trying to go amber.
4398    [RR] Mon May 13 23:49:28 2024:   Setting LED level.
4397    [RR] Mon May 13 23:49:28 2024: Standby ring dim detected.
4396    [RR] Mon May 13 23:49:28 2024:   Setting LED level.
4395    [RR] Mon May 13 23:49:28 2024: Standby ring dim detected.
4394    [RR] Mon May 13 23:39:56 2024: Play icon off.
4393    [RR] Mon May 13 23:16:37 2024: Play icon on.
4392    [RR] Mon May 13 23:16:02 2024: Play icon off.
4391    [RR] Mon May 13 22:52:36 2024: Play icon on.
4390    [RR] Mon May 13 22:52:34 2024: Play icon off.
4389    [RR] Mon May 13 22:31:51 2024: Recording end 40.
4388    [RR] Mon May 13 22:31:50 2024: REC icon off.
4387    [RR] Mon May 13 22:29:27 2024: Play icon on.
4386    [RR] Mon May 13 22:29:21 2024: Play icon off.
4385    [RR] Mon May 13 22:06:26 2024: Opened time data file '/tmp/.offset'
4384    [RR] Mon May 13 22:06:14 2024: Play icon on.
4383    [RR] Mon May 13 22:01:09 2024: REC icon on.
4382    [RR] Mon May 13 22:01:09 2024: Ring going red.
4381    [RR] M+++++++++++++++++++++++++++++++++++++++++++++++++++++++
4380    +++++++++
4379    [RR] Mon May 13 22:01:09 2024:    Recording 1
4378    [RR] Mon May 13 22:01:09 2024:    Recording 1
4377    [RR] Mon May 13 22:01:09 2024: Recording start 40:'/mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201.nts'
4376    +++++++++++++++++++++++++++++++++++++++++++++++++++++++
4375    on May 13 22:01:09 2024: Persistent log starting, v2.20
4374    [RR] Mon May 13 22:01:09 2024: Persistent log starting, v2.20
4373    [RR] Mon May 13 04:40:05 2024:   Setting LED level.
4372    [RR] Mon May 13 04:40:05 2024: Standby ring dim detected.
4371    [RR] Mon May 13 04:40:03 2024:   Changing to dim blue.
4370    [RR] Mon May 13 04:40:03 2024: Ring trying to go amber.
4369    [RR] Mon May 13 04:40:02 2024:   Setting LED level.

RS log:
Code:
14/05/2024 04:02:02    Updated disk contents.
14/05/2024 03:57:56    System booted (Scheduled event).
13/05/2024 22:31:19    Recorded: The Sky at Night/The Sky at Night (30 minutes - BBC FOUR)
13/05/2024 08:54:29    System booted (Remote control handset).
13/05/2024 04:02:34    Updated disk contents.

auto log
Code:
4483    15/05/2024 23:48:32 - decrypt:  /media/My Video/Race Across the World/Race Across the World_20240515_2101.ts - Queued for decryption.
4482    13/05/2024 22:36:18 -     OK - 747.44 MiB in 81.252 seconds - 9.2 MiB/s Saved: 1100677/4082027 packets, 201.54/747.44 MiB (26.96%) -
4481    13/05/2024 22:36:17 - shrink:Saved: 1100677/4082027 packets, 201.54/747.44 MiB (26.96%)
4480    13/05/2024 22:34:57 - shrink:          Shrinking...
4479    13/05/2024 22:34:57 - shrink:          Estimate 27% saving.
4478    13/05/2024 22:34:57 - shrink:  SHRINK: /mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201
4477    13/05/2024 22:34:55 - De-queuing 113222 - shrink - /mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201.ts
4476    13/05/2024 22:34:55 -     OK - 747.44 MiB in 109.484 seconds - 6.83 MiB/s -
4475    13/05/2024 22:34:54 - decrypt:  bin = (/media/My Video/[Deleted Items]/webif_autodecrypt/The Sky at Night)
4474    13/05/2024 22:34:54 - decrypt:  finding bin for /mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201.ts
4473    13/05/2024 22:34:54 - decrypt:  Removing/binning old copy.
4472    13/05/2024 22:33:05 - decrypt:  DLNA: http://127.0.0.1:9000/web/media/5014.TS
4471    13/05/2024 22:33:05 - decrypt:  DECRYPT: /mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201
4470    13/05/2024 22:33:02 - De-queuing 113221 - decrypt - /mnt/hd2/My Video/The Sky at Night/The Sky at Night_20240513_2201.ts
4469    13/05/2024 22:32:08 - shrink:autotrigger[21411]:   /media/My Video/The Sky at Night/The Sky at Night_20240513_2201.ts - Queued for shrink.
4468    13/05/2024 22:32:07 - decrypt:autotrigger[21411]:   /media/My Video/The Sky at Night/The Sky at Night_20240513_2201.ts - Queued for decryption.
4467    12/05/2024 15:03:19 -     OK - 649.61 MiB in 47.591 seconds - 13.65 MiB/s Saved: 1058538/3547472 packets, 193.82/649.61 MiB (29.84%) -

Should the fan have gone to 'Fan Full' or due to the fan package does that actually not correlate to the legend in the below chart?
Why is this 92 deg c reading not registered in the SMART diagnostics (see below)?
Is there any way to measure the RPM of the fan to check it didn't temporarily fail?
1716202644421.png

1716202971652.png


1716202751497.png

fan speed set to 60%.

1716203136999.png

HDD diagnostics

1716203300461.png
1716203327758.png
1716203352919.png
 

Attachments

  • 1716203133271.png
    1716203133271.png
    22.2 KB · Views: 2
Should the fan have gone to 'Fan Full' or due to the fan package does that actually not correlate to the legend in the below chart?
The legend is predicted behaviour assuming only standard firmware is in control. It does not relate to actual fan behaviour, the only measured information is the HDD temperature. However, the fan package treats the standard behaviour as the minimum requirement, and adds more duty cycle rather than decreasing it at all.

It can malfunction and require fix-flash-packages, but it wouldn't fix itself.

Why is this 92 deg c reading not registered in the SMART diagnostics (see below)?
I suspect this is a glitch of some kind and not an actual report from the HDD.

Is there any way to measure the RPM of the fan to check it didn't temporarily fail?
No.

Put the fan setting to 100% and you'll know whether the fan is running and whether the fan package is malfunctioning.
 
Last edited:
Put the fan setting to 100% and you'll know whether the fan is running and whether the fan package is malfunctioning.
Yep - all ok, spins up to max after about 1 minute and then spins back down again after about 1 minute after updating the setting again. So I'll assume it's a glitch!

On a different point...
It can malfunction and require fix-flash-packages, but it wouldn't fix itself.
-When ever you have a crash, if you acknowledged the crash message in Webif and then rebooted a couple of times, would that do the same thing as fix-flash-packges or would you still need to run it manually?
-Can you tell what packages have been disabled?
 
would you still need to run it manually?
Yes. All acknowledging the message does is acknowledge the message so it disappears until next time. All rebooting does is reboot. I guess you've never done it to see what hoops fix-flash-packages jumps through.

Can you tell what packages have been disabled?
I am not aware that you can, but watching fix-flash-packages run provides details of what it's doing. I wasn't referring to things getting disabled due to a double crash, I was talking about using it to restore operation if things got corrupted.

The "disable flash packages" function is a failsafe, invoked if multiple crashes occur in quick succession, so that if crashes are induced by specific CF packages (those which are tightly integrated with the Humax code) they can't create a deadly embrace (in this case, uninterruptible crash cycles), but this is more a hindrance than a help once the packages have a track record of being "clean".

There is a mechanism to disable the disable (WebIF >> Diagnostics):
For anyone who wishes to disable this automatic plugin-disable safety net, you can... run the plugin_autodisable/off diagnostic.

If your box is crashing, then something else is wrong, but this will at least stop things like undelete going away. One warning, if something is wrong with one of the plugins then your box could enter a permanent reboot loop that requires a custom-custom firmware to resolve. There have been no recent reports of instability though.

To re-enable the automatic disable feature, run the plugin_autodisable/on diagnostic.

Or alternatively on the command line:
To prevent this:
Code:
humax# touch /var/lib/humaxtv/mod/no_plugin_autodisable

Note that, unless you are using the beta test WebIF, there will still be a warning posted that some packages may have been disabled, but with the above flag file installed there is no need to take any notice – they will not have been disabled en bloc.

If the deadly embrace of a flash-installed CF package causing an uninterruptible crash cycle does occur with the autodisable flag set, then drastic action will be required such as reinstalling firmware or (perhaps) running the system flush image, or some other utility using the firmware update from USB mechanism. We have never had a report of this happening in the wild – the whole idea of prevention was to aid alpha development work.

Having the autodisable flag set does not mean fix-flash-packages never needs to be run; if you notice something not working properly then ffp is the first cure to try.
 
Back
Top