SOLVED: The CF causes processor overload UNLESS fixdisk used to optimise hard drive again

nor the 2nd recording
1Raw_Read_Error_RatePOSR--239264634120079006-
3Spin_Up_TimePO----0097097000-
4Start_Stop_Count-O--CK2285407807802073%-
5Reallocated_Sector_CtPO--CK0100100036100%-
7Seek_Error_RatePOSR--1534387878091060030-
9Power_On_Hours-O--CK4339605105100051%-
10Spin_Retry_CountPO--C-0100100097100%-
12Power_Cycle_Count-O--CK1142708908902087%-
184End-to-End_Error-O--CK0100100099-
187Reported_Uncorrect-O--CK14774001001000-
188Command_Timeout-O--CK0100100000-
189High_Fly_Writes-O-RCK0100100000-
190Airflow_Temperature_Cel-O---K54046 (54°C)043 (57°C)045 (55°C)In_the_past
194Temperature_Celsius-O---K54054057000-
195Hardware_ECC_Recovered-O-RC-239264634044033000-
197Current_Pending_Sector-O--C-0100100000-
198Offline_Uncorrectable----C-0100100000-
199UDMA_CRC_Error_Count-OSRCK0200200000-
 
I don't like that "Reported_Uncorrect" figure. I suppose the cause of the higher-than-desirable temperature figure is due to you running in Safe mode without the Fan package with a decent minimum speed set.
 
There's a discussion of the interpretation of "Reported_Uncorrectable_Errors" (SMART parameter 187) here. tl;dr: any value > 0 is a Bad Thing.

It would be reassuring to see that "Hardware_ECC_Recovered" == "Raw_Read_Error_Rate", if I didn't suspect that the firmware is just reporting the same data in both fields.
 
those guys have new drives that fail within 70 power cycles. how is that relevant to my 22k cycles?

there was an instance in the past when a sector had failed but af123 helped me sort it out about 3-5 years ago whether that has any bearing on this matter
 
BackBlaze had (2014) many thousands of drives. Some fail on burn-in. As the blog points out, their usage doesn't involve regularly turning drives off and on, so their data for power cycling isn't a good match for your HDR.

The ones that survive, like your disk, turn out to be likely to fail once SMART 187 > 0. In fact if 10 drives have values in the range 65-120 their stats say that between 1 and 3 will probably fail in the next year. Your disk's value is some 1500 times greater, way off the scale of the graphs presented by BackBlaze.

According to your stats, the firmware marked the bad sector as good without relocating it. It (and perhaps its neighbours) may be responsible for the large number of "Reported_Uncorrectable_Errors".
 
Speculative hypothesis:

The Reported Uncorrect figures could refer to a single bad sector that is being frequently referenced, but for some reason the disk hardware is unable to reallocate the sector.

Since pixellation occurs at ten minute intervals while the CF is running (but auto turned off) but not during recording with CF turned off the faulty sector could be in one of the databases that the CF accesses every ten minutes.

If you were to rename the affected database that portion of the disk would no longer be accessed and hopefully the problems would cease, (missing databases are normally recreated on next use though the reservation database (rsv.db) would need to be restored from a backup of the recording schedule)

AFAIK the two processes that run every 10 minute with auto turned off are the EPG update using /mnt/hd1/dvbepg/epg.dat sqlitedumpd /mnt/hd1/epg.db and rs_processes using /var/lib/humaxtv/rsv.db, I am not what else they might access

BTW my own disk has an Reported Uncorrect figure of 2052 so I will have to keep an eye on it.
 
Speculative hypothesis:

The Reported Uncorrect figures could refer to a single bad sector that is being frequently referenced, but for some reason the disk hardware is unable to reallocate the sector.
...
Or just kept deciding it wasn't necessary. The stats indicate that this is a one in 20 thing at worst, if the box has been CFed from the start, or worse in inverse proportion to the fraction of uptime using CF.

hdparm --fibmap filename shows which disk sectors are used by the file "filename". A fix-file script could be envisaged that would test the sectors listed for the file using a selective SMART test. Any sectors that were marginal but never remapped could be added to a bad-block file to be fed to e2fsck -l. This would be less intrusive than running fix-disk or a full OS level disk check, eg e2fsck -cc.
 
Last edited:
Can I allow myself a sigh of relief?
While it may not be dying imminently you still have the ongoing problem that the disk is not performing well enough to allow you to run the normal basic CF operations without getting pixellation of recordings, If you are prepared to live without the CF facilities that is your choice but if you want to use CF then you need to improve the disks performance or replace it.

You could try my earlier suggestions, try running fix disk to see if it finds anything or find another disk to see if performs better
 
viewed quite a few recordings recently and no pic breaks at all.

is there a possibility that fix disk might make things worse?
 
Back
Top