Current_Pending_Sector/Offline_Uncorrectable errors - will fix-disk help?

OP
I

Ian Manning

Member
I'll check the signal strength and quality the next time it happens.
In the meantime I kicked off another fix-disk about 40 minutes ago and it hasn't moved on from "Running short disk self test - Waiting... 0" (see below). Does that sound normal? (it's a 2TB drive)
The front display says "Fixdisk Init"
humax fix-disk 2020-10-13.jpg
 

MartinLiddle

Super Moderator
Staff member
In the meantime I kicked off another fix-disk about 40 minutes ago and it hasn't moved on from "Running short disk self test - Waiting... 0" (see below). Does that sound normal? (it's a 2TB drive)
No that isn't normal; wait for one of the fix-disk gurus to tell you what to do next.
 

MymsMan

Ad detector
Not a guru ...
Try connecting a new telnet session and try connecting to abduco session
but if front panel also frozen it looks to have crashed very early so probably just restart fixdisk and hope it gets further
 
OP
I

Ian Manning

Member
I get "connection refused" if I try to connect in from another telnet session. Presumably I should just power off the Humax and start again?
 
OP
I

Ian Manning

Member
I've just retried the fix-disk and it's doing exactly the same thing as before (stuck at "Running short disk self test - Waiting... 0"). I'm guessing that this is not good news?
 

/df

Active Member
Try the -l option to fix-disk. It forces a different and possibly more correct test procedure.

Don't expect to use the disk for a few hours while it runs.
 
OP
I

Ian Manning

Member
I ran fix-disk overnight with the -l parameter. Approx. 10 hours later the screen was full of these messages:

ncheck: EXT2 directory corrupted while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: EXT2 directory corrupted while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: EXT2 directory corrupted while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate
ncheck: Invalid argument while calling ext2_dir_iterate


...and the job did not appear to complete - i.e. there was no message indicating that it had completed. I've now rebooted it.

Any ideas??
 

/df

Active Member
If it was like this, the ncheck messages arise when fix-disk is trying to navigate the filesystem to identify which file or directory is affected by a bad sector. This can work well if bad sectors haven't affected directories, but not otherwise.

"I agree with him". You just won't have a reliable list of affected files. Use -P to avoid an extra SMART test.
 
OP
I

Ian Manning

Member
Thanks for the feedback. So the conclusion is: no need to replace the HDD just yet, but run another fix-disk with -l, -P and -y?
 

/df

Active Member
Just -P -y. That'll run the just the filesystem check/fix, which should result in a valid filesystem (actually, 3, 1 per partition).

A corrupt filesystem has appeared to be the root cause of various system crashes, hangs and glitches, so possibly also your signal issue.

The OEM firmware doesn't let you correct such errors except by backing up and re-formatting.
 
Last edited:

MymsMan

Ad detector
Thanks for the feedback. So the conclusion is: no need to replace the HDD just yet, but run another fix-disk with -l, -P and -y?
It is a pity the ncheck messages give no indication as to which directories they referred so that you could have some idea which files you may have lost.

The file system can't be all bad otherwise you wouldn't be able to use the webif or play recordings but until it has been repaired I wouldn't risk new recordings and avoid anything that writes too much (decryption, shrink, cropping etc) to avoid exacerbating the problems.
You might also want to watch unwatached recordings or copy to another device any recordings you want to save in case re-formatting the disk becomes necessary.
 
OP
I

Ian Manning

Member
OK I'll do another fix-disk with -P -y. In the meantime I just logged into the web IF and got this:
humax pink message.jpg
Reading between the lines, would it be sensible just to replace the drive, rather than waiting for it to fail (and lose my recordings)?
 

MymsMan

Ad detector
OK I'll do another fix-disk with -P -y. In the meantime I just logged into the web IF and got this:
View attachment 4974
Reading between the lines, would it be sensible just to replace the drive, rather than waiting for it to fail (and lose my recordings)?
This is not a new error, it is just telling you what you already knew - that fix disk has successfully reallocated the sectors that were in pending state

Once you have sorted out the file system issues caused by those bad sectors you should be OK provided the numbers don't start climbing steeply.
 

/df

Active Member
As the filesystem check ran out of memory, it probably needs to be run again. With any luck the fixes so far will have left it (I assume the Video partition) in a good enough state to run the check to completion now. If it completed the check on the 1st and 3rd partitions last time, use -2 -P -y to check just the Video partition.
 
OP
I

Ian Manning

Member
This one finished more quickly (-2 -P -y), but still had the memory allocation error, and still spat out shedloads of "illegal indirect block" errors:
humax fixdisk finished 2.jpg
 

/df

Active Member
This one finished more quickly (-2 -P -y), but still had the memory allocation error, and still spat out shedloads of "illegal indirect block" errors:
It's moved on. The problem is described here.

You could just continue until it works. Or, if you don't mind using the maintenance mode shell, this might sort it in one go (based on the link above, untested).

At the maintenance mode command line (cli), run mount[Enter] to check whether partition 3 (probably /dev/sda3) is mounted. If not, mount /dev/sda3 /mnt/hd3[Enter]. Create a cache directory on that partition for e2fsck mkdir -p /mnt/hd3/e2fscache[Enter].

Create a configuration file for e2fsck:
Code:
cat <<EOM >/var/lib/humaxtv_backup/mod/e2fsck.conf 
[scratch_files]
directory = /mnt/hd3/e2fscache
EOM
Check that the new file in fact contains lines 2 and 3 above as it should cat /var/lib/humaxtv_backup/mod/e2fsck.conf[Enter].

Now, go to an environment where this configuration should be used E2FSCK_CONFIG=/var/lib/humaxtv_backup/mod/e2fsck.conf tmenu[Enter] (not E2FSCK_OPTS!). This should bring up the familiar maintenance mode menu (after PIN entry). When you run fix-disk from this menu, e2fsck should find out about the cache directory, fill it with data needed to fix a giant inode, and not run out of memory.

If this works it would be a candidate for updating fix-disk in the next CF. fix-disk uses a swap file to gain memory headroom for scanning the large partition, but it's obviously not enough. The [scratch_files] option might not have been available when fix-disk was created, but we have /sbin/e2fsck which is linked to v1.42.13 in /usr/lib/ext2 in the CF now (as well as two statically linked versions in the repository, which wouldn't be accessible to fix-disk as their installations would be on the filesystem being fixed).
 
Last edited:
Top