Has fixdisk frozen?

4ndy

Member
I opened webif on one of my boxes yesterday, only to get the two options: download full webif or telnet menu, The first option gave errors, so I went for the second and ran fix disk.

After about 3 hours it looks like the FAT is scrambled, and it trying to repair it. It has been stuck at this point for a couple of hours now. I can carriage return, so I know it is not locked. Is it likely to be able to repair itself, or do I need a new HDD?

Code:
Pass 1b: Memory used: 4148k/19440k (4113k/36k), time: 3721.11/1262.49/866.41
Pass 1b: I/O read: 30465MB, write: 0MB, rate: 8.19MB/s
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1c: Memory used: 4148k/19440k (4113k/36k), time:  2.86/ 0.07/ 0.17
Pass 1c: I/O read: 4MB, write: 0MB, rate: 1.40MB/s
Pass 1D: Reconciling multiply-claimed blocks
(There are 4575 inodes containing multiply-claimed blocks.)

File /My Video/Around the World in Eighty Days (iPlayer)/Around_the_World_in_80_Days_Ancient_Mariners.mp4 (inode #1433607, mod time Fri Oct 11 12:36:16 2019)
  has 36 multiply-claimed block(s), shared with 1 file(s):
        ... (inode #73901858, mod time Mon Jan  7 23:04:39 2019)
Clone multiply-claimed blocks? yes

It got to this point last night, but I assumed it was stuck and rebooted.

Any help or advice appreciated.
 
There isnt' a 'FAT'.
Just leave it and let it run.

Post the SMART stats. from a command line run of "smartctl -a /dev/sda".
 
Just to add to the good advice from prpr; fix-disk runs can take from an hour to several days depending on the nature of the problem and long pauses are not uncommon.
 
Thanks for the help guys.

Back home after day out and fixdisk has not advanced in over 8 hours! Still accepting a carriage return, so presumably not locked up.

The smartctl output is here

Code:
HDR4# smartctl -a /dev/sda
smartctl 6.4 2015-06-04 r4109 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Video 3.5 HDD
Device Model:     ST2000VM003-1ET164
Serial Number:    Z525RXJZ
LU WWN Device Id: 5 000c50 0b1b5c893
Firmware Version: SC12
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Nov 16 19:56:30 2019 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  107) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 258) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x10b9) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   099   006    Pre-fail  Always       -       10141968
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1150
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       195604995
  9 Power_On_Hours          0x0032   095   095   000    Old_age   Always       -       5214
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1117
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   026   026   000    Old_age   Always       -       74
190 Airflow_Temperature_Cel 0x0022   064   047   045    Old_age   Always       -       36 (Min/Max 35/38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1112
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       1150
194 Temperature_Celsius     0x0022   036   053   000    Old_age   Always       -       36 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
 
There's nothing wrong with the disk, so looks like you 'just' have filesystem corruption.
If fixdisk doesn't do anything within a sensible time then your only option is to backup what you can and reformat.
 
Fixdisk is still running, "sorting out" about 5-4 recordings a day. Happy to let it run indefinitely.

Out of interest can anyone make sense of the display on the front of the box? It says

"nt_stor:1*70" where * is the whirling symbol used in fixdisk. The 70 was 73 this afternoon. I wondered if it was a clue as to how long it needs to run.

Sent from my SM-G950F using Tapatalk
 
Yes, this comes from fix-disk's "background progress reporter" (/mod/boot/2/fix-disk l 746).

The format is <device>:<pass>*<percent>%. So your display "nt_stor:1*70" means it's 70% through pass 1 on the partition /dev/nt_stor... Normally the 5 characters allowed for the partition device would be enough todevice field would identify the partition (eg /dev/sda2); I'm not sure how you've got an ext2fs partition with the name shown.

Updated to remove brain fade.
 
Last edited:
One presumes that is really "hmx_int_stor" which comes from the first line of "dumpe2fs -h" output for a filesystem:
Filesystem volume name: hmx_int_stor
 
  • Like
Reactions: /df
That must be it, except it's from e2fsck -fyvtt -C3 (the 4th word of the -C output).

I believe that the code was expecting to see something like /dev/sda1 which is why it strips off the 1st 5 characters. But a little test shows that e2fsck -C replaces the device name with the volume name if the latter isn't empty.

So for the next CF (or anyone who wants to patch /mod/boot/2/fix-disk):
Code:
--- fix-disk.313
+++ fix-disk.vol
@@ -743,7 +743,7 @@
                if [ -n "$pass" -a -n "$perc" ]; then
                        [ "$perc" = "100.0" ] && perc=100
                        c=${spinner:$i:1}
-                       display "${dev:5}:$pass$c$perc%"
+                       display "${dev#/dev/}:$pass$c$perc%"
                        i=$((i + 1))
                        [ $i -ge ${#spinner} ] && i=0
                fi
 
That must be it, except it's from e2fsck -fyvtt -C3
I didn't mean "dumpe2fs -h" was used in fixdisk, but rather to indicate how to display the volume name.
So for the next CF (or anyone who wants to patch /mod/boot/2/fix-disk)
It's /bin/fix-disk
The version in /mod/boot/2/ is what you get if you install the fixdisk package, and it's somewhat backlevel compared to the version in the firmware.

Anyway, that's two things that need fixing in the CF now, so maybe @af123 will do it this Xmas when he's bored (actually during Coronavirus lockdown)!
 
Last edited:
Thanks for the suggestions.

I have left it running, more out of curiosity than anything.

The "70" has not changed since my last post.

Fixdisk has been 24 hours working on the /tsr/0.ts file. Cold a future version of fixdisk delete this file as it is pointless rebuilding it? Maybe also empty the dustbin would speed things up as well.

Sent from my SM-G950F using Tapatalk
 
Although it might seem reasonable to remove potentially unwanted files, such as the time-shift buffer(s) and the contents of the dustbin, before fixing, the idea has some problems.

The file system containing the time-shift buffer may not be mounted when fix-disk runs; in fact, its non-mounting couild be why fix-disk is being run. Also, the timeshift buffer may in fact contain the programme that the user is trying to recover (via nicesplice, eg); similarly with the dustbin, if enabled.

If you know the time shift buffer, or its backup, is scrap and its filesystem does mount, you can delete it manually from the command line before running fix-disk.
 
The point is that the buffer can easily be the largest file on the Humax and therefore beyong the poor thing's processing power.

Mine is still working on it after two days.

Sent from my SM-G950F using Tapatalk
 
The point is that the buffer can easily be the largest file on the Humax and therefore beyong the poor thing's processing power.
I thought the buffers were fixed in size? Whereas programme recordings have no limit on there size beyond available space on the recording partition (as a serious runaway recording will demonstrate). Has it finished yet?
 
Fixdisk is still running, day 10 now!

The tsr/0.ts file took best part of 3 days to complete.

Hopefully fixdisk will finish soon, as I have one half of a double socket in the kitchen stopped working, and I need to put the power off to replace it.

One unexpected consequence of the box being offline is that the auto function of RS has stopped working, with no emails notifying me of matches I might want to record on another box. It has also stopped adding to the pending list.

RS was useful initially as it listed scheduled recordings I could record elsewhere.

It would be good if RS didn't give up on an offline box quite so quickly.



Sent from my SM-G950F using Tapatalk
 
fixdisk finally finished. To make absolutely certain I set it running again, expecting it to run straight though quite quickly.

I have just returned home to a list of thousands of inodes(?) ending with the message below. can there really e over 300,000 multiply-claimed blocks in zero files?

Code:
Pass 1b: Memory used: 15732k/19440k (15698k/35k), time: 3316.09/931.74/893.58
Pass 1b: I/O read: 30463MB, write: 0MB, rate: 9.19MB/s
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1c: Memory used: 15732k/19440k (15698k/35k), time:  0.43/ 0.01/ 0.02
Pass 1c: I/O read: 1MB, write: 0MB, rate: 2.34MB/s
Pass 1D: Reconciling multiply-claimed blocks
(There are 2 inodes containing multiply-claimed blocks.)

File /My Video/Van der Valk/Still Waters.ts (inode #52576276, mod time Tue Oct  8 19:38:10 2019)
  has 303282 multiply-claimed block(s), shared with 0 file(s):
Clone multiply-claimed blocks? yes

I am out for the evening. Is it worth leaving it to run beyond that? Any suggestions welcome

The new Smart data is here

Code:
smartctl 6.4 2015-06-04 r4109 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Video 3.5 HDD
Device Model:     ST2000VM003-1ET164
Serial Number:    Z525RXJZ
LU WWN Device Id: 5 000c50 0b1b5c893
Firmware Version: SC12
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Nov 28 17:49:16 2019 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  107) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 258) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x10b9) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       153493128
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1150
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always       -       196118375
  9 Power_On_Hours          0x0032   094   094   000    Old_age   Always       -       5500
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   Always       -       1117
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   026   026   000    Old_age   Always       -       74
190 Airflow_Temperature_Cel 0x0022   063   047   045    Old_age   Always       -       37 (Min/Max 33/38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1112
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       1150
194 Temperature_Celsius     0x0022   037   053   000    Old_age   Always       -       37 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5490         -
# 2  Short offline       Completed without error       00%      5200         -
# 3  Short offline       Completed without error       00%      5192         -
# 4  Short offline       Completed without error       00%      5188         -
# 5  Short offline       Completed without error       00%      5182         -
# 6  Short offline       Completed without error       00%      4137         -
# 7  Short offline       Completed without error       00%      4087         -
# 8  Short offline       Completed without error       00%      3918         -
# 9  Short offline       Completed without error       00%      3706         -
#10  Short offline       Completed without error       00%      3703         -
#11  Short offline       Completed without error       00%      2666         -
#12  Short offline       Completed without error       00%      1637         -
#13  Short offline       Completed without error       00%       812         -
#14  Short offline       Completed without error       00%       636         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
I don't speak emojies, but I sense you take issue with my logic.

If fixdisk has repaired the corruptions, then i would expect a second run to power through without any issues. In my logical world, if it has found more issues it was not fixed.

Sent from my SM-G950F using Tapatalk
 
Back
Top