HDD errors - should I be worried?

rowanmoor

Member
Hi,

Over the weekend WebIF had a banner about HDD problems. Initially it was:
Code:
Disk pending sector count is: 8
Disk offline sector count is: 8
Which was reflected in the SMART data. At that point Reallocated_Sector_Ct was 0.

The first thing I did was fix-disk. When running I had a loop of
Code:
Running short disk self test
Error at LBA 104280
Do you wish to attempt repair of the bad block? [Y/N]: y

/dev/sda:
re-writing sector 104280: succeeded

Running short disk self test
Error at LBA 104280
Do you wish to attempt repair of the bad block? [Y/N]:
I tried Y a few times and decided it wasn't doing anything so did N. This resulted in a repitition in the fix-disk output of:
Code:
Dev: /dev/sda LBA: 104280
LBA: 104280 is on partition /dev/sda1, start: 8, bad sector offset: 104272
dumpe2fs 1.42.10 (18-May-2014)
Using superblock 0
Block size: 4096
LBA 104280 maps to file system block 13034 on /dev/sda1

Checking to see if this block is in use...
debugfs 1.42.10 (18-May-2014)
Block 13034 is not in use
After that it ran through with nothing seeming to be unusual.
Reboot and the banner says
Code:
Disk realloc sector count is: 8
Disk pending sector count is: 7 (was 8)
Disk offline sector count is: 7 (was 8)
and the Smart data had changed to:
Code:
5- Reallocated_Sector_Ct: 8
197 - Current_Pending_Sector: 7
198 - Offline_Uncorrectable: 7
Another fix disk did the same loop thing and again nothing unusual in the output. However, this time it did nothing to the SMART data.
The next day I looked again and this time the smart data had changed to
Code:
5- Reallocated_Sector_Ct: 8
197 - Current_Pending_Sector: 0
198 - Offline_Uncorrectable: 0
I ran fix-disk again just in case and the output was:
Code:
Running /bin/fix-disk
Custom firmware version 3.00


Checking disk sda

Unmounted /dev/sda1
Unmounted /dev/sda2
Unmounted /dev/sda3

Running short disk self test

No pending sectors found - skipping sector repair
Using superblock 0 on sda1
Using superblock 0 on sda2
Using superblock 0 on sda3


Checking partition /dev/sda3...
e2fsck 1.42.10 (18-May-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
hmx_int_stor: 13/655776 files (0.0% non-contiguous), 112568/2622611 blocks

Checking partition /dev/sda1...
e2fsck 1.42.10 (18-May-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
hmx_int_stor: 16/65808 files (0.0% non-contiguous), 15265/263062 blocks

Creating swap file...
Setting up swapspace version 1, size = 1073737728 bytes
UUID=99111232-8084-443d-8d5a-f00d92aecfaf

Checking partition /dev/sda2...
e2fsck 1.42.10 (18-May-2014)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
hmx_int_stor: 7216/60334080 files (4.1% non-contiguous), 131864642/241304331 blocks
Removing extra swap space.
Are you having problems with a delete loop? [Y/N]: n
Skipped

Finished

Full smart data is:
Code:
SMART StatusPASSED 
Device Model  ST1000VM002-1CT162
Serial Number  S1G2GLHK
LU WWN Device Id  5 000c50 061f759ea
Firmware Version  SC23
User Capacity  1,000,204,886,016 bytes [1.00 TB]
Sector Sizes  512 bytes logical, 4096 bytes physical
ATA Version is  8
ATA Standard is  ATA-8-ACS revision 4
Local Time is  Tue Oct 28 13:30:36 2014 GMT
SMART support is  Available - device has SMART capability.
SMART support is  Enabled
Attributes
ID Name Flags Raw Value Value Worst Thresh Type Updated When Failed
1 Raw_Read_Error_Rate POSR-- 124172648 117 099 006 Pre-fail Always -
3 Spin_Up_Time PO---- 0 097 097 000 Pre-fail Always -
4 Start_Stop_Count -O--CK 1081 099 099 020 Old_age Always -
5 Reallocated_Sector_Ct PO--CK 8 100 100 036 Pre-fail Always -
7 Seek_Error_Rate POSR-- 25686366 074 060 030 Pre-fail Always -
9 Power_On_Hours -O--CK 922 099 099 000 Old_age Always -
10 Spin_Retry_Count PO--C- 0 100 100 097 Pre-fail Always -
12 Power_Cycle_Count -O--CK 1081 099 099 020 Old_age Always -
184 End-to-End_Error -O--CK 0 100 100 099 Old_age Always -
187 Reported_Uncorrect -O--CK 1 099 099 000 Old_age Always -
188 Command_Timeout -O--CK 0 100 100 000 Old_age Always -
189 High_Fly_Writes -O-RCK 5 095 095 000 Old_age Always -
190 Airflow_Temperature_Cel -O---K 41 059 044 045 Old_age Always In_the_past
191 G-Sense_Error_Rate -O--CK 0 100 100 000 Old_age Always -
192 Power-Off_Retract_Count -O--CK 1080 100 100 000 Old_age Always -
193 Load_Cycle_Count -O--CK 1081 100 100 000 Old_age Always -
194 Temperature_Celsius -O---K 41 041 056 000 Old_age Always -
197 Current_Pending_Sector -O--C- 0 100 100 000 Old_age Always -
198 Offline_Uncorrectable ----C- 0 100 100 000 Old_age Offline -
199 UDMA_CRC_Error_Count -OSRCK 0 200 200 000 Old_age Always -

I have had a couple of recordings (always HD channels) where there was occasional breakup, but until now I have put that down to HDMI-aerial flylead crosstalk. Just before this there was one recording that had some breakup which was recorded overnight when I am certain there was nothing turned on so it shouldn't be HDMI interference. Recordings since have been fine. As this is a 6-9 month old graded box from Humax, I suspect the disk has not been as full as it currently is before, so it could just be using new sectors of the disk that have never been used before.

So, should I be worried and looking at a warranty exchange yet, or should I just keep an eye on it?
 
So, should I be worried and looking at a warranty exchange yet, or should I just keep an eye on it?
I would say the SMART data looks fine at the moment although the appearance and then disappearance of Pending_Sectors is a bit odd and is perhaps related to recording breakup. I would keep an eye on it for now.
 
Thanks for the thoughts.

I would say the SMART data looks fine at the moment although the appearance and then disappearance of Pending_Sectors is a bit odd and is perhaps related to recording breakup. I would keep an eye on it for now.
That was my thinking - but I don't know much about the technicalities of disk failures so wanted to check with more knowledgeable people.

How would you submit your warranty claim when your evidence is only available through custom firmware?
That would have been my next question :lol:

I would start an immediate and ongoing backup of everything if people thought it signs of terminal decline. I may also consider just replacing the disk myself in order to do the replacement in my timing and not have the hassle of not having it turned off while I wait for couriers etc. It does seem a fair bit of money when it is potentially under warranty though. I guess I would have to wait until it had regular recording/playback failures that I could prove were not signal related to start a warranty claim.
 
Back
Top