1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Disk issue - advice please

Discussion in 'HDR-FOX T2 Freeview Recorder' started by SimpleSim, Nov 5, 2012.

  1. SimpleSim

    SimpleSim Member

    Hi
    Just back from a weeks holiday so the box has only been recording and not watching at the same time.

    I have noticed that my recordings jump and stutter, sometimes if I rewind it will play the same part ok other times it will replay the same part with the same skips. Old recordings I know are ercorded ok skip around also. When I run the standard HDD test I get the following

    HDD test fail
    You may recover the HDD through Format Storage. (Error code: 8)

    So I have seen another thread on a disk problem so I have tried to use the same tools on my box.

    >>> Beginning diagnostic diskattr
    Running: diskattr
    smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF INFORMATION SECTION ===
    Model Family: Seagate Pipeline HD 5900.2
    Device Model: ST31000424CS
    Serial Number: 5VX2H6ZE
    LU WWN Device Id: 5 000c50 044930c5c
    Firmware Version: SC13
    User Capacity: 1,000,204,886,016 bytes [1.00 TB]
    Sector Size: 512 bytes logical/physical
    Device is: In smartctl database [for details use: -P show]
    ATA Version is: 8
    ATA Standard is: ATA-8-ACS revision 4
    Local Time is: Mon Nov 5 17:21:37 2012 GMT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    === START OF READ SMART DATA SECTION ===
    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate 0x000f 085 084 006 Pre-fail Always - 189654418
    3 Spin_Up_Time 0x0003 095 093 000 Pre-fail Always - 0
    4 Start_Stop_Count 0x0032 097 097 020 Old_age Always - 3480
    5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
    7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail Always - 78841377
    9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 2261
    10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
    12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1740
    184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
    187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 2400
    188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
    189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1
    190 Airflow_Temperature_Cel 0x0022 051 044 045 Old_age Always In_the_past 49 (1 44 49 49)
    194 Temperature_Celsius 0x0022 049 056 000 Old_age Always - 49 (0 14 0 0)
    195 Hardware_ECC_Recovered 0x001a 045 037 000 Old_age Always - 189654418
    197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1
    198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1
    199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0


    >>> Ending diagnostic diskattr


    I will post some extra stuff below...
     
  2. SimpleSim

    SimpleSim Member

    humax# smartctl --test=short /dev/sda
    smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Short self-test routine immediately in off-line
    mode".
    Drive command "Execute SMART Short self-test routine immediately in off-line mod
    e" successful.
    Testing has begun.
    Please wait 1 minutes for test to complete.
    Test will complete after Mon Nov 5 15:56:16 2012
    Use smartctl -X to abort test. humax# humax# smartctl -l selftest /dev/sda
    smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF READ SMART DATA SECTION ===
    SMART Self-test log structure revision number 1
    Num Test_Description Status Remaining LifeTime(hours) LBA
    _of_first_error
    # 1 Short offline Completed: read failure 90% 2260 909
    913325
    # 2 Short offline Completed: read failure 90% 2260 909
    913325
    # 3 Short offline Completed: read failure 90% 2260 909
    913325
    # 4 Short offline Completed: read failure 90% 2259 909
    913325
    # 5 Short offline Completed: read failure 90% 2259 909
    913325
    # 6 Short offline Completed: read failure 90% 2259 909
    913325
     
  3. SimpleSim

    SimpleSim Member

    humax# fdisk -lu /dev/sda

    Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
    255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
    Units = sectors of 1 * 512 = 512 bytes

    Device Boot Start End Blocks Id System
    /dev/sda1 2 2104514 1052256+ 83 Linux
    /dev/sda2 2104515 1932539174 965217330 83 Linux
    /dev/sda3 1932539175 1953520064 10490445 83 Linux
    humax#
    humax# /sbin/tune2fs -l /dev/sda2
    tune2fs 1.41.14 (22-Dec-2010)
    Filesystem volume name: <none>
    Last mounted on: <not available>
    Filesystem UUID: f774ede7-525b-41dd-8b16-f270652e1f9f
    Filesystem magic number: 0xEF53
    Filesystem revision #: 1 (dynamic)
    Filesystem features: has_journal ext_attr resize_inode dir_index filetype n
    eeds_recovery sparse_super large_file
    Filesystem flags: signed_directory_hash
    Default mount options: (none)
    Filesystem state: clean
    Errors behavior: Continue
    Filesystem OS type: Linux
    Inode count: 60334080
    Block count: 241304332
    Reserved block count: 12065216
    Free blocks: 84665574
    Free inodes: 60331591
    First block: 0
    Block size: 4096
    Fragment size: 4096
    Reserved GDT blocks: 966
    Blocks per group: 32768
    Fragments per group: 32768
    Inodes per group: 8192
    Inode blocks per group: 512
    Filesystem created: Sat Jan 1 00:00:17 2000
    Last mount time: Mon Nov 5 12:00:10 2012
    Last write time: Mon Nov 5 12:00:10 2012
    Mount count: 1733
    Maximum mount count: 37
    Last checked: Sat Jan 1 00:00:17 2000
    Check interval: 15552000 (6 months)
    Next check after: Thu Jun 29 01:00:17 2000
    Reserved blocks uid: 0 (user root)
    Reserved blocks gid: 0 (group root)
    First inode: 11
    Inode size: 256
    Journal inode: 8
    Default directory hash: tea
    Directory Hash Seed: fe7b3f3b-b765-4792-a80d-7962c3a54863
    Journal backup: inode blocks
     
  4. SimpleSim

    SimpleSim Member

    So I think I have to use this forula fsblock = (int)((<problem LBA>-<partition start LBA>)*<sector size>/<fs block size> to get the block

    problem LBA 909,913,325
    part start 2,104,515
    sector size 512
    fs block size 4096
    humax# debugfs
    debugfs 1.41.14 (22-Dec-2010)
    debugfs: open /dev/sda2
    debugfs: testb 113476101
    debugfs: debugfs: Block 113476101 not in use
    debugfs: debugfs:

    I did wait a while for it to tell me that.

    So what do I do now?
     
  5. SimpleSim

    SimpleSim Member

    Ok so I have taken the next step dd if=/dev/zero of=/dev/sda2 bs=4096 count=1 seek=113476101

    When I have run diskattr again I no have Current_Pending_Sector = 0 and Offline_Uncorrectable = 0

    I will see how the playback goes.

    Is it possible that it will have affected some recent recordings?

    Thanks
     
  6. af123

    af123 Administrator Staff Member

    Hopefully that will fix your playback problems, nicely done. Since the fault occurred 90% into the short test it is likely just the one problem sector which you've now fixed.
    The block wasn't in use so no files will have been affected.

    It's worth running fix-disk again to do a full filesystem check and a long disk self test, although that will take a long time.
     
  7. SimpleSim

    SimpleSim Member

    'nicely done' - it all down reading and blindly following advice from other postings - so nicely done back at you!

    Not sure what fix-disk is. I have re-run the normal humax HDD test and got a pass. Also from telnet did the smartctl --test=short /dev/sda
    and smartctl -l selftest /dev/sda thing and that showed as completed without errors. I have started the long test (smartctl --test=long /dev/sda) and will have to wait till tomorrow for that to complete. Do I get the results in the same way (smartctl -l selftest /dev/sda)?

    Thanks
     
  8. af123

    af123 Administrator Staff Member

  9. Black Hole

    Black Hole Felonius Gru

    "fix-disk" is a command to type at the humax# prompt, which would have run your fixes more effectively because it stops the normal Humax operations while it does it, and creates some swap space. Not sure where you got the idea to run "fdisk" raw, the fact you managed it is "nicely done".

    Next time, just get the Telnet humax# prompt and type "fix-disk" (you need CF 2.12+).
     
  10. af123

    af123 Administrator Staff Member

    fix-disk wouldn't have fixed his underlying sector fault, the steps taken were right for that. fix-disk would be useful now to confirm that the filesystem is completely intact.
     
  11. SimpleSim

    SimpleSim Member

    Thanks I will run fix-disk when the long test has run and I can get the box to myself now its doesn't stutter any more :)

    Sorry I didn't look for fix-disk in the wiki.

    All the things I did came from this other forum post http://hummy.tv/forum/threads/hard-drive-failure.2482/ with a bit of help from google as rob4x4 had a greater base understanding than me so he didn't need step-by-step instructions.
     
  12. Black Hole

    Black Hole Felonius Gru

    What we need now is an automated process to track down bad sectors!
     
    SimpleSim likes this.
  13. SimpleSim

    SimpleSim Member

    I guess it depends on the ease of development, frequency of occurrence and the goodwill of someone with the skills to do it!

    For someone who hasn't used telnet before I was able to follow the steps others described in the forums and fix my bad sector. I have yet to look at the results of the long test - the box had switched itself off sometime before its daily early morning wake up - see if I can get my hands on it tonight. From first appearance it is working properly again.

    Without the help of this forum and associated developments I would have had to use the standard menu format option and lost all my recordings so the help is (as always) much appreciated :)
     
  14. af123

    af123 Administrator Staff Member

    It will probably come as an extension to the new disk diagnostics page in the web interface... the number of HDRs that are approaching the two year old mark has triggered a large number of disk problems it seems!
     
    SimpleSim likes this.
  15. Ezra Pound

    Ezra Pound Well-Known Member

    I had guessed that a space had been reserved on the Web-If Main Screen for something new :)
     
  16. SimpleSim

    SimpleSim Member

    Phew, extended offline check completed without errors.

    I have run fix-disk (windows 7 using unset crlf option) and I think it all ran ok. One question it asked which I was not expecting was 'lost+found not found. Create? ' As everyone needs a lost and found I said Yes!! (is that okay?)

    Is this a sign of a upcoming problem with the disk which I should look at replacing in a controlled manner or is it likely to be a one-off?

    Thanks again to the usual suspects for their invaluable help :)

    Full log from fix-disk pasted below


    humax# fix-disk

    Checking disk sda



    Unmounted /dev/sda1

    Unmounted /dev/sda2

    Unmounted /dev/sda3



    Checking partition /dev/sda3...

    e2fsck 1.41.14 (22-Dec-2010)

    Pass 1: Checking inodes, blocks, and sizes

    ☻☺

    ☻Pass 2: Checking directory structure

    Pass 3: Checking directory connectivity

    /lost+found not found. Create? yes



    Pass 4: Checking reference counts

    ☻☺

    ☻Pass 5: Checking group summary information

    ☻☺/dev/sda3: |========================================================| 100.0%

    ☻☺



    /dev/sda3: ***** FILE SYSTEM WAS MODIFIED *****

    /dev/sda3: 13/655776 files (0.0% non-contiguous), 90356/2622611 blocks



    Checking partition /dev/sda1...

    e2fsck 1.41.14 (22-Dec-2010)

    Pass 1: Checking inodes, blocks, and sizes

    ☻☺

    ☻Pass 2: Checking directory structure

    Pass 3: Checking directory connectivity

    /lost+found not found. Create? yes



    Pass 4: Checking reference counts

    Pass 5: Checking group summary information

    ☻☺/dev/sda1: |========================================================| 100.0%

    ☻☺



    /dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****

    /dev/sda1: 15/65808 files (13.3% non-contiguous), 15653/263064 blocks



    Creating swap file...

    Setting up swapspace version 1, size = 1073737728 bytes



    Checking partition /dev/sda2...

    e2fsck 1.41.14 (22-Dec-2010)

    Pass 1: Checking inodes, blocks, and sizes

    ☻☺

    ☻Pass 2: Checking directory structure

    ☻☺

    ☻Pass 3: Checking directory connectivity

    ☻☺

    ☻/lost+found not found. Create? yes



    yPass 4: Checking reference counts

    ☻☺

    ☻Pass 5: Checking group summary information

    ☻☺/dev/sda2: |========================================================| 100.0%

    ☻☺



    /dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****

    /dev/sda2: 2506/60334080 files (13.5% non-contiguous), 157386878/241304332 block

    s

    Removing extra swap space.

    Are you having problems with a delete loop [Y/N]? n



    Finished - type 'reboot' to return to normal operation

    humax#
     
  17. Black Hole

    Black Hole Felonius Gru

    You might find some of your recordings have disappeared, when broken references were gathered up into lost+found.
     
  18. SimpleSim

    SimpleSim Member

    oh :(, with the extended offline check being successful I thought I had got away without any damage! Does the log indicate some other stuff was bad then?
     
  19. af123

    af123 Administrator Staff Member

    That log looks fine. The checker just creates the lost+found directories as part of its run if they aren't already present.
     
  20. Black Hole

    Black Hole Felonius Gru

    I said "might"! However, the log does say the file system was changed, implying to me (at least) that something got repaired. When I ran fix-disk recently, when I was having some difficulties with the custom software installation, some of the CF files disappeared up the lost+found jaxie.

    BUT af123 is the guru, don't listen to me over him.