Hard disk and S.M.A.R.T.

Wallace

Traveler 34122
As well as using my Humax I also use a Topfield 5810. As you may or may not be aware, the Toppy can run TAP's (Topfield Application Programs) which allow for a high degree of customisation.

One such useful TAP is HDD Monitor. It reads the S.M.A.R.T. data from the installed hard disk, including disk temperature and various error counters etc.

Would it be possible to run a similar program in the Humax operation environment? I think it would be very useful.
 
As well as using my Humax I also use a Topfield 5810. As you may or may not be aware, the Toppy can run TAP's (Topfield Application Programs) which allow for a high degree of customisation.

One such useful TAP is HDD Monitor. It reads the S.M.A.R.T. data from the installed hard disk, including disk temperature and various error counters etc.

Would it be possible to run a similar program in the Humax operation environment? I think it would be very useful.
Good idea, there's a a linux app named smartmontools which could possibly be compiled to run on the HDR.
 
Managed to locate a precompiled package for smartmontools that is compatible with the HDR.
You can get it from http://www.miforz.com/smartmontools-5.38-mips.tgz
To use it in its simplest form just unpack the smartctl binary from the /sbin folder in the archive and place it in /mod/sbin on the HDR.
Here's what you get when you enter /mod/sbin/smartctl -a /dev/sda1

Code:
smartctl version 5.38 [mipsel-unknown-linux-uclibc] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:    ST3500312CS
Serial Number:    XXXXXXX
Firmware Version: SC13
User Capacity:    500,107,862,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Wed Jul 27 12:15:04 2011 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                ( 623) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  1) minutes.
Extended self-test routine
recommended polling time:        ( 110) minutes.
Conveyance self-test routine
recommended polling time:        (  2) minutes.
SCT capabilities:              (0x103b) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  118  099  006    Pre-fail  Always      -      191720469
  3 Spin_Up_Time            0x0003  097  097  000    Pre-fail  Always      -      0
  4 Start_Stop_Count        0x0032  098  098  020    Old_age  Always      -      2514
  5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000f  072  060  030    Pre-fail  Always      -      17918289
  9 Power_On_Hours          0x0032  098  098  000    Old_age  Always      -      1854
10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0
12 Power_Cycle_Count      0x0032  099  099  020    Old_age  Always      -      1257
184 Unknown_Attribute      0x0032  100  100  099    Old_age  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0
188 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0
189 High_Fly_Writes        0x003a  099  099  000    Old_age  Always      -      1
190 Airflow_Temperature_Cel 0x0022  056  044  045    Old_age  Always  In_the_past 44 (0 28 44 24)
194 Temperature_Celsius    0x0022  044  056  000    Old_age  Always      -      44 (0 15 0 0)
195 Hardware_ECC_Recovered  0x001a  045  039  000    Old_age  Always      -      191720469
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
 
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Still not had a chance to play yet but I was wondering if it would be possible to add this as a 'button' on the Web Interface?

I have a VERY limited knowledge of Linux and don't want to risk breaking what currently works perfectly.
 
Might as well go for the latest version (5.41) since it compiles quite easily. The new version shows attribute 188 as "Command_Timeout" not "Unknown_Attribute".
 
Any chance of a link? Tried the site of the previous version, but cannot read/speak Russian! Google didn't help me either.

edit: Sorry, overlooked the link to the smartmon tools in your post. Read it too fast!
 
Sorry, but I really don't know what to do. I am afraid that I need step-by-step instructions. I don't even know which 'flavour' of v5.41 to download!

Any help would be appreciated. Thanks.
 
Edit: This message originally contained an opk file to download the smartmontools package and instructions to install it. Since it is now available from the usual place you can now install it using
Code:
opkg install smartmontools
or use the web interface.


You can then run 'smartctl -a /dev/sda' to see the SMART data.
 
I have renamed the file from .zip to .auto. Copied it to my USB stick (FAT32) and inserted it into the Humax. Nothing seems to have happened. What should I expect?

If I copy it over via FTP, where to I put it on the Humax? I.E. which directory?

Not got a clue really, just 'poking and hoping' here, lol
 
I have renamed the file from .zip to .auto. Copied it to my USB stick (FAT32) and inserted it into the Humax. Nothing seems to have happened. What should I expect?

If I copy it over via FTP, where to I put it on the Humax? I.E. which directory?

Not got a clue really, just 'poking and hoping' here, lol
If the install from USB worked correctly the smartmontools_5.41_mipsel.opk.auto file on the USB stick will have been renamed to smartmontools_5.41_mipsel.opk.done , plus a log file will have been created on the USB stick named smartmontools_5.41_mipsel.opk.log. If that is the case then enter 'smartctl -a /dev/sda' from telnet to list the SMART data.
 
Nope. The only file on the USB is 'smartmontools_5.41_mipsel.opk.auto'. It is formatted in FAT32 and is the same USB stick I used to update the firmware. Do I need to power-cycle the box? I have tried with the box powered off, inserted the USB and then powered the box on. I have also tried thie the box already on just inserting the USB stick. Same result - nothing. The Humax does 'see' the USB stick as the USB symbol appears on the display.
 
Nope. The only file on the USB is 'smartmontools_5.41_mipsel.opk.auto'. It is formatted in FAT32 and is the same USB stick I used to update the firmware. Do I need to power-cycle the box? I have tried with the box powered off, inserted the USB and then powered the box on. I have also tried thie the box already on just inserting the USB stick. Same result - nothing. The Humax does 'see' the USB stick as the USB symbol appears on the display.
No you don't need to power cycle the box. Just plugging in the USB stick with the box powered up should do it. The opk file is OK as I've just installed it using the same method. Here's the resulting log file:
Code:
955: BF: [smartmontools_5.41_mipsel.opk]
smartmontools_5.41_mipsel.opk: Package installation started.
Installing smartmontools (5.41 ) to root...
Configuring smartmontools.
smartmontools_5.41_mipsel.opk: Package installation finished.
Gotta be your USB stick that's the problem. Try reformatting it. Otherwise FTP the file onto the box. Copy it to /mod and from telnet enter cd /mod then enter opkg install smartmontools_5.41_mipsel.opk , and finally rm smartmontools_5.41_mipsel.opk
to delete the installer package.
 
xyz321: The install package creates the init.d script with a 'Z' prefix (no autostart). However, uninstalling the package gives an error as it's looking for the script with an 'S' prefix. BTW have you had the smartd daemon running ? I can issue the command to start it without errors but the process is not appearing in the process list.
Edit: Just ran it with debug switch and it looks like the config file needs a little tweaking to define the device. Here's the output from debug:
Code:
Opened configuration file /mod/etc/smartd.conf
Drive: DEVICESCAN, implied '-a' Directive on line 23 of file /mod/etc/smartd.conf
Configuration file /mod/etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
glob(3) found no matches for pattern /dev/sd[a-c][a-z]
glob(3) found no matches for pattern /dev/discs/disc*
In the system's table of devices NO devices found to scan
Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting...
Pattern matching should have found /dev/sda (possible bug ?)
After defining /dev/sda explicitly in smartd.conf the daemon started O.K.
 
Hmm, the auto-load thing does seem to be broken on the version of firmware I am using (1.08) which is admittedly not the latest.

Your best bet then is to FTP it to the box. Make sure to use binary mode if your client doesn't set it up for you. FTP the file to a temporary location e.g /mod/tmp then telnet into the box and issue the following commands:

Code:
cd /mod/tmp
opkg install smartmontools_5.41_mipsel.opk
 
Yipee!

FTP worked that time with your clear instructions. Thank you.

I now can retrieve the S.M.A.R.T. info.

Don't know what was wrong with my USB stick. As I said, it's the one I have used to update the firmware without any troubles.

Not to worry. Thank you for your patience. Hopefully someone else will find this thread useful too.

It would be great if the S.M.A.R.T. data could be retrieved with a button press from the Web Interface!!
 
xyz321: The install package creates the init.d script with a 'Z' prefix (no autostart). However, uninstalling the package gives an error as it's looking for the script with an 'S' prefix.
Sorry posts crossed.

Yes, I was in two minds whether or not to activate smartd. The uninstall script should attempt to stop the daemon if it is an 'S' script but shouldn't display an error if it is not present. This error can be ignored for now but I will fix it for the 'proper' version.

There may be a problem with other packages which have been disabled, then removed, then reinstalled. You could end up with an 'S' script and a 'Z' script for the same package since opkg only registers one of the script names.
 
Now I can see my HDD's SMART information. I have noticed that the Power-Off Retract Count is 36 and increments by 1 every time I power-cycle the box. It appears that the SMART limit is 200. Does that mean that when the count reads 200, my HDD will stop working?

It is a replacement 1TB WD unit (it is a model specifically designed for use in PVR's etc). Should I be worried?

You are probably laughing at me now thinking too much information is a bad thing - he should have left well alone!

On the other hand, if I have fitted an unsuitable/incompatible drive, it's better I know.

Not sure if I should have started a new topic for this post so please feel free to move it if required.
 
I have noticed that the Power-Off Retract Count is 36 and increments by 1 every time I power-cycle the box. It appears that the SMART limit is 200. Does that mean that when the count reads 200, my HDD will stop working?

That attribute doesn't appear on the 'standard' disks. Can you post the whole line?

BTW. I think my disk may be a little too warm:eek:
Code:
190 Airflow_Temperature_Cel 0x0022  044  042  045    Old_age  Always  FAILING_NOW 56 (1 105 56 34)
 
Here is a copy of the results:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
Code:
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0

3 Spin_Up_Time 0x0027 192 189 021 Pre-fail Always - 6366

4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 39

5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0

7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 38

10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0

11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 37

192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 35

193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 58

194 Temperature_Celsius 0x0022 097 085 000 Old_age Always - 50

196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0

200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

You can see, my temp is 50, and also you can see the Power-Off Retract Count of 35, it has now increased to 37 after two power cycles of the box!

I am running v5.41 of smartmon tools.
 
Back
Top