Sysmon slows HDR down so replay drops out.

It does, but it's vague: "some packages require a reboot". If I were reading that I'd think to myself "Why can't they tell me WHICH packages require a reboot?".
A note on the Wiki is only slightly more helpful than one on a message board.
As you say, I think it would be best put on the Web-If somehow. How the packages get to tell the Web-If I'm not really sure.
 
I could add something to the WiKi for each package indicating if a reboot is required but I don't think that is the right place to put the Info. (I don't kid myself that everyone consults the WiKi before installing a new package), It would be possible to add something to the 'description file' for each package but that would require every package to be updated and re-issued, Not really practical. So I think a general "some packages may require a reboot" will have to do for now
 
I checked and sysmon wasn't installed, although I did see on twitter it was updated recently (runs to wiki :) )

Regards

Damian
 
I got the same problem where every recording I played back was freezing every 30 secs ish so I opened up Sysmon to have a look at the CPU usage and it didn't show the graph.

I then removed it and the recording I was playing back started to play back correctly. I've had to remove it for now.

I ran the Humax HDDTest and it reports an error 8.

After removing sysmon package I did another Hard disk check, and it "completed without error"

I writting this on my phone, and the screen shots are done on my phone, so I hope you can see the screen shots :)
 

Attachments

  • Screenshot_2012-11-17-01-00-02.png
    Screenshot_2012-11-17-01-00-02.png
    109.6 KB · Views: 19
  • Screenshot_2012-11-17-00-38-09.png
    Screenshot_2012-11-17-00-38-09.png
    73.7 KB · Views: 17
I got the same problem where every recording I played back was freezing every 30 secs ish so I opened up Sysmon to have a look at the CPU usage and it didn't show the graph.
The disk has at least one bad sector which will cause problems in future if not fixed. For details as to how to go about fixing it see this thread, particularly from post 13 onwards.
 
My diagnostics report clean. I propose to do a more detailed examination when I have some free time.
 
The disk has at least one bad sector which will cause problems in future if not fixed. For details as to how to go about fixing it see this thread, particularly from post 13 onwards.

I tried the following Telnet commands, do you think I need to do anything else ?

Do I need to connect the HDD to the PC, and run the Seagate SeaTools ?

humax# smartctl -l selftest /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

humax# smartctl --test=short /dev/sda

smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mod e" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Sat Nov 17 17:41:16 2012

Use smartctl -X to abort test.
humax#
humax# smartctl -l selftest /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error
# 1 Short offline Completed without error 00% 926 -

humax# fix-disk
Checking disk sda

Unmounted /dev/sda1
Unmounted /dev/sda2
Unmounted /dev/sda3

Checking partition /dev/sda1...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes

Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 15/65808 files (6.7% non-contiguous), 14955/263064 blocks

Checking partition /dev/sda3...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes

Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sda3: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda3: 13/655776 files (7.7% non-contiguous), 84826/2622611 blocks

Creating swap file...
Setting up swapspace version 1, size = 1073737728 bytes

Checking partition /dev/sda2...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
rebootD2: |========================== - 47.1%

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes

Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 6176/60334080 files (8.8% non-contiguous), 195973862/241304332 blocks

Finished - type 'reboot' to return to normal operation
humax#

>>> Beginning diagnostic diskattr

Running: diskattr
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Pipeline HD 5900.2
Device Model: ST31000424CS
Serial Number: 9VX18ABQ
LU WWN Device Id: 5 000c50 02d79638b
Firmware Version: SC13
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Nov 17 20:38:03 2012 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 118 071 006 Pre-fail Always - 194376603
3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 097 097 020 Old_age Always - 3465
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 16
7 Seek_Error_Rate 0x000f 081 060 030 Pre-fail Always - 4433749620
9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4857
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1733
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 65535
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 8590065666
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 041 036 045 Old_age Always FAILING_NOW 59 (4 124 59 59)
194 Temperature_Celsius 0x0022 059 064 000 Old_age Always - 59 (0 17 0 0)
195 Hardware_ECC_Recovered 0x001a 046 023 000 Old_age Always - 194376603
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1
>>> Ending diagnostic diskattr
 
Yes, there is still a problem since the smart attributes 197 & 198 are non-zero. These are shown in red on your screenshots and are still non-zero in the above listing.

I am not sure whether the test results for smartctl refer to the correct disk when you ran it manually above, do you have the external drive connected? The self test log is showing a power on time of 926 hours and no errors whereas the diskattr listing shows 4857 hours usage. Is there an external disk attached?
 
The internal disk can be allocated device name sda or sdb between reboots if an external drive is connected. Can you try the following:

Code:
smartctl -i -l selftest /dev/sda

The diskattr diagnostic and fix-disk will always show details for the internal disk.
 
Thanks xyz321, I didn't know about the sda and sdb, I have now removed the external HDD during the testing.

When I ran the fix-disk, the box rebooted and then the box seemed to stop, but it didn't display MAINTENANCE MODE.

So I wasn't sure if I need to be in maintenance mode when doing the smartctl calls, so I did the following Telnet first.

diag diagmode
reboot

Again it didn't display 'MAINTENANCE MODE' message after the reboot, but froze the current channel, so I assume it is now in Maintenance mode.

humax# smartctl --test=short /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Sun Nov 18 11:11:21 2012

Use smartctl -X to abort test.

humax# smartctl -i -l selftest /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Pipeline HD 5900.2
Device Model: ST31000424CS
Serial Number: 9VX18ABQ
LU WWN Device Id: 5 000c50 02d79638b
Firmware Version: SC13
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Nov 18 11:13:14 2012 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 4869 530423802
# 2 Short offline Completed: read failure 90% 4869 530423802
# 3 Short offline Completed: read failure 90% 4869 530423802
# 4 Short offline Completed: read failure 90% 4839 530423802
# 5 Short offline Completed: read failure 50% 4839 530423802
# 6 Short offline Completed: read failure 90% 4839 530423802
# 7 Short offline Completed without error 00% 3071 -

This seems to show the same information on the Diagnostics 'Hard Disc' Self-test logs, so I'm not sure if this is useful, do I need to type anything else ?
 
OK, it makes more sense now since we have confirmed that the problem sector is at LBA 530423802. You need to continue with the procedure in the other thread from post #20 which involves the following:
  • Use fdisk to find the start and end sectors of each partition
  • Use tune2fs to find the block size used
  • Calculate the filesystem block containing the bad sector using LBA 530423802.
  • Use debugfs to find out if the block is in use
  • If not in use, zero out the sector - this should fix the problem.
  • Re-run the smartctl tests to check it has been fixed
  • Run a long smartctl test to look for any more bad sectors - this will take a few hours to run.
Edit: It is probably to best to be in maintenance mode before running debugfs or attempting to zero out the problem sector. Are you running CF version 2.13? If so, it should display 'MAINTENANCE' on the display and the picture will be frozen.
 
Thanks xyz321, I'm running CFW 2.12.

diag diagmode
reboot


humax# fdisk -lu /dev/sda

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/sda1 2 2104514 1052256+ 83 Linux
/dev/sda2 2104515 1932539174 965217330 83 Linux
/dev/sda3 1932539175 1953520064 10490445 83 Linux
humax#

humax# sbin/tune2fs -l /dev/sda2
tune2fs 1.41.14 (22-Dec-2010)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: da76a1ce-9113-4c4d-837e-e20ffd9d114d
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype n eeds_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 60334080
Block count: 241304332
Reserved block count: 12065216
Free blocks: 46808758
Free inodes: 60327922
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 966
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Filesystem created: Sat Jan 1 00:00:17 2000
Last mount time: Sun Nov 18 14:14:43 2012
Last write time: Sun Nov 18 14:14:43 2012
Mount count: 5
Maximum mount count: 37
Last checked: Sat Nov 17 19:13:16 2012
Check interval: 15552000 (6 months)
Next check after: Thu May 16 20:13:16 2013
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: e2cdd3e7-35cb-4061-8fa9-09678d723167
Journal backup: inode blocks
humax#

1) Please check that I am using the right numbers :

Problem LBA = 530423802
'fdisk' Start address for sda2 = 2104515

The filesystem block of the bad sector should be :

fsblock = (int)( (<problem LBA> - <partition start LBA>) * <sector size> / <fs block size>

fsblock = ((530423802 - 2104515) * 512) / 4096

I first tried it using calculator, assuming the numbers are decimal and I got 66039910, converted to hex gives me 0x3EFB066, but when I tried it using dc command I get less :)

humax# dc
16 o 530423802 2104512 - 512 * 4096

/ p
3efb067

2) So is my block is 66039911 dec ? (i.e. 0x3efb067)

I tried the debugfs but couldn't get it to return the inode number ?

humax# debugfs
debugfs 1.41.14 (22-Dec-2010)
debugfs: open /dev/sda2
testb 66039911
icheck 66039911
 
xyz321, I tried again with debugfs, and I think debugfs seems to hang, so I pressed CTRL+Z to abort (not sure if this is right) could it be that the bad sector is not within a file or my maths is wrong, or I am doing open on the wrong disk ?

humax# debugfs
debugfs 1.41.14 (22-Dec-2010)
debugfs: open /dev/sda2

CTRL+Z

debugfs: testb 66039911
Block 66039911 marked in use

debugfs: icheck 66039911

CTRL+Z

debugfs:
 
I believe you want block 66039910 as the result of the division gives a 0.875 fraction which you want to truncate, not round up (which is what dc seems to do).
 
The dc command would give the correct result (66039910) but the partition start LBA used in the calc. above was incorrect.

@linuxtowers: The debugfs command may take some time to search through the filesystem. Also please make sure that the external disk is not connected for any of these commands - it may cause unecessary confusion.
 
Back
Top