Sysmon slows HDR down so replay drops out.

Sorry for the delay, been out with the guys playing snooker.
I got the original monitor.db file, because I FTP the file using FileZilla to the PC and it size is 4,632,576 bytes (smaller).

Strangely, I had no problems FTP the monitor.db file even though it had a bad sector which you would have thought you get read errors during the FTP process.

I did look at the Busybox dd command, and I was worried it could get truncated myself, but thought it wouldn't because I didn't include the conv=notrunc parameter (i.e. don't truncate output file).

I can play back recording OK, so there not too much wrong at the moment, but be good to get rid of the errors and not have to format the HDD.
 
I have upgraded to CFW 2.14 now, and run Telnet maintenance mode, I had fixed the disk errors now:)

humax# dd if=/dev/zero of=/dev/sda2 bs=4096 count=1 seek=66039910
1+0 records in
1+0 records out
humax# sync

I updated the SMART Attribute values by forcing the HDD offline.

humax# smartctl -t offline /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART off-line routine immediately in off-line mode".
Drive command "Execute SMART off-line routine immediately in off-line mode" successful.
Testing has begun.
Please wait 633 seconds for test to complete.
Test will complete after Thu Nov 22 00:11:49 2012
Use smartctl -X to abort test.

humax# smartctl -A /dev/sda
smartctl 5.41 2011-06-09 r3365 [7405b0-smp-linux-2.6.18-7.1] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 106 071 006 Pre-fail Always - 11901708
3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 097 097 020 Old_age Always - 3517
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 19
7 Seek_Error_Rate 0x000f 081 060 030 Pre-fail Always - 4436531858
9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4951
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1759
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 65535
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 8590065666
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 048 035 045 Old_age Always In_the_past 52 (5 28 52 52)
194 Temperature_Celsius 0x0022 052 065 000 Old_age Always - 52 (0 17 0 0)
195 Hardware_ECC_Recovered 0x001a 047 023 000 Old_age Always - 11901708
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1

Did some good background reading that helped me understand a bit more :
http://smartmontools.sourceforge.net/badblockhowto.html

Thanks guys for your help on this.
 

Attachments

  • fixed.jpg
    fixed.jpg
    335.2 KB · Views: 11
I have had sysmon disabled (and other things) for a couple of weeks, and my HDR is behaving much better.

Sorry to resurrect an 'old' thread but I thought it would be appropriate.

I have recently noticed slight hesitation/stutter when watching HD recordings (not too sure about 'live' TV at the moment as we don't seem to watch much of that!). It started out of the blue.

According to the SMART monitor, my HDD is fine, no bad/reallocated sectors etc.

Out of pure curiosity, I followed BH's lead and disabled, well, removed, the sysmon package and rebooted. After a couple of days there has not been a re-occurrence of the hesitation/stuttering. It may be purely psychological, but I feel the box responds a bit sharper/more responsive too.

I will keep an eye on things and look a bit deeper if needed when I have more free time to play. But I just thought that I would mention the point.
 
I removed sysmon over a week ago but still get the pixellation and jumping on HD recordings. Hoping to get enough time to copy everything off and reformat but no sign of that happening yet!!
 
I don't and never did, suffer from pixelation. Since the removal of sysmon, I have yet to experience any hesitation or stutter.

It is still early days yet. The next obvious step would be to re install the package. I have deleted the monitor.db file from the Hummy.


Sent from my iPad using Tapatalk
 
I removed sysmon over a week ago but still get the pixellation and jumping on HD recordings. Hoping to get enough time to copy everything off and reformat but no sign of that happening yet!!
Try unplugging the network cable or pull the wireless dongle out and see if it is better.
 
Hi - this is my first post, so first of all can I say an enormous thanks to everybody involved in the custom firmware. It's a superbly functional and reliable piece of software. The custom firmware was main reason why I went for the Hummy when replacing our old Toppy!

Like Wallace, I too started getting a stutter on playback of all recordings, which introduced a glitch every 10 minutes or so. This first developed a few months ago and had plagued recordings consistently since then.

Watching live TV remained fine. So given that only recorded TV was affected, and that the disc checked out OK for errors, I reckoned that it must be something related to CPU load while recording.

So I set about watching everything as a recording, chase played by the minimum amount of about 20 seconds. While watching for glitches in the programmes I kept "top" running via ssh, to look for any correlation between the glitches and the running processes. Sure enough, every time there was a glitch on TV it correlated to high activity in a process "temp", which continued taking 100% of CPU for tens of seconds. See attached screenshot.

I quickly brought up the process list and found that "temp" is part of the monitoring tools package. So, I uninstalled it; and everything has been fine since then (several weeks now).

Until the problem started, I'd been using the monitoring tools without any problems. I wonder if the monitor.db database file has grown to the point where there's a high overhead in writing to it? I've kept a copy of the db file in case anybody thinks it'd be helpful for me to investigate further.

Hope that helps.
 

Attachments

  • high-temp.png
    high-temp.png
    49.3 KB · Views: 22
Just for the record. Even after 'all this time', I have not had and hesitation or stutter since I removed sysmon.
 
I wonder if the monitor.db database file has grown to the point where there's a high overhead in writing to it? I've kept a copy of the db file in case anybody thinks it'd be helpful for me to investigate further.

I have run sysmon since it's launch without seeing any problem that I could put down to it's use, however just recently I have seen some disruption to recordings and it did seem to go when sysmon was un-installed. I too though that something apart from simply running sysmon must have started to happen recently. So here's the question, how big is your monitor.db file? my file size got up to 8.8 Megs since the last time it was erased
 
So here's the question, how big is your monitor.db file? my file size got up to 8.8 Megs since the last time it was erased
It's about 15MB.

I copied it off and browsed it using the SQLite Database Browser that you recommended from Sourceforce. It seems to browse fine. It compacted down to 12MB; I don't know what that says about the state of the indexes etc.


One thing I did notice is that there were some strange entries in the first page of each table. It seems that the primary key is an integer representation of the date & time for each physical sample. Most of the datetime index values are 10 digits long (e.g. 1342303200), which sounds sensible along the line of seconds since 1970 or something.

But, the first page of the table contains a dozen or so values in the range from 3600 upwards, which doesn't fit the pattern. I suppose the next step is to look for some documentation on the monitor.db table layouts...

Or, I could just be happy that removing the monitoring tools has fixed the problem!
 
If accessing large databases is the problem, then it would be good to see if things improve by starting a new database each day and archiving the last few days if a longer history is required
 
Yup, I suppose that would be a good feature. But reading some of the SQLite specs, it's designed to work with files that are many terabytes in size. I guess that means nothing if the DB use is heavy: either lots of reads that aren't covered by an index, or lots of writes which update multiple indexes. But looking through monitor.db that's obviously not the case.

Would any current users like to check their monitor.db for those strange table entries that I mentioned above? They're in the first page of the table, so easy to spot.

(this is purely for the sake of helping people in future now - I don't want to upset my wife by corrupting any more recordings just yet!)
 
I have had a look at my monitor.db file but I don't get any of the short time stamps you have found, they are all 10 digits, As you say this is a number of seconds from Thu, 01 Jan 1970 00:00:00 GMT, so 3600 would be 1 hour later and doesn't look right. Here is my data, it stops roughly 10 days ago when I uninstalled sysmon
Code:
temp  : dat 1 : 1351098000 = Wed, 24 Oct 2012 17:00:00 GMT
vmstat : dat 1 : 1351098000 = Wed, 24 Oct 2012 17:00:00 GMT
smart  : dat 1 : 1352851200 = Wed, 14 Nov 2012 00:00:00 GMT
 
temp  : dat 99312 : 1363642982 = Mon, 18 Mar 2013 21:43:02 GMT
vmstat : dat 99276 : 1363642982 = Mon, 18 Mar 2013 21:43:02 GMT
smart  : dat 85040 : 1363563602 = Sun, 17 Mar 2013 23:40:02 GMT
 
It seems there is a bug in the database purge routine which means it never gets purged of old entries. Here is a patch.
Code:
--- db.jim.old
+++ db.jim
@@ -73,7 +73,7 @@
        $mondb query "
                delete from $montable
                where scope = '$scope'
-               and dat < [[clock seconds ] - [expr $days * 86400]]
+               and dat < [expr [clock seconds ] - [expr $days * 86400]]
        "
 }
 
Sysmon is a bit bizarre. I caught it doing its 95% CPU use stuff today:
Code:
19036 ?        S<     0:00 /bin/sh -c /mod/monitor/run
19037 ?        S<     0:00 /bin/sh /mod/monitor/run
19203 ?        R<     0:12 /mod/bin/jimsh /mod/monitor/bin/vmstat
After it finished I ran "sh -c /mod/monitor/run" manually several times and it produced this:
Code:
humax# sh -c /mod/monitor/run
Rolling up.
Rolling up.
Rolling up.
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
humax# sh -c /mod/monitor/run
The one which had all those "Rolling up"s took several 10s of seconds and used huge amounts of CPU.
All the others finished in a fraction of a second.
I'm not sure what to conclude from this. Over to af123 to see if he can shed any light...
 
It seems there is a bug in the database purge routine which means it never gets purged of old entries. Here is a patch.
Code:
--- db.jim.old
+++ db.jim
@@ -73,7 +73,7 @@
        $mondb query "
                delete from $montable
                where scope = '$scope'
-              and dat < [[clock seconds ] - [expr $days * 86400]]
+              and dat < [expr [clock seconds ] - [expr $days * 86400]]
        "
}
Thanks. I suspect I just meant to let the subtraction occur in SQL so:

Code:
and dat < [clock seconds ] - [expr $days * 86400]

or perhaps

Code:
and dat < [expr [clock seconds ] - $days * 86400]

Either way, the existing expression is wrong, thanks xyz321.
 
I'm not sure what to conclude from this. Over to af123 to see if he can shed any light...

If the data expire hasn't been working then there will always be huge amounts of data to roll up.
I will push out an updated version later on which fixes this.
 
Version 1.0.10 is there now which fixes the data purge issue and also reduces the amount of data involved in any rollup.
I suggest running a manual purge following installation via the sysmon/purge diagnostic, or if you're using the CLI then:

Code:
humax# /mod/monitor/run -p
 
Back
Top