Unable to access web interface after CF install

prpr · Oct 5, 2014

I would reinstall 1.03.12 CF2.22 (as that is what you said you started with; failing that, 1.02.32 CF2.21 as that did have access to the disk) and run fix-disk from there. Not being able to find the hard disk is a pretty fundamental problem...

escat · Oct 5, 2014

Thanks - will do. The command line (i.e. ls, cd etc) can still find the hard disk; so hopefully all is not completely lost.

prpr · Oct 5, 2014

fix-disk is a load of command line calls, so I'm not quite sure what you mean. If the output of "ls -l /sys/block/" shows "sda" as one of the items and "ls -l /sys/block/sda/device" shows something like this:
lrwxrwxrwx 1 root root 0 Oct 5 09:21 /sys/block/sda/device -> ../../devices/pci0000:01/0000:01:00.0/host0/target0:0:0/0:0:0:0
then fix-disk ought to run.
It might be useful to know if there are any differences between normal mode and maintenance mode. I don't see how, but you never know till you see the output...

Black Hole · Oct 5, 2014

escat said:
Thanks - will do. The command line (i.e. ls, cd etc) can still find the hard disk; so hopefully all is not completely lost.

Give us screen dumps of everything you do (which means copying the Telnet dialogue and pasting it into a forum post inside a Code section [code]...[/code]).

escat · Oct 5, 2014

Thanks to both prpr and Black Hole for your interest. My reason for posting in detail was to offer the chance to explore what was going on, not just to get me going again. I have other systems, and everything is backed up, so I'm not in any way desperate. But I am a relative novice - as must be obvious!

First, although I started with 1.03.12 + CF 2.22, after the CF 3.0 upgrade failed, so did my subsequent attempts to load CF 2.22 on top of a newly installed 1.03.12. Hence my downgrade to 1.02.32.

Second, in response to prpr's latest post, the longer ls commands both worked as you suggest. (Black Hole: I hope I've got the copy/paste format correct.)

Code:

humax# ls -l /sys/block
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop0
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop1
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop2
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop3
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop4
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop5
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop6
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop7
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock0
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock1
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock2
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock3
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock4
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock5
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock6
drwxr-xr-x  6 root  root  0 Jan  1  2000 sda
drwxr-xr-x  6 root  root  0 Jan  1  2000 sdb
drwxr-xr-x  6 root  root  0 Jan  1  2000 sdc

Code:

humax# ls -l /sys/block/sda/device
lrwxrwxrwx  1 root  root  0 Jan  1  2000 /sys/block/sda/device -  > ../../devices/platform/brcm-ehci.0/usb1/1-1/1-1.1/1-1.1:1.0/host2/target2:0:0/  2:0:0:0

This leads me to wonder whether it's not necessarily the hard disk itself which has problems - though I will run fix-disk as soon as I can. I guess it's still a possibilty that I've got some detritus from earlier loads.

Black Hole · Oct 5, 2014

That's as may be, but disks can and do suffer file system failures including the whole partition going "read only", which then cascades down to CF misoperation etc etc etc. You can only start figuring out whether there is anything systematic going on once you are sure the file system is and was working properly - and that is the role of fix-disk. If fix-disk can't find the HDD partitions, there is definitely a problem!

The Linux output is just Greek to me, but by posting it up the guys with the system knowledge have a chance of spotting what's going on or suggesting the next input.

Bear in mind that loading firmware only updates the non-volatile memory and does not affect the HDD. The HDD does not need to be working to be able to use the Telnet tools.

escat · Oct 5, 2014

Black Hole said:
That's as may be, but disks can and do suffer file system failures including the whole partition going "read only", which then cascades down to CF misoperation etc etc etc. You can only start figuring out whether there is anything systematic going on once you are sure the file system is and was working properly - and that is the role of fix-disk. If fix-disk can't find the HDD partitions, there is definitely a problem!

I agree. I've just been re-reading your article on "Steps for Repairing a Disk of Unknown Faults". I wonder whether I should try to run fix-disk from the command line, rather than using the Maintenance Mode option 1 to invoke it indirectly. In principle, I assume, the two should be equivalent - but removing one level of indirection might make a difference.

escat · Oct 5, 2014

First, a confession, then some progress.

When I reported that the ls /sys/block command appeared to have found the hard drive OK, I wasn't aware that I had inadvertently left the external drives plugged in. It appears that the sda, sdb, sdc drives that ls found were the external ones, not the internal hard drive. Sorry. When I removed them and ran smartctl from the telnet command line, it confirmed that it couldn't find any drives. Very bad news.

Then the good news. As prpr recommended, I downgraded back to 1.02.32 - and recovered access to my recordings. I re-installed CF 2.22 successfully, and downloaded the web-if package. Then, following Black Hole's advice, I ran the the disk checker - tempting luck by using the web interface, rather than going into maintenance mode. This reported no disk failures - though some parameters were marked as 'old-age' and others as 'pre-fail' - not unusual in itself.

So, here's where we are. 1.02.32 with or without CF 2.21 or CF 2.22 seems to operate normally. 1.03.12 without CF seems to operate normally. 1.03.12 with CF 2.22 or CF 3.0 loses access to the hard drive. In the absence of any other suggestions, I'm inclined to go through RMA and see if that cleans anything up. But if it does, that would suggest that RMA might be wise before any firmware upgrade, which is more than is currently recommended?

Two other 'environmental' considerations that may possibly - or more likely may not - have some bearing: 1) the first attempt at installing CF 3.0 on this machine was performed immediately after a self-initiated re-tune. 2) My firewall prevents this machine accessing the internet, other than the CF package download site (89.248.55.77) when I want to install CF packages. In particular, it blocks the calls to the Opera site when the machine starts, but I've never experienced any negative consequences before.

If anyone is interested in finding out more, please let me know. Thanks to prpr and Black Hole for the suggestions.

Black Hole · Oct 5, 2014

The WebIF Diagnostics does not check the file system, it only reports the SMART stats that the disk controller itself gathers.

It is curious that reinstating previous firmware restores normal access to the HDD. This does seem to imply there is little or nothing wrong with the HDD or file system on it. As a wild speculation, maybe the EEPROM is faulty in some location that is not critical to the 1.02.32 firmware but renders 1.03.12 unable to access the HDD.

prpr · Oct 5, 2014

I've kinda lost what you have and haven't done...
Fix-disk in normal mode just initiates the next boot in maintenance mode, where you run fix-disk again.
So, just unplug ALL external drives (you should not see sdb or sdc on the "ls /sys/block/" command) and run fix-disk, reboot, telnet in and run fix-disk again.
There is NO point doing anything else until you have done this. Forget chancing things. Just do this.

escat · Oct 5, 2014

prpr, thanks again. Yes, I had already removed the external drives. With telnet, I have checked that the ls /sys/block command is only showing sda. I'm now back on 1.02.32 / CF 2.21 and everything *appears* to be working normally.

I hope I'm right in interpreting the first fix-disk (in normal mode) to which you refer as being the invocation of option 1 from within the telnet front end i.e. to put the box into maintenance mode (Step 3 in Black Hole's documentation on disk faults). If I'm mis-understanding something, please let me know.

After the reboot, and re-telnet, I ran fix-disk with the following result:

Code:

Menu version 1.10
Enter system PIN: ****

  /---------------------------------------------\
  |  M A I N T E N A N C E  M O D E  M E N U  |
  \---------------------------------------------/

  [ Humax HDR-Fox T2 (humax) 1.02.32/2.21 ]

  1 - Check and repair hard disk (fix-disk).
  2 - Run short hard-disk self test.
  3 - Run long hard-disk self test.
  4 - Check self-test progress.
  epg - Clear persistent EPG data.
  x - Leave maintenance mode (Humax will restart).
 diag - Run a diagnostic.
  cli - System command line (advanced users).

Please select option: 1
Any additional options (or press return for none):
Are you sure you wish to run the hard disk checker? [Y/N] y
Running /bin/fix-disk
Custom firmware version 2.21


Checking disk sda

Unmounted /dev/sda1
Unmounted /dev/sda2
Unmounted /dev/sda3

Running short disk self test

No pending sectors found - skipping sector repair
Using superblock 0 on sda1
Using superblock 0 on sda2
Using superblock 0 on sda3


Checking partition /dev/sda3...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
hmx_int_stor: 13/655776 files (0.0% non-contiguous), 112544/2622611 blocks

Checking partition /dev/sda1...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
hmx_int_stor: 14/65808 files (0.0% non-contiguous), 15308/263062 blocks


Creating swap file...
Setting up swapspace version 1, size = 1073737728 bytes
UUID=fef4d779-4430-42b2-bf10-bd9100c293cb

Checking partition /dev/sda2...
e2fsck 1.41.14 (22-Dec-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found.  Create? yes

Pass 4: Checking reference counts
Unattached zero-length inode 55525763.  Clear? yes

Pass 5: Checking group summary information
Free inodes count wrong for group #6778 (8193, counted=8192).
Fix? yes

Free inodes count wrong (60328834, counted=60328833).
Fix? yes

hmx_int_stor: ***** FILE SYSTEM WAS MODIFIED *****
hmx_int_stor: 5247/60334080 files (2.3% non-contiguous), 36024944/241304331 blocks
Removing extra swap space.

Finished

Please let me know if this what you wanted, and whether it tells us anything significant.

Again, many thanks for your help and patience.

escat · Oct 5, 2014

Black Hole said:
The WebIF Diagnostics does not check the file system, it only reports the SMART stats that the disk controller itself gathers.

Many thanks for clarifying that. I had wondered why it wasn't included in the "Steps for Repairing a Disk" documentation. Hopefully, prpr will be able to use the fix-disk output to determine whether there is anything wrong with the drive.

Thanks again - and for all the countless other helpful posts!

prpr · Oct 5, 2014

Fix-disk fixed one minor error by the look of it, so nothing to worry about there.
I would probably try 1.02.32/CF 3.00 next and see what happens (give the output of "ls /sys/block" again if it doesn't work).
If it does work, then move on to 1.03.12 official and see whether your disk is still recognised (obviously no Webif at this point).

escat · Oct 5, 2014

Many thanks - that's a relief.

I'll try 1.02.32/CF 3.00 as you suggest. Note, I have already installed 1.03.12 official several times (i.e. without CF), and it is OK. The problem has only arisen when I've installed CF on top of it. But one step at a time!

Looked at objectively, this has to be a de-bugger's dream problem - a catastrophic failure (lack of access to the main drive) that can readily be by-passed (use the back-level firmware) and that is readily re-producible (just install 1.03.12 + CF).

Hopefully, we'll get there in the end.

Thanks again

escat · Oct 6, 2014

As anticipated, installing 1.03.12 official was successful.

As definitely not anticipated, installing CF 3.00 on top was also successful. Everything seemed to work satisfactorily: recording, playback, attached devices etc. So, presumably either the fix-disk run, or the countless installations of different firmware levels, had 'shaken something loose'.

Before declaring victory, I decided to go round the loop one more time - i.e. install 1.03.12 official, then CF 3.00. Again, everything seemed fine. Then I noticed that, contrary to all the instructions, I had forgotten to unplug the external drives. So one more install was called for, with the external drives removed.

This time it went back to its previous behaviour: no recordings visible, no access to the Web-if. Telnet again shows the hard drive missing:

Code:

humax# ls -l /sys/block
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop0
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop1
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop2
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop3
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop4
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop5
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop6
drwxr-xr-x  4 root  root  0 Jan  1  2000 loop7
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock0
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock1
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock2
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock3
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock4
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock5
drwxr-xr-x  5 root  root  0 Jan  1  2000 mtdblock6
humax#

But, oddest of all, power-off / power-on restored everything to full working order - and telnet showed the sda hard drive present. So we're back in business again. But for how long?

Any further thoughts welcome, but I appreciate that this is now definitely not a 'debugger's dream problem'.

Anyway, thanks to Black Hole and prpr for your help in getting me this far.

Black Hole · Oct 6, 2014

I don't think you will have any further problem. Cold restarts are more thorough than reboots, it is likely that a weird state has been cleared and won't recur.

I'm a bit confused about the external drives: did you manage to make firmware updates "take" while there was another drive in the rear USB?? You would definitely have to have removed the UPD after the firmware update, otherwise it would have gone around the loop again.

escat · Oct 6, 2014

Black Hole, many thanks for your continued interest. On your various points:

1) Yes, I did (accidentally) leave the three external drives plugged into the rear USB slot whilst I did the software update earlier this morning - and it did work. It was when I took them out that the subsequent re-installation failed! But if you want to be absolutely certain - or explore this behaviour in more detail - I'm happy to try it again. (For the time being, I'm not relying on the behaviour of this system for anything critical).

2) Yes, I always remove the UPD from the front USB slot before re-booting, so as to avoid going round the loop again.

3) I'm pretty certain I will have done a power-off / power on cycle many times before without cleaning out the problem, but I did wonder whether this had something to do with it. I also wondered whether there was any significance in the fact that the Humax instructions for a new official installation specify that the unit should be powered off, whilst the CF "Install Modified Firmware" instructions specify that the unit should be placed in standby. In hindsight, I cannot be certain that I always slavishly followed this distinction.

Thanks again

Black Hole · Oct 6, 2014

The reason we talk about standby is that we have found that all you need to trigger firmware update is to reboot the unit with the firmware update file in the USB socket, and not a full cold start. The HDR-FOX needs to drop into full standby (drive clicks off) and restart in order to reboot, just a quick cycle before the shutdown is complete does not produce a reboot. You can also use a WebIF diagnostics reboot. A cold start ensures a reboot, whatever the circumstances, but we perceive a full blown power cycle to be overkill.

This does however presume there are no other USB devices connected, and there is a good chance the update would not start if there were (even though yours did).

The HD-FOX has a reset button, which is much more convenient, but you have to hold the HD-FOX's power button in as well to initiate an update.

It would appear that the weird state your unit got into was caused by having external drives present while the update was taking place, and it seems to me a small miracle that a cold start restored normal operation after that. If you want to be sure, do another update but this time with the other drives disconnected first!

escat · Oct 6, 2014

It would appear that the weird state your unit got into was caused by having external drives present while the update was taking place, and it seems to me a small miracle that a cold start restored normal operation after that. If you want to be sure, do another update but this time with the other drives disconnected first!

I agree, there's something really bizarre in all this. When I've got a few hours later in the day, I'll go round all the combinations again and see if I can reliably reproduce, and narrow down, the problem.

Thanks, as ever, for your helpful explanations.

escat · Oct 7, 2014

Despite how many times I ran whatever combination of upgrades, I couldn't reproduce the problem. So time to call it a day. What I did notice, however, was that leaving the drives plugged in the rear USB slot during many of the attempts didn't cause any discernible problems. Anyway, thanks again to Black Hole and prpr for helping me exclude some of the worst possibilities.

Unable to access web interface after CF install

Well-Known Member

Member

Well-Known Member

May contain traces of nut

Member

May contain traces of nut

Member

Member

May contain traces of nut

Well-Known Member

Member

Member

Well-Known Member

Member

Member

May contain traces of nut

Member

May contain traces of nut

Member

Member