Media List Sort Order

escat

Member
This is a minority interest I'm sure, but has anyone else experienced ordering errors in their displayed list of recordings when sorted alphabetically? In my case, the problem seems to arise after cropping and renaming recordings, using the customised firmware. It may be caused by changes that seen to have been made to the relevant routines some time last summer.

There are three places in the hmt file where the recording name is stored: 0x0180 which holds the relevant file name; 0x029A which holds the displayed recording name; and 0x516 which holds the name to be displayed in the iPlate. The crop and rename process appears to sometimes (?) change all these. The file name may be moved to 017F; the displayed name may have the string-leading character '15' inserted; the iPlate name may have the string-leading characters '106937' replaced by '15'. In the greater scheme of things, these changes don't really matter, but it would appear to throw the alphabetic sort of the recording list. In the sort, all those recordings whose names have the inserted '15' prefix will precede all those (mainly older) recordings that have retained their starting genuine alphanumeric character. Since some of my folders have more than a hundred recordings in them, this can be a bit of a tiresome issue!

I'd be grateful if anyone can confirm - or refute - my suppositions, or offer an explanation as to why the change was made.

Thanks
 
How do you see these effects? Do you have to use a binary dump, or do they show up in a command line hmt dump? Is there a definitive way to reproduce them?
 
How do you see these effects? Do you have to use a binary dump, or do they show up in a command line hmt dump? Is there a definitive way to reproduce them?
I use FTP to copy the hmt file, then just look at it with a hex editor, with the help of Raydon's guide to the format of the hmt file. If I use the editor to remove the inserted x'15' character at 0x029A in the file, the sort then works as required.

I can't be 100% certain that it happens every time. But I've got several series where each recording is named (e.g.) "SSEE: Episode title" (where SS is the series number and EE is the Episode Number). In every case, the programmes recorded before last summer follow those recorded after last summer in the alphabetical listing - even though their initial numbers are lower.

The variability of the file name position between 0x017F and 0x0180 has been around for at least seven years, and I've never understood what controls it.

(In passing, I should give credit to whoever changed the crop/rename routine(s) last summer for fixing a long-standing problem whereby about 5% of the cropped recordings would fail to play with VLC because, after the crop, they didn't seem to have the correct lead-in packets. Thanks)
 
I use FTP to copy the hmt file, then just look at it with a hex editor, with the help of Raydon's guide to the format of the hmt file. If I use the editor to remove the inserted x'15' character at 0x029A in the file, the sort then works as required.
Perhaps you could upload some examples (just the .hmt): zip them then add ".txt" to the filename (so it becomes "filename.zip.txt") to fool the forum software into accepting the zip as an attachment.

Also, take a look using the hmt utility: on the HDR-FOX command line (be that via Telnet or using webshell):
Code:
> hmt <recording filename without extension>
 
Last edited:
I've never seen this problem because (a) I rarely crop or rename; and (b) use reverse date for sorting. I can't find a single example of incorrect sorting on my machines.

I take it you are using seriesfiler (or something) for the renaming? I suspect that's the source of the problem, I doubt there are supposed to be offsets to the start of actual data.
 
Perhaps you could upload some examples (just the .hmt): zip them then add ".txt" to the filename (so it becomes "filename.zip.txt") to fool the forum software into accepting the zip as an attachment.
The forum software does not need fooling to accept files with zip extensions.
 
Thanks Black Hole for your interest and prompt response. I attach the zip file as requested (produced before I saw Brian's comment). It includes three different versions of the hmt file for the latest episode of ITV's The Ipcress File broadcast on Sunday. The three fields of interest, described above, are at 017F, 029A and 0516/0519. (For convenience, I've also included a version of Raydon's HMT format guide.)

The Original version is after auto-decrypt. The Cropped version is after cropping, using bookmarks and the web interface crop option. The only change is that the filename has moved from 0180 to 017F. The Rename version is after renaming, again using the web interface Rename option. (I don't know seriesfiler.) The name field at 029A now has the '15' character inserted before the name. The name field at 0516/0519 has the '106937' string replaced by '15' and the name itself shifted left by two characters.

I'm not sure what telnet will tell me about the files that I don't already know? My interest is in understanding why these routines make these changes, and whether they can be stopped from doing it! I guess that probably needs knowledge of the actual code.

As I say, a minority interest, but important to me.

Thanks again
 

Attachments

  • hmtfiles.zip.txt
    18.7 KB · Views: 2
Oh. Is that new? I'll have to update the Newbies Guide.
It has been possible for many years, a quick forum search showed zip file attachments from 2014, there may be earlier examples.

Edit: It seems that the earliest attached zip file on the forum is from 2011.
 
Last edited:
I don't have time to delve into your files right now, but comments below for your consideration:

I don't know seriesfiler
There is an automated way to use a database to retitle episodes, including series and episode number. I don't recall the details, but check out seriesfiler and sweeper. The wiki summarises packages, and you can get to the relevant forum discussions via the search utility or Index of Package Primary Topics.

For convenience, I've also included a version of Raydon's HMT format guide.
It's on the wiki. I note some of the information is attributed to you, so I doubt I'm going to be able to contribute much.

I'm not sure what telnet will tell me about the files that I don't already know?
I'm curious how the hmt utility (which provides a decode of the .hmt file in human-readable form) handles these variations, in particular whether it mimics what's seen on-screen. It should, and if it doesn't it needs updating.

If the firmware history includes an alteration of the start index for the display title from 017F to 0180, how does the new firmware interoperate with old recordings? If the start index is actually variable, how is it indicated? If the index has changed, have the utilities (such as rename) been updated in accordance? Why does crop have any input to this?

Your observations of a quirk might help get to the bottom of some of this and enable refinements to the code.
 
I'll see what the hmt utility has to say and report back.

I've obviously wondered myself how both the native and customised firmware copes with these problems. With regard to the 017F/0180 problem. It's possible that, since the path name and file name are stored in consecutive fields, the firmware just concatenates the two and strips out any redundant characters. Hence no problem coping with both old or new and cropped or uncropped recordings. With regard to the other two fields and their problems, since the characters concerned are just the string lead-in characters, they probably just get stripped out, so again no problem. Except if the '15' character is included in the sort string, of course. I assume that the screen sorted display is produced by native firmware, so it has no reason to take the introduced '15' character at 029A as other than a genuine part of the name.

I learnt the hard way when I started programming sixty years ago - when turn round time could be a week or more - that quirks were always important :)) Hence my interest in finding the root of the problem.

Thanks again.
 
Hence my interest in finding the root of the problem.
The root of the problem is that there is no proper spec. for what the file format is supposed to be. It's all done by educated guesswork and reverse engineering over many variations of firmware and broadcasters' transmissions. Fixing one thing often breaks another, but is that a reason to put something back? Finding the right answer can be very time consuming.
 
The prefixes distinguish various character encodings as defined in the DVB specifications (DTG D-Book 7 Pt A 8.5.6 and EN 300 468 Annex A), with some Humax extension IIRC.

There may have been slightly different HMT versions in different OEM FW versions.

This Jim TCL code from my WIP port of hmt implements my understanding of the character encodings (updated):
Code:
proc ucs2toutf8 {wc} { ...}

proc iso8859toutf8{s {part 1}} {...}

proc {string fromttx} {ttxstr} {
	if {$ttxstr eq "" || 1 != [binary scan $ttxstr a c1] || $c1 >= 0x20} {
		# seems to be no string, or untagged ISO 8859-x
		return $ttxstr
	}
	if {$c1 == 0x15} {
		# UTF-8 in bytes 1..end
		return [string byterange $ttxstr 1 end]
		# ignoring:
		#0x12 KSX1001-2004
		#0x13 GB-2312-1980
		#0x14 Unicode Big5
    } elseif {$c1 >= 1 && $c1 <= 11 && $c1 != 8} {
		# various ISO 8859-x in bytes 1..end
		set part [dict get {0x01 5 0x02 6 0x03 7 0x04 8 0x05 9 0x06 10 0x07 11 0x09 13 0x0A 14 0x0B 15} $c1
		return [iso8859toutf8 [string byterange $ttxstr 1 end] $part]
	} elseif {$c1 == 0x11} {
		#0x11 Unicode BMP
		set s [string byterange $ttxstr 1 end]
        binary scan $s s* le
        binary scan $s S* be
        proc score{l} { # count bytes with top bit set
            ...
        }
        # select endianness with more 7-bit chars
        return [join [lmap c $([score le] < [score be]? $le: $be) {ucs2toutf8 $c}]]
	} elseif {$c1 == 0x10 && 2 == [scan $ttxstr "%c%c" c2 c3]} {
		if {$c2 == 0 && $c3 >= 1 && $c3 <= 0x0F && $c3 != 0x0C} {
			# various ISO 8859-x in bytes 3..end
			return [iso8859toutf8 [string byterange $ttxstr 3 end] $c3]
		} elseif {$c2 == 0x69 && $c3 == 0x37} {
			# ISO 6937 in bytes 3..end
			return [xconv [string byterange $ttxstr 3 end]]
		}
	} elseif {$c1 == 0x1f} {
		# ETSI 101162 Encoding_Type_ID
		# used in HD EPG; should have been decoded already
		# xconv ??
		return [xconv [string byterange $ttxstr 2 end]]
	}
	# who knows?
	return $ttxstr
}
 
Last edited:
Thank you all for your contributions. I fully accept that mine is a minority interest, but am grateful for any help in resolving it.

The root of the problem is that there is no proper spec. for what the file format is supposed to be. It's all done by educated guesswork and reverse engineering over many variations of firmware and broadcasters' transmissions. Fixing one thing often breaks another, but is that a reason to put something back? Finding the right answer can be very time consuming.

... as I have indeed found out over the years! But I think that the root of my problem in this instance isn't the spec of the humax itself, but in how the implementation of the crop and rename functions in the customised firmware relates to that spec.

I'm curious how the hmt utility (which provides a decode of the .hmt file in human-readable form) handles these variations, in particular whether it mimics what's seen on-screen. It should, and if it doesn't it needs updating.

Thanks for a very useful tip. ( I haven't used the hmt utility before. I've always worked from the hmt files directly.) I loaded the three hmt files included in the attachment above on to my humax and ran the utility against them. The output in all three cases seems to be valid. It would appear that the writer of hmt was aware of the variability in the use of these fields between the original humax files and their processed derivatives, and accommodated it.

The prefixes distinguish various character encodings as defined in the DVB specifications (DTG D-Book 7 Pt A 8.5.6 and EN 300 468 Annex A), with some Humax extension IIRC.

There may have been slightly different HMT versions in different OEM FW versions.

Yes, the problem certainly arises from the use of these prefixes, and that usage certainly does vary!

All the above helps me to frame my questions much more specifically.

1) Can anybody confirm my supposition that the web interface crop and rename function does indeed alter/move these three names fields in the way that I have reported? If so, perhaps the documentation might register the fact as a warning to any future coders.

2) Is there any reason why these changes were made by the writers of the respective functions?

3) Is it possible to update the rename function to stop inserting the X'15' character at 029A, which compromises the underlying media list alphabetic sort function?

Thanks and apologies for any unintended apparent criticisms.
 
...
1) Can anybody confirm my supposition that the web interface crop and rename function does indeed alter/move these three names fields in the way that I have reported? If so, perhaps the documentation might register the fact as a warning to any future coders.

Probably, because it believes, following the DTG spec, that this is a correct tag for a text string in ISO 8859-15 encoding (ie Latin-1 with €), whereas the strings with 106937 are in ISO 6937 encoding.

2) Is there any reason why these changes were made by the writers of the respective functions?

The renaming in the HMT file is done by the hmt utility (webif/lib/ts.class l.481).
Code:
exec /mod/bin/hmt "+setfilename=$newname" "$dir/$newname.hmt"
This is also done in webif/html/browse/crop/execute.jim l.73, but redundantly since it was already done by the rename operation on the line before:
Code:
ts renamegroup "$dir/$shname.ts" $newname
3) Is it possible to update the rename function to stop inserting the X'15' character at 029A, which compromises the underlying media list alphabetic sort function?

The utility would have to be modified, if it were determined to be doing the wrong thing, and rebuilt, or replaced by a Jim version.
 
Senior moment?
It appears so. Disguising with a .zip is what I put in the Newbies Guide, and that is what has stuck in my mind, and although I feel sure it was the case at one time I can't confirm it and have no idea whether it was true at the time of writing. All I can say is: where were the proof readers normally so ready to jump down my gullet?
 
Last edited:
The utility would have to be modified, if it were determined to be doing the wrong thing, and rebuilt, or replaced by a Jim version.

Just to be clear, there's no suggestion on my part that it's doing the 'wrong' thing, merely that it appears to be using different conventions for string prefixes than the underlying humax firmware - for which there may well be good reasons of which I am unaware.

However, as I said, I realise that this is a minority issue so won't waste any more of any one's time. I'll either just live with it or figure out my own solution.

Thanks again for all your helpful comments - and for the incredible support over the years.
 
Back
Top