Special Characters, sweeper rename ' instead of '

/df

Well-Known Member
Thanks again for your help.

1. Opening /mod/var/tvdb/73141.xml
This was too big to open in WebIF as were most of the .xml files. I managed to open one of them and it had the incorrectly encoded data in. See attached image 1.
Of course, it's correctly encoded in XML. The problem is if that encoding doesn't get decoded.
2. Running commands in the Jim interpreter.
As you can see below in 3 the result was incorrectly encoded. It took me a while as I had to rename the file back after I had allowed sweaper to rename it before, not sure if this would change the result. I will try and find a fresh file to try it on.
Your test 3 is interesting (#2 failed because the recording whose name was in one of the previous screenshots had been renamed, as you say). Two possible causes:
  • your system has an old/broken /mod/webif/lib/xml.class; I think you checked already; it should have as line 205:
    Code:
    proc {xml decodeCharEntities} {xmlText} {
  • the incorrectly decoded episode name is cached somewhere in the theTVDB.com interface.
I hoped that running the tvdbreset diagnostic would have fixed the second issue, but I can see that there is a binary program /mod/webif/lib/bin/tvdb being run to read the XML and insert it into a SQLite3 series database, used as a cache to avoid having to parse the XML every time. Ideally the necessary decoding would be done in that program, but I believe @af123 has the source. My "Schitt's Creek" series XML has a good test-case where one episode uses " (correctly decoded) and the next uses ' (not properly decoded). We could make an auxiliary script that fixes the database just after it's been created (might be a bit slow), or replace the entire program script (might be very slow).
 
Last edited:

/df

Well-Known Member
And I've run up a patch that appears to deal with the problem. The final episode in the test-case is the one using " which is correctly decoded by both versions.
Code:
# /mod/webif/lib/bin/tvdb /mod/var/tvdb/287247.xml
Series: Schitt's Creek
Episode: Our Cup Runneth Over
Episode: The Drip
Episode: Don't Worry, It's His Sister
...
Episode: Schitt's Creek feat. Mariah Carey | Dear Class Of 2020
Episode: Schitt's Creek Cast Q&A w/ Jennifer Garner 
Episodes: 91
# ./tvdb /mod/var/tvdb/287247.xml
Series: Schitt's Creek
Episode: Our Cup Runneth Over
Episode: The Drip
Episode: Don't Worry, It's His Sister
...
Episode: Schitt's Creek feat. Mariah Carey | Dear Class Of 2020
Episode: Schitt's Creek Cast Q&A w/ Jennifer Garner 
Episodes: 91
#
 
OP
free30

free30

Member
  • your system has an old/broken /mod/webif/lib/xml.class; I think you checked already; it should have as line 205:
    Code:
    proc {xml decodeCharEntities} {xmlText} {

Yes I have this file.
How can I try your patchs?
 

/df

Well-Known Member
The test version of the program is at https://git.hpkg.tv/df/tvdb/raw/branch/test/tvdb.

The best way to install it for testing is:
  • start a telnet or WebShell session to the Humax;
  • enter the following commands in the shell (go to the scratch directory, fetch the program, make it executable, save the current version, install the test version -- below I'm just listing the commands and not an entire shell transcript):
    Code:
    cd /tmp
    wget 'https://git.hpkg.tv/df/tvdb/raw/branch/test/tvdb'
    chmod a+x tvdb
    mv /mod/webif/lib/bin/tvdb /mod/webif/lib/bin/tvdb.org
    mv tvdb /mod/webif/lib/bin/
  • check that by verifying the permissions (-rwx--x--x) and size (23827) and if OK, by running the program with no parameters, which should look something like this:
    Code:
    # ls -l /mod/webif/lib/bin/tvdb
    -rwx--x--x 1 root root 23827 Jan 15 19:56 /mod/webif/lib/bin/tvdb
    # /mod/webif/lib/bin/tvdb
    Syntax: /mod/webif/lib/bin/tvdb [-d] <xml file>
    #
  • run the tvdbreset diagnostic;
  • now try the operations that didn't work before.
 
OP
free30

free30

Member
That all seems to work for me.
Thank you for your help.
I know its not important but its been upsetting me for a long time now. I really like my HDR.
Thanks :)
 
OP
free30

free30

Member
The data from theTVDB.com (as quoted above) disagrees with the transmitted EPG data. One would have to watch the show to know which is correct. The matcher (as described above) believes the series/episode number ahead of the guessed episode name, but you can manually override the episode selection from the TVDB data with the "Change" button.
I have tried the change button but it seems to do nothing.
 
OP
free30

free30

Member
Sorry, not to be clear.

I click on change and this page (attached image) comes up.
Then I fill in the field the and click search and nothing happens. There is no response at all.
This example, like many on my system have the correct, synopsis and series and episode number but incorrect name.


1.jpg
 

/df

Well-Known Member
So it's tried to find the episode name by "series and episode number", which means the episode name was set from (non-zero) series and episode numbers, guessed from the synopsis (set from the EPG), by looking up the episode in the cached theTVDB.com series data with matching series and episode numbers. The guessing occurs in step 2 of the episode name algorithm summarised in post #39.

The Change button brings up the dialogue as you posted, with the "possible" alternative episodes below the fields shown, but in this case there are none. I'm not sure whether this is as intended.

According to theTVDB.com, "You Gotta Strike for Your Right" is episode s15e06 (Combined_season: 15, DVD_season: 0), but the EPG synopsis has s14e06, corresponding to the episode "Roger's Baby" (Combined_season: 14, DVD_season: 13). Clearly there's a glitch in the metadata from theTVDB.com (DVD_season == 0) and perhaps this is what is disturbing the force. The tvdb program extracts the Combined_season to populate the episode cache. These values are inconsistent in the metadata.
 
OP
free30

free30

Member
Thank you.
I think whats happening is the DVR synopsis holds the episode being broadcast and the one before or after.
I think it is selecting the wrong one.
That is just a guess.
 

/df

Well-Known Member
Unless you've "renamed" the recording, the synopsis should be what came from the EPG data broadcast with the show (via the HMT sidecar file).

According to the Wikipedia entry for the show, there are series 1-15 that have been both broadcast and sold on DVD, with no reason for different numbering. This Jim script (very slow) lists the episode IDs with Combined_season not matching DVD_season; the mentioned file /mod/tmp/en.xml would be /mod/var/tvdb/73141.xml if I had AD! as a recorded show synced with theTVDB.com.
Code:
# jimsh
Welcome to Jim version 0.79
. source /mod/webif/lib/xml.class
xml test
. set xx [open "/mod/tmp/en.xml" r]
::aio.handle4
. set xxx [xml init [$xx read]]
<reference.<xml____>.00000000000000000000>
. for {set tag ""; set season ""} {1} {$xxx next} {
>    lassign [$xxx next 1] typ val attr etyp 
>    if {$typ eq "EOF"} { break }
>    if {$etyp eq "START"} {
>        set tag $val
>        if {$tag ni {id Combined_season DVD_season}} { set tag "" }
>    } elseif {$typ eq "TXT" && $tag ne ""} {
>        if {$tag eq "id"} {
>            set season ""
>            set id $val
>        } elseif {$tag in {Combined_season DVD_season} && $val ne $season} {
>            if {$season eq ""} {
>                set season $val
>            } else {
>                puts "$id: $season, $tag = $val"
>            }
>        }
>    }
>}
306167: 2, DVD_season = 1
306168: 2, DVD_season = 1
306169: 2, DVD_season = 1
306170: 2, DVD_season = 1
306171: 2, DVD_season = 1
306172: 2, DVD_season = 1
307007: 3, DVD_season = 2
307089: 3, DVD_season = 2
307090: 3, DVD_season = 2
307091: 3, DVD_season = 2
309548: 3, DVD_season = 2
313216: 4, DVD_season = 3
313217: 4, DVD_season = 3
313673: 3, DVD_season = 2
314392: 3, DVD_season = 2
317063: 3, DVD_season = 2
317064: 3, DVD_season = 2
335792: 4, DVD_season = 3
335793: 4, DVD_season = 3
335794: 4, DVD_season = 3
335795: 4, DVD_season = 3
335796: 4, DVD_season = 3
335798: 4, DVD_season = 3
378653: 5, DVD_season = 4
378656: 5, DVD_season = 4
378659: 5, DVD_season = 4
378663: 5, DVD_season = 4
378665: 5, DVD_season = 4
378667: 5, DVD_season = 4
5015817: 12, DVD_season = 11
5015818: 12, DVD_season = 11
5015819: 12, DVD_season = 11
5015821: 12, DVD_season = 11
5015822: 12, DVD_season = 11
5015823: 12, DVD_season = 11
5123244: 12, DVD_season = 11
5123246: 12, DVD_season = 11
5123247: 12, DVD_season = 11
5123248: 12, DVD_season = 11
5166473: 12, DVD_season = 11
5166474: 12, DVD_season = 11
5225253: 12, DVD_season = 11
5225254: 12, DVD_season = 11
5225255: 12, DVD_season = 11
5441126: 13, DVD_season = 12
5493712: 13, DVD_season = 12
5493713: 13, DVD_season = 12
5493714: 13, DVD_season = 12
5493715: 13, DVD_season = 12
5493716: 13, DVD_season = 12
5493717: 13, DVD_season = 12
5493718: 13, DVD_season = 12
5493719: 13, DVD_season = 12
5493720: 13, DVD_season = 12
5521430: 13, DVD_season = 12
5521431: 13, DVD_season = 12
5521433: 13, DVD_season = 12
5546736: 13, DVD_season = 12
5546737: 13, DVD_season = 12
5546738: 13, DVD_season = 12
5546739: 13, DVD_season = 12
5546740: 13, DVD_season = 12
5560899: 13, DVD_season = 12
5560900: 13, DVD_season = 12
5560901: 13, DVD_season = 12
5560902: 13, DVD_season = 12
5670286: 14, DVD_season = 13
5773608: 14, DVD_season = 13
5773609: 14, DVD_season = 13
5775508: 14, DVD_season = 13
5802133: 14, DVD_season = 13
5802134: 14, DVD_season = 13
5802135: 14, DVD_season = 13
5925435: 14, DVD_season = 13
5970948: 14, DVD_season = 13
5970949: 14, DVD_season = 13
5970951: 14, DVD_season = 13
6006925: 14, DVD_season = 13
6006926: 14, DVD_season = 13
6084382: 14, DVD_season = 13
6106624: 14, DVD_season = 13
6106625: 14, DVD_season = 13
6146544: 14, DVD_season = 13
6146545: 14, DVD_season = 13
6203747: 14, DVD_season = 13
6203748: 14, DVD_season = 13
6246774: 14, DVD_season = 13
6246775: 14, DVD_season = 13
6463065: 15, DVD_season = 0
6467665: 15, DVD_season = 0
6467666: 15, DVD_season = 0
6467667: 15, DVD_season = 0
6543804: 15, DVD_season = 0
6543805: 15, DVD_season = 0
6543806: 15, DVD_season = 0
6543807: 15, DVD_season = 0
6565184: 15, DVD_season = 0
6565186: 15, DVD_season = 0
6565190: 15, DVD_season = 0
6579622: 15, DVD_season = 0
6593325: 15, DVD_season = 0
...
7186585: 17, DVD_season = 16
7186587: 17, DVD_season = 16
...
.
This shows that
  • some entries in series 1-4 are actually labelled as 1 series greater;
  • some entries in series 3 have DVD_season 1 lower;
  • all entries for series 12-14 have DVD_season 1 lower;
  • 13 entries for series 15 have DVD_season set to 0;
  • all entries for series 16 have DVD_season set but no DVD;
  • 2 entries for series 17 have DVD_season set but no DVD.
However, Wikipedia and theTVDB agree that "You Gotta Strike for Your Right" is episode s15e06, so the EPG appears to be wrong (but maybe the other two were both from the same source).

Regardless, there ought to be a way of overriding the series number incorrectly inferred from the EPG synopsis -- apart from sweeper, that is.
 
Last edited:
OP
free30

free30

Member
Thank you.
So its the EPG is wrong, broadcasting the wrong season number.
Using this season number means it gets named incorrectly despite rest of the synopsis being right.

Can I change the sweeper rules to fix this ?
Or like you suggest is there an easy way to add to option to use synopsis and not season number?
 

/df

Well-Known Member
Thank you.
So its the EPG is wrong, broadcasting the wrong season number.
You'd have to take that up with the producers and broadcasters. Here the Strike episode is s14e06. IMDB and American Dad Wikia have it as s13e06! Best of all, at "American Dad - HOME - The official site for FOX TV in UK", it's s14e05. A G search for American Dad "You Gotta Strike" shows all these numberings and also s13e05 (eg at TBS, the current first-run broadcaster of the show).

I feel the will to live evaporating. Who could care about Covid19 or Vikings in the Capitol when people can't even agree on the numbering of episodes in a TV cartoonanimation series?
Using this season number means it gets named incorrectly despite rest of the synopsis being right.
Yes.
Can I change the sweeper rules to fix this ?
You could try creating a rule to remove the series/episode information from the synopsis, in the reasonable hope that matching the synopsis with theTVDB data would set the series and episode number and thus the episode name, when you later run the standard rules.

Obvs this wouldn't help where theTVDB is, in some sense (see above), "wrong" in the first place: you'd have to submit corrections to the site.
Or like you suggest is there an easy way to add to option to use synopsis and not season number?
What I suggest above is my best guess for that.
 

MymsMan

Ad detector
Obvs this wouldn't help where theTVDB is, in some sense (see above), "wrong" in the first place: you'd have to submit corrections to the site.
That is the basic problem with theTVDB, it may claim "You've found the most accurate source for TV and film." but it goes on to say "Our information comes from fans like you" It all depends on individuals supplying correct information to the various sites and for broadcasters using accurate information neither of which can be relied on.

I have attempted, with very little success, to use it on some of my Granddaughters series to try and find the specific episode of Peppa Pig she is trying to describe but mismatched synopsis have made the problem worse. Fortunately she can now drive an ipad to find stuff on iplayer kids.
 
Top