Special Characters, sweeper rename ' instead of '

free30

Member
Hello
I'm using sweeper and TVDB to rename and remove doubles.
This has worked for a long time without issue, thank you!
But now the renaming files with 'special characters' like ' does not work correctly.
So a program called Quagmire's Mom is named as Quagmire's Mom on the webif but in the TV browsers its called Quagmire's Mom.
This is also creating many doubles as in the past it was named correctly and now it keeping a double as the name don't match.

Thank for any advice.
 
Last edited:

/df

Active Member
It's special characters you have a problem with?

Something is leaving the apostrophe in an HTML-encoded form (') which is displayed as ' in the WebIf but not on screen.

Either it's sent like that by TheTVDB.com, or some processing in the CF side is changing it. However the Jim cgi module that would be the prime suspect in the latter case only changes <>&" and the only recent change to sweeper and webif that could have affected TVDB doesn't seem to be capable of changing the treatment of HTML entities.

If characters encoded as HTML entities are now being received, maybe the HTML processing in /mod/webif/plugin/sweeper/save.jim can be adapted to handle a slightly wider range of them, something like this. This should probably go somewhere in the tvdb::_extract method:
Code:
# Transform some chars that might be sent as HTML entities. 
# Initially &<>, but add some others (sent by TheTVDB.com?).
# Entity names are case-sensitive but HTML5 adds AMP, etc;
# other syntaxes (eg &#xhhhh;) aren't. So use -nocase at the risk
# of transforming an illegal &APOS;, eg.
set data [string map -nocase {
        &amp; &
        &lt; <
        &gt; >
        &apos; "'"
        &#x0027 "'"
        &#39; "'"
        &#37; %
        &#43; +
        &#32; " "
        &#34; "\""
        &quot; "\""
        &#x0022; "\""
        &#63; "?"
        &#38; "&"
        &#35; "#"
 } $data ]
Or this code for proc HtmlDecodeEntity could be adapted.

@af123?
 
Last edited:
OP
free30

free30

Member
Thanks that sounds great.
I guess I am not the only one with this problem.
Is anyone able to implement this? I don't know where to start.
 

/df

Active Member
Not sure I have the answer, but on reflection maybe the right solution is a class method decodeCharEntities for xml.class that can be called from tvdb.class. I can run that up, I expect.

BUMP @af123

Test patch here.
 
Last edited:
OP
free30

free30

Member
Thanks, for the suggestions. I hope it gets picked up as I cant test your patch without clear instructions.
 
Last edited:

/df

Active Member
If you are able to use the File Editor in WebIf>Diagnostics, then you could use these instructions, and this would be helpful to confirm that the patch fixes the problem:
  1. Use the File Editor to open the file /mod/webif/lib/xml.class and save a copy, say /mod/webif/lib/xml.class.org ("org" = original)
  2. In another tab or window of your web browser, load the text of the new version; select the entire text and copy it.
  3. In the File Editor tab/window, select the entire text of the original xml.class; paste the new text to replace it.
  4. Save the result as /mod/webif/lib/xml.class.
  5. In case it's important, restart the system in order to clear any caches (or at the command line, service lighttpd restart).
  6. Test your problem TVDB shows; it maywill be necessary to use Change>Clear Series Information and then re-associate the series folder with TheTVDB.
If you think this has all gone horribly wrong, you can restore the "org" version of the changed file, or just reinstall the webif package using the WebIf>Diagnostics>Force reinstall function.
 
Last edited:
OP
free30

free30

Member
Ok, So I followed your fantastically clear instructions, thank you.
I waited for a show to come along to test it, but it looks like it is not working. I thought maybe Webif got updated and removed it but no.
So thanks but even with these changes I still get shows renamed incorrectly. Like " A Picture&#39;s Worth a Thousand"
 

/df

Active Member
Bother.

My attempt to reproduce is hampered by the absence of the 2 shows you mention from my thetvdb.com search results. A series URL from thetvdb.com would be helpful.

The TVDB interface caches data assiduously: in memory, while WebIf is running on the server, and persistently in the directory /mod/var/tvdb. The series reset in step #6 is definitely required, and in case, as it appears, that doesn't clear the in-memory cache, it would be prudent to repeat step 5 as well. The on-disk cached episode data is refreshed if it's older than 24 hours, and replaced if the TheTVDB.com download is accessible.

If the TVDB field in the Media Details pop-up says "Found episode using cached values" when first opened, it hasn't cleared the cache properly (although refreshing the episode data should be enough).

You might first try running the tvdbreset diagnostic, using Webif>Diagnostics>Run Diagnostic. This clears the on-disk cache, and you shouldn't lose anything if the original series data is still available from TheTVDB.com. In fact you may get better data if the site has been updated; there doesn't seem to be any other way to get the TVDB interface to update its cached data from the site.
 
Top