de-dup and helper files


I've noted the info in that thead but could anyone give me further information about the matching process used in de-dup/tidy please?

I have been trying to use dedup/tidy on the "Spooks" re-runs on the Drama channel and it isn't going well. Eventually I tracked down the episode information on the Drama web site and I realised that the info they are putting out does not lend itself to matching. It does not match the episode guides in TVDB (nor the ones in for that matter) and so the results were all over the place. Furthermore it is inconsistent, taking a different form across different seasons - e.g. for some seasons the episode names are given, for other seasons they are not. So I created a helper file from the info and this seems to be matching a bit more accurately but not 100%. I'd naively expected this to match 100% as it will match exactly the synopsis being broadcast.

I've notice that the matching is reported as being "by cached value" but I do not know what this means, i.e. what is being cached and compared against what using what algoithm.

If someone could advise on the matching process then maybe I can tweak the helper file. Happy to share the helper file but doesn't seem much point if it doesn't work.


If there was no response to your previous post, it's because nobody that's read it has an answer for you. There is no reason re-posting will get a better response.