TheTVDB.com Integration

A few minor points:

The Series selection popup needs to scrollable, I tried to add Grand Designs but the list of matching series exceeded screen size with the one I needed (of courrse) off the top of the screen,
Fullscreen capture 28102015 234130.jpg

It seems odd/unnecessary to have the Save and Cancel buttons on a different line on the season selection panel
Humax HDR Fox T2 (Humax) - Mozilla Firefox 29102015 100638.jpg

Terminology you use the word series for both different programs and different series (seasons) of the same series, Whilst this is correct British English usage it can be confusing to use the word for different purposes in close proximity and that it might be clearer to use the Americanism Season when talking about series numbers

It would be quite nice to have in addition to the link to the series database links to the relevant theTVdb.com pages from the Series selector list, the Folder image and individual program details.

theTVDB.com also has individual episode images for many programs, would it be possible to provide a n option to use those, if available, as program thumbnail images ?
 
Thanks for the feedback, I won't be able to get to this for a few days but I will sort out the selection box size/scroll issue and look at the other things you've raised; I can't see any reason not to implement the links* and additional thumbnails but I can't bring myself to call them seasons.

* The link to the internal database screens was to aid development and debugging - I think they can just be replaced with links to theTVDB web site.
 
For Grand Designs Australia the Channel 4 synopsis text is significantly different from that in the TVDB database
So series 5 episode 2 'Mt Eliza Modern' matched to 's5e5 - Port Melbourne Urban Green'
A Melbourne couple try to build a lightweight, modern house on a perfect plot high up on Mount Eliza. But as well as a tight schedule they face costly redesigns and long delays.
I think the only match is the word 'modern', it didn't match the word 'Eliza' in the title
Code:
4996332     5     1     The Graceville Container House     This feisty and resourceful couple decide to trial a whole new way of building, starting with 10, then 20, then 31 steel shipping containers that carpenter Todd, plans to crane into their suburban block, then somehow weld into a flood proof family home - all for just $400,000.
4996333     5     2     Mt Eliza Modern     Georgina has been drawing and designing homes since she was a child, but has never built anything - until now. She and her husband are putting her plans into action building a modernist design.
4996334     5     3     Claremont Origami     Why would an architect at the pinnacle her career and creative powers ditch the drafting desk, don steel cap boots and take up a blowtorch? Because for Ariane Prevost, that's exactly the antidote to a lifetime of designing houses for other people - 102 in fact.
4996335     5     4     Foxground Pavilion     After 30 years as a civil engineer, Joe Cato has built more roads than the Romans. But a few years ago he and wife Maura made the active decision to slow everything down - to sell their successful construction business, and spend more time with their three children.
5025317     5     5     Port Melbourne Urban Green     Ian and Ann are challenging the norms in their suburb to build a modern and sustainable home that is clad in water tanks. It's a game of invention and ingenuity for a $1.8 million investment.
5025318     5     6     Toowoomba English Farmhouse     Sarah and Alistair bring a piece of the English countryside to Sarah's hometown. From the pitched roof to the European interiors, they are committed to authenticity - with a touch of fantasy.
5025319     5     7     Williamstown Bluestone Cottage     Jason and Jennifer's 15- year-old derelict bluestone cottage was left almost frozen in time. They plan to restore it and add a modern, industrial structure that will contrast the heritage frontage.
5025320     5     8     Brookfield Spotted Gum House     Millie and Andrew are bringing their dream of life on a farm to fruition, including a home that is unique. Projecting out of the landscape and clad in natural materials, it's a new take on rural living.
5045324     5     9     Pipers Creek Strawbale House     Like many people, Dean and Sherril Lamb yearn for a simpler existence, for them and their three children. But unlike most people, they're actually going to try and make it happen. They've sold their successful fruit shop and home in Warragul and bought 40acres in Pipers Creek in country Victoria \xe2\x80\xa6.all in the pursuit of total self sufficiency.
5045325     5     10     Faraday Aussie Bush House     Before Matt McClelland's wife Anne died six years ago, they'd been looking for a rural property to build on - a place to call home for them and for their four adult children to come to visit. So when Matt stumbled across 40 acres in central Victoria with spectacular views to Mt Alexander's granite hill side, he knew he'd found the spot.
 
It's difficult if the episode name can't be extracted from the broadcasted synopsis.
The current algorithm is to attempt a synopsis phrase match, starting with phrases containing 5 words and working down to just single words. Any phrases found are scored at twice the phrase length which gives weight to longer phrases. It doesn't currently look at the episode names in the database though - I'll change that - can you see if the version I'm about to publish is any better?
(I've also started to look at some of your comments earlier so at least the episode selection dialogue should fit on your screen now!)
 
Just found a weird failure on one of the Diners, Drive-ins and Dives episodes.

Don't know where it found the info to rename it to s12e11 - Breakfast, Lunch and Dinner

2015 11 03 - Triple D.png
 
Just found a weird failure on one of the Diners, Drive-ins and Dives episodes.

Don't know where it found the info to rename it to s12e11 - Breakfast, Lunch and Dinner
If you click on the little + next to the red text it will show you the synopsis that is held in theTVDB database - there must be some overlap in words there.
 
Strange, just taken another screen grab with the synopsis open and noticed that the episode number has changed to s12e10

2015 11 03 - Triple D b.png
 
The "From Kraut to Couscous" looks like an episode name - if you click on the number at the right hand side of the TVDB logo, you'll get a new browser window showing you the series database. Can you see if that episode name is in there or if they have misspelled it somehow?
 
It's difficult if the episode name can't be extracted from the broadcasted synopsis.
The current algorithm is to attempt a synopsis phrase match, starting with phrases containing 5 words and working down to just single words. Any phrases found are scored at twice the phrase length which gives weight to longer phrases. It doesn't currently look at the episode names in the database though - I'll change that - can you see if the version I'm about to publish is any better?
(I've also started to look at some of your comments earlier so at least the episode selection dialogue should fit on your screen now!)
Series selection scrolling works well :)
Problem with episode of Father Brown (report in post #7) now resolved :)
Grand Designs (UK) matched successfully on synopsis text :)
Grand Designs Australia matched a different incorrect episode (s5e4 - Kyneton Flat Pack) :(

It was never going to be a 100% positive report ;)
When there is so little in common between the broadcast synopsis and TVDB matching is never going to be perfect so I think there needs to be improvements to the 'De-duplicate / tidy this Folder' panels to allow for manual selection of an episode from a list for the cases where automation fails.
Similarly for Dexter where most episodes synopsis start with the phrase 'Razor-sharp drama series centring on a serial killer who also works for the police' before the actual episode synopsis and the words 'serial killer', works, police, are causing frequent mismatches to s1e1. A manual selection option would be easier than trying to create a Sweeper rule to remove the phrase from the synopsis.

Some more, simpler to implement, suggestions for dedup:
  • Where episodes numbers in synopsis are of the form 1/10 leave the /10 out of title and file names
  • Use 2 digits for episode numbers (and possibly series numbers) so that when sorted by name episode 10 doesn't appear before episode 1
 
  • Use 2 digits for episode numbers (and possibly series numbers) so that when sorted by name episode 10 doesn't appear before episode 1
Or at least before episode 2. Many sort systems now use an intelligent sort which recognises numeric strings and sorts them appropriately - but my habit is to use leading zeros to avoid problems with old-skool sorters as you say.
 
Cous Cous is 2 words in the database.
That would do it : )

The current matching algorithm is the best one I've come up with so far, but I haven't spent a lot of time on it.
Any suggestions for improvements?
 
I think it's working very well so if the odd item sneaks through it's not a problem.

Having the option to select the correct episode from the TVDB would be nice though.

Any thoughts on why the episode number changed from my 1st to 2nd screen grab?
 
That would do it : )

The current matching algorithm is the best one I've come up with so far, but I haven't spent a lot of time on it.
Any suggestions for improvements?

I presume all matching is case insensitive.

Does the current algorithm search for phrases from the program synopsis in the TVDB or vice versa? I am not sure whether it would make any difference

Is the TVDB title field handled any different from the synopsis? I would suggest a higher match weight if words from the TVDB title are found within the first few words of the program synopsis.

It might be nice to attempt to handle word variations and abbreviations eg cous cous / couscous, mount / mt / mnt/, road / rd etc but the length of the variation list and search time could quickly spiral out of control.

Any thoughts on why the episode number changed from my 1st to 2nd screen grab?

When did you take the screen grab? There was a change to the webif on Monday night which affected the algorithm.
 
Can a folder support more than one TVDB series id?

Channel 4 have started a new Grand Designs variation Grand Designs: House of the Year with it own TVDB entries and episode numbering 1-4 but using the same series CRID as the main series so using the existing Humax schedule entry and target folder.

I could modify the schedule entry or use Sweeper to move recordings but I would prefer not to have to create yet another folder just to keep the TVDB support happy.

How often are the TVDB databases on the Humax refreshed from the web to pick up new episodes? The TVDB currently only has a an entry for the first episode of the new series.
 
Can a folder support more than one TVDB series id?
No, not at present.
How often are the TVDB databases on the Humax refreshed from the web to pick up new episodes? The TVDB currently only has a an entry for the first episode of the new series.
Every 24 hours. I'm not sure how proactive they are at updating the databases though. I've had problems with the new Doctor Who series for example.
 
The next version will have a couple of extra settings:

2015-11-06 13_58_18-Humax HDR Fox T2 (humax).png

The configurable episode prefix is used by dedup when renaming recordings and files. I've set the default to be as shown above and %e is padded to two digits as requested.

The additional diagnostic output will show more information when matching using synopsis text, e.g.:

2015-11-06 14_02_54-Humax HDR Fox T2 (humax).png

The first set of information shows the common phrases found in both the broadcast and database synopses, in descending order of number of words. The second set shows the episode IDs and points scored. In this case, the circled episode wins with 72 points.
I am considering tweaking the algorithm to give slightly more weight to phrases found in the recording name over the synopsis but I'm not sure about that yet. In the example above, the word Team would hit and for many episodes it's likely that the whole phrase Time Team would be present in both.

ian_j - if you still have that Couscous episode around, I'd be interested in the additional diagnostic information from that when I upload the new version of webif tonight.
 
Back
Top