Quick guide to Extract DVB-Subtitles from TS HD file and convert to SRT in minutes

Trev · Oct 31, 2018

Back in May '17 and post #11, I couldn't understand "Why". I still don't.
Why would you want to do this other than the "Because I can" approach?
Can someone tell me please?

fenlander · Oct 31, 2018

Tesseract is a general-purpose OCR engine that has long been the default choice for Linux, where it can be used from the console or incorporated into applications, such as scanning software. So far as I know it does not have a dedicated GUI in either Linux or Windows. So yes, it is a 'stock' engine.

The last files I tried this extraction process on were the final episodes of 'Black Earth Rising' and 'Strangers'. The first is largely set in Rwanda and apart from a lot of very unfamiliar names, it also contains French, including some doubtful terms like 'genocidaire'. The latter is set in Hong Kong, so Chinese proper nouns. Some of these resemble English words or acronyms, so the name 'Xo' confused the spell checker which wanted to render it as 'X0' or 'XO'. There were very few actual character recognition errors: those that did occur mostly involved double l (ll) or m, which in some circumstances can be detected as hi or nn or hn. Arguably, it might be better just to turn off the spell checking: this would pretty much eliminate the uncertainty caused by foreign names being spell checked, at the expense of letting a small number of recognition errors through.

fenlander · Oct 31, 2018

@Trev: why do it?

1) My ageing ears have increasing difficulty with a) American TV and b) BBC audio. I like to keep subtitles on.

2) DVB subtitles are over-large, hideous and distracting. I can adjust srt subs to be inconspicuous in a small font at the bottom of the screen, where I can ignore them until I need them.

3) I use my Hummy purely as a recorder (it's in a bedroom). I clean up all recordings and store them on my NAS in mkv or mp4 format for use throughout the house, mostly viewed using a dedicated PC or video streamer. Once a recording is edited to remove unwanted material, including ads, DVB subs are lost anyway.

4) Sometimes I can't get a specific set of subs online. Particularly the case with non-drama programming.

5) I have time to play, it's a hobby and I'm probably a little bit OCD...

We all organise our viewing in our own way. Personally, I'm at a complete loss to understand why anyone would want to be bothered to download iPlayer or YouTube material using a Hummy or to decrypt programmes off-box.

Trev · Oct 31, 2018

Thanks for that. I just wondered why all the effort.

Black Hole · Oct 31, 2018

fenlander said:
Arguably, it might be better just to turn off the spell checking: this would pretty much eliminate the uncertainty caused by foreign names being spell checked, at the expense of letting a small number of recognition errors through.

That's exactly what I was arguing.

EEPhil · Nov 1, 2018

fenlander said:
We all organise our viewing in our own way. Personally, I'm at a complete loss to understand why anyone would want to be bothered to download iPlayer or YouTube material using a Hummy or to decrypt programmes off-box.

I'd certainly agree with the first point(s).
Even though I've written the Windows version of the off-box decryption (following af123's work on this), I can't see why people are using it in place of other methods. Especially when the people using it seem to have the custom firmware. It's not as though I can even use it with my 2000T (at present). Before you ask why I wrote it - I refer you to your point 5. "I have time to play, it's a hobby and I'm probably a little bit OCD..."

fenlander · Nov 1, 2018

OCD makes the world go around - to a very precise schedule

.

prpr · Nov 1, 2018

"I've got CDO - it's like OCD, but with all the letters in the right order, as they should be!"

fenlander · Nov 1, 2018

Collateralized Debt Obligation?

Quick guide to Extract DVB-Subtitles from TS HD file and convert to SRT in minutes

Trev

The Dumb One

fenlander

Active Member

fenlander

Active Member

Trev

The Dumb One

Black Hole

May contain traces of nut

EEPhil

Number 28

fenlander

Active Member

prpr

Well-Known Member

fenlander

Active Member