Rockyrails
New Member
Hi guys,
I've been passively following this incredible forum for a long time and have finally signed up. Thanks for all the input and information - I have learned so much. I would like to give back a bit by showing how to quickly and painlessly extract subtitles from an HD TS file decrypted and copied to my PC, and convert to SRT or any other format you so wish. DVB subs are extremely limited in terms of the ability to play them back. I've been searching forever for a way to do this, but never found any concrete solutions...lots of suggestions that haven't worked or incredibly elaborate solutions that require all kinds of software but also never seem to work. Today, I found the solution which has been in front of me all this time. I have used an hour long HD TS file decrypted and copied to my PC, which has been edited in VideoRedo so as to keep the DVB Subtitles intact. Before you start;
(i) Download and install 'Subtitle Edit'.
(ii) Download 'Tesseract OCR' (tesseract-ocr-setup-3.02.02.exe) and install.
(iii) Copy & paste the 'tessdata' folder and 'tesseract.exe' file from C:\Program Files (x86)\Tesseract-OCR to
C:\Program Files (x86)\Subtitle Edit\Tesseract. Agree to move and replace the files of same name already
in folder.
Now we're good to go!
1. Here is my TS HD file with DVB Subtitles 'on' in VLC
2. Now open Subtitle Edit
Drag your TS file onto the white area on the left side of the program, and wait between 30 secs and a minute for file parsing to complete (My one hour HD video took about 45 seconds to parse. Once completed, the program automatically opens the following window:
Make sure to use the settings as pictured, with OCR Via Tesseract chosen in top left, and the settings exactly as shown on right side (you can play around with these when you feel more confident). Now, click 'Start OCR' button (Do NOT click OK!)...Let the program do its magic...My one hour verbally intensive TS file took 6 minutes to complete, and you can see the OCR in action.
3. When the OCR has finished its work, you'll be left with something a bit like this:
You will be amazed at the incredible accuracy of the OCR via Tesseract setting. It really is about 99.8% accurate! Some of the problems it encountered and attempted to fix can be found on the right side window, but as you scroll down and compare those lines with the fixed lines in the bottom left window, you'll see that most have been fixed automatically by the program. You could spend a couple of minutes clearing the odd error if you want. The top window shows you the original DVB subtitles if you need something to compare against. Now, click 'OK' in the bottom right of the screen.
4. The program returns to its original window, and you can see your shiny new subtitles in the left window,
just waiting to be edited and saved!
I normally click 'Tools - Fix Common Errors - Next - Apply Selected Fixes - OK', and also under Tools, I click 'Merge Short Lines' and also 'Split Long Lines'. That cleans up the subs so I am ready to save them. Then, go to File, and Save in whichever format you prefer. Save the file in same place as TS file and give it the exact same name as your TS file. (Very important!!)
5. Once saved and closed, open Subtitle Edit again, and drag your new subtitle file in. It will automatically
load the video onto the right hand part of the screen:
Make sure you click the 'Adjust' tab on bottom left. Now you can play the video and alter the position of the subtitles if you need to. I only really use the 'Set start and Off-set the rest' button under 'Adjust'. If the original DVB Subtitles were correctly in sync with the TS file, then you probably won't need to change any positions. Now save again!
6. So in order to use your new subtitle file, you need to convert your TS file to MKV. I use MKV Merge for this.
As shown, drag in your TS file and your new SRT file, and hit the 'Start Muxing' button. In a few seconds, you have an MKV file with subtitles which will play on a huge number of devices.
All done! Please don't be put off by my instructions. With an hour of practise, you should be able to complete the entire process from start to finish in about 10 - 12 minutes (which includes the 6 minutes to parse the original TS file)! I have tested this on 4 of the 5 original terrestrial Freeview channels. Hope it helps.
I've been passively following this incredible forum for a long time and have finally signed up. Thanks for all the input and information - I have learned so much. I would like to give back a bit by showing how to quickly and painlessly extract subtitles from an HD TS file decrypted and copied to my PC, and convert to SRT or any other format you so wish. DVB subs are extremely limited in terms of the ability to play them back. I've been searching forever for a way to do this, but never found any concrete solutions...lots of suggestions that haven't worked or incredibly elaborate solutions that require all kinds of software but also never seem to work. Today, I found the solution which has been in front of me all this time. I have used an hour long HD TS file decrypted and copied to my PC, which has been edited in VideoRedo so as to keep the DVB Subtitles intact. Before you start;
(i) Download and install 'Subtitle Edit'.
(ii) Download 'Tesseract OCR' (tesseract-ocr-setup-3.02.02.exe) and install.
(iii) Copy & paste the 'tessdata' folder and 'tesseract.exe' file from C:\Program Files (x86)\Tesseract-OCR to
C:\Program Files (x86)\Subtitle Edit\Tesseract. Agree to move and replace the files of same name already
in folder.
Now we're good to go!
1. Here is my TS HD file with DVB Subtitles 'on' in VLC
2. Now open Subtitle Edit
Drag your TS file onto the white area on the left side of the program, and wait between 30 secs and a minute for file parsing to complete (My one hour HD video took about 45 seconds to parse. Once completed, the program automatically opens the following window:
Make sure to use the settings as pictured, with OCR Via Tesseract chosen in top left, and the settings exactly as shown on right side (you can play around with these when you feel more confident). Now, click 'Start OCR' button (Do NOT click OK!)...Let the program do its magic...My one hour verbally intensive TS file took 6 minutes to complete, and you can see the OCR in action.
3. When the OCR has finished its work, you'll be left with something a bit like this:
You will be amazed at the incredible accuracy of the OCR via Tesseract setting. It really is about 99.8% accurate! Some of the problems it encountered and attempted to fix can be found on the right side window, but as you scroll down and compare those lines with the fixed lines in the bottom left window, you'll see that most have been fixed automatically by the program. You could spend a couple of minutes clearing the odd error if you want. The top window shows you the original DVB subtitles if you need something to compare against. Now, click 'OK' in the bottom right of the screen.
4. The program returns to its original window, and you can see your shiny new subtitles in the left window,
just waiting to be edited and saved!
I normally click 'Tools - Fix Common Errors - Next - Apply Selected Fixes - OK', and also under Tools, I click 'Merge Short Lines' and also 'Split Long Lines'. That cleans up the subs so I am ready to save them. Then, go to File, and Save in whichever format you prefer. Save the file in same place as TS file and give it the exact same name as your TS file. (Very important!!)
5. Once saved and closed, open Subtitle Edit again, and drag your new subtitle file in. It will automatically
load the video onto the right hand part of the screen:
Make sure you click the 'Adjust' tab on bottom left. Now you can play the video and alter the position of the subtitles if you need to. I only really use the 'Set start and Off-set the rest' button under 'Adjust'. If the original DVB Subtitles were correctly in sync with the TS file, then you probably won't need to change any positions. Now save again!
6. So in order to use your new subtitle file, you need to convert your TS file to MKV. I use MKV Merge for this.
As shown, drag in your TS file and your new SRT file, and hit the 'Start Muxing' button. In a few seconds, you have an MKV file with subtitles which will play on a huge number of devices.
All done! Please don't be put off by my instructions. With an hour of practise, you should be able to complete the entire process from start to finish in about 10 - 12 minutes (which includes the 6 minutes to parse the original TS file)! I have tested this on 4 of the 5 original terrestrial Freeview channels. Hope it helps.