SRT-To-SSML

Converts SRT subtitle file to SSML file with speech durations.

Note: If looking for a more comprehensive tool for also generating synced and translated dubs, visit my other repo.

Use Cases

Using TTS to generate speech for a video using only subtitles
Automated translation and dubbing of videos while keeping the dub in sync. You can simply translate the text portions of the subtitles before feeding it into the script. This allows the translations of each line remain the same length of the original speech, so the generated speech should theoretically be a drop-in replacement of the original.

How it Works:

It takes the text lines from the subtitle file and puts each on a separate line within the speak tag
It takes the timestamps for the start/end for each subtitle line, and calculates that time difference in milliseconds. Then uses that for the duration attribute for the prosody tag. This tells the TTS how long it should take to say the line, so it will stay in sync with the original video.
- Note: Not every neural TTS service supports/uses the duration feature. Amazon Polly non-neural voices and Azure Speech do, but use their own tags, which this script will automatically use instead.
It also calculates the time difference between the end of one subtitle line and the beginning of the next, and uses that as the time attribute for the break tag at the end of each text line. This is also to keep it in sync with the original video.

Other Notable Features

Automatic tag configuration based on TTS service (currently supports Microsoft Azure and Amazon Polly non-neural voices)
- Note: Currently only Azure Speech seems to support specifying the duration of speech for neural voices. Therefore that is the only service that can properly take advantage of this script. Amazon Polly does too, but only for standard non-neural voices.

SSML Options Changeable With Variables

Language
TTS Voice Name
SSML Version
xmlns Attributes for tag
Whether to include the xmlns:xsi and xsi:schemaLocation attributes
Input and Output file names (Defaults: subtitles.srt for input and SSML.txt for output)
Duration Attribute Name

Example

Input (SRT Subtitle File)

1
00:00:00,140 --> 00:00:05,050
This is an example of a subtitle file with a bunch of random words I've added with various timestamps.

2
00:00:05,240 --> 00:00:13,290
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim

3
00:00:13,480 --> 00:00:14,250
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

4
00:00:14,340 --> 00:00:19,930
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla.

5
00:00:20,130 --> 00:00:23,419
Now some examples of some escaped characters such as & and ' and " and < and > just to name a few

Output

<?xml version="1.0" encoding="UTF-8"?>
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" version="1.0" xml:lang="en-US"><voice name="en-US-DavisNeural">
	<prosody duration="4910ms">This is an example of a subtitle file with a bunch of random words I&apos;ve added with various timestamps.</prosody><break time="190ms"/>
	<prosody duration="8050ms">Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim</prosody><break time="190ms"/>
	<prosody duration="770ms">veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</prosody><break time="90ms"/>
	<prosody duration="5590ms">Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla.</prosody><break time="200ms"/>
	<prosody duration="3289ms">Now some examples of some escaped characters such as &amp; and &apos; and &quot; and &lt; and &gt; just to name a few</prosody>
</voice></speak>

SSML creates longer times and not in sync.

Thank you for your work on this project. I have an SRT audio file and an SSML Audio file from Amazon Polly.

For some reason which I cannot figure out, the SSML audio files created by Amazon Polly are much longer and out of sync with original SRT timestamps.

Here is a view of the audio files in Resolve showing different timestamps and SSML being longer. I used the example 23 second SRT listed in this project.

1
00:00:00,140 --> 00:00:05,050
This is an example of a subtitle file with a bunch of random words I've added with various timestamps.

2
00:00:05,240 --> 00:00:13,290
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim

3
00:00:13,480 --> 00:00:14,250
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

4
00:00:14,340 --> 00:00:19,930
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla.

5
00:00:20,130 --> 00:00:23,419
Now some examples of some escaped characters such as & and ' and " and < and > just to name a few

This is an example of a subtitle file with a bunch of random words I've added with various timestamps.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla.
Now some examples of some escaped characters such as & and ' and " and < and > just to name a few

thiojoe / srt-to-ssml Goto Github PK

srt-to-ssml's Introduction

SRT-To-SSML

Note: If looking for a more comprehensive tool for also generating synced and translated dubs, visit my other repo.

Use Cases

How it Works:

Other Notable Features

SSML Options Changeable With Variables

Example

Input (SRT Subtitle File)

Output

srt-to-ssml's People

Contributors

Stargazers

Watchers

Forkers

srt-to-ssml's Issues

Recommend Projects

Recommend Topics

Recommend Org