Comments (6)
Hi @Rom888, that's a though one!
So the upstream API will introduce a break after 100 characters and there's no way to control this. Which is why gTTS tries to pre-emptively split (tokenize) where pauses would naturally occur (e.g. punctuation) to remediate this, which works pretty well most of the time. But if your input is more than 100 characters w/o any break that gTTS' tokenizer could use to split on, there will have a break no matter what.
Edit: So your best bet if you control the input is to introduce punctuation (commas, etc.).
Edit 2: Wondering if I understood your question correctly actually. Do you have an example where this occurs?
from gtts.
Here is an example:
split-string-1 <split by tokenizer>
split-string-2 <split by tokenizer>
split-string-3 <split by minimizer (because larger than 100 characters)>
split-string-4 <split by tokenizer>
If I understand correctly, gTTS gets all the split strings, makes audio, and then joins all audio fragments into one and adds pauses between those audio fragments.
Is it possible to not add pauses between audio fragments 3 and 4 when joining?
from gtts.
@Rom888 Sorry for the delay—
So what you said is almost correct. gTTS splits the strings (where the speech would typically pause), then generate that audio, and puts the audio bits together. It doesn't add any breaks in the audio because it doesn't have to. It's only the natural break happening between the end of an audio phrase and the next.
So to answer your question, it's not something we can easily control other than by changing the text that is sent, i.e. with some punctuation, to make it sound at least more natural.
from gtts.
Okay, do you think we can add an option to gtts-cli, for example:
--cut-if-minimizer=500ms
and cut the end of the audio, if that audio was because of minimizer?
(the audio from split-string-3 in the example above).
from gtts.
Sorry for the delay—
Hmm, that would be pretty hard. Pretty much the same conclusion to what I wrote in #398 (comment). This library has no knowledge of the data it gets (audio, words, timing), it just saves it to a file.
from gtts.
If there's consistent pauses you could do some post-processing with MoviePy or FFmpeg on the generated audio to trim them off.
from gtts.
Related Issues (20)
- Cant use 'tr' language
- gTTS throws unknown error for some languages, help me find why. HOT 3
- gtts.tts.gTTSError: 200 (OK) from TTS API. Probable cause: Unknown HOT 8
- Can Any one tell me even after tld set to com why it's connecting to translate.google.en? HOT 12
- Loosen dependencies if possible or at least make `click` optional HOT 3
- test fails: test_file_ascii and test_file_utf8: AssertionError HOT 3
- Timestamps of the spoken words HOT 1
- Proxy setting parameters need to be added HOT 4
- Chinese example doesn't work HOT 3
- gtts_cli breaks when text starts with '-' HOT 3
- Add ability to adjust speed HOT 4
- 2mn latency at the save step HOT 3
- Add Support for Farsi/ Persian HOT 1
- Want support for LT lithuanian lang HOT 2
- Possible to add support for neural wavenet and studio? HOT 1
- readthedocs out of date HOT 1
- GTTS is adding in a word when speaking. HOT 1
- SD106 appears to mean South Dakota 106 HOT 1
- Update languages github action doesn't work properly HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gtts.