Giter Club home page Giter Club logo

Comments (6)

pndurette avatar pndurette commented on August 25, 2024

Hi @Rom888, that's a though one!
So the upstream API will introduce a break after 100 characters and there's no way to control this. Which is why gTTS tries to pre-emptively split (tokenize) where pauses would naturally occur (e.g. punctuation) to remediate this, which works pretty well most of the time. But if your input is more than 100 characters w/o any break that gTTS' tokenizer could use to split on, there will have a break no matter what.

Edit: So your best bet if you control the input is to introduce punctuation (commas, etc.).

Edit 2: Wondering if I understood your question correctly actually. Do you have an example where this occurs?

from gtts.

Rom888 avatar Rom888 commented on August 25, 2024

Here is an example:

split-string-1 <split by tokenizer>
split-string-2 <split by tokenizer>
split-string-3 <split by minimizer (because larger than 100 characters)>
split-string-4 <split by tokenizer>

If I understand correctly, gTTS gets all the split strings, makes audio, and then joins all audio fragments into one and adds pauses between those audio fragments.

Is it possible to not add pauses between audio fragments 3 and 4 when joining?

from gtts.

pndurette avatar pndurette commented on August 25, 2024

@Rom888 Sorry for the delay—

So what you said is almost correct. gTTS splits the strings (where the speech would typically pause), then generate that audio, and puts the audio bits together. It doesn't add any breaks in the audio because it doesn't have to. It's only the natural break happening between the end of an audio phrase and the next.

So to answer your question, it's not something we can easily control other than by changing the text that is sent, i.e. with some punctuation, to make it sound at least more natural.

from gtts.

Rom888 avatar Rom888 commented on August 25, 2024

Okay, do you think we can add an option to gtts-cli, for example:
--cut-if-minimizer=500ms
and cut the end of the audio, if that audio was because of minimizer?
(the audio from split-string-3 in the example above).

from gtts.

pndurette avatar pndurette commented on August 25, 2024

Sorry for the delay—
Hmm, that would be pretty hard. Pretty much the same conclusion to what I wrote in #398 (comment). This library has no knowledge of the data it gets (audio, words, timing), it just saves it to a file.

from gtts.

keisanng avatar keisanng commented on August 25, 2024

If there's consistent pauses you could do some post-processing with MoviePy or FFmpeg on the generated audio to trim them off.

from gtts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.