thiojoe / auto-synced-translated-dubs Goto Github PK

Automatically translates the text of a video based on a subtitle file, and also uses AI voice to dub the video, and synced using the subtitle's timings

License: GNU General Public License v3.0

Python 100.00%

auto-synced-translated-dubs's People

Stargazers

Watchers

Forkers

parthib22 coderchintu 253ping superfeliz billnature prakash-rokade shifaau9 qiamast sam-tovar laxmannepal 5l1v3r1 seredeep switchalpha rogalikyt zikyfranky bahaaeldeen1999 ceeroblaq phoenix2077 abelkrijgtalles sofiadparamo gabrielmaestre surgeontalus betoxx1 renatofrota iramarfalcao starry8004 clarkjoao yanstan pamskye moaltawil jvsdv muhammedashraf9244 analogpvt vinayrajput05 intelmib rushchang kirillovdv adynr17 edugg djedu28 simaopedros feimpraim alexisnovas cate9021 hadryan hmd83 ehles louderthanthunderx1 folkevil monicaarnaud dereckysany lyrl jiraspiom mehmetsafabenli vanbac91 11jjchina lipe-lx gabrielknot diegofornalha scatolo osu78 soebb scriptyyy13 a-russkikh dimzeofficial ryanpnayr imdark futurizerush rama1277 ethpony deepansharya1111 erodneycorus spacekingboiiii marcinorlowski googlesearchbot libeerdev toread-jxj qkjin ronivaldo alani1 babyblue26 kuny12345 ebash3 ca6ypo adamkowalski-dev danielrobin13 wendellchi mustafayuce33 anarch-ia mpathy orkunisitmak de30 gorillatwin botheory skyroot yhbbobo moyermk akbarazimifar c0xinha4xisd cristina-gabriela

auto-synced-translated-dubs's Issues

Adding support for tortoise-tts to clone voices

https://github.com/neonbjb/tortoise-tts
this tts support voice cloning I wonder if it's possible to modify this repo to use this tts ?

if anyone able to do this it will be great

--- Thanks in advance to whoever add this

Apparently doesn't work with free tier + issue with encoding the tts audio files

I tried to run the main.py and i'm using the f0 tier of Azure, but I get this error message for a split second. Failed to submit batch synthesis job: Only "Standard"\ subscriptions for this region of called service are valid. What is the problem here? Removing unnecessary folders then got me this error: no such file or directory

Missing modules in requirements.txt

import numpy as np
ModuleNotFoundError: No module named 'numpy'

Better audio speed

The youtube playback speed uses the default browser, so in your case, it is Chrome, you can look at chromium (the open source part of the browser) and see what they are using...

https://www.chromium.org/developers/design-documents/video/

But corting short, they are using ffmpeg which is open source and can be used to improve the audio quality.

[Feature request] Add Lithuanian language

Add Lithuanian language for my fellow Lithuanian speakers, please.

[Feature Request] Add a configuration option for the output directory

Why

Well people like to customize the directory for workflow stuff and to have everything cleanly in one place.

How to implement

Just add an option in the config.ini, parse the variable in the script and pass it in.

If you are very busy, or have no idea on how to implement it, I (or another contributor) cloud do that for you...

Is this feature bloat?

Depends. It cloud be implemented, see if people use it, then remove it if it's unpopular. Also, it doesn't really slow down the script...

[Feature Suggestion/Bug Fix] Manual Review

In some cases, auto-translation get message REALLY wrong, for example RJ45 can get translated as Rio de Janeiro 45, so a manual review option would be great!

Having a large subtitles file causes a crash (chunk translation)

The usage of batch translation causes the program to crash due to an list index out of range, as seen here:

...
  File "Auto-Synced-Translated-Dubs\main.py", line 425, in translate_dictionary
    inputSubsDict[key]['translated_text'] = translatedTexts[i]
IndexError: list index out of range

This is due to the program iterating thru the chunks of translated text, and going to the range of the size of the whole list of texts to translate, causing it to crash if there are more than 100 texts in the srt file.

The list will always have 100 chunks of text, but the for loop is going beyond that.

main.py
                   Here's the problem
                           |
                           v
                   ------------------------
424| for i, key in enumerate(inputSubsDict):
425|                    inputSubsDict[key]['translated_text'] = translatedTexts[i]
426|                    # Print progress, ovwerwrite the same line
427|                    print(f' Translated: {key} of {len(inputSubsDict)}', end='\r')

skip translation

I don't want to translate the subtitles. I only want the project to read out the subtitles with Azure. So, I set skip_translation = True in config.ini file.
In my expectation, the project then won't ask me to provide the Google API key (cloud_secrets_file.json) and the DeepL API key, but it still require me to provide the Google API or the DeepL API.
Would the author of this project please help me?

Consider adapting the translation to the duration

Often what is done when translating something to be dubbed is adjusting the translation - not the speed - to cater to audio constraints. It's often possible to say things in multiple ways - meme for reference - still keeping the original meaning or something close to it.

There are articles about controlling the output length of a machine translation such this one (I remember reading other but I could not find it, but I found one better in the Edit below) which can be researched for that.

One idea I had and tested is using the fact many libraries - such as Hugging Face's Transformers - support providing multiple choices for translations. An example: I used the "Helsinki-NLP/opus-mt-tc-big-itc-itc" model, set num_return_sequences=5 to make it return 5 translations and translated "Ok" (like the meme) to Spanish, then it returned "De acuerdo.", "Está bien.", "Bien.", "Muy bien." and "De acuerdo", which are mostly correct translations (well, at least from what I know of when I studied Spanish a long time ago; by the way the last translation is just the first without a period).

One downside of this idea is that it restricts models to only models supported by the library and someone might prefer a proprietary translation model instead, then one possibility is using a Summarization model to at least avoid the case of having to speed up the dub voice in order to read a way too long translation. Note that I don't tried this yet and there is a chance those models might not work well summarizing small sentences.

Edit: this 2021 paper from Amazon AI addresses a lot of things related to this project. Its references are quite good too.

Error after setting batch_tts_synthesize = True

Failed to submit batch synthesis job: {
"code": "Forbidden",
"message": "Only "Standard" subscriptions for the region of the called service are valid."
}
Traceback (most recent call last):
File "E:\01project\12youtube\02program\Auto-Synced-Translated-Dubs\main.py", line 262, in
process_language(langData)
File "E:\01project\12youtube\02program\Auto-Synced-Translated-Dubs\main.py", line 253, in process_language
individualLanguageSubsDict = audio_builder.build_audio(individualLanguageSubsDict, langDict, totalAudioLength, twoPassVoiceSynth)
File "E:\01project\12youtube\02program\Auto-Synced-Translated-Dubs\audio_builder.py", line 100, in build_audio
rawClip = AudioSegment.from_file(value['TTS_FilePath'], format="mp3", frame_rate=nativeSampleRate)
KeyError: 'TTS_FilePath'

I use the latest code. The project can work well when batch_tts_synthesize is set False.
Please check it.
Thank you.

Can we simplfy the install instructions on the front page?

On the external requirements:
"You'll need the binaries for a program called 'rubberband' ( http://breakfastquay.com/rubberband/ ). Doesn't need to be installed, just put both exe's and the dll file in the same directory as the scripts.

I need more context than what is given. First there are no exe's or dll file in the download provided here and I don't know what scripts are being implied to put with them if I'm over looking them.

Failed to submit batch synthesis job

[04/28/2023 11:05:10 PM Central Daylight Time] Failed to submit batch synthesis job: {
"statusCode": 401,
"value": {
"code": "Unauthorized",
"message": "Authentication is required to access the resource."
}
}
Traceback (most recent call last):
File "c:\Laurence\main.py", line 282, in
process_language(langData, processedCount, totalLanguages)
File "c:\Laurence\main.py", line 268, in process_language
individualLanguageSubsDict = audio_builder.build_audio(individualLanguageSubsDict, langDict, totalAudioLength, config['two_pass_voice_synth'])
File "c:\Laurence\Scripts\audio_builder.py", line 76, in build_audio
rawClip = AudioSegment.from_file(value['TTS_FilePath'], format="mp3", frame_rate=int(config['synth_sample_rate']))
KeyError: 'TTS_FilePath'

where should I put the pre-translated SRT file or what do I do wrong with something else

I put the pre-translated SRT file in the directory same as main.py, and also copy it into workingfolder. Then it gives me the following message:
------- 'Auto Synced Translated Dubs' script by ThioJoe - Release version 0.13.1 -------

----- Beginning Processing of Languages -----

----- Beginning Processing of Language (1/1): zh-CN -----
Skip translation enabled. Checking for pre-translated subtitles...
Pre-translated subtitles not found for language: zh-CN. Skipping.

Combine characters does not work on pre-translated subtitles

I always used pre-translated srt files as I wanted full control over the dub, and I have been using version 0.10.0 since yesterday, when I updated to version 0.14.1 and noticed that the number of audio processed was the same as the number of subtitles, resulting in unwanted pauses between 2 or more subtitles that compose a single sentence.

Can this setting be reintegrated?

Problem using the Batch processing (Azure)

When I try to Batch Process with Azure, it gives me this error.

Translating text using Google...

Waiting for Azure batch synthesis job to finish. Status: [NotStarted]
ERROR: Batch synthesis job failed!
Reason:OK
Traceback (most recent call last):
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\main.py", line 281, in <module>
    process_language(langData, processedCount, totalLanguages)
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\main.py", line 267, in process_language
    individualLanguageSubsDict = audio_builder.build_audio(individualLanguageSubsDict, langDict, totalAudioLength, config['two_pass_voice_synth'])
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\Scripts\audio_builder.py", line 76, in build_audio
    rawClip = AudioSegment.from_file(value['TTS_FilePath'], format="mp3", frame_rate=int(config['synth_sample_rate']))
KeyError: 'TTS_FilePath'

F:\Auto-Synced-Translated-Dubs-0.14.1>

Just for information, I am on the standard subscription, not the free one. So, I tried to deactivate the Azure Batch process, and it shows me the error:

Traceback (most recent call last):
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\main.py", line 281, in <module>
    process_language(langData, processedCount, totalLanguages)
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\main.py", line 267, in process_language
    individualLanguageSubsDict = audio_builder.build_audio(individualLanguageSubsDict, langDict, totalAudioLength, config['two_pass_voice_synth'])
  File "F:\Auto-Synced-Translated-Dubs-0.14.1\Scripts\audio_builder.py", line 76, in build_audio
    rawClip = AudioSegment.from_file(value['TTS_FilePath'], format="mp3", frame_rate=int(config['synth_sample_rate']))
  File "C:\Users\Brown\AppData\Local\Programs\Python\Python310\lib\site-packages\pydub\audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 6.0-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      58.  2.100 / 58.  2.100
  libavcodec     60.  3.100 / 60.  3.100
  libavformat    60.  3.100 / 60.  3.100
  libavdevice    60.  1.100 / 60.  1.100
  libavfilter     9.  3.100 /  9.  3.100
  libswscale      7.  1.100 /  7.  1.100
  libswresample   4. 10.100 /  4. 10.100
  libpostproc    57.  1.100 / 57.  1.100
[mp3 @ 000001d67b755900] Failed to read frame size: Could not seek to 1026.
workingFolder\1.mp3: Invalid argument


F:\Auto-Synced-Translated-Dubs-0.14.1>python main.py

And it ends up happening that Azure creates 0kb .mp3 audio files, and the error that they had fixed in fix #17 reappears. For information, I am using the latest version of ASTD 0.14.1.

I tried changing add_line_buffer_milliseconds = 0 to add_line_buffer_milliseconds = 1. Because I saw it was a possible error cause, but it still didn't work, and the issue with empty audio files persists.

Add parallel to trimm audio

One of the slowest moments
You shouldn't use only one cpu's core

Donot translate not works for hindi and other Indian languages

Hello, When I put some English text in dont_translate_phrases.txt , it does not get translated to any language. But when I want to translate to Hindi and other Indian languages, it does not skip the translation. I am putting the Hindi text in dont_translate_phrases.txt file. Is there something you can help with it?

Access blocked: This app's request is invalid

After running main.py, it takes me to the Google login screen with the following error: "You cannot sign in because this app has sent an invalid request. You can try again later or contact the developer to fix the problem. Learn more about this error. Error 400: redirect_uri_mismatch"

Azure second pass fails because of undeclared variable

keyIndex is not being fetched when secondPass is True in TTS.py. This causes a crash if second pass is enabled and using Azure.

HTTP 400 from Google (googleapiclient.errors.HttpError:)

Hello!

I've tried using, but I'm getting error 400 from Google.

Any ideias? I've tried reseting my client secret but that didn't worked

Also here is the SRT file:
subtitles.zip

Translating text using Google...
Traceback (most recent call last):
 File "D:\auto-dub\main.py", line 262, in <module>
   process_language(langData)
 File "D:\auto-dub\main.py", line 240, in process_language
   individualLanguageSubsDict = translate.translate_dictionary(individualLanguageSubsDict, langDict, skipTranslation=skipTranslation)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "D:\auto-dub\translate.py", line 169, in translate_dictionary
   ).execute()
     ^^^^^^^^^
 File "C:\Users\pdv\AppData\Local\Programs\Python\Python311\Lib\site-packages\googleapiclient\_helpers.py", line 130, in positional_wrapper
   return wrapped(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
 File "C:\Users\pdv\AppData\Local\Programs\Python\Python311\Lib\site-packages\googleapiclient\http.py", line 938, in execute
   raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://translation.googleapis.com/v3beta1/projects/your-project-name:translateText?alt=json returned "Invalid 'parent'.; Invalid resource name. Missing locations/ part.;  Resource type: location". Details: "Invalid 'parent'.; Invalid resource name. Missing locations/ part.;  Resource type: location"> ```

add voice cloning

would be cool.
i think they have a api and they write some stuff about different languages on there web page you probably have to ask but the voices sound extremely good. https://beta.elevenlabs.io/
the demo has 10.000 free char of voice cloning

Clone Voice?

I've to admit this programm is absolutely gold.
I also discovered an AI website that, in addition to dubbing the audio, also clones the original voice in order to maintain tones and even pronunciation defects (i don't know if I can leave the link here).
What about add this feature too?

Add bark for very good local multilingual TTS

https://github.com/suno-ai/bark

and the converted version that can generate longer audio then 14s

https://github.com/JonathanFly/bark

in future it will support voice cloning and at the moment it works in many languages and it sounds very good

it would be cool to add this

Does not work on machines without a web browser

Since I am running this in a vm without a gui, it can't start a browser.
The program should show a URL that can be pasted in a browser on the laptop.

--- dialog looks like this --

/opt/git/Auto-Synced-Translated-Dubs.local# python3 ./main.py

Please login using the browser window that opened just now.
Waiting for authorization. See message above.

The dubbing is not respecting the pause that exists in the subtitles.

I'm a Brazilian content creator and I'm using Google, I couldn't get Azure to work.

I generated 3 voiceovers, English, Spanish and Chinese to test. The quality is pretty cool, but when checking the editor's timeline, in some moments the subtitle pause time is respected, but in others, it is completely ignored, and it is very out of sync with what is being shown in the video . There are several moments in the video where there is no narration and the dubbing did not respect these times.

In Chinese for example, he respected the pauses 3 times in the entire video, which was totally out of sync.

[Feature Idea] overlay generated voices on original track

A lot of german media does translation in an interesting way.
The original audio gets reduced in volume and the translated audio gets overlayed above that.

I really like this method and it could save a lot of API calls for the TTS services.

This method would even keep some of the backgroud sounds of the original audio to at least some extend.

Example

You have an audio section and translated audio file in the video that is looks like this:

O: Original
T: Translation

O: |----------------------------|
T: |--------------|

To merge these Tracks without using the twopass method could look something like this:

O: |---v------------------^----|
T:       |--------------|

v: turn down the volume
^: turn the audio back up

Possible problems

One Problem could be when the translated audio is longer than the original track.
This would still need a second pass to the TTS API but it would reduce the amound of calls to the API saving money.

Config

Some of the configuration options could be something like this:

whether the generated audio is centered on the original audio
what offset the start of this audio has from the start of the section (most of the time it has a short delay when I have seen this method)
behaviour for when the translation is longer than the original section
volume of the turned down audio

Existing examples

This was done for stuff like Top Gear, Pawn Stars and many documentaries. At least those come to memory for me.
One example is something like this: https://www.youtube.com/watch?v=121t4E3EM48 (first thing I found while searching)

There should be compatibility with local and open source programs

Why?

The reason why is to stop relying on these cloud servers entirely. It would also reduce the cost for API credits(local does not require credits, and some open source APIs are self-hostable with changeable servers that it would help).

examples of These types of projects

Translation

LibreTranslate is an open-source self-hostable API. Some servers(including the official one) require credits, while others do not.
TranslateLocally and Firefox Translations are local tools that are similar to each other and perform a little bit better.

note: DO NOT CONFUSE "Firefox Translations" WITH To Google Translate"

Text To Speech

Coqui.ai's open source engine works similar to the cloud TTS engines you use. It is a self-hostable API that you could host locally. There are more accurate TTS engines out there(I listed this one to give an example).

Implementation:

Translation

For LibreTranslate, You can use LibreTranslate-py for the Python API.
For TranslateLocally, You could probably run the commands from the website in Python.

TTS

For Coqui.ai, you could use the examples provided

Subprocess.CalledProcessError

Good morning!

Sorry for the question that surely is simple.

I would like to know what is wrong here? I think I have a problem in "totalAudioLength = get_duration(originalVideoFile)"

So, I assume that I inserted the wrong path to the file.

I tried in these ways:
original_video_file_path = E:\VideosTranslation\video.mp4
srt_file_path = E:\VideosTranslation\subtitles.srt

original_video_file_path = "E:\VideosTranslation\video.mp4"
srt_file_path = "E:\VideosTranslation\subtitles.srt"

or even leaving it in the same file as main.py and putting simply :

original_video_file_path = video.mp4
srt_file_path = subtitles.srt

But I always have the same error. Is it a problem with some other configuration?

Just to mention that I don't know anything about programming or Github either. Possibly I am thinking that it is an "X" error and in reality it has nothing to do

Cloud IAM permission 'cloudtranslate.generalModels.predict' denied. "

Hi,
I`m getting this error even after I was successfully logged in to my google account and 'token.pickle' got created.

    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://translation.googleapis.com/v3beta1/projects/sasa:translateText?alt=json returned "Cloud IAM permission 'cloudtranslate.generalModels.predict' denied.". Details: "Cloud IAM permission 'cloudtranslate.generalModels.predict' denied. "

Thank you.

break_until_next'

I got following error, how can I fix this, while executing the program
subsDict[str(int(line)-1)]['break_until_next'] = processedTime1 - int(subsDict[str(int(line) - 1)]['end_ms'])

KeyError: '1'

Adding local workflow and TTS to software

I want to have the option to do this fully locally aside from potential LLM translators in the future. We could implement whisper or https://alphacephei.com/vosk/models and maybe balabolka with microsoft voices or other local alternatives. We could potentially implement it into https://github.com/SubtitleEdit/subtitleedit and maybe use RVC AI for voice conversion.

Restrict maximum and minimum speed factor

I just watched your youtube video with Hindi audio track and there were parts of videos where the AI spoke incredibly fast or incredibly slow. I think there needs to be a compromise between millisecond accuracy versus speech speed.

Maybe during the second pass the script can create an array of speed factors and if an element is too high or too low, it can try to average it with it's neighbours. For eg: [ ... 1.05, 1.5, 1.1 ...] would get averaged out to [ ... 1.2, 1.2, 1.2 ...] or something like that.

Error in subtitles English translation

Hi,

I'm trying to generate audios from Spanish to English, Portuguese, Russian and more and I'm having issues with the English translation. Only with the English version.

First lines of the original subtitles don't be translated in the first lines of the English .srt and it causes two situations: First TTS starts at an incorrect timestamp and last TTS finishs at the original timestamp. I show you (I will put errors with asterisks):

**00:00:07,958** --> 00:00:11,773
and while we were trying to steal the cheese from Mickey.exe

2
00:00:11,993 --> 00:00:17,078
Today while browsing Roblox, this game by Hungry Nora was recommended to me

(...)

115
00:09:54,599 --> 00:09:56,349
remember that if you have arrived new, subscribe and activate

116
00:09:56,349 --> **00:00:07,738**
the bell. a big hug and see you in the next videos guys Goodbye! **In the last days we have had enough complications while we were trying to steal Peppa Pig's food**

"Goodbye" is the last word in the original version, and "In the last days..." is the first sentence.

Original:

1
00:00:00,221 --> 00:00:04,194
En los últimos días hemos tenido bastantes complicaciones

2
00:00:04,454 --> 00:00:07,738
mientras intentábamos robarle la comida a Peppa Pig

(...)

261
00:09:57,793 --> 00:09:59,396
y nos vemos en los siguientes videos chicos

262
00:10:00,210 --> 00:10:01,407
Adios!

Am I doing something wrong?

Thanks!

Can someone tell me why can't I run multiple languages separated by a comma? I can only run one language at a time

Azure gets empty audios when add_line_buffer_milliseconds is different than 0 and batch_tts_synthesize is False

Azure returns empty audio files (0 bytes) when using the option add_line_buffer_milliseconds with a value different than 0, and using batch_tts_synthesize=True under config.ini.

The reason is a typo inside the code with an error in the ssml, causing a parsing error on Azure API and therefore, returning invalid mp3 files.

Google Colab Implement

I was trying to implement it in google colab, I don't know about programming, could someone continue to improve it?

https://colab.research.google.com/drive/1ox8rvSKtL1WBplJoeVET7Bv9eC6F-Plh?usp=sharing

Problem with synchronization

Hello!

I present a problem with the synchronization of the original audio and the translated audio.

Some languages are much more concise than others. For example, English vs Spanish. Normally more words are needed to express something in Spanish than it takes to say the same thing in English.

I give a concrete example:

1
00:00:00,100 --> 00:00:06,195
¿Qué creéis que ocurriría si mezclamos los personajes de Poppy Playtime con los personajes de Rainbow Friends?

As you can see, this phrase needs 6 seconds to be spoken. However, the translation would be this:

1
00:00:00,100 --> 00:00:06,195
What do you think would happen if we mix Poppy Playtime with the Rainbow Friends characters?

And the resulting unsynced audio takes 4 seconds:

1.mp3.zip

When synchronizing the English audio it is forced to 6 seconds which produces a very unsatisfactory result (very slow).

1.mp3.zip

Would there be a way to tell the program that, in the case of such a difference between the times of the languages, apply a maximum of X seconds between the original and the translation? For example max_diff = 1000ms. In this way the audio in English would last 6 seconds instead of 7 and it would not be so weird.

This is especially useful in videos where there is no camera and synchronization is not strictly necessary (if there are empty gaps without audio, the video could be clipped).

Best!

HTTPError during runninf main.py

Hi!
Could someone help me with this error please?

----- Beginning Processing of Languages -----
----- Beginning Processing of Language: es-MX -----
Translating text using Google...

raise HttpError(resp, content, uri=self.uri)

googleapiclient.errors.HttpError: <HttpError 400 when requesting https://translation.googleapis.com/v3beta1/projects/sasa:translateText?alt=json returned "Empty request.". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'contents', 'description': '_****No text contents provided.'}]}]">_****

MacOS support?

You'll need the binaries for a program called 'rubberband' ( https://breakfastquay.com/rubberband/ ) . Doesn't need to be installed, just put both exe's and the dll file in the same directory as the scripts.

For this information, what should I do for the MacOS for this dependency?

Some Azure dll error

Good day. I get this when I try to run the script (I tried to follow all the requirements and instructions as close as possible). My Python version is 3.9 as the readme.md of this repo suggested.

------- 'Auto Synced Translated Dubs' script by ThioJoe - Release version 0.7.0 -------
Traceback (most recent call last):
  File "C:\Users\Andrey\Downloads\Auto-Synced-Translated-Dubs-main\main.py", line 13, in <module>
    import TTS
  File "C:\Users\Andrey\Downloads\Auto-Synced-Translated-Dubs-main\TTS.py", line 6, in <module>
    import azure.cognitiveservices.speech as speechsdk
  File "C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\cognitiveservices\speech\__init__.py", line 8, in <module>
    from .speech import *
  File "C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\cognitiveservices\speech\speech.py", line 13, in <module>
    from .interop import (
  File "C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\cognitiveservices\speech\interop.py", line 20, in <module>
    _sdk_lib = load_library.LoadLibrary(lib_path)
  File "C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\ctypes\__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\ctypes\__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'C:\Users\Andrey\AppData\Local\Programs\Python\Python39\lib\site-packages\azure\cognitiveservices\speech\Microsoft.CognitiveServices.Speech.core.dll' (or one of its dependencies). Try using the full path with constructor syntax.```
What can I do about this?

KeyError: 'google_project_id'

in "cloud service settings" change the values of "google_project_id" and "your-project-name" to those of my account, I guess I did it right but I don't know why I don't get that

Sections of translated audio slightly overlap.

I'm using the Google APIs for both translation and dubbing. I've tested using multiple languages, including Spanish, Portuguese, and Arabic.

In all of my tests, there is some overlapping of some of the words. It's almost as if sections of audio got combined together but the sections start before the previous one finishes.

Is there some setting I'm missing?

Here is a 4-second example. The Spanish should be "Hay mucha sintaxis. Hay muchas cosas pequeñas que si te equivocas..." In the subtitle file between the first sentence of that text and the second sentence is a section break. That is the part where the overlap occurs.

overlap_spanish.mov

skip translation and do synthesizing only still impossible

I set skip_translation=true, I want it to synthesize with Azure, and run the main.py, then it gives the following message:
------- 'Auto Synced Translated Dubs' script by ThioJoe - Release version 0.13.0 -------

     ----- [!] Error: client_secrets.json file not found -----

----- Did you create a Google Cloud Platform Project to access the API? -----

Press Enter to Exit...

Please login using the browser window that opened just now. Error: Client secrets must be for a web or installed app.

Please login using the browser window that opened just now.

Traceback (most recent call last):
File "C:\Users\lzb\Auto-Synced-Translated-Dubs-0.14.1\Scripts\auth.py", line 147, in first_authentication
GOOGLE_TTS_API, GOOGLE_TRANSLATE_API = get_authenticated_service() # Create authentication object
File "C:\Users\lzb\Auto-Synced-Translated-Dubs-0.14.1\Scripts\auth.py", line 105, in get_authenticated_service
flow = InstalledAppFlow.from_client_secrets_file(secrets_file, scopes=API_SCOPES)
File "C:\Users\lzb\AppData\Local\Programs\Python\Python39\lib\site-packages\google_auth_oauthlib\flow.py", line 201, in from_client_secrets_file
return cls.from_client_config(client_config, scopes=scopes, **kwargs)
File "C:\Users\lzb\AppData\Local\Programs\Python\Python39\lib\site-packages\google_auth_oauthlib\flow.py", line 159, in from_client_config
raise ValueError("Client secrets must be for a web or installed app.")
ValueError: Client secrets must be for a web or installed app.

[!!!] Error: Client secrets must be for a web or installed app.

Error: Something went wrong during authentication. Try deleting the token.pickle file.
Press Enter to Exit...

Empty audio clips

The biggest issue, Azure creates empty audio clips, but not all of them. I updated the code according to #17, disabled batch_tts_synthesize and set add_line_buffer_milliseconds to 0.
On the side, all the clips are downloaded outside the workingFolder and then never deleted.

Here is a screenshot of the terminal:

And the app folder looks like this afterwards

Created a google platform project but it is still returning an error

I have made a Google platform project on the google cloud console but it is returning an error. I pasted the project ID and was unable to do the billing. I don't know how to correct this.

    ----- [!] Error: client_secrets.json file not found -----

----- Did you create a Google Cloud Platform Project to access the API? -----

ImportError: cannot import name 'parseBool' from partially initialized module 'Scripts.utils' (most likely due to a circular import)

Most likely due to circular imports
shared_imports and utils cycled
I'm a novice how should I solve him

Program crashes if no output/workingFolder dirs

The program crashes if it finds no output or workingFolder created during execution, a common problem on first-time executions.

...
  File "Auto-Synced-Translated-Dubs\main.py", line 310, in translate_dictionary
    with open(translatedSrtFileName, 'w', encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'output\\video - Spanish - es.srt'

After creating the output directory, it crashes with:

...
  File "Auto-Synced-Translated-Dubs\TTS.py", line 237, in synthesize_text_azure_batch
    for filename in os.listdir('workingFolder'):
FileNotFoundError: [WinError 3] El sistema no puede encontrar la ruta especificada: 'workingFolder'

HTTP Error 403

I got following problem:

googleapiclient.errors.HttpError: <HttpError 403 when requesting https://translation.googleapis.com/v3beta1/projects/third-framing-374214:translateText?alt=json returned "Cloud Translation API has not been used in project 234603848874 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/translate.googleapis.com/overview?project=234603848874 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.". Details: "[{'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Google developers console API activation', 'url': 'https://console.developers.google.com/apis/api/translate.googleapis.com/overview?project=234603848874'}]}, {'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'SERVICE_DISABLED', 'domain': 'googleapis.com', 'metadata': {'service': 'translate.googleapis.com', 'consumer': 'projects/234603848874'}}]">

thiojoe / auto-synced-translated-dubs Goto Github PK

auto-synced-translated-dubs's People

Stargazers

Watchers

Forkers

auto-synced-translated-dubs's Issues

Why

How to implement

Is this feature bloat?

Example

Possible problems

Config

Existing examples

Why?

examples of These types of projects

Translation

note: DO NOT CONFUSE "Firefox Translations" WITH To Google Translate"

Text To Speech

Implementation:

Translation

TTS

Recommend Projects

Recommend Topics

Recommend Org