Comments (24)
Recording of "angry" phrases for emotional dataset is in progress. Always keep in mind that i'm no professional voice artist.
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset
@monatis what do you think?
from thorsten-voice.
Recording of emotional dataset is finished 💬 🥳
It's been harder and took longer than i thought pronouncing emotions on phrases which do not match these emotion, but i tried my best.
Please keep in mind that i'm just a guy sharing his voice with you folks, i'm no professional voice artist.
@domcross is now doing his audio optimization magic on recordings and once he's done with that i'll publish them.
Until then - here are two samples on how the results sound:
Mist, wieder nichts geschafft.
Es kann doch nicht so schwer sein, einen Ring ins Feuer zu werfen.
Samples are spoken in following emotion order:
- Normal 🙂
- Disgusted 🤢
- Angry 😠
- Amused 😀
- Surprised 😲
- Sleepy 😔
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-de-emotional-dataset
from thorsten-voice.
@thorstenMueller Wow you're doing an amazing job --I can easily get the emotion in the sample. I strongly believe that it will have a substantial impact in the literature.
from thorsten-voice.
@thorstenMueller Yes they are up and running now 😃
Thanks for the reminder and of course for the great dataset you're giving away to the community 😊
By the way, I'm leading the TensorFlow Turkey community and we're holding regular online events to talk about TensorFlow things and broader machine learning. If you would like to, I want to have you as a guest in one of such events to talk about your motivation and experience in building and sharing this dataset. I'm pretty sure that it will be a great inspiration for a lot of people. If you agree, you can simply say hello at the email address on my profile to talk about the details.
from thorsten-voice.
Thanks @monatis i highly appreciate your nice words 👍 .
If my dataset would be mentioned in literature this would be a honor for me, but i just hope my voice contribution is useful for someone.
from thorsten-voice.
Short update:
- Record "normal"
- Record "angry"
- Record "disgusted"
- Record "sleepy" (I'm currently recording "sleepy")
- Record "amused"
- Record "surprised"
I've published some clips on soundcloud:
- Angry: https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset
- Disgusted: https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset-1
- Sleepy: https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset-2
A comparison of two sentences in all current recorded emotions can be heard here:
- "Das ist genau mein Humor."
- "Er tut, wie ihm geheißen."
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset-3
from thorsten-voice.
@thorstenMueller This is awesome! Congrats on the hard work and the great job you're doing here 👏
Looking forward to the release of processed full version.
I'm truely happy to witness such great dataset contributions to the community. Can't wait to work with this one 🚀
from thorsten-voice.
Audio optimisation for my emotional dataset contribution is finished (thanks @domcross). I've compressed the audio files + csv metadata and upload is ready to go. Release is planned as a gift for easter holidays 🐰.
See my twitter account for updates (https://twitter.com/ThorstenVoice) and of course, this topic here.
from thorsten-voice.
Great news! Thank you @thorstenMueller and @domcross.
By the way, Google published a parallel version of Tacotron 2, which is non-autoregressive thus faster, more stable and more natural according to subjective listening tests. I started to work on this version.
from thorsten-voice.
I wonder how you have chosen these 304 phrases?
It is quite difficult to find phrases that match the emotion
from thorsten-voice.
@snakers4 Phrases are not emotion specific. All sentences are generic and i just pronounced these in different emotions. As good as i could since i am no professional voice talent.
from thorsten-voice.
Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".
On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.
from thorsten-voice.
Also check this out - https://silyfox.github.io/iscslp-98-demo/ - sounds like anime
Similar work to yours in essence
from thorsten-voice.
Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".
On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.
Thanks :-)
It's not easy to get into the right emotion on sentences which do not match these emotion.
300 phrases in 5 emotions = 1.500 recordings. I didn't record every day so it took some time until it was finished.
from thorsten-voice.
300 phrases in 5 emotions = 1.500 recordings.
When we recorded speakers for "normal", they could do about 15k phrases in 3-4 weeks, also not every day.
Looks like recording with emotion is more difficult
from thorsten-voice.
Also I have found @thorstenMueller on telegram, is it you?
from thorsten-voice.
Also I have found @thorstenMueller on telegram, is it you?
No, i'm not using telegram.
You can pm me in @coqui-ai Gitter chat (https://gitter.im/coqui-ai/TTS).
from thorsten-voice.
Hello.
I hope you've some relaxing easter holidays 🐰 🙂.
As promised i've released my emotional "Thorsten" dataset 🥳.
I hope it's useful for someone and (as always) please keep in mind that i'm no professional voice talent, just a guy sharing his voice.
Infos and download links can be found here:
https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional
from thorsten-voice.
I'll close this issue due my emotional dataset has been released. But feel free to post here for feedback on this.
from thorsten-voice.
After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085
from thorsten-voice.
@monatis I'm just being curious, did you have time time listen to my emotional dataset and/or found an intended usage?
from thorsten-voice.
Hi @thorstenMueller, I listened to some samples and it sounded quite promising. Unfortunately I was dealing with health problems of my parents at that time so I just missed a feedback. Sorry for that.
Currently working on a NLP project in Turkish, then I want to return to working with Emotional Thorsten dataset to make some experiments with the following:
(1) Introducing an emotion vector to the architecture of Tacotron2 and finetuning it for Emotional TTS.
(2) Training few-shot adaptation methods like AdaDurIAN.
Let's see what we can build.
from thorsten-voice.
Hi @monatis .
Thanks for your fast response and sorry to hear your parents had health problems. Hopefully things are doing well now.
I justed wanted to be sure, that you know my emotional dataset is ready for "whatever". Your ideas sounds promising.
from thorsten-voice.
After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085
Today i've released version 2 of my emotional dataset 🎉.
In addition to emotions from version 1:
- Neutral
- Disgusted
- Angry
- Amused
- Surprised
- Sleepy
It now includes:
- Drunk
- Whispering
Check details and download on my Github page https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional
@monatis: Just in case it's interesting for you.
from thorsten-voice.
Related Issues (20)
- Question with Phonemes HOT 4
- Documenting the process of building an open voice model out of audio files HOT 2
- ValueError: Phonemizer is not defined in the TTS config. HOT 2
- Porting the German voice into RHVoice HOT 7
- Eigene TTS erstellen HOT 3
- 44khz 16 bit available? HOT 2
- Source of Text Prompts HOT 1
- Help for vocoder training for Coqui HOT 6
- Multispeaker-Finetuning on Single-Speaker-VITS-Model HOT 2
- NumPy (Torch) issues HOT 2
- training duration / female voice? HOT 2
- Request for an oobabooga extension HOT 2
- TTS-Models: Download-Links broken? HOT 2
- Made with Thorsten-Voice 😊 HOT 2
- Windows: tts_to_file ignoring German Umlauts HOT 5
- Request - "synthesize_csv.py" from YouTube "Coqui TTS Audio samples of all models (Version 0.7.1)" HOT 4
- Voz Português Brazil HOT 1
- Emphasis on syllables – How to choose? HOT 6
- Voice synthesizing fails after finetuning HOT 2
- Das Wort "Prolog" führt zu Decoder stopped with `max_decoder_steps` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thorsten-voice.