Because there exist some interesting papers based on emotional speech datasets i've de

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Great news! Thank you <a class="user-mention notranslate" data-hovercard-type="user" d

Recording free emotional german dataset about thorsten-voice HOT 24 CLOSED

thorstenmueller commented on May 24, 2024 2

Recording free emotional german dataset

from thorsten-voice.

Comments (24)

thorstenMueller commented on May 24, 2024 6

Recording of "angry" phrases for emotional dataset is in progress. Always keep in mind that i'm no professional voice artist.
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset

@monatis what do you think?

from thorsten-voice.

thorstenMueller commented on May 24, 2024 5

Recording of emotional dataset is finished 💬 🥳
It's been harder and took longer than i thought pronouncing emotions on phrases which do not match these emotion, but i tried my best.

Please keep in mind that i'm just a guy sharing his voice with you folks, i'm no professional voice artist.

@domcross is now doing his audio optimization magic on recordings and once he's done with that i'll publish them.

Until then - here are two samples on how the results sound:

Mist, wieder nichts geschafft.
Es kann doch nicht so schwer sein, einen Ring ins Feuer zu werfen.

Samples are spoken in following emotion order:

Normal 🙂
Disgusted 🤢
Angry 😠
Amused 😀
Surprised 😲
Sleepy 😔

https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-de-emotional-dataset

from thorsten-voice.

monatis commented on May 24, 2024 1

@thorstenMueller Wow you're doing an amazing job --I can easily get the emotion in the sample. I strongly believe that it will have a substantial impact in the literature.

from thorsten-voice.

monatis commented on May 24, 2024 1

@thorstenMueller Yes they are up and running now 😃
Thanks for the reminder and of course for the great dataset you're giving away to the community 😊
By the way, I'm leading the TensorFlow Turkey community and we're holding regular online events to talk about TensorFlow things and broader machine learning. If you would like to, I want to have you as a guest in one of such events to talk about your motivation and experience in building and sharing this dataset. I'm pretty sure that it will be a great inspiration for a lot of people. If you agree, you can simply say hello at the email address on my profile to talk about the details.

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Thanks @monatis i highly appreciate your nice words 👍 .
If my dataset would be mentioned in literature this would be a honor for me, but i just hope my voice contribution is useful for someone.

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Short update:

I've published some clips on soundcloud:

A comparison of two sentences in all current recorded emotions can be heard here:

"Das ist genau mein Humor."
"Er tut, wie ihm geheißen."
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset-3

from thorsten-voice.

monatis commented on May 24, 2024

@thorstenMueller This is awesome! Congrats on the hard work and the great job you're doing here 👏
Looking forward to the release of processed full version.

I'm truely happy to witness such great dataset contributions to the community. Can't wait to work with this one 🚀

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Audio optimisation for my emotional dataset contribution is finished (thanks @domcross). I've compressed the audio files + csv metadata and upload is ready to go. Release is planned as a gift for easter holidays 🐰.
See my twitter account for updates (https://twitter.com/ThorstenVoice) and of course, this topic here.

from thorsten-voice.

monatis commented on May 24, 2024

Great news! Thank you @thorstenMueller and @domcross.
By the way, Google published a parallel version of Tacotron 2, which is non-autoregressive thus faster, more stable and more natural according to subjective listening tests. I started to work on this version.

from thorsten-voice.

snakers4 commented on May 24, 2024

I wonder how you have chosen these 304 phrases?
It is quite difficult to find phrases that match the emotion

from thorsten-voice.

thorstenMueller commented on May 24, 2024

@snakers4 Phrases are not emotion specific. All sentences are generic and i just pronounced these in different emotions. As good as i could since i am no professional voice talent.

from thorsten-voice.

snakers4 commented on May 24, 2024

Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".

On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.

from thorsten-voice.

snakers4 commented on May 24, 2024

Also check this out - https://silyfox.github.io/iscslp-98-demo/ - sounds like anime
Similar work to yours in essence

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".

On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.

Thanks :-)
It's not easy to get into the right emotion on sentences which do not match these emotion.

300 phrases in 5 emotions = 1.500 recordings. I didn't record every day so it took some time until it was finished.

from thorsten-voice.

snakers4 commented on May 24, 2024

300 phrases in 5 emotions = 1.500 recordings.

When we recorded speakers for "normal", they could do about 15k phrases in 3-4 weeks, also not every day.
Looks like recording with emotion is more difficult

from thorsten-voice.

snakers4 commented on May 24, 2024

Also I have found @thorstenMueller on telegram, is it you?

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Also I have found @thorstenMueller on telegram, is it you?

No, i'm not using telegram.
You can pm me in @coqui-ai Gitter chat (https://gitter.im/coqui-ai/TTS).

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Hello.
I hope you've some relaxing easter holidays 🐰 🙂.

As promised i've released my emotional "Thorsten" dataset 🥳.
I hope it's useful for someone and (as always) please keep in mind that i'm no professional voice talent, just a guy sharing his voice.

Infos and download links can be found here:
https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional

from thorsten-voice.

thorstenMueller commented on May 24, 2024

I'll close this issue due my emotional dataset has been released. But feel free to post here for feedback on this.

from thorsten-voice.

thorstenMueller commented on May 24, 2024

After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085

from thorsten-voice.

thorstenMueller commented on May 24, 2024

@monatis I'm just being curious, did you have time time listen to my emotional dataset and/or found an intended usage?

from thorsten-voice.

monatis commented on May 24, 2024

Hi @thorstenMueller, I listened to some samples and it sounded quite promising. Unfortunately I was dealing with health problems of my parents at that time so I just missed a feedback. Sorry for that.
Currently working on a NLP project in Turkish, then I want to return to working with Emotional Thorsten dataset to make some experiments with the following:
(1) Introducing an emotion vector to the architecture of Tacotron2 and finetuning it for Emotional TTS.
(2) Training few-shot adaptation methods like AdaDurIAN.
Let's see what we can build.

from thorsten-voice.

thorstenMueller commented on May 24, 2024

Hi @monatis .
Thanks for your fast response and sorry to hear your parents had health problems. Hopefully things are doing well now.
I justed wanted to be sure, that you know my emotional dataset is ready for "whatever". Your ideas sounds promising.

from thorsten-voice.

thorstenMueller commented on May 24, 2024

After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085

Today i've released version 2 of my emotional dataset 🎉.
In addition to emotions from version 1:

Neutral
Disgusted
Angry
Amused
Surprised
Sleepy

It now includes:

Drunk
Whispering

Check details and download on my Github page https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional

@monatis: Just in case it's interesting for you.

from thorsten-voice.

Recording free emotional german dataset about thorsten-voice HOT 24 CLOSED

Comments (24)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent