Giter Club home page Giter Club logo

Comments (24)

thorstenMueller avatar thorstenMueller commented on May 24, 2024 6

Recording of "angry" phrases for emotional dataset is in progress. Always keep in mind that i'm no professional voice artist.
https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-emotional-dataset

@monatis what do you think?

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024 5

Recording of emotional dataset is finished 💬 🥳
It's been harder and took longer than i thought pronouncing emotions on phrases which do not match these emotion, but i tried my best.

Please keep in mind that i'm just a guy sharing his voice with you folks, i'm no professional voice artist.

@domcross is now doing his audio optimization magic on recordings and once he's done with that i'll publish them.

Until then - here are two samples on how the results sound:

Mist, wieder nichts geschafft.
Es kann doch nicht so schwer sein, einen Ring ins Feuer zu werfen.

Samples are spoken in following emotion order:

  • Normal 🙂
  • Disgusted 🤢
  • Angry 😠
  • Amused 😀
  • Surprised 😲
  • Sleepy 😔

https://soundcloud.com/thorsten-mueller-395984278/sets/thorsten-de-emotional-dataset

from thorsten-voice.

monatis avatar monatis commented on May 24, 2024 1

@thorstenMueller Wow you're doing an amazing job --I can easily get the emotion in the sample. I strongly believe that it will have a substantial impact in the literature.

from thorsten-voice.

monatis avatar monatis commented on May 24, 2024 1

@thorstenMueller Yes they are up and running now 😃
Thanks for the reminder and of course for the great dataset you're giving away to the community 😊
By the way, I'm leading the TensorFlow Turkey community and we're holding regular online events to talk about TensorFlow things and broader machine learning. If you would like to, I want to have you as a guest in one of such events to talk about your motivation and experience in building and sharing this dataset. I'm pretty sure that it will be a great inspiration for a lot of people. If you agree, you can simply say hello at the email address on my profile to talk about the details.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Thanks @monatis i highly appreciate your nice words 👍 .
If my dataset would be mentioned in literature this would be a honor for me, but i just hope my voice contribution is useful for someone.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Short update:

  • Record "normal"
  • Record "angry"
  • Record "disgusted"
  • Record "sleepy" (I'm currently recording "sleepy")
  • Record "amused"
  • Record "surprised"

I've published some clips on soundcloud:

A comparison of two sentences in all current recorded emotions can be heard here:

from thorsten-voice.

monatis avatar monatis commented on May 24, 2024

@thorstenMueller This is awesome! Congrats on the hard work and the great job you're doing here 👏
Looking forward to the release of processed full version.

I'm truely happy to witness such great dataset contributions to the community. Can't wait to work with this one 🚀

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Audio optimisation for my emotional dataset contribution is finished (thanks @domcross). I've compressed the audio files + csv metadata and upload is ready to go. Release is planned as a gift for easter holidays 🐰.
See my twitter account for updates (https://twitter.com/ThorstenVoice) and of course, this topic here.

from thorsten-voice.

monatis avatar monatis commented on May 24, 2024

Great news! Thank you @thorstenMueller and @domcross.
By the way, Google published a parallel version of Tacotron 2, which is non-autoregressive thus faster, more stable and more natural according to subjective listening tests. I started to work on this version.

from thorsten-voice.

snakers4 avatar snakers4 commented on May 24, 2024

I wonder how you have chosen these 304 phrases?
It is quite difficult to find phrases that match the emotion

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

@snakers4 Phrases are not emotion specific. All sentences are generic and i just pronounced these in different emotions. As good as i could since i am no professional voice talent.

from thorsten-voice.

snakers4 avatar snakers4 commented on May 24, 2024

Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".

On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.

from thorsten-voice.

snakers4 avatar snakers4 commented on May 24, 2024

Also check this out - https://silyfox.github.io/iscslp-98-demo/ - sounds like anime
Similar work to yours in essence

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Well, you did a great job, since when I was listening to the "Wütend" phrases, I instantly remembered that scene from "Der Untergang".

On a more serious note, looks like it was very difficult to record ~300 phrases per emotion, since it took ~2 months?
Were there any non-obvious issues? I am asking because we are planning to record something similar as well.

Thanks :-)
It's not easy to get into the right emotion on sentences which do not match these emotion.

300 phrases in 5 emotions = 1.500 recordings. I didn't record every day so it took some time until it was finished.

from thorsten-voice.

snakers4 avatar snakers4 commented on May 24, 2024

300 phrases in 5 emotions = 1.500 recordings.

When we recorded speakers for "normal", they could do about 15k phrases in 3-4 weeks, also not every day.
Looks like recording with emotion is more difficult

from thorsten-voice.

snakers4 avatar snakers4 commented on May 24, 2024

Also I have found @thorstenMueller on telegram, is it you?

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Also I have found @thorstenMueller on telegram, is it you?

No, i'm not using telegram.
You can pm me in @coqui-ai Gitter chat (https://gitter.im/coqui-ai/TTS).

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Hello.
I hope you've some relaxing easter holidays 🐰 🙂.

As promised i've released my emotional "Thorsten" dataset 🥳.
I hope it's useful for someone and (as always) please keep in mind that i'm no professional voice talent, just a guy sharing his voice.

Infos and download links can be found here:
https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

I'll close this issue due my emotional dataset has been released. But feel free to post here for feedback on this.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

@monatis I'm just being curious, did you have time time listen to my emotional dataset and/or found an intended usage?

from thorsten-voice.

monatis avatar monatis commented on May 24, 2024

Hi @thorstenMueller, I listened to some samples and it sounded quite promising. Unfortunately I was dealing with health problems of my parents at that time so I just missed a feedback. Sorry for that.
Currently working on a NLP project in Turkish, then I want to return to working with Emotional Thorsten dataset to make some experiments with the following:
(1) Introducing an emotion vector to the architecture of Tacotron2 and finetuning it for Emotional TTS.
(2) Training few-shot adaptation methods like AdaDurIAN.
Let's see what we can build.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

Hi @monatis .
Thanks for your fast response and sorry to hear your parents had health problems. Hopefully things are doing well now.
I justed wanted to be sure, that you know my emotional dataset is ready for "whatever". Your ideas sounds promising.

from thorsten-voice.

thorstenMueller avatar thorstenMueller commented on May 24, 2024

After release means before new ideas are coming to mind. See here for "drunk" samples (just pronounced this way, i'm not drunk while recording ;-) )
https://twitter.com/ThorstenVoice/status/1386060775488430085

Today i've released version 2 of my emotional dataset 🎉.
In addition to emotions from version 1:

  • Neutral
  • Disgusted
  • Angry
  • Amused
  • Surprised
  • Sleepy

It now includes:

  • Drunk
  • Whispering

Check details and download on my Github page https://github.com/thorstenMueller/deep-learning-german-tts/#dataset-Thorsten-emotional

@monatis: Just in case it's interesting for you.

from thorsten-voice.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.