Comments (3)
Rhythmizers are actually temporary solution of phoneme duration prediction for MIDI-less models. A rhythmizer contains the FastSpeech2 encoder module and the DurationPredictor module from MIDI-A mode.
Models of MIDI-A mode can predict phoneme durations well and generate nice spectrograms, but their datasets are hard to label (you need MIDI sequence and slurs), and they have poor ability to predict the pitch, although they do have PitchPredictor. That is why we are deprecating this mode in this forked repository.
To get a rhythmizer, you need to first choose or design a phoneme dictionary. Then you should label your dataset in the opencpop segments format. Please note that the MIDI duration transcriptions of opencpop is in consonant-vowel format, and you need to label your dataset in vowel-consonant format, which is to say, the beginning of note should be aligned with the beginning of vowels instead of consonants (see issue). Here is an example of the labels that we converted from the original opencpop transcriptions: transcriptions-strict-revised2.txt. The last step is to preprocess your dataset and train a MIDI-A model with this config. After that, you can export the part for duration prediction with this script.
For CVVC languages like English and Polish, the answer is no. Because we currently can only deal with two-phase (CV) phoneme systems like Chinese and Japanese. MIDI-A, MIDI-B, duration predictors, data labels and all other word-phoneme related stuff will be re-designed in the future, and for that time you can expect a full support to all universal languages. No rhythmizers will be needed then - everyone can train their own variance adaptors (containing duration and pitch models and much more) via standard pipelines as easy as that of preparing and training MIDI-less acoustic models for now.
By the way, members of our team are already preparing for a Japanese rhythmizer. When they finish the dictionary, rhythmizer and the MFA model, we will formally support Japanese MIDI-less mode preparation in our pipeline. If you really find difficulties preparing by your own, it is fine to just wait for our progress.
from diffsinger.
Related Issues (20)
- 关于训练数据的准备 HOT 2
- Question about training in other languages like English HOT 3
- 添加自訂語言方法 HOT 9
- How do I inference from a file? HOT 9
- Fusion algorithm for timbres and styles
- 切片时,两段歌声之间会留存超长的无声片段 HOT 1
- mfa无法使用 HOT 4
- loss NsfHifiGAN HOT 1
- run without mel2ph raise error HOT 1
- ModuleNotFoundError: No module named 'skimage' HOT 7
- 关于python环境变量的问题 HOT 9
- RuntimeError: DataLoader worker HOT 2
- 数据集制作3.2 cell 2 HOT 2
- 训练时hifigan model file is not found HOT 3
- AttributeError: 'DiffSingerE2EInfer' object has no attribute 'spk_map' HOT 1
- 你好, pipelines/asserts/*.lab 文件格式是什么样的 HOT 2
- mfa无法对齐音素 HOT 2
- mfa提示 There were no files found for this corpus HOT 4
- 到底如何安装啊 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diffsinger.