yatingmusic / remi Goto Github PK

View Code? Open in Web Editor NEW

527.0 14.0 85.0 47 KB

"Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions", ACM Multimedia 2020

License: GNU General Public License v3.0

Python 76.91% Jupyter Notebook 23.09%

music-generation transformer tensorflow

remi's Introduction

REMI

Authors: Yu-Siang Huang, Yi-Hsuan Yang

Paper (arXiv) | Blog | Audio demo (Google Drive) | Online interactive demo

REMI, which stands for REvamped MIDI-derived events, is a new event representation we propose for converting MIDI scores into text-like discrete tokens. Compared to the MIDI-like event representation adopted in exising Transformer-based music composition models, REMI provides sequence models a metrical context for modeling the rhythmic patterns of music. Using REMI as the event representation, we train a Transformer-XL model to generate minute-long Pop piano music with expressive, coherent and clear structure of rhythm and harmony, without needing any post-processing to refine the result. The model also provides controllability of local tempo changes and chord progression.

Citation

@inproceedings{10.1145/3394171.3413671,
  author = {Huang, Yu-Siang and Yang, Yi-Hsuan},
  title = {Pop Music Transformer: Beat-Based Modeling and Generation of Expressive Pop Piano Compositions},
  year = {2020},
  isbn = {9781450379885},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3394171.3413671},
  doi = {10.1145/3394171.3413671},
  pages = {1180–1188},
  numpages = {9},
  location = {Seattle, WA, USA},
  series = {MM '20}
}

Getting Started

Install Dependencies

python 3.6 (recommend using Anaconda)
tensorflow-gpu 1.14.0 (pip install tensorflow-gpu==1.14.0)
miditoolkit (pip install miditoolkit)

Download Pre-trained Checkpoints

We provide two pre-trained checkpoints for generating samples.

REMI-tempo-checkpoint (428 MB)
REMI-tempo-chord-checkpoint (429 MB)

Obtain the MIDI Data

We provide the MIDI files including local tempo changes and estimated chord. (5 MB)

data/train: 775 files used for training models
data/evaluation: 100 files (prompts) used for the continuation experiments

Generate Samples

See main.py as an example:

from model import PopMusicTransformer
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

def main():
    # declare model
    model = PopMusicTransformer(
        checkpoint='REMI-tempo-checkpoint',
        is_training=False)
        
    # generate from scratch
    model.generate(
        n_target_bar=16,
        temperature=1.2,
        topk=5,
        output_path='./result/from_scratch.midi',
        prompt=None)
        
    # generate continuation
    model.generate(
        n_target_bar=16,
        temperature=1.2,
        topk=5,
        output_path='./result/continuation.midi',
        prompt='./data/evaluation/000.midi')
        
    # close model
    model.close()

if __name__ == '__main__':
    main()

Convert MIDI to REMI

You can find out how to convert the MIDI messages into REMI events in the midi2remi.ipynb.

FAQ

1. How to synthesize the audio files (e.g., mp3)?

We strongly recommend using DAW (e.g., Logic Pro) to open/play the generated MIDI files. Or, you can use FluidSynth with a SoundFont. However, it may not be able to correctly handle the tempo changes (see fluidsynth/issues/141).

2. What is the function of the inputs "temperature" and "topk"?

It is the temperature-controlled stochastic sampling methods are used for generating text from a trained language model. You can find out more details in the reference paper CTRL: 4.1 Sampling.

It is worth noting that the sampling method used for generation is very critical to the quality of the output, which is a research topic worthy of further exploration.

3. How to finetune with my personal MIDI data?

Please see issue/Training on custom MIDI corpus

Acknowledgement

The content of modules.py comes from the kimiyoung/transformer-xl repository.
Thanks @vibertthio for the awesome online interactive demo.

remi's People

Stargazers

Watchers

Forkers

wendonggan lianghsia asears meetwhom spawnfile xrick josephch405 fai247 vickyching chunhuiwang-china aashiqmuhamed xzm2004260 zhuwenxing akamight ruohoruotsi liyaangy kristenmasada huangyifan-emma mapleee amanda-cse-ucsd jsyzc2019 yaffa16 karager09 frankjy asigalov61 shubhampachori12110095 doken-tokuyama blackpaintedman astro-adamo daniel-schreier spijkervet nhsjgczryf jech2 bachkukkik sid0710 wayne391 joann8512 vrycn ghalestrilo wchen-casia liujingxiu23 huangqing6 stetelepta vick-wuwei cybernitta anthonyfang623 dhrumilp15 simba328 genises ak391 jaedukseo ericguizzo moseshu otnemrasordep daizhenrong slewyh meadow163 ms3744 mcolletta ishaan27chaturvedi silvia0v0 aleksolberg techthiyanes 612twilight baitubaitu jevenzh malcolmsailor cristianstanciu yeong35 1ucky40nc3 khang4dang littleye233 gtesfaye1 eri24816 ajunlonglive xinda-wu kandy22 earthwuyang vincent-chenzhun-huang schran4er yunokiisshin ubyjvovk chill4stev gariscat

remi's Issues

Generated music quality with finetuning

Thank you for the amazing work on this project!

However, I have an issue with the generated music quality after finetuning on the MAESTRO dataset. For more information, I was using the "REMI-tempo-chord-checkpoint as the base model and trained it for 5 epochs of the whole dataset. My hypothesis after reading the paper is because of the time signatures of songs in the MAESTRO dataset that is not supported by the REMI encoding method.

Do you have any insights about this problem?

.pb file output

Hello,

Is it possible to save a .pb file instead of a .ckpt when the training is 'complete'? I am looking into porting this model into ONNX and PyTorch and would need a pb file to accomplish this. Many thanks!

dictionary.pkl

how can i generate it

Process finished with exit code -1073741571 (0xC00000FD)

This problem occurs when I run code on pycharm:

Process finished with exit code -1073741571 (0xC00000FD)

Does anyone know how to solve it? You can reply here or contact my email: [email protected]

Thank you very much.

i can't train from scratch after remove line 99

i removed this line but i can't train,it says:" self._traceback = tf_stack.extract_stack_for_node(self._c_op)"

You can see #99 in model.py which is used to restore the pre-trained checkpoint. You can remove this line if you want to train from scratch.

Originally posted by @remyhuang in #17 (comment)

module 'miditoolkit.pianoroll.parser' has no attribute 'get_pianoroll'

HI.
This is a great project.
But , I meet some wrong in chord_recognition.py
module 'miditoolkit.pianoroll.parser' has no attribute 'get_pianoroll' in line 34
Isn't the function here supposed to be miditoolkit.pianoroll.parser.notes2pianoroll instead of

miditoolkit.pianoroll.parser.get_pianoroll ？
Thank u ~

self.group_size*2 should be self.group_size in prepare_data()

Thank you for providing the code for finetuning the model on custom midi datasets!

I notice that at the line 251 in model.py, the "step" argument of np.arrage is set to self.group_size*2. I think it leads to skipping half of the segments in pairs. Should it be just self.group_size instead of self.group_size*2?

for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
    data = pairs[i:i+self.group_size]
        if len(data) == self.group_size:
             segments.append(data)

questions about your data preprocessing

Hi, your work is really amazing! I have some questions about the def prepare_data() function in line 252.
`
# abandon the last

for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
data = pairs[i:i+self.group_size]
if len(data) == self.group_size:
segments.append(data)
`

why you abandon the last pairs_elements in one midi file? Will it improve the final result?
In the for loop, the third parameter is self.group_size*2. It means you will skip 5 pairs each loop in one midi file. For example, if a "pairs" variable has shape [30,2,512], only the pairs[0:5], pairs[10:15]and pairs[20:25]will be added to the final "segments" variable. Could you tell me why you not feeded the whole pairs to "segment"?

Thank you!

midi2remi is not run

Hi, I have a question. Why my midi2remi is not running. When midi2remi is running ，the problem appears (FileNotFoundError: [Errno 2] No such file or directory: '. \pop data'), I hope you can help to solve it.

evaluation code

can you share your evaluation code? thanks !

The code for evaluation(downbeat and so on)

Hi,

Thank you for your information about the dataset and I can generate pretty good result! Now I would like to evaluate the beat std, downbeat std and downbeat salience. After doing some research, I think you might use the madmom package to calculate the three scores(I am not sure.), and I am not familiar with that package. Therefore, could you please share your evaluation code with me?

why my training model dont have pkl file

i want to know that how to get the “dictionary.pkl” file
and how to generate this file

Some questions about "Controllable CHORD and TEMPO"

Thanks for your amazing works! When I read your paper and blog, I feel confused about the details about the method for controlling the chord and tempo of the generated results. I can not find where to input or control the chords(which I want the output has) in codes. It would be helpful if you could give me some suggestions.

Please help with REMI to MIDI coversion

Hey guys,

I would really appreciate if you can help me make my REMI colab implementation of REMI encoding work.

Here is the link.
https://github.com/asigalov61/DOREMI/blob/main/DOREMI.ipynb

I have no idea how to convert REMI/ REMI model output back to midi. I would really appreciate any advice/help that you can offer/suggest.

Your proposal/encoding is very interesting and a lot of people want to try it but cuz you did not provide full implementation/colab, very few of us can eval/use your work.

Please help.

Thank you so much.

Error in training from scratch

When I try to train the model from scratch, errors occur when initializing some vars. The error is : Failed precondition: Error while reading resource variable transformer/layer_0/rel_attn/layer_normalization/gamma from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/transformer/layer_0/rel_attn/layer_normalization/gamma/N10tensorflow3VarE does not exist.

something is wrong! Tempo Value_57

Hello, I changed the original training music to my own and this problem appeared, how can I solve it? Thank you very much for your answer！

MIDI files end up their last note pushed up a position

Hello, thank you for publishing this repo. I noticed theres a bug that occurs at the end of a MIDI file when encoding to REMI.
The very last note, or multiple notes that occur after a certain position in the last bar will be "pushed up" a position making them off rhythmically. I'm not sure how to fix this. I assume there must be an issue in the item2event function but I'm not very experienced with numpy.

How do I generate my own Remi-checkpoint

Hello, I want to add a pentatonic scale to MIDI. Do I need to modify the training checkpoint and dictionary? How should I modify and add

questions about model.py

I meet some wrong in model.py.
Here is my code:
from model import PopMusicTransformer
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
model = PopMusicTransformer(checkpoint='REMI-tempo-checkpoint', is_training=False)

And it turns out:
ValueError: Variable transformer/r_w_bias already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

      File "C:\Users\ailab502\anaconda3\envs\REMI\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
        self._traceback = tf_stack.extract_stack()
      File "C:\Users\ailab502\anaconda3\envs\REMI\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
        op_def=op_def)
      File "C:\Users\ailab502\anaconda3\envs\REMI\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
        return func(*args, **kwargs)
      File "C:\Users\ailab502\anaconda3\envs\REMI\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
        op_def=op_def)
      File "C:\Users\ailab502\anaconda3\envs\REMI\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 2023, in variable_v2
        shared_name=shared_name, name=name)

How can I fix this error?

style question

Hi remy,

 Thanks for your amazing work.
 In your blog, I see different music style, when you finish your training, how do you control what style your model will generate?
 In my understanding, it depends on the style of training dataset, if we would like to generate different music style, we must use different to retrain the model? am I right?

Best,

More detailed information about finetuning data format

Thank you very much for your work and also especially for providing the finetune code!

However, I intend to finetune the model with data not in MIDI but in a format where I have information about notes in the form of the following properties:

for each note: Position, Pitch, Velocity, Duration (in positions)
additionally: overall time signature and chord tokens + their positions.

So what I attempt is trying to convert these events into a notation that the finetune method can work with. However, from the paper I am unfortunately unable to discern the meaning of each Event the method extracts from a MIDI file. Here's an example of Events I get from a MIDI file when running finetune:

 Event(name=Bar, time=None, value=None, text=1),
 Event(name=Position, time=0, value=1/16, text=0),
 Event(name=Tempo Class, time=0, value=mid, text=None),
 Event(name=Tempo Value, time=0, value=30, text=None),

 Event(name=Position, time=960, value=9/16, text=960), 
 Event(name=Note Velocity, time=960, value=9, text=38/36),
 Event(name=Note On, time=960, value=52, text=52),
 Event(name=Note Duration, time=960, value=63, text=5255/3840), ...

I understand much of it but the meaning of a few bits and pieces is unclear to me:

lines 3-4: Tempo class mid is clear, but what does the Tempo Value: value = 30 mean, 30 BPM?
line 6: What does text=38/36 mean in regards of velocity?
line 7: What does value/text = 52 mean?
line 8: What does value=63 mean, 63 32th note multiples? What does text=5255/3840 mean?

Kind regards and thank you in advance!

Training on custom MIDI corpus?

Hello! This is a great project, and I was able to get it running immediately. Thank you for sharing it!

Is there a way to train the model from scratch -- and/or fine-tune one of your published models -- on my own corpus of MIDI files?

Errors with some seeds/stems MIDIs

Hey guys,

Reporting you error that happens when you run continuation function:

Is there a specific requirement for seed/stems files for your models?

KeyError Traceback (most recent call last)
in ()
5 topk=5,
6 output_path='./result/continuation.midi',
----> 7 prompt='/content/remi/prompt.mid')
8
9 # close model

1 frames
/content/remi/model.py in (.0)
139 if prompt:
140 events = self.extract_events(prompt)
--> 141 words = [[self.event2word['{}_{}'.format(e.name, e.value)] for e in events]]
142 words[0].append(self.event2word['Bar_None'])
143 else:

KeyError: 'Note Velocity_31'

Training from scratch for Classical Piano

Hey guys,

Another tiny issue here...

Is there any way you can share/help to write a nice training from scratch code for Classical Piano (think MAESTRO or similar Datasets)?

I know you have suggested adjusting parameters of the training data (which is useful) but there is no code. You only provide a fine-tuning code.

So if you can do this/help with this, I will be very grateful to you and your help will be much appreciated.

Thank you again

Issues when training with other datasets

Hi,

I would like to train the transformer xl with other midi dataset such as Maestro. However, when I convert the Midi to Remi, and then convert the Remi back to midi, the rythm will have some inconsistency. In addition, only the first track of the multitrack piano midis will be obtained. Therefore, it will cause some polyphonic problem.

Could you give me any advice where I can find other suitable piano midi dataset without loss after remi conversion? Thanks!

AttributeError: module 'miditoolkit' has no attribute 'pianoroll'

How can I fix this issue? Thanks.

Questions about beat and downbeat

Could you tell me where the method 3 mentioned in the paper was used in the process of Midi transferring to REMI . Thank you!

Issues with evaluation (std and salience for beat/downbeat)

Hi! I was trying to realize the evaluation part of the code according to the paper (see the attached picture) Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions:

However, There must be some mistakes with my calculation because of the perceptable distincts between my result of beat std and downbeat salience and the one the paper. Also, I found difficulty in realizing the part of downbeat std because of the different description (perhaps?) of the paper and the file for madmom. The data presented in my result is based on the train set (Remi/data/train) and madmom, and I think I transferred the .midi files into correct .wav files:

I've viewed https://madmom.readthedocs.io/en/latest/modules/features/downbeats.html for related information but still failed to solve it by myself. Therefore, could you please share the evaluation part with me? Thank you so much!

Training Model from Scratch

Hello, thank you for making your source code available for this project. Is it possible to train the model from scratch using our own midi dataset without using one of the shared checkpoints? I've tried the finetune.py script you've included but the resulting model is still too biased toward the initial training set for what I'm trying to do. Thanks again for sharing your work in this space.

Questions about the training data

Hello,

I have tried to train transformer-xl from scratch. However, the generating result is not as good as yours. The theme of one generated midi file is not consistent, which means the latest slice of sequence hears different from the prompt. Therefore,could you tell me if the dataset you provide is the whole dataset? or if you used other bigger dataset for pre-training and used the provided one only for finetuning?

My experiment setting is:
n_layers: 12
x_len: 512
m_len:512
ff:2048

And could you tell me the testset cross-entropy loss you got in your experiment?

Thank you!

How can I train my MIDI dataset

Can't we train our own data set？

Checkpoints saved with a NAN value

Hi, thanks for your amazing work.

I'm trying to finetune your model on a smaller dataset (292 midi files). I'm using the REMI-tempo-chord-checkpoint as the base model. However, my checkpoints are saved with a NAN value, from the first epoch (e.g, model-000-nan.data)

I also have this two warning messages during the training process:

/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
/miniconda3/envs/tfEnv/lib/python3.9/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars

I first checked if something was wrong during the event extraction. It seems that the extraction worked well: len(all_events)=292 that is the size of my dataset. Same with all_words.

However, the segments length is only 3. That means training_data = 3 and num_batches = 0

So I guess something went wrong in that part, but I don't know how to fix it:

# to training data
self.group_size = 5
segments = []
for words in all_words:
    pairs = []
    for i in range(0, len(words)-self.x_len-1, self.x_len):
        x = words[i:i+self.x_len]
        y = words[i+1:i+self.x_len+1]
        pairs.append([x, y])
    pairs = np.array(pairs)

    # abandon the last
    for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
        data = pairs[i:i+self.group_size]
        if len(data) == self.group_size:
            segments.append(data)
segments = np.array(segments)

Does anyone have an idea how to fix it? Thanks a lot

problem when using REMI-tempo-chord-checkpoint

Thanks for this great work! I generate samples with REMI-tempo-checkpoint successfully. But, when I use REMI-tempo-chord-checkpoint, the model shows errors below :

anything I did wrong? Thanks!

Data Augmentation Used?

After looking at the repository closely, I failed to find any data augmentation (Such as transposition and time stretching) used. So, is there a way that I can add this in or enable this feature somehow?

Thank you very much!

Hey guys!

Just wanted to thank you for this awesome repo/paper/approach. I think you are really onto something here. I will test soon and let you know what I think if you are interested.

The only thing I wanted to humbly recommend is to create a nice Google Colab for your idea so that it can easily be tried and evaluated. This is what I am going to do with your code so that everyone can easily check it out. I hope you do not mind.

Otherwise, great job! Thank you again :)))

Parameter "topk" not accessible from model.generate

The "topk" parameter is not accessible from model.generate and thus always remains at its default value of "5" when sampling new melodies.

yatingmusic / remi Goto Github PK

remi's Introduction

REMI

Citation

Getting Started

Install Dependencies

Download Pre-trained Checkpoints

Obtain the MIDI Data

Generate Samples

Convert MIDI to REMI

FAQ

1. How to synthesize the audio files (e.g., mp3)?

2. What is the function of the inputs "temperature" and "topk"?

3. How to finetune with my personal MIDI data?

Acknowledgement

remi's People

Stargazers

Watchers

Forkers

remi's Issues

Recommend Projects

Recommend Topics

Recommend Org