Giter Club home page Giter Club logo

icassp2021-a2s's Introduction

ICASSP-A2S

Accompanying code for paper

  • Lele Liu, Veronica Morfi and Emmanouil Benetos, "Joint Multi-pitch Detection and Score Transcription for Polyphonic Piano Music", IEEE International Conference on Acoustics, Speech and Signal Processing, Canada, Jun 2021.

Environment setup

This project uses pytorch and python 3, it's recommended you first create a python 3 virtual environment

python3 -m venv ICASSP2021-A2S-ENV
source ICASSP2021-A2S-ENV/bin/activate
git clone https://github.com/cheriell/ICASSP-A2S

Run the following command to install the python packages required in this project

cd ICASSP-A2S
pip install -r requirements.txt

In this project, we use the MV2H metric (McLeod et al., 2018) for Audio-to-Score transcription, please refer to the original github repository for installation details.

Before running, enable shell scripts

chmod +x runme.sh
chmod +x audio2score/utilities/evaluate_midi_mv2h.sh 

Data

We use the MuseSyn dataset for our experiments. To download the dataset, please refer to: MuseSyn: A dataset for complete automatic piano music transcription research.

Running

Please refer to runme.sh for examples of relavant commands for model training and evaluation. Before you run the script, please remember to change the relavant path on top of the shell script to where you save your datasets, features, models and MV2H metric. Uncomment commands to run the script.

Multi-pitch detection with different time-frequency representations

To train a multi-pitch detection model, use the following command.

python train.py audio2pr --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features

For evaluation, run

python test.py audio2pr --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features --model_checkpoint model_checkpoint_file

For different time-frequency representations, please modify the spectrogram settings in file audio2score/settings.py. A list of tested spectrogram settings in the paper are given in metadata/spectrogram_settings.csv.

Audio-to-Score transcription with different score representations

To train a single-task audio-to-score transcription model, run

python train.py audio2score --score_type score_type --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features

score_type should be Reshaped or LilyPond.

For evaluation, run

python test.py audio2score --score_type score_type --MV2H_path path/to/MV2H/bin --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features --model_checkpoint model_checkpoint_file

Joint Transcription

To train a joint transcrition model, run

python train.py joint --score_type "Reshaped" --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features

For evaluation, run

python test.py joint --score_type "Reshaped" --MV2H_path path/to/MV2H/bin --dataset_folder path/to/MuseSyn --feature_folder path/to/MuseSyn/features --model_checkpoint model_checkpoint_file

Transcription output example

An example set of Transcription output can be found in folder output/scores_joint_transcription.

icassp2021-a2s's People

Contributors

cheriell avatar tucan9389 avatar snowmint avatar

Stargazers

Hao Hao Tan avatar Devansh Khandekar avatar Longshen Ou avatar  avatar Gerald Golka avatar Licht avatar María avatar youngsik won avatar  avatar  avatar Ranjodh Singh avatar Andreas Jansson avatar Nguyen Khang avatar Michael Christensen avatar  avatar aaronchen avatar Jongho Choi avatar

Watchers

 avatar

icassp2021-a2s's Issues

After traing and testing the outputs folder are empty, and how to inference with our own audio?

After read your paper I have interest in how to achieve this work.

After training and testing the outputs folder are empty, the training processing end at epoch 56, then run the test.py and get the evaluation like below attached text, but the outputs folder are empty.

And also I wonder how to inference our own midi file and get the transcription music score output here.

(base) ilc@ilc:~/Desktop/workplace/ICASSP2021-A2S$ python test.py audio2pr --dataset_folder ./MuseSyn --feature_folder ./MuseSyn/features --model_checkpoint ./tensorboard_logs/audio2pr-VQT-bins_per_octave=60-n_octaves=8-gamma=20/version_0/checkpoints/epoch=56-valid_loss=43.3211.ckpt
Get train metadata, 4 pianos
Get valid metadata, 4 pianos
Get test metadata, 4 pianos
GPU available: True, used: True
TPU available: None, using: 0 TPU cores
Preparing spectrogram 672/672
Preparing pianoroll 672/672
Preparing spectrogram 80/80
Preparing pianoroll 80/80
Preparing spectrogram 84/84
Preparing pianoroll 84/84
The following callbacks returned in LightningModule.configure_callbacks will override existing callbacks passed to Trainer: ModelCheckpoint
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Get test dataloader
Testing: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 458/458 [07:21<00:00, 1.04it/s]


DATALOADER:0 TEST RESULTS
{'logs': {'test_accuracy': 0.7291517294943333,
'test_epoch': 0,
'test_f-score': 0.8169069737195969,
'test_f-score_n_on': 0.6852234803499391,
'test_f-score_n_onoff': 0.4228041814737893,
'test_loss': 89.91925048828125,
'test_precision': 0.9348128437995911,
'test_precision_n_on': 0.8518075491759702,
'test_precision_n_onoff': 0.5432599993745504,
'test_recall': 0.766946617513895,
'test_recall_n_on': 0.631439393939394,
'test_recall_n_onoff': 0.3829577285459639},
'loss': 89.91925048828125,
'test_accuracy': 0.8398997187614441,
'test_epoch': 0.0,
'test_f-score': 0.9031959772109985,
'test_f-score_n_on': 0.8432270884513855,
'test_f-score_n_onoff': 0.664776623249054,
'test_loss': 45.26031494140625,
'test_precision': 0.9279804229736328,
'test_precision_n_on': 0.920455813407898,
'test_precision_n_onoff': 0.7133415937423706,
'test_recall': 0.8941190838813782,
'test_recall_n_on': 0.8031598329544067,
'test_recall_n_onoff': 0.6364219784736633}


Thank you for your time for reading my question.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.