Giter Club home page Giter Club logo

btc-ismir19's Introduction

A Bi-Directional Transformer for Musical Chord Recognition

This repository has the source codes for the paper "A Bi-Directional Transformer for Musical Chord Recognition"(ISMIR19).

Requirements

  • pytorch >= 1.0.0
  • numpy >= 1.16.2
  • pandas >= 0.24.1
  • pyrubberband >= 0.3.0
  • librosa >= 0.6.3
  • pyyaml >= 3.13
  • mir_eval >= 0.5
  • pretty_midi >= 0.2.8

File descriptions

  • audio_dataset.py : loads data and preprocesses label files to chord labels and mp3 files to constant-q transformation.
  • btc_model.py : contains pytorch implementation of BTC.
  • train.py : for training.
  • crf_model.py : contatins pytorch implementation of Conditional Random Fields (CRFs) .
  • baseline_models.py : contains the codes of baseline models.
  • train_crf.py : for training CRFs.
  • run_config.yaml : includes hyper parameters and paths that are needed.
  • test.py : for recognizing chord from audio file.

Using BTC : Recognizing chords from files in audio directory

Using BTC from command line

$ python test.py --audio_dir audio_folder --save_dir save_folder --voca False
  • audio_dir : a folder of audio files for chord recognition (default: './test')
  • save_dir : a forder for saving recognition results (default: './test')
  • voca : False means major and minor label type, and True means large vocabulary label type (default: False)

The resulting files are lab files of the form shown below and midi files.

Attention Map

The figures represent the probability values of the attention of self-attention layers 1, 3, 5 and 8 respectively. The layers that best represent the different characteristics of each layers were chosen. The input audio is the song "Just A Girl" (0m30s ~ 0m40s) by No Doubt from UsPop2002, which was in evaluation data.

Data

We used Isophonics[1], Robbie Williams[2], UsPop2002[3] dataset which consists of chord label files. Due to copyright issue, these datasets do not include audio files. The audio files used in this work were collected from online music service providers.

[1] http://isophonics.net/datasets

[2] B. Di Giorgi, M. Zanoni, A. Sarti, and S. Tubaro. Automatic chord recognition based on the probabilistic modeling of diatonic modal harmony. In Proc. of the 8th International Workshop on Multidimensional Systems, Erlangen, Germany, 2013.

[3] https://github.com/tmc323/Chord-Annotations

Reference

Comments

  • Any comments for the codes are always welcome.

btc-ismir19's People

Contributors

ckycky3 avatar jayg996 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.