Giter Club home page Giter Club logo

online_amt's Introduction

Real-time Automatic Piano Transcription System

Screen shot This is the code for our Real-time Automatic Piano Transcription System, which was presented in SK Telecom Tech Gallery in Pangyo, Korea. The documentation is currently on working.

The system is based on the AMT model based on Polyphonic Piano Transcription Using Autoregressive Multi-State Note Model (ISMIR 2020). For the detailed explanation on the system implementation, please refer ISMIR 2020 LBD

Requirements

  • Flask==1.1.2
  • scipy==1.4.1
  • numpy==1.16.2
  • PyAudio==0.2.11
  • librosa==0.7.2
  • matplotlib==3.1.1
  • torch==1.6.0
  • rtmidi==2.3.4
  • python_rtmidi==1.1.2
  • numba==0.48

Pre-trained Model

The pre-trained model for AMT is uploaded with Git-LFS. If you are not familiar with Git-LFS, you can download it from here.

The model was trained with MAESTRO v.2.0.0 based on the code by Jongwook Kim

Usage

Caution

If you run the code on a laptop while using a laptop microphone input, the fan noise of laptop will cause severe degradation of AMT performance. We recommend you to use an external microphone, or internal audio such as Soundflower.

With a web browser visualization

$ python run_on_web.py

Then, open http://127.0.0.1:5000/ with your browser. After the page is opened, the AMT model will automatically run until a keyboard interrupt.

With a matplotlib visualization

$ python run_on_plt.py

online_amt's People

Contributors

jdasam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

online_amt's Issues

How did you train the model? could you open source the training code?

According to my understanding, the auto-aggressive LSTM model needs to be trained 'stateful' in batches. More precisely, the LSTM hidden states in batch N need to be passed to batch N+1. How did you train the model? Did you train the model in mini-batches by splitting piano rolls into pieces and use stateful LSTM, or trained the audio samples one by one(stochastic gradient descent)?

Array error

When I run_on_plt.py I get this error.

image

PS D:\online_amt-master> python run_on_plt.py
* recording
C:\Users\Ulises\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\numpy\core\_asarray.py:102: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must 
specify 'dtype=object' when creating the ndarray.
  return array(a, dtype, copy=False, order=order)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "run_on_plt.py", line 83, in <module>
    main(args.model_file)
  File "run_on_plt.py", line 75, in main
    draw_plot(q)
  File "run_on_plt.py", line 58, in draw_plot
    new_roll[:, -num_updated:] = np.asarray(updated_frames).T
ValueError: setting an array element with a sequence.

I would be more than grateful if you could help me =) Thanks in advance

No Default Input Device Available

Hello, I'm trying to run it through Ubuntu WSL with a browser visualization.

(base) ulaili@DESKTOP-UM451RI:/mnt/d/online_amt-master$ python run_on_web.py
http://127.0.0.1:5000/
 * Serving Flask app "run_on_web" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
http://127.0.0.1:5000/
shared memfd open() failed: Function not implemented
ALSA lib confmisc.c:767:(parse_card) cannot find card '0'
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM sysdefault
ALSA lib confmisc.c:767:(parse_card) cannot find card '0'
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM sysdefault
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
ALSA lib pcm_oss.c:377:(_snd_pcm_oss_open) Unknown field port
shared memfd open() failed: Function not implemented
ALSA lib pulse.c:242:(pulse_connect) PulseAudio: Unable to connect: Connection refused

shared memfd open() failed: Function not implemented
ALSA lib pulse.c:242:(pulse_connect) PulseAudio: Unable to connect: Connection refused

ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_hw.c:1829:(_snd_pcm_hw_open) Invalid value for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib pcm_usb_stream.c:486:(_snd_pcm_usb_stream_open) Invalid type for card
ALSA lib confmisc.c:767:(parse_card) cannot find card '0'
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:767:(parse_card) cannot find card '0'
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:767:(parse_card) cannot find card '0'
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory
ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1246:(snd_func_refer) error evaluating name
ALSA lib conf.c:4732:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5220:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2642:(snd_pcm_open_noupdate) Unknown PCM dmix
Exception in thread <function get_buffer_and_transcribe at 0x7f36f3b461f0>:
Traceback (most recent call last):
  File "/home/ulaili/miniconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/ulaili/miniconda3/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/d/online_amt-master/run_on_web.py", line 44, in get_buffer_and_transcribe
    CHANNELS = pyaudio.PyAudio().get_default_input_device_info()['maxInputChannels']
  File "/home/ulaili/.local/lib/python3.8/site-packages/pyaudio.py", line 949, in get_default_input_device_info
    device_index = pa.get_default_input_device()
OSError: No Default Input Device Available

Is it possible for it to grab my computers audio with Voicemeeter? Your work seems very interesting to try, though I'm no expert here. I'd appreciate your guidance, thank you!

Small model checkpoint

Hi,

There is a small model with five states in Table I of the paper. Could you please share the small model's last checkpoint as well?

Many thanks in advance!

Questions about the code

In the code, there is a 5-elements vector representing each state's logits for each pitch. First of all, what is the label of each state? Creating an enum to represent each state would increase the code readability. BTW, I think the order of states is

  • 0: off
  • 1: offset
  • 2: on
  • 3: onset
  • 4: re-onset.

Is this correct?

And, why do you double the logits of the two last states?

language_out[0,0,:,3:5] *= 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.