Giter Club home page Giter Club logo

polymath's Introduction

Polymath

Polymath uses machine learning to convert any music library (e.g from Hard-Drive or YouTube) into a music production sample-library. The tool automatically separates songs into stems (beats, bass, etc.), quantizes them to the same tempo and beat-grid (e.g. 120bpm), analyzes musical structure (e.g. verse, chorus, etc.), key (e.g C4, E3, etc.) and other infos (timbre, loudness, etc.), and converts audio to midi. The result is a searchable sample library that streamlines the workflow for music producers, DJs, and ML audio developers.

Polymath

Use-cases

Polymath makes it effortless to combine elements from different songs to create unique new compositions: Simply grab a beat from a Funkadelic track, a bassline from a Tito Puente piece, and fitting horns from a Fela Kuti song, and seamlessly integrate them into your DAW in record time. Using Polymath's search capability to discover related tracks, it is a breeze to create a polished, hour-long mash-up DJ set. For ML developers, Polymath simplifies the process of creating a large music dataset, for training generative models, etc.

How does it work?

  • Music Source Separation is performed with the Demucs neural network
  • Music Structure Segmentation/Labeling is performed with the sf_segmenter neural network
  • Music Pitch Tracking and Key Detection are performed with Crepe neural network
  • Music to MIDI transcription is performed with Basic Pitch neural network
  • Music Quantization and Alignment are performed with pyrubberband
  • Music Info retrieval and processing is performed with librosa

Community

Join the Polymath Community on Discord

Requirements

You need to have the following software installed on your system:

  • ffmpeg

Installation

You need python version >=3.7 and <=3.10. From your terminal run:

git clone https://github.com/samim23/polymath
cd polymath
pip install -r requirements.txt

If you run into an issue with basic-pitch while trying to run Polymath, run this command after your installation:

pip install git+https://github.com/spotify/basic-pitch.git

GPU support

Most of the libraries polymath uses come with native GPU support through cuda. Please follow the steps on https://www.tensorflow.org/install/pip to setup tensorflow for use with cuda. If you have followed these steps, tensorflow and torch will both automatically pick up the GPU and use it. This only applied to native setups, for dockerized deployments (see next section), gpu support is forthcoming

Docker setup

If you have Docker installed on your system, you can use the provided Dockerfile to quickly build a polymath docker image (if your user is not part of the docker group, remember to prepend sudo to the following command):

docker build -t polymath ./

In order to exchange input and output files between your hosts system and the polymath docker container, you need to create the following four directories:

  • ./input
  • ./library
  • ./processed
  • ./separated

Now put any files you want to process with polymath into the input folder. Then you can run polymath through docker by using the docker run command and pass any arguments that you would originally pass to the python command, e.g. if you are in a linux OS call:

docker run \
    -v "$(pwd)"/processed:/polymath/processed \
    -v "$(pwd)"/separated:/polymath/separated \
    -v "$(pwd)"/library:/polymath/library \
    -v "$(pwd)"/input:/polymath/input \
    polymath python /polymath/polymath.py -a ./input/song1.wav

Run Polymath

1. Add songs to the Polymath Library

Add YouTube video to library (auto-download)
python polymath.py -a n6DAqMFe97E
Add audio file (wav or mp3)
python polymath.py -a /path/to/audiolib/song.wav
Add multiple files at once
python polymath.py -a n6DAqMFe97E,eaPzCHEQExs,RijB8wnJCN0
python polymath.py -a /path/to/audiolib/song1.wav,/path/to/audiolib/song2.wav
python polymath.py -a /path/to/audiolib/

Songs are automatically analyzed once which takes some time. Once in the database, they can be access rapidly. The database is stored in the folder "/library/database.p". To reset everything, simply delete it.

2. Quantize songs in the Polymath Library

Quantize a specific songs in the library to tempo 120 BPM (-q = database audio file ID, -t = tempo in BPM)
python polymath.py -q n6DAqMFe97E -t 120
Quantize all songs in the library to tempo 120 BPM
python polymath.py -q all -t 120
Quantize a specific songs in the library to the tempo of the song (-k)
python polymath.py -q n6DAqMFe97E -k

Songs are automatically quantized to the same tempo and beat-grid and saved to the folder “/processed”.

3. Search for similar songs in the Polymath Library

Search for 10 similar songs based on a specific songs in the library (-s = database audio file ID, -sa = results amount)
python polymath.py -s n6DAqMFe97E -sa 10
Search for similar songs based on a specific songs in the library and quantize all of them to tempo 120 BPM
python polymath.py -s n6DAqMFe97E -sa 10 -q all -t 120
Include BPM as search criteria (-st)
python polymath.py -s n6DAqMFe97E -sa 10 -q all -t 120 -st -k

Similar songs are automatically found and optionally quantized and saved to the folder "/processed". This makes it easy to create for example an hour long mix of songs that perfectly match one after the other.

4. Convert Audio to MIDI

Convert all processed audio files and stems to MIDI (-m)
python polymath.py -a n6DAqMFe97E -q all -t 120 -m

Generated Midi Files are currently always 120BPM and need to be time adjusted in your DAW. This will be resolved soon. The current Audio2Midi model gives mixed results with drums/percussion. This will be resolved with additional audio2midi model options in the future.

Audio Features

Extracted Stems

The Demucs Neural Net has settings that can be adjusted in the python file

- bass
- drum
- guitare
- other
- piano
- vocals

Extracted Features

The audio feature extractors have settings that can be adjusted in the python file

- tempo
- duration
- timbre
- timbre_frames
- pitch
- pitch_frames
- intensity
- intensity_frames
- volume
- avg_volume
- loudness
- beats
- segments_boundaries
- segments_labels
- frequency_frames
- frequency
- key

License

Polymath is released under the MIT license as found in the LICENSE file.

polymath's People

Contributors

akx avatar azilnik avatar dulanyw avatar lightshifted avatar mgiraldo avatar mpancia avatar natehouk avatar qwertykeith avatar samim23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

polymath's Issues

Quantized files are mono

  1. Add file to library python3.9 polymath.py -a btQYekwDU0U
  2. Quantize all files to bpm python3.9 polymath.py -q all -t 80
  3. Look at files in ./processed - all are mono

I'm on m1 macOS Monterey

I tried changing lines 338/339

# load audio file
    y, sr = librosa.load(vid.audio, sr=None)

to...

# load audio file
    y, sr = librosa.load(vid.audio, sr=None, mono=False)

...but then I get the following error
TypeError: type numpy.ndarray doesn't define __round__ method

Thanks

[WinError 2] The system cannot find the file specified

Tried running a local .mp3 file and it resulted in this error after 8/8 split stems :

Traceback (most recent call last): File "polymath.py", line 689, in <module> main() File "polymath.py", line 621, in main audio_features = get_audio_features(file=file,file_id=vid.id) File "polymath.py", line 448, in get_audio_features stemsplit(file, 'htdemucs_6s') File "polymath.py", line 326, in stemsplit subprocess.run(["demucs", destination, "-n", demucsmodel]) # '--mp3' File "C:\Users\salwy\miniconda3\envs\polymath\lib\subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "C:\Users\salwy\miniconda3\envs\polymath\lib\subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\salwy\miniconda3\envs\polymath\lib\subprocess.py", line 1311, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified

Filename based issue when quantizing songs (Windows)

When trying to quantize a song which could before be added without a problem through:

python polymath.py -a G3dFpQzu54w

there is an error when trying to quantize it to 123 BPM (for testing purposes):

python polymath.py -q G3dFpQzu54w -t 123

this is the error output:

Quantize Audio: Target BPM 123 -- id: G3dFpQzu54w bpm: 136.0 frequency: 133.58 key: C3 timbre: -7.04 name: The Rolling Stones - Jumpin’ Jack Flash (Official Lyric Video) keepOriginalBpm: False
- Quantize Audio: source
Traceback (most recent call last):
  File "C:\polymath\polymath.py", line 674, in <module>
    main()
  File "C:\polymath\polymath.py", line 640, in main
    quantizeAudio(videos[idx], bpm=tempo, keepOriginalBpm = keepOriginalBpm, pitchShiftFirst = pitchShiftFirst)
  File "C:\polymath\polymath.py", line 371, in quantizeAudio
    sf.write(path, strechedaudio, sr)
  File "C:\ProgramData\Miniconda3\envs\polymath\lib\site-packages\soundfile.py", line 430, in write
    with SoundFile(file, 'w', samplerate, channels,
  File "C:\ProgramData\Miniconda3\envs\polymath\lib\site-packages\soundfile.py", line 740, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\ProgramData\Miniconda3\envs\polymath\lib\site-packages\soundfile.py", line 1264, in _open
    _error_check(_snd.sf_error(file_ptr),
  File "C:\ProgramData\Miniconda3\envs\polymath\lib\site-packages\soundfile.py", line 1455, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'C:\\polymath/processed/G3dFpQzu54w - The Rolling Stones - Jumpin’ Jack Flash (Official Lyric Video) - Key: C3 - Freq: 133.58 - Timbre: -7.04 - BPM Original: 135 - BPM: 123.wav': System error.

as far as I understand, the reason is the filename. When inserting in line 368 in polymath.py the following:

    path = os.getcwd() + "/processed/testfile.wav"

the (first) file gets saved without a problem. This is obviously not a real solution to the problem.

So I changed the script and got rid of the ":"s which Windows doesn't seem to like (apart from volume names):

    # save audio to disk   
    path = os.getcwd() + "/processed/" + vid.id +  " - " + vid.name + " - Key " + vid.audio_features['key']  + " - Freq " + str(round(vid.audio_features['frequency'],2)) + " - Timbre " + str(round(vid.audio_features['timbre'],2)) + " - BPM Original " + str(int(vid.audio_features['tempo'])) + " - BPM " + str(bpm) +".wav"
    sf.write(path, strechedaudio, sr)

and for the stems files:

        # save stems to disk
        path = os.getcwd() + "/processed/" + vid.id +  " - " + vid.name + " - Stem " + stem +  " - Key " + vid.audio_features['key']  + " - Freq " + str(round(vid.audio_features['frequency'],2)) + " - Timbre " + str(round(vid.audio_features['timbre'],2)) + " - BPM Original " + str(int(vid.audio_features['tempo'])) + " - BPM " + str(bpm) +".wav"
        sf.write(path, strechedaudio, sr)

Maybe it would be wise to escape the filenames in a proper way for the different OSs, but I don't know the conventions and now it's working for me.

I've added the diff-file in case someone has the same problem.
diff.txt

Trouble with M1 install, did not manage to locate a library called 'sndfile'

Following install instructions on M1 MacBook Air via README and adapted from https://developer.apple.com/metal/tensorflow-plugin/

$ uname -a
Darwin [hostname] 21.6.0 Darwin Kernel Version 21.6.0: Mon Dec 19 20:46:01 PST 2022;
root:xnu-8020.240.18~2/RELEASE_ARM64_T8101 arm64

$ git clone https://github.com/samim23/polymath
$ cd polymath
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
$ bash Miniconda3-latest-MacOSX-arm64.sh -b -p `pwd`/miniconda
$ source miniconda/bin/activate
$ conda install -c apple tensorflow-deps
$ python -m pip install -U pip
$ python -m pip install tensorflow-macos
$ python -m pip install tensorflow-metal
$ pip install -r requirements.txt
$ python polymath.py -a m98qb8ecQf0
...
OSError: ctypes.util.find_library() did not manage to locate a library called 'sndfile'

$ pip install soundfile
Requirement already satisfied: soundfile in ./miniconda/lib/python3.10/site-packages (0.11.0)
Requirement already satisfied: cffi>=1.0 in ./miniconda/lib/python3.10/site-packages (from soundfile) (1.15.1)
Requirement already satisfied: pycparser in ./miniconda/lib/python3.10/site-packages (from cffi>=1.0->soundfile) (2.21)

$ brew install libsndfile
Warning: libsndfile 1.2.0 is already installed and up-to-date.

Please let me know if I'm missing something!

Error on Second stage (pitch tracking)

Windows 10

Tried running on a YouTube video.
This is the traceback:
File "C:\Users\Dimi novo\polymath\polymath.py", line 690, in <module> main() File "C:\Users\Dimi novo\polymath\polymath.py", line 622, in main audio_features = get_audio_features(file=file,file_id=vid.id) File "C:\Users\Dimi novo\polymath\polymath.py", line 424, in get_audio_features frequency_frames = get_pitch_dnn(file) File "C:\Users\Dimi novo\polymath\polymath.py", line 317, in get_pitch_dnn time, frequency, confidence, activation = crepe.predict(audio, sr, model_capacity="tiny", viterbi=True, center=True, step_size=10, verbose=1) # tiny|small|medium|large|full File "C:\Users\Dimi novo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\crepe\core.py", line 255, in predict activation = get_activation(audio, sr, model_capacity=model_capacity, File "C:\Users\Dimi novo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\crepe\core.py", line 185, in get_activation model = build_and_load_model(model_capacity) File "C:\Users\Dimi novo\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\crepe\core.py", line 44, in build_and_load_model from tensorflow.keras.layers import Input, Reshape, Conv2D, BatchNormalization ModuleNotFoundError: No module named 'tensorflow'

Verification of certificate failed

Crazy how far this tech has come lately. Nice work. I've got to this point "pip install -r requirements.txt" and I get the below error message. Seems as if there was an issue with verifying the SSL certificate when attempting to download. I've upgraded SSL certificates with pip install --upgrade certifi and still getting the error. Have any ideas to get past that? Thanks

djm@DJs-MacBook-Pro polymath % pip install -r requirements.txt
Collecting crepe==0.0.13
  Using cached crepe-0.0.13.tar.gz (15 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [48 lines of output]
      /private/var/folders/t4/7st1z6512vv_cqxm52tvglz40000gn/T/pip-install-ubxfn9y6/crepe_7a8be73f6b92479495dd90a3163e2079/setup.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
        import imp
      Traceback (most recent call last):
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
          h.request(req.get_method(), req.selector, req.data, headers,
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request
          self._send_request(method, url, body, headers, encode_chunked)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request
          self.endheaders(body, encode_chunked=encode_chunked)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders
          self._send_output(message_body, encode_chunked=encode_chunked)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output
          self.send(msg)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send
          self.connect()
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1454, in connect
          self.sock = self._context.wrap_socket(self.sock,
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 512, in wrap_socket
          return self.sslsocket_class._create(
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1070, in _create
          self.do_handshake()
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1341, in do_handshake
          self._sslobj.do_handshake()
      ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/private/var/folders/t4/7st1z6512vv_cqxm52tvglz40000gn/T/pip-install-ubxfn9y6/crepe_7a8be73f6b92479495dd90a3163e2079/setup.py", line 30, in <module>
          urlretrieve(base_url + compressed_file, compressed_path)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 241, in urlretrieve
          with contextlib.closing(urlopen(url, data)) as fp:
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
          return opener.open(url, data, timeout)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
          response = self._open(req, data)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
          result = self._call_chain(self.handle_open, protocol, protocol +
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
          result = func(*args)
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
          return self.do_open(http.client.HTTPSConnection, req,
        File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
          raise URLError(err)
      urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>
      Downloading weight file model-tiny.h5.bz2 ...
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Conversion to midi

I'm very impressed by what you've done this far.

I have a question about midi conversion. I used it, but for Windows, I had problems writing the file in the processed directory. So I had to change the name in a more standard way (toto + _{timestamp}). However, when the conversion is finished, I have multiple .mid files - I think they come from the splited instruments. Is there a way to have a single midi file that combines all the separated instruments ?

Thanks

libtorch_cuda_cpp.so not found

Good morning, this is the first issue I open on GitHub, so please be patient :)

I am on Arch Linux (6.1.12-arch1-1 x86_64).
I installed polymath through pip install.

My pip version is pip 23.0.1 from /usr/lib/python3.10/site-packages/pip (python 3.10).

I encountered this problem.

OSError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory

My notebook is an MSI GE65 Raider 9SE and PyTorch uses my NVidia 2060 GPU.

Here is the command I gave and the output I received:

[fz@FZPC polymath]$ python polymath.py -a library/1.wav
---------------------------------------------------------------------------- 
--------------------------------- POLYMATH --------------------------------- 
---------------------------------------------------------------------------- 
No Database file found: library/database.p
add video: library/1.wav to videos: 0
add wav or mp3 file
------ process audio library/1.wav
Finished procesing files: 1
------------------------------ Files in DB: 1 ------------------------------
is audio 1_a4e3843739c26f461790d53394bfe77c00b26ea297a1328968b00b611ce09902 1 /home/fz/Documents/github/polymath/library/1_a4e3843739c26f461790d53394bfe77c00b26ea297a1328968b00b611ce09902.wav
------------------------------ get_audio_features: 1_a4e3843739c26f461790d53394bfe77c00b26ea297a1328968b00b611ce09902 ------------------------------
1/8 segementation
 > 0
 > 1
 > 2
 > 3
 > 4
 > 5
 > 6
 > 7
 > 8
 > 9
 > 10
 > 11
 > 12
 > 13
 > 14
 > 15
 > 16
 > 17
 > 18
 > 19
 > 20
 > 21
 > 22
 > 23
2/8 pitch tracking
2023-02-26 20:55:17.954140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:17.988489: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:17.988907: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:17.989552: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-26 20:55:17.991506: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:17.992071: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:17.992463: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:18.799081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:18.799450: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:18.799736: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-26 20:55:18.799898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4623 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5
2023-02-26 20:55:28.007107: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8600
688/688 [==============================] - 8s 7ms/step
3/8 load sample
4/8 sample separation
5/8 beat tracking
6/8 feature extraction
/home/fz/.local/lib/python3.10/site-packages/librosa/util/decorators.py:88: UserWarning: power_to_db was called on complex input so phase information will be discarded. To suppress this warning, call power_to_db(np.abs(D)**2) instead.
  return f(*args, **kwargs)
/home/fz/Documents/github/polymath/polymath.py:302: FutureWarning: Pass y=[0. 0. 0. ... 0. 0. 0.] as keyword args. From version 0.10 passing these as positional arguments will result in an error
  S = librosa.feature.melspectrogram(y, sr=sr, n_mels=128)
7/8 feature aggregation
8/8 split stems
Traceback (most recent call last):
  File "/home/fz/.local/bin/demucs", line 5, in <module>
    from demucs.separate import main
  File "/home/fz/.local/lib/python3.10/site-packages/demucs/separate.py", line 14, in <module>
    import torchaudio as ta
  File "/home/fz/.local/lib/python3.10/site-packages/torchaudio/__init__.py", line 1, in <module>
    from torchaudio import (  # noqa: F401
  File "/home/fz/.local/lib/python3.10/site-packages/torchaudio/_extension.py", line 135, in <module>
    _init_extension()
  File "/home/fz/.local/lib/python3.10/site-packages/torchaudio/_extension.py", line 105, in _init_extension
    _load_lib("libtorchaudio")
  File "/home/fz/.local/lib/python3.10/site-packages/torchaudio/_extension.py", line 52, in _load_lib
    torch.ops.load_library(path)
  File "/usr/lib/python3.10/site-packages/torch/_ops.py", line 573, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory
1_a4e3843739c26f461790d53394bfe77c00b26ea297a1328968b00b611ce09902 tempo 172.27 duration 220.15 timbre -7.37 pitch 0.5 intensity -40.34 segments 25 frequency 395.55 key G4 name 1

thanks in advance for your help

install on M1 mac stopped working

macOS Ventura 13.2.1 (22D68)
Apple M1 Max

I pulled the latest (5b89899) commit which comes with os/platform-specific tensorflow and am getting whis error on pip install:

ERROR: Could not find a version that satisfies the requirement tensorflow<2.10,>=2.4.1 (from basic-pitch) (from versions: none)
ERROR: No matching distribution found for tensorflow<2.10,>=2.4.1

I had previously managed to get it working (#22) so not sure what is going on now.

YoutubeDL filenaming inconsistent

I encountered an issue with ytdl and filenames when downloading files that start with a hyphen.
For example: -qFw9qljQf4
If a link begins with a hyphen or -
The ytdl throws an error as starting the command with a hyphen leads to incorrect reading of the link.
polymath.py: error: argument -a/--add: expected one argument
To avoid this a workaround is to replace the hyphen with %2d, thus running instead:
polymath.py -a %2dqFw9qljQf4
The file gets downloaded, but than saved to a filename that does not get renamed correctly making the accessed file non-existent:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\~\\polymath\\library\\-qFw9qljQf4.wav'
Doing the exact same command again after renaming the file with %2d prefix with a - resolves the issue as a work-around.

Format not recognised

macOS Ventura 13.2.1 (22D68)
Apple M1 Max CPU

after successfully processing a few files i decided to update my local clone to the latest commit (be86356) and now am getting this error:

Traceback (most recent call last):
  File "/path/to/polymath/polymath.py", line 689, in <module>
    main()
  File "/path/to/polymath/polymath.py", line 573, in main
    videos = audio_process(vids,videos)
  File "/path/to/polymath/polymath.py", line 156, in audio_process
    sf.write(path, y, 44100)
  File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/site-packages/soundfile.py", line 430, in write
    with SoundFile(file, 'w', samplerate, channels,
  File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/site-packages/soundfile.py", line 740, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/site-packages/soundfile.py", line 1264, in _open
    _error_check(_snd.sf_error(file_ptr),
  File "/Users/me/.pyenv/versions/3.9.7/lib/python3.9/site-packages/soundfile.py", line 1455, in _error_check
    raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '/path/to/polymath/library/the_file_name.wav': Format not recognised.

started reverting commits one by one until 6228a5a where this error no longer appears

FFmpeg is required

Might want to mention in readme that FFmpeg is also required, but can't be install via pip. Homebrew works, though.

database.p file

How do I interact with the database.p file ? is it a pickle file ?

incomplete installation on ubuntu --> basic-pitch 0.2.0 depends on tensorflow<2.10 and >=2.4.1

Ciao Samim,
I tried installing polymath on my Ubuntu 22.10 machine with Python 3.10.7, and i got the following error:

`ERROR: Cannot install basic-pitch==0.2.0 and tensorflow==2.11 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested tensorflow==2.11
    basic-pitch 0.2.0 depends on tensorflow<2.10 and >=2.4.1`

I tried to loosen the version for tensorflow but then i got tensorflow2.9 and other errors:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. open-clip-torch 2.7.0 requires protobuf==3.20.0, but you have protobuf 3.19.6 which is incompatible. clean-fid 0.1.29 requires requests==2.25.1, but you have requests 2.28.2 which is incompatible.
Any hint is appreciated. Thanks a lot for your great piece of software! (polymath is installing and running fine on my mac OS machine)

The stereo sound source will become mono when need to be converted.

Hi, this is noe. Thanks your awesome works.

As the title says, if the input audio requires conversion (e.g., if it has a sampling rate other than 44.1 kHz, or if it is an mp3 instead of a wav), the input audio seems to be converted in a mono format. As a result, the output audio will sound very different.

This problem does not occur with Download type videos from YouTube or 44.1kHz and wav type sound sources.

Have a nice day!

AttributeError: module 'numpy' has no attribute 'bool'

Hello,
os: Macos ventura 13.4
python version: 3.10.9

I've successfully cloned and installed but when i try to quantize a song from youtube i get this error:

me@MBP polymath % python3 polymath.py -a QYMwNMj41LI -q all -t 120 -m
---------------------------------------------------------------------------- 
--------------------------------- POLYMATH --------------------------------- 
---------------------------------------------------------------------------- 
add video: QYMwNMj41LI to videos: 1
------ process video QYMwNMj41LI
already in db QYMwNMj41LI
2023-06-15 22:37:35.323056: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
------------------------------ Files in DB: 1 ------------------------------
------------------------------ get_audio_features: QYMwNMj41LI ------------------------------
1/8 segementation
/usr/local/lib/python3.10/site-packages/librosa/segment.py:523: FutureWarning: In the future `np.bool` will be defined as the corresponding NumPy scalar.
  rec = rec.astype(np.bool)
Traceback (most recent call last):
  File "/Users/me/Documents/perso/polymath/polymath.py", line 742, in <module>
    main()
  File "/Users/me/Documents/perso/polymath/polymath.py", line 674, in main
    audio_features = get_audio_features(file=file,file_id=vid.id, extractMidi=extractmidi)
  File "/Users/me/Documents/perso/polymath/polymath.py", line 451, in get_audio_features
    segments_boundaries,segments_labels = get_segments(file)
  File "/Users/me/Documents/perso/polymath/polymath.py", line 316, in get_segments
    boundaries, labs = segmenter.proc_audio(audio_file)
  File "/usr/local/lib/python3.10/site-packages/sf_segmenter/segmenter.py", line 250, in proc_audio
    return self.process(pcp, is_label=is_label)
  File "/usr/local/lib/python3.10/site-packages/sf_segmenter/segmenter.py", line 283, in process
    R = librosa.segment.recurrence_matrix(
  File "/usr/local/lib/python3.10/site-packages/librosa/util/decorators.py", line 88, in inner_f
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/librosa/segment.py", line 523, in recurrence_matrix
    rec = rec.astype(np.bool)
  File "/usr/local/lib/python3.10/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'bool'.
`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'bool_'?

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

Last week the pre-print, pre-trained models, and training code for MERT were released:

The paper reports good performance on 14 different music understanding tasks using the representations learned by the model. Given this flexibility, I think it could be quite interesting to integrate this model into polymath.

What are some of the design considerations (functional & non-functional) for implementing this model effectively?

I think the main benefit would be just extracting deep features and using these e.g. for search.

However, maybe some of the downstream tagging tasks like instrument, mood, or genre could be used as well? In this case, despite not being state-of-the-art, it would save having separate models for each of these tags.

Change BPM does not process entire file

When using the quantize option with a slower bpm, the result files seem to drop audio towards the end of the file.

Steps to reproduce (from empty DB):

  1. python polymath.py -a I32Dwc242Cs (adds a 1-minute beat from YouTube for testing)
  2. python polymath.py -q all -t 50 (attempts to change bpm from ~105 to 50)

Expected result:
Output files in 'processed' should be ~2x the length of the original, because bpm is ~half the original.

Actual result:
Output files in 'processed' are exact same length as the source file. Beat quantization seems to die out near the end of the original file.

Cannot access repository

[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I haven't had this issue with gitub before, and I'm guessing it might be something on your end? Cheers!

segmentation fault when extracting midi on n6DAqMFe97E windows wsl2

I have performed a Fresh install of wsl2 on my laptop,
then installed tensorflow, cuda and polymath following the instructions at https://github.com/samim23/polymath and https://www.tensorflow.org/install/pip
GPU support is enabled

when trying
python polymath.py -a n6DAqMFe97E -q all -t 120 -m
steps 1 to 7 seem to succeed

step 6 gives the following warning :

UserWarning: power_to_db was called on complex input so phase information will be discarded. To suppress this warning, call power_to_db(np.abs(D)**2) instead.

step 8 gives the following warnings :

ffprobe: /home/alexis/anaconda3/envs/tf/lib/libncursesw.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)
ffprobe: /home/alexis/anaconda3/envs/tf/lib/libncursesw.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)
ffprobe: /home/alexis/anaconda3/envs/tf/lib/libtinfo.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)
ffmpeg: /home/alexis/anaconda3/envs/tf/lib/libncursesw.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)
ffmpeg: /home/alexis/anaconda3/envs/tf/lib/libncursesw.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)
ffmpeg: /home/alexis/anaconda3/envs/tf/lib/libtinfo.so.6: no version information available (required by /lib/x86_64-linux-gnu/libcaca.so.0)

then displays the following message :

- Extract Midi
Segmentation fault

Documentation or requirements.txt incomplete: tensorflow

Following the instructions in the README, I attempted to grab audio from YouTube. It downloaded okay but then:

2/8 pitch tracking
Traceback (most recent call last):
  File "/home/kjcole/gits/Audio/polymath/polymath.py", line 671, in <module>
    main()
  File "/home/kjcole/gits/Audio/polymath/polymath.py", line 618, in main
    audio_features = get_audio_features(file=file,file_id=vid.id)
  File "/home/kjcole/gits/Audio/polymath/polymath.py", line 395, in get_audio_features
    frequency_frames = get_pitch_dnn(file)
  File "/home/kjcole/gits/Audio/polymath/polymath.py", line 301, in get_pitch_dnn
    time, frequency, confidence, activation = crepe.predict(audio, sr, model_capacity="tiny", viterbi=True, center=True, step_size=10, verbose=1) # tiny|small|medium|large|full
  File "/home/kjcole/.local/lib/python3.10/site-packages/crepe/core.py", line 255, in predict
    activation = get_activation(audio, sr, model_capacity=model_capacity,
  File "/home/kjcole/.local/lib/python3.10/site-packages/crepe/core.py", line 185, in get_activation
    model = build_and_load_model(model_capacity)
  File "/home/kjcole/.local/lib/python3.10/site-packages/crepe/core.py", line 44, in build_and_load_model
    from tensorflow.keras.layers import Input, Reshape, Conv2D, BatchNormalization
ModuleNotFoundError: No module named 'tensorflow'

System Requirements

Hello friends, thank you for this library, I found it amazing!

I was impressed with the results on the already separated songs and I'm considering buying a better computer if it makes sense.

I have an old computer running Linux (i5 + 8gb memory) and it's taking years to do work on 4 songs.

Will it make a difference if I run it on a more powerful one with a GPU?

How much storage is needed?

Hej there!
I just wanted to try this tool, but fail to install it. It always flushed my disk until no storage was left.
So I wonder: how much storage might I need, if I run the pip installation of the requirements on a pretty new ubtuntu machine (actually it's a WSL)?

syntaxError

When i run i've got a syntax error :
python polymath.py -a /Users/loris/Desktop/Music/polymath2023/

File "polymath.py", line 60 command = f"ffmpeg -hide_banner -loglevel panic -i {file} -ab 160k -ac 2 -ar 44100 -vn -y {vidobj.audio}" ^ SyntaxError: invalid syntax

Unable to scan MP3s on windows

As for my environment,
im running in win 11, visual studio code, Python 3.10.7 (no virtual environment), ffmpeg is installed
this is the output I get trying to add a folder of MP3s (same issue if I try and add a single mp3 too):

PS F:\OneDrive\Documents\Github\polymath> python polymath.py -a H:/mp3s/beatport/
2023-03-29 16:14:53.934583: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2023-03-29 16:14:53.934915: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
---------------------------------------------------------------------------- 
--------------------------------- POLYMATH ---------------------------------
----------------------------------------------------------------------------
No Database file found: library/database.p
add video: H:/mp3s/beatport/ to videos: 0
add directory with wav or mp3 files
H:/mp3s/beatport/11801077_Alone In The Dark_(Remastered Original Mix).mp3
H:/mp3s/beatport/12355_Spiral_(Original Mix).mp3
H:/mp3s/beatport/13098550_Just_Be_Good_To_Me_Original_Mix.mp3
H:/mp3s/beatport/13233407_After The Rain_(Extended Club Mix).mp3
H:/mp3s/beatport/13577515_Touch_Me_PAX___Rui_Da_Silva_Extended_Version.mp3
H:/mp3s/beatport/14269418_Children_Extended_Mix.mp3
H:/mp3s/beatport/14278711_Madagascar_Alex_M_O_R_P_H__Extended_Remix.mp3
H:/mp3s/beatport/14324348_Your_Love_feat__Jamie_Principle_Alan_Dixon__Love_Attack__Extended_Remix.mp3
H:/mp3s/beatport/14360239_Easier_Original_Mix.mp3
H:/mp3s/beatport/14409377_Everything_Everything_Cosmic_Gate_Extended_Remix.mp3
H:/mp3s/beatport/14436359_Alive_Extended_Mix.mp3
H:/mp3s/beatport/14900610_Blame_(Extended Mix).mp3
H:/mp3s/beatport/14987083_Tree of Life_(Extended Mix).mp3
H:/mp3s/beatport/15005223_Save You_(Cold Blue Extended Remix).mp3
H:/mp3s/beatport/15009994_Starlab_(Extended Mix).mp3
H:/mp3s/beatport/15030367_Keep Me Believing feat. Megan McDuffee_(Extended Mix).mp3
H:/mp3s/beatport/15062411_Shattered Sun_(Extended Mix).mp3
H:/mp3s/beatport/15129652_Another Song_(Extended Mix).mp3
H:/mp3s/beatport/287724_Always A Fool_(Club Mix).mp3
H:/mp3s/beatport/3099819_Beauty Hides In The Deep_(Joc Extended).mp3
H:/mp3s/beatport/3510866_Flaming June_(Paul Van Dyk Remix).mp3
H:/mp3s/beatport/4709353_El Nino_(Original Mix).mp3
H:/mp3s/beatport/4791254_Home feat. Discovery_(Original Mix).mp3
H:/mp3s/beatport/7327402_Proper_Education__Club_Mix__Club_Mix.mp3
H:/mp3s/beatport/8095473_Amber_(Extended Mix).mp3
H:/mp3s/beatport/88564_Amsterdam_(Original Mix).mp3
H:/mp3s/beatport/Alive_(Extended_Mix).mp3
H:/mp3s/beatport/All_Around_The_World_(La_La_La)_(Extended_Version).mp3
H:/mp3s/beatport/Always_Be_My_Friend_(Extended_Mix).mp3
H:/mp3s/beatport/Another_Song_(Extended_Mix).mp3
H:/mp3s/beatport/A_Sky_Full_of_Stars_(Hardwell_Remix).mp3
H:/mp3s/beatport/Bittersweet_Symphony_(feat._Emily_Roberts)_(Extended_Version).mp3
H:/mp3s/beatport/Blame_(Extended_Mix).mp3
H:/mp3s/beatport/Can't_Hold_My_Tongue_(Original_Mix).mp3
H:/mp3s/beatport/Chain_My_Heart_(Extended_Mix).mp3
H:/mp3s/beatport/Children_(Extended_Mix).mp3
H:/mp3s/beatport/Come_2_Life_(Original_Mix).mp3
H:/mp3s/beatport/Divine_Love_(Original_Mix).mp3
H:/mp3s/beatport/Don't_Worry_(feat._Aloe_Blacc)_(Otsem_Extended_Mix).mp3
H:/mp3s/beatport/Don't_You_Want_Me_(Extended_Mix).mp3
H:/mp3s/beatport/Easier_((Extended_Mix)).mp3
H:/mp3s/beatport/Echoes_(Extended_Mix).mp3
H:/mp3s/beatport/El_Nino_(Original_Mix).mp3
H:/mp3s/beatport/Everything_Everything_(Cosmic_Gate_Extended_Remix).mp3
H:/mp3s/beatport/Feeling_Kinda_Strange_(Extended_Mix).mp3
H:/mp3s/beatport/Fix_you_(BSNO_Remix).mp3
H:/mp3s/beatport/Flaming_June_(Paul_Van_Dyk_Remix).mp3
H:/mp3s/beatport/Ghost_(Hausman_Extended_Remix).mp3
H:/mp3s/beatport/Giant_(Purple_Disco_Machine_Extended_Remix).mp3
H:/mp3s/beatport/Hearts_In_Darkness_(Original_Mix).mp3
H:/mp3s/beatport/Heaven_(David_Guetta_&_MORTEN_Tribute_Remix).mp3
H:/mp3s/beatport/Hide_U_(Tinlicker_Extended_Remix).mp3
H:/mp3s/beatport/Hooked_(Extended).mp3
H:/mp3s/beatport/Impossible_(feat._John_Martin)_(Extended_Mix).mp3
H:/mp3s/beatport/It's_A_Lot_(Extended).mp3
H:/mp3s/beatport/Just_Be_Good_To_Me_(Original_Mix).mp3
H:/mp3s/beatport/Keep_Me_Believing_feat._Megan_McDuffee_(Extended_Mix).mp3
H:/mp3s/beatport/Lose_Your_Head_(CamelPhat_Extended_Remix).mp3
H:/mp3s/beatport/Love_You_Better_feat._Kimberly_Fransens_(Original_Mix).mp3
H:/mp3s/beatport/Madagascar_(Alex_M.O.R.P.H._Extended_Remix).mp3
H:/mp3s/beatport/Move_Your_Body_(Extended_Mix).mp3
H:/mp3s/beatport/New_Vibe_Who_Dis_(feat._Little_League)_(Extended_Mix).mp3
H:/mp3s/beatport/Not_In_Love_(Extended).mp3
H:/mp3s/beatport/Out_Of_Reach_(Club_Mix).mp3
H:/mp3s/beatport/Paradise_(Fedde_Le_Grand_Remix).mp3
H:/mp3s/beatport/Paradise_(with_Sam_Feldt)_(Extended_Mix).mp3
H:/mp3s/beatport/Piece_Of_Your_Heart_(Alok_Extended_Remix).mp3
H:/mp3s/beatport/Pour_The_Milk_(Extended_Mix).mp3
H:/mp3s/beatport/Rasputin_(Extended_Mix).mp3
H:/mp3s/beatport/Rhythm_Of_The_Night_(Original_Extended_Mix).mp3
H:/mp3s/beatport/Save_You_(Cold_Blue_Extended_Remix).mp3
H:/mp3s/beatport/Shattered_Sun_(Extended_Mix).mp3
H:/mp3s/beatport/Shine_(Extended_Mix).mp3
H:/mp3s/beatport/Somebody_To_Love_(LIZOT_Extended_Remix).mp3
H:/mp3s/beatport/Starlab_(Extended_Mix).mp3
H:/mp3s/beatport/Sweet_Dreams_(Extended_Mix).mp3
H:/mp3s/beatport/Tell_Me_Why_(Jolyon_Petch_Remix).mp3
H:/mp3s/beatport/Touch_Me_(PAX_&_Rui_Da_Silva_Extended_Version).mp3
H:/mp3s/beatport/Tree_of_Life_(Extended_Mix).mp3
H:/mp3s/beatport/Turn_Me_On_(Extended).mp3
H:/mp3s/beatport/We_Got_That_Cool_(feat._Afrojack_&_Icona_Pop)_(Extended_Mix).mp3
H:/mp3s/beatport/You're_Not_Alone_(Extended_Mix).mp3
H:/mp3s/beatport/Young_&_Beautiful_(Cedric_Gervais_Remix).mp3
H:/mp3s/beatport/Your_Love_feat._Jamie_Principle_(Alan_Dixon_'Love_Attack'_Remix).mp3
H:/mp3s/beatport/You_Got_The_Love_(Extended_Mix).mp3
Found 85 wav or mp3 files
------ process audio H:/mp3s/beatport/11801077_Alone In The Dark_(Remastered Original Mix).mp3
converting mp3 to wav: H:/mp3s/beatport/11801077_Alone In The Dark_(Remastered Original Mix).mp3
C:\Python310\lib\site-packages\librosa\util\decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
  return f(*args, **kwargs)
Traceback (most recent call last):
  File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 164, in load
    y, sr_native = __soundfile_load(path, offset, duration, dtype)
  File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 195, in __soundfile_load
    main()
  File "F:\OneDrive\Documents\Github\polymath\polymath.py", line 611, in main
    videos = audio_directory_process(vids,videos)
  File "F:\OneDrive\Documents\Github\polymath\polymath.py", line 127, in audio_directory_process
    videos = audio_process(filesToProcess, videos)
  File "F:\OneDrive\Documents\Github\polymath\polymath.py", line 154, in audio_process
    y, sr = librosa.load(path=vid, sr=None, mono=False)
  File "C:\Python310\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
    return f(*args, **kwargs)
  File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 170, in load
    y, sr_native = __audioread_load(path, offset, duration, dtype)
  File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 226, in __audioread_load
    reader = audioread.audio_open(path)
  File "C:\Python310\lib\site-packages\audioread\__init__.py", line 132, in audio_open
    raise NoBackendError()
audioread.exceptions.NoBackendError

numpy problem during installation

During the installation process i've got a error: note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for numpy Failed to build numpy ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
i've tried to upgrade pip3 and install an older version of numpy but i got the same error.

Mp3 files give a "File contains data in an unknown format" error

I couldn't get it to work. Here's the full error stack. The file is an mp3 file with MPEG Audio Layer 1/2 codec (mpga), at 48khz

Found 232 wav or mp3 files
------ process audio C:/Users/Music/Mantovani\Adios Muchachos.mp3
converting mp3 to wav: C:/Users/Music/Mantovani\Adios Muchachos.mp3
C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\util\decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
return f(*args, **kwargs)
Traceback (most recent call last):
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\audio.py", line 164, in load
y, sr_native = __soundfile_load(path, offset, duration, dtype)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\audio.py", line 195, in __soundfile_load
context = sf.SoundFile(path)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\soundfile.py", line 740, in init
self._file = self._open(file, mode_int, closefd)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\soundfile.py", line 1264, in _open
_error_check(_snd.sf_error(file_ptr),
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\soundfile.py", line 1455, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'C:/Users/Music/Mantovani\Adios Muchachos.mp3': File contains data in an unknown format.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Music\Samples\Polymath\polymath\polymath.py", line 690, in
main()
File "C:\Users\Music\Samples\Polymath\polymath\polymath.py", line 571, in main
videos = audio_directory_process(vids,videos)
File "C:\Users\Music\Samples\Polymath\polymath\polymath.py", line 122, in audio_directory_process
videos = audio_process(filesToProcess, videos)
File "C:\Users\Music\Samples\Polymath\polymath\polymath.py", line 149, in audio_process
y, sr = librosa.load(path=vid, sr=None, mono=False)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
return f(*args, **kwargs)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\audio.py", line 170, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\librosa\core\audio.py", line 226, in _audioread_load
reader = audioread.audio_open(path)
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\audioread_init
.py", line 132, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.