cpuguy96 / stepcovnet Goto Github PK

View Code? Open in Web Editor NEW

15.0 5.0 2.0 2.87 GB

Deep Learning to Create StepMania SM FIles

License: Apache License 2.0

Python 100.00%

deep-learning neural-network cnn stepmania machine-learning lstm games timeseries python transformer

stepcovnet's Introduction

stepcovnet's People

Contributors

Stargazers

Watchers

Forkers

codacy-badger d-vaillant

stepcovnet's Issues

Set dependency versions in requirements.txt

Probabilities do not sum to 1

( Note I am running this on the previous version as I cannot use the latest version, as per #3 )
When I try run stepmania_note_generator on a fully retrained model, I get the below error:

Generating notes for breakitoff
-----------------------------------------

0.018300232973530228
Traceback (most recent call last):
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 154, in <module>
    stepmania_note_generator(args.input,
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 126, in stepmania_note_generator
    generate_notes(output_path, tmp_dir, stepcovnet_model, verbose_int)
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 88, in generate_notes
    pred_arrows = inference_executor.execute(input_data=inference_input)
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepcovnet\executor\InferenceExecutor.py", line 43, in execute
    encoded_arrow = np.random.choice(NUM_ARROW_TYPES, 1, p=binary_arrow_prob)[0]
  File "mtrand.pyx", line 939, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities do not sum to 1

As you can see I added a print(sum(binary_arrow_prob)) in InferenceExecutor.py and it's clear that it's not even close to 1 and therefore I can't just correct with rounding. I did initially try that just in case but it results in like 9000 notes for a 1 minute song lol.

Any ideas? I've spent about 2 days trying to get this working initially, then 2 days running the training and I'm starting to lose hope on getting this working.

Many Thanks

Create metadata database for training data

This can come in the form of a Google sheets that has the name of the song and metadata about it.

Migrate to Keras 3.0.0

Keras 3.0.0 was recently released: https://github.com/keras-team/keras/releases/tag/v3.0.0

Add E2E unit tests for all modules

Upgrade to TensorFlow 2.15.0

Add TODO to issue action

metadata.json is not distributed with pretrained model

Unfortunately, you can't just use this out of the box since StepCONVNetModel uses the metadata.json file as a necessary step in loading. You can hack some code to get it to load the h5 file directly but then you get this error, which I'm reading as "uh oh, something went very wrong."

ValueError: No model config found in the file at <tensorflow.python.platform.gfile.GFile object at 0x285687410>.

Currently using Tensorflow 2.13.1 on an M2 Macbook Air.

Update train.py typing to Python 3.10+

Training data missing audio files

Some of the raw training data found in the Google Drive doesn't contain the audio files corresponding to them (e.g. bluhende_nacht.ogg).

Link to training data on README is 404ing

https://drive.google.com/open?id=1eCRYSf2qnbsSOzC-KmxPWcSbMzi1fLHi

Update codebase typing to Python 3.10+

Can't load model

On latest version when I try run stepmania_note_generator.py I get the following:

PS C:\dev\StepCOVNet-master> python stepmania_note_generator.py -i test_in -o test_out --model dataset\time2_challenge_timing_model -v 1
Loading StepCOVNet retrained model
Failed to retrieve retrained StepCOVNet model. Loading non-retrained model
Traceback (most recent call last):
  File "C:\dev\StepCOVNet-master\stepmania_note_generator.py", line 155, in <module>
    stepmania_note_generator(args.input,
  File "C:\dev\StepCOVNet-master\stepmania_note_generator.py", line 124, in stepmania_note_generator
    stepcovnet_model = StepCOVNetModel.load(input_path=model_path, retrained=False)
  File "C:\dev\StepCOVNet-master\stepcovnet\model\StepCOVNetModel.py", line 40, in load
    model = load_model(model_path, compile=compile_model)
  File "C:\Users\georg\AppData\Roaming\Python\Python39\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\georg\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\util\nest.py", line 573, in assert_same_structure
    raise type(e)("%s\n"
ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name=None)}, None, None, None, None, None, None, None, None, None, None, None, None, False), {})

Second structure: type=tuple str=((TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids'), None, TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), None, None, None, None, None, None, None, None, None, None, False), {})

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name=None)}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids')" is not
Entire first structure:
(({'input_ids': .}, ., ., ., ., ., ., ., ., ., ., ., ., .), {})
Entire second structure:
((., ., ., ., ., ., ., ., ., ., ., ., ., .), {})

Using librosa.load to resample audio files

Since we're already using the Librosa library for some of our audio processing, we could cut down on a lot of the wav_converter code by using librosa.load

By default, librosa.load will resample to 22,050 hz, reduce the number of channels to 1 (mono), and normalize the data so that values will range from -1 to 1.

Since we want to create 16,000 hz, mono wav files, we can adjust the conversion function to do this using

import librosa
import soundfile as sf

input_audio_data, sample_frequency = librosa.load(filename, sr=16000)
sf.write(file_output_path, input_audio_data, sample_frequency)

Thoughts?

Move all training hyperparameters into config file

In general, it'll be better for ease of use and visibility for hyperparameters to be moved into a config file that's read by train.py. After the move, the config options should still show up in the training meta file afterwards for tracking purposes.

Trained model download?

Is there a download available of a fully trained model?

Modernize audio preprocessing to use TF I/O

https://www.tensorflow.org/io/tutorials/audio