Giter Club home page Giter Club logo

stepcovnet's Introduction

cpuguy96's GitHub Stats Top Langs

stepcovnet's People

Contributors

cpuguy96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

stepcovnet's Issues

Probabilities do not sum to 1

( Note I am running this on the previous version as I cannot use the latest version, as per #3 )
When I try run stepmania_note_generator on a fully retrained model, I get the below error:

Generating notes for breakitoff
-----------------------------------------

0.018300232973530228
Traceback (most recent call last):
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 154, in <module>
    stepmania_note_generator(args.input,
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 126, in stepmania_note_generator
    generate_notes(output_path, tmp_dir, stepcovnet_model, verbose_int)
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepmania_note_generator.py", line 88, in generate_notes
    pred_arrows = inference_executor.execute(input_data=inference_input)
  File "C:\dev\StepCOVNet-ef61b36bf2a30bfacb3fdd49a3a613e38d5ca548\stepcovnet\executor\InferenceExecutor.py", line 43, in execute
    encoded_arrow = np.random.choice(NUM_ARROW_TYPES, 1, p=binary_arrow_prob)[0]
  File "mtrand.pyx", line 939, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities do not sum to 1

As you can see I added a print(sum(binary_arrow_prob)) in InferenceExecutor.py and it's clear that it's not even close to 1 and therefore I can't just correct with rounding. I did initially try that just in case but it results in like 9000 notes for a 1 minute song lol.

Any ideas? I've spent about 2 days trying to get this working initially, then 2 days running the training and I'm starting to lose hope on getting this working.

Many Thanks

metadata.json is not distributed with pretrained model

Unfortunately, you can't just use this out of the box since StepCONVNetModel uses the metadata.json file as a necessary step in loading. You can hack some code to get it to load the h5 file directly but then you get this error, which I'm reading as "uh oh, something went very wrong."

ValueError: No model config found in the file at <tensorflow.python.platform.gfile.GFile object at 0x285687410>.

Currently using Tensorflow 2.13.1 on an M2 Macbook Air.

Training data missing audio files

Some of the raw training data found in the Google Drive doesn't contain the audio files corresponding to them (e.g. bluhende_nacht.ogg).

Can't load model

On latest version when I try run stepmania_note_generator.py I get the following:

PS C:\dev\StepCOVNet-master> python stepmania_note_generator.py -i test_in -o test_out --model dataset\time2_challenge_timing_model -v 1
Loading StepCOVNet retrained model
Failed to retrieve retrained StepCOVNet model. Loading non-retrained model
Traceback (most recent call last):
  File "C:\dev\StepCOVNet-master\stepmania_note_generator.py", line 155, in <module>
    stepmania_note_generator(args.input,
  File "C:\dev\StepCOVNet-master\stepmania_note_generator.py", line 124, in stepmania_note_generator
    stepcovnet_model = StepCOVNetModel.load(input_path=model_path, retrained=False)
  File "C:\dev\StepCOVNet-master\stepcovnet\model\StepCOVNetModel.py", line 40, in load
    model = load_model(model_path, compile=compile_model)
  File "C:\Users\georg\AppData\Roaming\Python\Python39\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\georg\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\util\nest.py", line 573, in assert_same_structure
    raise type(e)("%s\n"
ValueError: The two structures don't have the same nested structure.

First structure: type=tuple str=(({'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name=None)}, None, None, None, None, None, None, None, None, None, None, None, None, False), {})

Second structure: type=tuple str=((TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids'), None, TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), None, None, None, None, None, None, None, None, None, None, False), {})

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name=None)}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids')" is not
Entire first structure:
(({'input_ids': .}, ., ., ., ., ., ., ., ., ., ., ., ., .), {})
Entire second structure:
((., ., ., ., ., ., ., ., ., ., ., ., ., .), {})

Using librosa.load to resample audio files

Since we're already using the Librosa library for some of our audio processing, we could cut down on a lot of the wav_converter code by using librosa.load

By default, librosa.load will resample to 22,050 hz, reduce the number of channels to 1 (mono), and normalize the data so that values will range from -1 to 1.

Since we want to create 16,000 hz, mono wav files, we can adjust the conversion function to do this using

import librosa
import soundfile as sf

input_audio_data, sample_frequency = librosa.load(filename, sr=16000)
sf.write(file_output_path, input_audio_data, sample_frequency)

Thoughts?

Move all training hyperparameters into config file

In general, it'll be better for ease of use and visibility for hyperparameters to be moved into a config file that's read by train.py. After the move, the config options should still show up in the training meta file afterwards for tracking purposes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.