Comments (16)
v2 should still work - it sounds like Stefan is confirming that it works modulo PyTorch versions.
v2.1 currently doesn't work because some flags were changed and removed by default.
As part of our upcoming v3 release, we are currently retraining models and will release updated checkpoints.
from s4.
Hi Stefan,
The model changed recently and we are planning to revisit it next week to make sure the Sashimi code is working.
If you go to changelog and go to the commit when Sashimi was first released (V2), the code should work.
from s4.
Hi Albert,
Okay perfect, thank you very much for the insanely quick response! I'll try it out with the state from the v2 tag and report back if I encounter any other issues.
Other than that, maybe one more quick question for now: Does it make sense for me to get it working with v2 or would you suggest I rather wait a week or two until you're mostly done with your current iterations?
from s4.
It probably depends on what you want to use it for. What I expect is that the training code (generation.py
) will change minimally, so if you're writing code to use the model as a black box you shouldn't need to change it much between versions. However, if you're training large-scale models and want to save concrete models, then the models will change between versions and make it harder to load. Realistically, it will probably be about 2 weeks before we can finalize the updated model.
from s4.
Hi there, I quickly wanted to share I run into similar issues when simply trying to generate samples following the instructions, without any changes:
python -m sashimi.generation --model sashimi --dataset youtubemix --n_samples 32 --sample_len 16000
throws
hydra.errors.ConfigCompositionException: Could not override 'model.layer.hurwitz'.
To append to your config use +model.layer.hurwitz=false
I haven't yet tried the steps @stefan-baumann suggested to fix this error though.
from s4.
@albertfgu any updates on this? I trained a model on the most recent changes on GitHub and still get the same error when trying to generate.
from s4.
I would also be happy to implement the fix myself if you can give me some systematic hints on what has been changed that causes this error.
from s4.
The model should work with v2 of the codebase, which is the official Sashimi release. Can you describe your setup and paste the command you ran and the error it gave?
from s4.
Probably still the same issues I described in my initial post. There have been no changes to relevant parts of the code afaik. I can confirm that I got v2 to work though.
from s4.
@stefan-baumann Good to hear that you could make it work! Can you tell me the hash of the commit that works for you?
from s4.
I trained a model from scratch on commit 6cbc09a on my own dataset using:
python3.8 -m train wandb=null experiment=sashimi-youtubemix dataset=youtubemix trainer.gpus=4 model.n_layers=4 loader.num_workers=2
For generation later I use:
python3.8 -m sashimi.generation --model sashimi --dataset youtubemix --n_samples 2 --sample_len 16000 --checkpoint_path $MODEL
from s4.
Can you tell me the hash of the commit that works for you?
I took the v2 tag (74d2706) and backported some of the later commits (especially the kernel stuff) @davidmrau
Iirc only 83a9f13 was actually needed.
from s4.
I am also able to load the model and to generate using 83a9f13. I still have troubles loading the model trained with the current codebase 6cbc09a, I guess it's easiest to train again using 83a9f13.
from s4.
You should be able to load models trained with the current codebase by modifying the generation script with the appropriate flags, for example just removing the hurwitz
flag.
from s4.
I'll give it a try.
from s4.
Sorry for how long it took to get this all out. The current version of the codebase should have
- working configs for training all the Sashimi models
- updated checkpoints with the latest models
- improved generation script that works with the released checkpoints, as well as any new experiments you run
I tested as many things as I could, but please file a new issue for any problems that may arise
from s4.
Related Issues (20)
- information mismatch in s4/models/s4/experiments.md
- Paper, Table 1, Convolution number of parameters HOT 2
- About `krylov()` HOT 1
- Missing or misplaced "old" config folder? HOT 4
- "pretrained_model" is not defined before being called in train.py HOT 2
- Question on HMDB51 Dataset (S4ND Video Experiment)
- Unable to generate the weather using generate.py with time Series training checkpoint
- Large difference of inference result between forward and step
- AttributeError: 'SSMKernelDPLR' object has no attribute 'kernel' HOT 1
- Training on 12bits audio instead of 8bit? (Question, what do I need to change?)
- S4 Listops have nan loss HOT 2
- Quantization for S4/ Hippo
- The dynamics of the latent state of the model
- segmentation fault when running python -m train pipeline=mnist model=s4 HOT 1
- how to use the S4Block .step()
- KeyError in train.py self.dataset = SequenceDataset.registry[self.hparams.dataset._name_]
- Why is Sashimi's effect in speech signal enhancement (denoisy) so bad?
- Passing a video to S4ND
- CUDA error: no kernel image is available for execution on the device HOT 3
- Different results on the same dataset with the same Settings in the paper s4d
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s4.