Comments (14)
Fortunately, MMdnn (https://github.com/microsoft/MMdnn) worked perfectly for my needs. You can also use that. If you want to see the commands I used to port the model, look here: https://github.com/sainathadapa/urban-sound-tagging/tree/after_challenge/nbs/openl3
from openl3.
Hi! We haven't ported the model to PyTorch; feel free to do so!
from openl3.
Hi @adrienchaton we have been working on a pytorch port. It was a little more fiddly than we expected but seems to be relatively stable now. We are prepping it for more general release: https://github.com/turian/torchopenl3
Please feel free to email me, [email protected]. I'm a huge fan of your work, and Philippe is on the committee for a NeuroIPS audio representation competition I am proposing. I'd love to talk more.
from openl3.
I'm going to try porting the model to PyTorch. Before I work on that, can you tell me if you already have a PyTorch version of the model right now? This is so that I do not waste effort in doing something that is already done.
from openl3.
Where can the Pytorch models be found?
Thanks in advance!
from openl3.
@sainathadapa thank you for sharing your codes to convert openl3 to pytorch.
I see you took the mel128 embedding, are the pytorch weights available anywhere please ?
from openl3.
The main issue was the difference in Kapre 0.1.3 STFTs is in high frequencies. This means that on the chirp audio, our MAE was maybe 2e-3 versus tfopenl3 when using mels (I'd have to double check). On FSD50K 100 random sounds, it was far lower.
from openl3.
great, thank you for sharing your port and your awesome research too !
right now it is for an art project I would use it, so it is not needed to be perfectly reproduced
I am following up by email
from openl3.
@adrienchaton great looking forward to talking. And any issues, questions, snags, etc. file an issue on github
from openl3.
this is awesome, thanks for putting this together!
Heads up - we're working on an update to openl3 that will include:
- Support for TF 2.x (using an updated version of kapre)
- Support for using Librosa instead of Kapre as the audio front-end
@turian if you think it makes sense it would be awesome to merge torchopenl3
into openl3
eventually, such that a single library provides support for both TF and PyTorch backends.
from openl3.
@justinsalamon thank you, I wanted to reach out and make sure this is all copacetic before doing any public move. Happy to integrate. TBH getting MAE low with kapre old version was quite gnarly and we had to reimplement a lot of the Mel stuff ourselves. (We still get high error on high frequencies like chirp).
BTW my email inbox is open. [email protected]
I have talked with Zeyu---who is on the committee for the accepted NeuroIPS 2021 competition I'm organizing, learning general purpose audio representations. If possible I'd love to confirm that your model and weights could potentially be included as pretraining for the dev-kit. Let me know if you'd like to sync over email or chat on Zoom for 30 minutes.
from openl3.
@justinsalamon my one request would be that if numpy librosa is used, we make sure to find a compatible GPU spectrogram / mel implementation. Matching the kapre spectrograms was quite hellacious. I'd want to sanity check torchlibrosa etc
I think this is of interest to people who are synthesizing audio on the GPU.
from openl3.
What we've done is the following:
- Update Kapre (lets call it v2) and try to match the old version (v1) as closely as possible
- Implement a librosa front-end and try to match it as closely as possible to v1 as well
- Compare the embeddings
- Compare performance on downstream classification of the UrbanSound8K dataset
As you might expect, the embeddings don't match perfectly when we replace the audio front-end. However, performance on the downstream classification task was the same (or within the margin of error), which we hope is good enough.
So, the updated version of OpenL3 will let you choose between the Kapre and Librosa front-end, but they are not interchangeable. Models trained on embeddings from a specific front-end should continue to use the same front-end for inference. The same would apply if we incorporated a pytorch version - it would be close but probably not interchangeable with the TF versions.
Yes, I owe you an email. Coming soon.
from openl3.
p.s - @turian happy to find a time for a quick chat if that would be helpful.
from openl3.
Related Issues (20)
- Expected new release HOT 1
- Add function to API for loading custom model weights HOT 1
- git clone pulls weight files as well by default HOT 1
- API docs broken HOT 1
- skimage submodules not imported correctly, regression tests fail HOT 7
- Coveralls not triggered on successful Travis build
- Make linear frontend consistent with mel
- Add Action for running US8K benchmark
- Migrate from Travis CI to GitHub Actions
- Add note to docs about embeddings changing after tensorflow 2 / kapre + librosa upgrade
- Numerical errors in hop_len due to rounding
- Dependency conflict with Tensorflow 2.5.x
- Breaks with librosa 0.9.x
- Export OpenL3 as TF/TF-Lite model
- m1 macos installation problem HOT 4
- Extract activation from lower audio layers HOT 1
- Example of fine-tuning the audio sub-network. HOT 1
- Clarification on input representation
- Error installing openl3 on Win64 with Python 3.10 HOT 6
- error when initializing the model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openl3.