Any chance you have PyTorch model files saved, in addition to the Keras models present

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

PyTorch models about openl3 HOT 14 CLOSED

sainathadapa commented on May 26, 2024

PyTorch models

from openl3.

Comments (14)

sainathadapa commented on May 26, 2024 5

Fortunately, MMdnn (https://github.com/microsoft/MMdnn) worked perfectly for my needs. You can also use that. If you want to see the commands I used to port the model, look here: https://github.com/sainathadapa/urban-sound-tagging/tree/after_challenge/nbs/openl3

from openl3.

auroracramer commented on May 26, 2024 2

Hi! We haven't ported the model to PyTorch; feel free to do so!

from openl3.

turian commented on May 26, 2024 1

Hi @adrienchaton we have been working on a pytorch port. It was a little more fiddly than we expected but seems to be relatively stable now. We are prepping it for more general release: https://github.com/turian/torchopenl3

Please feel free to email me, [email protected]. I'm a huge fan of your work, and Philippe is on the committee for a NeuroIPS audio representation competition I am proposing. I'd love to talk more.

from openl3.

sainathadapa commented on May 26, 2024

I'm going to try porting the model to PyTorch. Before I work on that, can you tell me if you already have a PyTorch version of the model right now? This is so that I do not waste effort in doing something that is already done.

from openl3.

janaal1 commented on May 26, 2024

Where can the Pytorch models be found?
Thanks in advance!

from openl3.

adrienchaton commented on May 26, 2024

@sainathadapa thank you for sharing your codes to convert openl3 to pytorch.

I see you took the mel128 embedding, are the pytorch weights available anywhere please ?

from openl3.

turian commented on May 26, 2024

The main issue was the difference in Kapre 0.1.3 STFTs is in high frequencies. This means that on the chirp audio, our MAE was maybe 2e-3 versus tfopenl3 when using mels (I'd have to double check). On FSD50K 100 random sounds, it was far lower.

from openl3.

adrienchaton commented on May 26, 2024

great, thank you for sharing your port and your awesome research too !
right now it is for an art project I would use it, so it is not needed to be perfectly reproduced
I am following up by email

from openl3.

turian commented on May 26, 2024

@adrienchaton great looking forward to talking. And any issues, questions, snags, etc. file an issue on github

from openl3.

justinsalamon commented on May 26, 2024

this is awesome, thanks for putting this together!

Heads up - we're working on an update to openl3 that will include:

Support for TF 2.x (using an updated version of kapre)
Support for using Librosa instead of Kapre as the audio front-end

@turian if you think it makes sense it would be awesome to merge torchopenl3 into openl3 eventually, such that a single library provides support for both TF and PyTorch backends.

from openl3.

turian commented on May 26, 2024

@justinsalamon thank you, I wanted to reach out and make sure this is all copacetic before doing any public move. Happy to integrate. TBH getting MAE low with kapre old version was quite gnarly and we had to reimplement a lot of the Mel stuff ourselves. (We still get high error on high frequencies like chirp).

BTW my email inbox is open. [email protected]

I have talked with Zeyu---who is on the committee for the accepted NeuroIPS 2021 competition I'm organizing, learning general purpose audio representations. If possible I'd love to confirm that your model and weights could potentially be included as pretraining for the dev-kit. Let me know if you'd like to sync over email or chat on Zoom for 30 minutes.

from openl3.

turian commented on May 26, 2024

@justinsalamon my one request would be that if numpy librosa is used, we make sure to find a compatible GPU spectrogram / mel implementation. Matching the kapre spectrograms was quite hellacious. I'd want to sanity check torchlibrosa etc

I think this is of interest to people who are synthesizing audio on the GPU.

from openl3.

justinsalamon commented on May 26, 2024

What we've done is the following:

Update Kapre (lets call it v2) and try to match the old version (v1) as closely as possible
Implement a librosa front-end and try to match it as closely as possible to v1 as well
Compare the embeddings
Compare performance on downstream classification of the UrbanSound8K dataset

As you might expect, the embeddings don't match perfectly when we replace the audio front-end. However, performance on the downstream classification task was the same (or within the margin of error), which we hope is good enough.

So, the updated version of OpenL3 will let you choose between the Kapre and Librosa front-end, but they are not interchangeable. Models trained on embeddings from a specific front-end should continue to use the same front-end for inference. The same would apply if we incorporated a pytorch version - it would be close but probably not interchangeable with the TF versions.

Yes, I owe you an email. Coming soon.

from openl3.

justinsalamon commented on May 26, 2024

p.s - @turian happy to find a time for a quick chat if that would be helpful.

from openl3.

PyTorch models about openl3 HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent