Light

SVD features processing about ismir2018-revisiting-svd HOT 7 OPEN

kyungyunlee commented on June 12, 2024 1

SVD features processing

from ismir2018-revisiting-svd.

Comments (7)

kyungyunlee commented on June 12, 2024 1

Hi :)
So the idea behind this singing voice detection system is to determine whether there is a singing voice or not for each input segment (which is 1.6 seconds of audio). This means that if we want to analyze a longer audio file, we need to divide it into 1.6-second long segments. Step 3 is where I do this : just looping through the time dimension of the melspectrogram and cutting them into 1.6-second segments. At the end, there are 13475 segments of size (80, 115).

After running the prediction, you will be able to identify at which frames the prediction is 0 or 1. From here, you need to convert frames to seconds using the function like frames_to_time.

I hope this helps. Let me know if there are more questions!

from ismir2018-revisiting-svd.

kyungyunlee commented on June 12, 2024 1

@simonefrancia Sure, but I think 100ms is way too short to determine if the input contains singing voice or not. The big characteristic to detect is vibrato and 100ms doesn't seem long enough to detect vibrato. Typically the input is around 1 second and it makes sense in human's perspective as well. Feel free to try :)

from ismir2018-revisiting-svd.

simonefrancia commented on June 12, 2024

Hi @kyungyunlee, thanks for your response.
So is assigned one label for every 1.6 seconds?
Because what I don't understand is why if I have an audio with duration 194 seconds ( that is the duration of the example above) , as result of the preprocessing I have 13475 segments of 1.6 second for each one.
Thanks

from ismir2018-revisiting-svd.

kyungyunlee commented on June 12, 2024

@simonefrancia Hi, yes it's single binary label for 1.6 seconds, but there is overlap during training so there will be more than 194/1.6 segments, for instance.

from ismir2018-revisiting-svd.

simonefrancia commented on June 12, 2024

Ok. But is it possible to be more precise, for example if I want to have a prediction for every 100ms ?
is it possible to do this, only changing some config of training?
Thank you

from ismir2018-revisiting-svd.

simonefrancia commented on June 12, 2024

@kyungyunlee according to you, what is the smallest duration of segment we can use for training?
Thanks

from ismir2018-revisiting-svd.

kyungyunlee commented on June 12, 2024

I am not sure, since I haven’t tried using shorter input. I think at this point you have to define it in terms of your task goal

from ismir2018-revisiting-svd.

Related Issues (4)

Doubt: Double Stage HPSS calculated over first P component HOT 1
Link is broken HOT 2
Help: which probability to chose in y_pred segment HOT 1

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.