Giter Club home page Giter Club logo

music-genre-classification's Introduction

Music genre classification using Deep Convolutional Network

This repository provides a basic approach for predicting the music genre from WAV files. This is done using a deep convolutional network trained on the well-known GTZAN dataset.

A Flask application and a minimal Dash web application run a simple test for prediction, on jazz, reggae and metal musics. The prediction is done in real-time during playing the music.

References:

How to predict the music genre ?

Dataset

The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit audio files in .wav format.

The genres are:

  • blues
  • classical
  • country
  • disco
  • hiphop
  • jazz
  • metal
  • pop
  • reggae
  • rock

This database is the most widely-used in the benchmark but it is also known that there are many issues (sound quality, repetitions, mislabelling, etc.), see here. Despite this, this is a good starting point for testing deep learning techniques. See here for an extensive list of Music Information Retrieval datasets.

Overall approach

First, we have to keep in mind that sound can be represented as images, thanks to signal processing techniques such as the well-known Short Time Fourier Transform. So the natural way to learning from music is to train a CNN on the spectrogram images derived from the musics.

Our approach if based on a two-blocks convolutional model :

  • two 2D convolutional layers (resp. 32 and 128 channels) followed by a max-pooling
  • a 20% dropout layer
  • a global average pooling layer. This avoids the explosion of the number of parameters in comparison with a simple flatten layer.
  • a 512 fully connected layer

A deeper network was tested but did not show significantly better results.

Note that the model is trained on sequences of 3 seconds of music in order to control the size of the images.

Model performances

The model is trained on a sample of 80% of the data randomly chosen. The remaining 20% are used for the validation and test sets.

The training procedure has been repeated five times for estimating the model accuracy with random train, validation and testing sets. The final model performances are:

  • Mean accuracy: 94.61% (+/- 0.40%)
  • Mean loss: 0.2192 (+/- 0.0292)

Model error during the training

The following figure represents the accuracy of the model during the training step. A slight overfitting seems to appear after 20 epochs. We should add some regularization layers or we should augment the dataset for better results.

music-genre-classification's People

Contributors

bgregorutti avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.