Giter Club home page Giter Club logo

spotify_capstone_project's Introduction

Spotify_Capstone_Project

Table of Contents

  1. Description
  2. Installing
  3. Project Tasks
  4. File Structure
  5. Medium Blog
  6. Results
  7. Authors
  8. License

Description

This Project is a part of Data Science Nanodegree Program by Udacity. The dataset is taken from Kaggle Website. The aim of the project is to build a ML model that can predict popularity of any song and also build a Recommendation system based which can recommend songs to any given song

Installing

There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*.

Project Tasks

Your project will be divided into the following tasks

I. Exploratory Data Analysis

First step is to explore the data you are working with for the project. Dive in to see what you can find. There are some basic, required questions to be answered about the data you are working with throughout the rest of the notebook. Use this space to explore, before you dive into the details.

II. ML Modelling

Before we start building a model for prediction. Let's first find the features which are highly correlated with Popularity feature and use them as feature variables that we will trained in the model. Next will be performing Feature Transformations. The steps followed are

  • Object data of the artists with some numerical indicator that identify the artist.
  • Eliminate Zero values from tempo columns and replace it
  • Standardizing Instrumental Criteria with numeric values
  • Using OneHotEncoder from SKlearn to create dummies
  • Minmax Scaling for relevant features
  • Target Scaling for Popularity Column

Below are a few models which I have attempted:

  1. Decision Tree Regressor
  2. Decision Tree with Grid Search CV
  3. Random Forest Regressor (RF)

And the best accuracy is achieved by Decision Tree with Grid Search CV model

III. Neighbourhood Based Collaborative Filtering Recommendation

Building a recommendation system where it recommends similar songs for any given song.

In this project I have used Neighbourhood Collaborative Filtering using similarity metrics method. Calculated Manhattan Distance using all numerical features available in the dataset and find the neighbour songs which have relatively less distance.

File Structure

  • data folder contains the following:

    • data.csv: contains the songs data csv file
    • data_by_artist.csv: contains the artist data csv file
    • data_by_genres: contains the genres data csv file
    • data_by_year.csv: contains the Year Wise data csv file
    • data_w_genres.csv: contains the data with genres csv file
  • Spotify_Capstone_Project.ipynb : Jupyter notebook with python codes

Medium Blog Post

The main findings of the code can be found at the post available here

Results

Results are as follows:

  • The Mean Absolute Error and r2 score obtained from a test run using Decision Tree Regressor are 0.0792 and 74.896% respectively
  • The Mean Absolute Error and r2 score obtained from a test run using Decision Tree Regressor with Grid Search CV are 0.073 and 76.6% respectively
  • The Mean Absolute Error and r2 score obtained from a test run using Random Forest Regressor are 0.0758 and 74.683% respectively

The Best Accuracy is achieved using Decision Tree Regressor with Grid Search CV model

Authors

Sowmya | LinkedIn

License

License: MIT

This project is licensed under the MIT License - see the LICENSE.md file for details

spotify_capstone_project's People

Contributors

sunkusowmyasree avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.