Giter Club home page Giter Club logo

recommender_systems_on_h-m_fashion_dataset's Introduction

Recommender Systems on H&M Fashion Dataset

H&M

Introduction

In this project I applied Exploratory Data Analytics on the H&M Fashion Dataset to have an insight look at the dataset and then implemented three recommender systems based on the Turi Create.

The code for this project can be found in ./code/Recommender_Systems_H&M.ipynb

Dataset

The complete dataset is available HERE.

The dataset contains totally three .csv files and 105K .jpg files.

The three .csv files are as followed:

  • articles.csv - detailed metadata for each article_id available for purchase
  • customers.csv - metadata for each customer_id in dataset
  • transactions_train.csv - the training data, consisting of the purchases each customer for each date, as well as additional information. Duplicate rows correspond to multiple purchases of the same item.

./images is a folder of images corresponding to each article_id; images are placed in subfolders starting with the first three digits of the article_id; important note! Not all article_id values have a corresponding image!

Exploratory Data Analytics

In this part, I applied data preprocessing methods to articles.csv, customers.csv, transaction_tran.csv to impute missing values and made visulizations to have a more directly look into the data. For

Product TypeWordcloud

Recommender Systems

In this part, based on Turi Create, I implemented three recommender systems for the recommendation of top-12 items for each user. Turi Create simplifies the development of custom machine learning models and it's open source on GitHub. Since Turi Create is based on the RAPID dataframe, here we use the RAPID dataframe to read in and prepare the datasets.

Prepare Dataset

To prepare the dataset, we first create normalized matrix with customers on rows and articles ad columns and then split the dataset into training and testing dataset.

  • Training Dataset: 70%
  • Testing Dataset: 30%

Models

The three recommender systems includes Popularity Recommender System, Cosine Recommender System and Pearson Recommender System. The Popularity Recommender System recommends the top-12 popular items among all items while the Cosine Recommender System and Pearson Recommender System recommend the top-12 item that are most correlated to the user's previous purchases based on collaborative filtering.

Results

After the training process of each model, the RMSE, Mean Precision and Mean Recall of each model is calculated to evaluate the performance of each model. The complete evaluation output can be found in the .txt and '.txt'.

Here is the result of the RMSE for each model:

Models RMSE
Popularity Recommender System 1.699
Cosine Recommender System 1.017
Pearson Recommender System 1.675

Project Structure

├─code
│ ├─Recommender_System_H&M.ipynb   # Jupyter Notebook for this project
├─output
│ ├─eval_counts.txt   # evaluaton results of three models  
├─public   # Some of the example images
│ ├─example1.png
│ ├─example2.png
│ ├─H&M.jpg

recommender_systems_on_h-m_fashion_dataset's People

Contributors

hqr2000 avatar

Stargazers

Robin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.