Giter Club home page Giter Club logo

beer_recommendations's Introduction

Goal

To build a recommendation system that can recommend beers based on previous user's ratings from beeradvocate.com.

Overview

The craft beer scene has exploded in recent years with microbreweries and major beer suppliers now consistently making new kinds of beer on a daily basis. This has led to a lot of confusion when trying to pick out a beer that you like. And this can be difficult for a casual beer drinker or even for those that consider themselves beer connoisseurs. There are several apps and websites now solely dedicated to ranking and classifying beers. Whether the point is to help an individual remember what they liked and didn't like or to help others discover new beers it is worthwhile to explore the reviews given to try and predict new kinds of beers that an individual would like.

This project uses data scraped from beeradvocate.com using Beautiful Soup, where users create profiles and are able to rate any beer they try on a scale of 1 to 5. This site uses a rating system that allows for users to take multiple factors in to consideration during their rating process that everyone can relate.

Dataset

The initial factor that we looked at were which users to collect data from. The website breaks reviews down in to five different sections: recent reviews, top reviews for the last year, top reviews of all time, most popular reviews, and a beer hall of fame reviews. We decided to take a small section of users from each category to try and catch a variety of different users in terms of types of beers, frequency, and variety that we could sample from.

Using Beautiful Soup we were able to scrape user's profiles and create a sparse matrix for 2,682 users that resulted in 32,917 different beers that were reviewed. This totalled 86,133 reviews amongst all users that were collected.

alt text

Next we created a beer matrix that included the name of the beer, ABV, brewery, and type of the beer.

alt text

Data Processing

Now that we had all of our data we had to normalize the results. We extracted all of the values from the sparse matrix and subtracted the average from each column from each of the populated values. From there we created a were able to run a Single Value Decomposition (SVD) using the scipy.sparse.linalg library. This applies a linear transformation using the sigma matrix as weights for the transformation.

alt text

Next we make our predictions from this weighted diagonal matrix. These predictions are rating recommendations for every user against every beer that the user has already reviewed. We then applied this model against the entire data set to allow us to predict new ratings that users have not yet rated. From this we built a function to suggest top recommendations for each user using the weighted predictions and from the original user ratings.

alt text

If you wanted suggestions given a particular beer that you like, we created a function that would return the beer's highest rating user's top recommendations.

alt text

Validation and Results

After collecting all of this data we cross validated using the surprise.model_selection library and including models SVD, KNNBaseline, KNNBasic, KNNWithMeans, KNNWithZScore, BaselineOnly, and CoClustering. We found that the SVD that we ran had a comparitively acceptable lowest RMSE!

alt text

Additionally we found which type of beer received the most reviews,

alt text

the brewery with the most reviews,

alt text

which ABV of beer was most frequently reviewed,

alt text

and the spread of the most reivewd beer types across the most rated breweries.

alt text

beer_recommendations's People

Contributors

scbronder avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

emmabernstein1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.