Giter Club home page Giter Club logo

steam-recsys's Introduction

Static Badge


Logo Logo

Games Recommender System using LightFM on Steam Dataset

Picture having a buddy who just gets you and suggests the ideal game for your adventures
Explore the docs » · Check Project Blog on Medium » · View Demo »

Table of Contents
  1. About The Project
  2. Steam dataset
  3. Model architecture
  4. Results (Metrics)
  5. Contact
  6. License

About the project

In this project, we developed a Games Recommender System using the LightFM library on the Steam dataset. The recommendation system employs a hybrid approach, combining collaborative filtering and content-based filtering techniques to provide personalized game suggestions for both multiplayer gaming sessions and solo adventures on the Steam platform. By considering user interactions, game metadata, and individual preferences, our model aims to enhance the gaming experience by offering nuanced and tailored recommendations.

Steam dataset

For this project/post, we’ll be using the Steam dataset, which contains 7,793,069 reviews, 2,567,538 users, and 32,135 games. In addition to the review text, the data also includes the users’ play hours in each review.

We’ll focus on the interactions between users and items as well as on metadata about games, such as publisher, release date, genres and tags.

Model architecture

LightFM package

LightFM is a Python implementation of a hybrid recommendation algorithms for both implicit and explicit feedbacks.

It is a hybrid content-collaborative model which represents users and items as linear combinations of their content features’ latent factors. The model learns embeddings or latent representations of the users and items in such a way that it encodes user preferences over items. These representations produce scores for every item for a given user; items scored highly are more likely to be interesting to the user.

The user and item embeddings are estimated for every feature, and these features are then added together to be the final representations for users and items.

Modelling approach

Let $U$ be the set of users and $I$ be the set of items, and each user can be described by a set of user features $f_{u} \subset F^{U}$ whilst each items can be described by item features $f_{i} \subset F^{I}$. Both $F^{U}$ and $F^{I}$ are all the features which fully describe all users and items.

The LightFM model operates based binary feedbacks, the ratings will be normalised into two groups. The user-item interaction pairs $(u,i) \in U\times I$ are the union of positive (favourable reviews) $S^+$ and negative interactions (negative reviews) $S^-$ for explicit ratings. For implicit feedbacks, these can be the observed and not observed interactions respectively.

For each user and item feature, their embeddings are $e_{f}^{U}$ and $e_{f}^{I}$ respectively. Furthermore, each feature is also has a scalar bias term ($b_U^f$ for user and $b_I^f$ for item features). The embedding (latent representation) of user $u$ and item $i$ are the sum of its respective features’ latent vectors:

$$ q_{u} = \sum_{j \in f_{u}} e_{j}^{U} $$

$$ p_{i} = \sum_{j \in f_{i}} e_{j}^{I} $$

Similarly the biases for user $u$ and item $i$ are the sum of its respective bias vectors. These variables capture the variation in behaviour across users and items:

$$ b_{u} = \sum_{j \in f_{u}} b_{j}^{U} $$

$$ b_{i} = \sum_{j \in f_{i}} b_{j}^{I} $$

In LightFM, the representation for each user/item is a linear weighted sum of its feature vectors.

Results (Metrics)

We trained two variants of model: Hybrid Filtering and Colaborative Filtering. Here the metrics calculated on train (blue) and test (orange) data: From metrics you can see that Collaborative Filtering metrics is slightly but better than Hybrid Filtering. It could be confusing at first, but this problem is lay under the model implementation. At first we thought the model is broken, but then we understand that it’s quite straightforward: the model essentially calculates the average of the embeddings for all the features it receives. Due to this averaging process, the model lacks the ability to distinguish uninformative features and disregard them.

As a result, including numerous uninformative features can adversely impact your model by diminishing the significance of valuable features. To address this issue, you might need to consider employing more advanced models that LightFM does not provide implementations for.

Additionally, it’s worth noting that metadata features are more likely to enhance performance in situations with very sparse datasets or sparse subsets of your data, such as long-tail or cold-start scenarios.

Contact

Abdulayev Damir - telegram - [email protected]

Dautov Almaz - telegram - [email protected]

Project Link: https://github.com/thehir0/steam-recsys

License

Distributed under the MIT License. See LICENSE for more information.

steam-recsys's People

Contributors

speedfiref avatar thehir0 avatar

Watchers

 avatar

Forkers

speedfiref

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.