Giter Club home page Giter Club logo

multi-armed-bandits's Introduction

Multi-Armed Bandits

A multi-armed bandit is a popular problem in probability theory and reinforcement learning, in which a given slot machine (resource pool) has n arms (bandit/resource), each rigged with their own probability of success. This is a classic sequential resource allocation problem where the objective is to pull the arms (resources) in a sequence so as to obtain the maximum cumulative reward in the face of uncertainty. Traditional A/B testing methods are offline resource allocation methods that dedicate a period of time purely to exploration where traffic is equally allocated to the two possible versions due to which a lot of time and revenue is wasted on the losing variant.On the other hand, MABs successfully balance exploration and exploitation as more knowledge is gained. In the context of internet advertisements, the goal is to learn the click-through rates of several competing advertisements in order to converge to the most appropriate set of advertisements for the user in question.

The proposed system implements the CSlogUCB-F algorithm, that we have come up with, which makes use of side information available in each round - user context and advertisement context - to improve the relevance of the chosen ads to the user making the search query, while ensuring fairness to all advertisers. Initially, the data is preprocessed to form normalised context vectors, which are fed to a module that implements the proposed algorithm and selects the best advertisement for the current query based on the input context. Following this, the various parameters of the algorithm including estimated rewards and fairness debts are updated based on user feedback.

The introduction of context into CSMAB-F algorithm, resulting in the proposed algorithm being developed, helped reduce the regret incurred through the entire run across time steps empirically. An improved fairness notion was proposed that models the real world scenarios more accurately, and was found to reduce the regret based on the experiments conducted. Logistic regression was used to learn the coefficient vectors for the context, which was also a factor in the reduced regret.

For detailed results, please view the Report.pdf file.

Steps to Execute

  1. The dataset used is a subset of the Avito Context Ad Clicks data (https://www.kaggle.com/c/avito-context-ad-clicks/data). The dataset which is originally in Russian, was translated to English for convenience. The advertisement titles have been vectorized using the word_to_vector python script and this creates two npy files.
  2. Then the CSMAB_F python script is run and it is able to predict the best advertisements to be displayed based on the user context.

multi-armed-bandits's People

Contributors

shreyasar2202 avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

namithap10

multi-armed-bandits's Issues

Error while running CSMAB_F

Hi shreyasar, thanks for sharing your work. It is really interesting, I'm working on a similar kind of problem. By running your code I was trying to get some extra insides of your work, but I can't run the file. I'm always getting the error:
'AttributeError: 'DataFrame' object has no attribute 'HistCTR'

Do you know how you could fix it?

Kind regards,
Vito

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.