Giter Club home page Giter Club logo

customer-behavior-prediction's Introduction

Customer-Behavior-Prediction

These scripts are used to predict wether a website customer will make a purchase in an online store based on their browsing history. They could be used as part of a realtime webpage control system in order to improve customer purchase behavior. The system would compute the buying probability given the current customer history combined with each of several possible webpage responses. The best response say, 'make button b larger and display ad C') would be actually executed.

The learning component could be periodically run offline on historical data

This directory contains the data files, scripts, outputs and plots.

Strategies

There are three learning strategies implemented - one (1) based simply on total time the user spends on the webpage; the other two based on a Naive Bayes classifier approach, using either the individual pages (2a) visited or the sequences of 3 pages (2b) visited.

Strategy 2b has the greatest area under the precision - recall curve, so this one is the best of the three. Its results marking the test.csv data are in predict2b.dat.

Probability of Buy

The files predict* are generated using the indicated strategies, marking the final element of each line with a number between 0 and 1. This number represents the predicted probability of buy : 1 is a certain buy, 0 is a certain no buy. To convert to a binary classification, simply select a threshold between 0 and 1. Then mark the lines with p_buy greater than the threshold as a buy and those less as a no-buy. Note that different choices of threshold will correpond to different amounts of false positives, false negatives, true negatives, and true positives. A higher threshold will produce fewer false positives but more false negatives. The overall p_buy in the training data is ~0.23. A threshold of ~0.4 is suggested.


The scorer.py script is used to evaluate the strategies, by training against either half or all the training data and then evaulating the perfomrance of each strategy using either the other half of the training data or all the training data as the source of truth.

The *.png files are the plots generated to analyze the training data and understand the performance of the strategies.

See comments in the Strat1.py and Strat2.py scripts for more information about the method, assumptions, and intermediate results.

customer-behavior-prediction's People

Contributors

mickeyshaughnessy avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.