Giter Club home page Giter Club logo

udacity-starbucks-capstone's Introduction

Starbucks Capstone Challenge

This repository deals with the Starbucks Capstone Challenge offered by udacity's datas science nanodegree program.

Included is a data set form Starbucks with simulated data containing transaction logs for persons interacting with offers distributed by Starbucks to end customers.

Project Motivation

The goal is to derive a predictive model from the data to suggest if there is a high chance that a person will spend money as a response to a discount offer or ‘by one get one free’ offer.

For more information read the corresponding blog post on medium: https://medium.com/@sopka/prediction-model-for-starbucks-bogo-and-discount-offers-e63383fd6469

Files Structure

├── data_aggregation.ipynb jupyter notebook for the data exploration, processing and aggregation and statistical evaluation

├── train_keras.py python script to train keras neural network model

├── train_model.ipynb jupyter notebook to construct and evaluate feature matrix and fit models with linear regression and radius near neighbors classification

├── data This directory contains a simulated data set that mimics customer behavior on the Starbucks rewards mobile app.

│   ├── profile.json Represents fictive or anonymized Rewards Program Users (17000 users x 5 fields)

  • gender: (categorical) M, F, O, or null
  • age: (numeric) missing value encoded as 118
  • id: (string/hash)
  • became_member_on: (date) format YYYYMMDD
  • income: (numeric)

│   ├── portfolio.json Defines offers sent during 30-day test period (10 offers x 6 fields)

  • reward: (numeric) money awarded for the amount spent
  • channels: (list) web, email, mobile, social
  • difficulty: (numeric) money required to be spent to receive reward
  • duration: (numeric) time for offer to be open, in days
  • offer_type: (string) bogo, discount, informational
  • id: (string/hash)

│   ├── transcript.json Contains event logs (306648 events x 4 fields)

  • person: (string/hash)
  • event: (string) offer received, offer viewed, transaction, offer completed
  • value: (dictionary) different values depending on event type
    • offer id: (string/hash) not associated with any "transaction"
    • amount: (numeric) money spent in "transaction"
    • reward: (numeric) money gained from "offer completed"
  • time: (numeric) hours after start of test

Installation

All scripts and python source codes are tested under osx and linux only.

Initialize the python environment

You do not want to polute your local environment with new python libraries. Instead you should create a virtual python environment in the current directory. The only requirement is that python3.8 is preinstalled on your local system. Then just run:

python3 -m venv venv

And activate the virtual environment in your current shell:

source ./venv/bin/activate

Install necessary Libraries

We are using the following libraries:

You can install the needed python libraries with pip in your venv environment:

pip install --upgrade pip
pip install pandas
pip install jupyterlab
pip install matplotlib
pip install plotly
pip install scikit-learn
pip install seaborn
pip install tensorflow
pip install keras

Project Analysis

Start with the jupyter notebook data_aggregation.ipynb. A complete run will aggregate the data into a CVS file in root of this directory called enriched_dtevent.csv.

This file is then used in the jupyter notebook train_model.ipynb to train models with linear regression and radius near neighbors classification.

The accuracy of these model will be around 65%.

There is also a python script called train_keras.py. It will also use the aggregated data from the CVS file enriched_dtevent.csv to train a simple neural network with keras.

The accurcy of this model will also be around 65%.

Acknowledgements

Thank you Starbucks and Udacity for providing the data sets!

udacity-starbucks-capstone's People

Contributors

sopka avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.