Giter Club home page Giter Club logo

nba-point-spread-lunch-bet's Introduction

NBA-point-spread-bet

I am creating a classification model that predicts the winning of regular season NBA games relative to the spread so I can win more lunch bets against my friend Eric.

Table of Contents

  1. Motivation
  2. Getting Started
  3. Approach
  4. Thoughts on Feature Selection
  5. Results
  6. Bet Against Eric
  7. Future Work
  8. Closing

Motivation

Basketball is my favorite sport to watch, specifically the NBA. Go Rockets!! It's even more exciting when you personally have something on the line. I have a friend named Eric who loves the NBA just as much as I do, maybe even more. Eric and I will usually wager a meal(usually a lunch) on the points spread of a game.

The probability of either outcome of the spread is usually 50%, so you have practically even odds for whatever side you end up choosing. If this is true then my long term expected value of lunches won against Eric will be about 0. I really, really, really wanted to win more lunch bets than him so I thought I could use machine learning to help acheive that goal.

Getting Started

I started by gathering regular season spread data(closing line) from the 2014-2015 season through the 2017-2018 season. I also collected boxscore data for those seasons. With 5 seasons worth of data, I starting doing exploratory data analysis.

Approach

Train some classification models with greater than 50% accuracy when betting on the spread. I chose Logistic Regression, Random Forest Classifier and Gradient Boosting Classifier to work with.

Thoughts on Feature Selection

The idea I had was to use all rolling averages of each team in a particular matchup. I started with a window length of 5 (previous 5 games) and that became the main features I used to transform my data for training. I also included the each team's record against the spread. The record was represented as the proportion of wins against the spread to total games played. This column was a little tricky to create because if you just took the cumulative sum of a team's result against the spread and added it as a feature column it would introduce data leakage. So I took the cumulative sum then shifted the results and inserted a row above with 0.0 for the first game of the season. It also meant excluding the last row of the cumulative sum.

Results

Logistic Regression was the most consistent over every iteration of the train/test setup. There were only 3 instances of the 24 train/test scenarios where it failed to return at least 50.0% accuracy. I will try to include graphs of the information as well as a cleaned up jupyter notebook of my work soon.

alt text

Random Forest and Gradient Boosting both had several instances where the model performed well below 50.0% (47-48% range)

alt text

Using a rolling average of 6 games back offered the best results in terms of overall accuracy so I decided to train my model to predict the upcoming season using that window length.

Bet Against Eric?

I believe my model will perform at greater than 50% accuracy for the upcoming season. I look forward to putting my model to the test! Eric is in for a big surprise this season!!

Vegas?

This is where it gets a little more complicated. Because Vegas takes a fee of 10% of your bet to take on your bet, break-even is no longer 50%. You need closer to 52.38% accuracy on games just so to not lose any money. The average accuracy of my model over the 4 years tested was 52.58%, slightly above break-even but not significant.

Future Work

I want to include additional features to better represent the current state of each team I also want to break down team stats individual player stats. An earlier idea was to use each team's previous team averages to PYMC3 to simulate games.

Closing

I really enjoyed coming up with a bunch of ideas and testing whether or not they worked. It was frustrating at times but reward each time you made a little progress.

nba-point-spread-lunch-bet's People

Contributors

jonlin84 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.