Giter Club home page Giter Club logo

horse_racing_prediction's Introduction

Horse Racing Prediction

The goal of this project is to predict which horse in a race can be the first 3 winners.

Data

The data is given and can be downloaded here

Domain Expertise

One should have at least basic understanding of horse racing in order to extract information from the given dataset.

Basic information about Japanese horse racing can be found here

Library Installation

All dependencies are listed in the requirements.txt

Steps to install all denpendencies

download requirements.txt

In the terminal, type: pip install -r requirements.txt

File Description

  • notebook_data_processing: preprocess and select meaningful data to build model
  • classification: horse top3 prediction based on preprocessed data
  • Input file: data is given in the compressed file given in the Data section. After unziping it, data in historical_data folder will be used.

Steps to predict top3 horses

Step 1: Preparation

  • Download the data in the link given above.
  • extract and place file in folder data/ in the main directory

Step 2: Data Processing

  • In ./src/data_process, run data_process.py file to process raw data.
  • The outputs are training.csv and testing.csv in extract_feature folder

Step 3: Feature Selection

  • In the newly created training and testing files, there are still many features for modeling. Therefore, several methods are applied to choose suitable features.
  • Run the file feature_selection.py then it will output useful_features.json in the same directory.

Step 4: Model Tuning and Training

  • Logistic Regression and Random Forest models are built and tuned in the basic_prediction_model.py.
  • CatBoost and LightGBM models are built and tuned in advanced_pred_model.py. Moreover, soft and hard ensemble models are also created based on CatBoost and LightGBM.
  • Scores of all models are recorded in src/models/score dicrectory.
  • Optimal parameters of each model are place in separated json files in src/models/best_hyperparameters directory.
  • After having tuned all parameters, training and testing datasets are combined and used to build soft voting ensemble in final_model.py. It output *_final.pkl models.

Step 5: Deploy model on Flask website

  • Working on it

GitHub URL

https://github.com/GarlicSoup/horse_racing_prediction

License

This program is created by Hieu Le

horse_racing_prediction's People

Contributors

hieutrungle avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.