Giter Club home page Giter Club logo

reg-study-dup's Introduction

Regression Case Study

In today's exercise you'll get a chance to try some of what you've learned about supervised learning on a real-world problem.

The goal of the contest is to predict the sale price of a particular piece of heavy equipment at auction based on it's usage, equipment type, and configuration. The data is sourced from auction result postings and includes information on usage and equipment configurations.

Evaluation

The evaluation of your model will be based on Root Mean Squared Log Error. Which is computed as follows:

where pi are the predicted values and ai are the target values.

See the code in score_model.py for details.

Setup

Run pip install git+https://github.com/zipfian/performotron.git

Data

The data for this case study are in ./data. Although there are both training and testing data sets, the testing data set will only be utilized to evaluate your final model performance. In other words, you should use cross-validation on the training data set to identify potential models, then score those models on the test data.

In order to score your model, you will need to output your predictions in the format specified in data/median_benchmark.csv. Then you can submit your solution for evaluation using the command:

python score_model.py data/your_predictions.csv

Note that this will announce your score on Slack to everybody else, but feel free to submit early to make sure you have a working model.

Important Tips

  1. This data is quite messy. Try to use your judgement about where your cleaning efforts will yield the most results and focus there first.
  2. Remember any transformations you apply to the training data will also have to be applied to the testing data, so plan accordingly.
  • It's possible some columns in the test data will take on values not seen in the training data. Plan accordingly.
  1. Use your intuition to think about where the strongest signal about a price is likely to come from. If you weren't fitting a model, but were asked to use this data to predict a price what would you do? Can you combine the model with your intuitive instincts?
  2. Start simply. Fit a basic model and make sure you're able to get the submission working then iterate to improve. Try to submit a model--even if you know it has some weaknesses--within the first hour.

reg-study-dup's People

Contributors

cwschupp avatar frank-w-b avatar lemonlaug avatar lemurey avatar mosesmarsh avatar darrenreger avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.