Giter Club home page Giter Club logo

toronto-housing-price-prediction's Introduction

Toronto Housing Price Prediction

The volume of preliminary work this project required amounted to perhaps one of the most challenging yet rewarding experiences I’ve had this summer. Through countless iterations on approach, fetching data, and finalizing model selection to plotting beautiful geomaps, preprocessing, and finally parameters’ tuning, each step produced a POC-like hurdle that took time to overcome but ultimately led to fruitful professional development. Surprisingly, although helpful, a three month data science course I took earlier this year paled in comparison to the growth I experienced spending my Friday nights working on this project in happy solitude.

The deciding factor in terms of subject matter comes from an experience I had last year as a Credit Risk Analyst at a bank. As expected, the role required processing many TFSA, RRSP, investment loan as well as housing applications. It piqued my curiosity into the Ontario housing market, particularly price prediction without manual human involvement. Many questions such as how different areas compare to each other in terms of value, price, demographic distribution, etc, remained unanswered even upon my departure from finance this past year. Since banking institutions have difficulty completely transitioning from their analog/manual past, it ensures working with new, efficient technology an unusual luxury. With my current skillset, I figured I had the tools to attempt a solution.

Special note: The project was based on the real data that came from the bank's database. In order to avoid any confidentiality issues I have decided to recreate the project and use the similar techniques I used there but with the publicly available data. I took advantage of the Canadian government’s enforcement towards real estate companies to disclose their housing sales to the public.

Related questions:

  • Sources and means to obtain the housing dataset
  • Feature engineer new predictors using different sources
  • Key insights from current housing market
  • ML model selection
  • Best approach to hyperparameter optimization

Project’s key takeaways:

  • Efficient web scraping
  • Self built dataset and feature engineering
  • Geographical data visualization
  • Standardizing and power transforming the data
  • Handling missing data
  • Creating custom objectives and evaluations for ML model
  • Automatic hyperparameter tuning
  • Model averaging and Model Stacking

Datasets:

  • Handcrafted housing dataset (time horizon - 1 year)
  • Geographical data of GTA neighbourhoods
  • Neighbourhood profiles (2 – years old)
  • Geographical data of Toronto subway lines/stations

Limitation: Although the latest Torontonian income level was unavailable, the relational information from the 2017 dataset sufficed.

Continue HERE (using nbviewer )

toronto-housing-price-prediction's People

Contributors

slavaspirin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

toronto-housing-price-prediction's Issues

Data source

Hi,

You mentioned on your main page that "In order to avoid any confidentiality issues I have decided to recreate the project and use the similar techniques I used there but with the publicly available data. I took advantage of the Canadian government’s enforcement towards real estate companies to disclose their housing sales to the public.".

You also noted a "Handcrafted housing dataset (time horizon - 1 year)". May I ask how you went about creating this dataset? e.g. was it via web scraping, contacting various real estate companies, or otherwise?

Thank you!

Clarification on Prediction

May I ask what kind of "real estate prices" is it predicting? In your Jupiter notebook there is a Final Prediction without any unit or context - what region and what time range for what type of housing is that value for?

I am sure if I carefully read through all your code I will be able to understand, but I just wish to get some easy pointers from you.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.