Giter Club home page Giter Club logo

challenge-api-deployment's Introduction


Logo

Belgian Real Estate Price Prediction

API deployment

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. API
  5. Logbook
  6. Authors

About The Project

This is the 5th project assigned during Becode's AI bootcamp in Brussels. Based on previous projects where we had to scrap Belgian real estate websites, collect the data, clean it and then create a model to predict the prices of other properties, we have to build and deploy an API for one particular model.

This project is more about the deployment than it is about the model. For further discussion, please refer to this repository.

Built With

Getting Started

To work with this API, you have two options. Either work directly with the API at this URL, either build it yourself from the sources and deploy it in a Docker container on Heroku as it is explained in the next subsection.

Prerequisites

You'll need the packages/software described above.

Installation

HEROKU

  • Install the Heroku CLI:

    • The Heroku Command Line Interface (CLI) makes it easy to create and manage your Heroku apps directly from the terminal. It’s an essential part of using Heroku.
    sudo snap install --classic heroku
  • Deployment on Heroku:

    • Heroku favours Heroku CLI therefore using command line is (ensure the CLI is up-to-date) crucial at this step.
    heroku login
    • After logging in to the respective Heroku account, the container needs to be registered with Heroku using
    heroku container:login
    • Once the container has been registered, a Heroku repo would be required to push the container which could be created :
    heroku create <yourapplicationname>

    NOTE: If there is no name stated after 'create', a random name will be assigned.

    • When there is an application repo to push the container, it is time to push the container to web :
    heroku container:push web --app <yourapplicationname>
    • Following the 'container:push' , the container should be released on web to be visible with
    heroku container:release web --app <yourapplicationname>
    • If the container has been released properly, it is available to see using
    heroku open --app <yourapplicationname>
    • Logging is also critical especially if the application is experiencing errors :
    heroku logs --tail <yourapplicationname>

IMPORTANT NOTE: While with localhost and Docker it is not mandatory to specify the PORT, if one would like to deploy on Heroku, the port needs to be specified within the 'app.py' to avoid crashes.

API

Our REST API is deployed on Heroku, using a Docker container. It is available at this address.

Now, let's describe our simple little API's routes and endpoints and the different HTTP methods that can be used.

/

This route is used with a GET method and returns a string "alive" in case the server is running and alive.

/predict

There are two endpoints for this route. The most important one is reached with a POST method but it is also accessible with a GET method. Let's further discuss these methods.

GET

This endpoint does not need any input. It returns a string explaining the input data and their format that the POST method expects.

POST

This endpoint is the main one of this API. With it, you will be able to query a price prediction giving abritrary real estate property features. It needs and returns specifically formatted inputs and outputs that will be described below.

Input

The input is given in a JSON notation of this particular format:

{
    "data": {
        "property-type": "APARTMENT" | "HOUSE" | "OTHERS",
        "area": int,
        "rooms-number": int,
        "zip-code": int,
        "garden": Optional[bool],
        "garden-area": Optional[int],
        "terrace": Optional[bool],
        "terrace-area": Optional[int],
        "facades-number": Optional[int],
        "building-state": Optional["NEW" | "GOOD" | "TO RENOVATE" | "JUST RENOVATED" | "TO REBUILD"],
        "equipped-kitchen": Optional[bool],
        "furnished": Optional[bool],
        "open-fire": Optional[bool],
        "swimmingpool": Optional[bool],
        "land-area": Optional[int],
        "full-address": Optional[str]
    }
}

As you can see, the input is wrapped in an object associated to the property data. Inside this object, not all the fields are mandatory. The optional ones are clearly tagged and can be ommitted in a request. The names are pretty much self-explanatory.

Output

The general output of this endpoint can be described with this JSON notation:

{
    "prediction": Optional {
        "price": [float],
        "r2_score":[float]
    }
    "error": Optional[str]
}

Both attributes prediction and error are optional and are in fact mutually exclusive: you either receive a prediction or an error message.

  • prediction contains itself two fields:

    • price: this key is associated to the price predicted by our model
    • r2_score: this key is associated to the estimate of the model's R² score (coefficient of determination) based on a segregated test set. Its purpose is to estimate the accuracy of the underlying model in general and can be ignored if not needed.

    It is sent back along with a HTTP status code 200 OK.

  • error warns the client that it didn't post the input data as expected. It could be because of a mandatory attribute missing (such as zip-code) or wrong typing (such as floating number for area instead of an integer). All these errors are detected using JSON Schema validation according to the schema specified in assets/input_schema.json.

    error contains a one-line string representation of the validation error that was produced using the JSON Schema package. It is written in human understandable English.

    It is sent back along with a HTTP status code 400 Bad Request.

Logbook

Project preparation

In model directory:

Processing original data

features_selection.py :

This class prepares the data from the original dataset used and create a new dataset in which all the right features are selected and correctly formatted to fit our regression model

Modeling

modeling.py :

This class create a model from the clean dataset, the scaler and the degree we want for the PolynomialFeatures. With the model created we can predict the price from estate and give the score of the prediction.

In preprocesing directory:

cleaning_data.py :

This module contains utility functions to validate, clean/preprocess and then select the features of an input data point (i.e. a property whose price has to be predicted) to feed the model.

app.py:

This is the main module of our server. It defines a basic REST API of several routes as explained above in the section about the API.

Dockerfile to wrap the API

The way to get our Python code running in a container is to pack it as a Docker image and then run a container based on it.

To generate a Docker image we need to create a Dockerfile which contains instructions needed to build the image. The Dockerfile is then processed by the Docker builder which generates the Docker image.

  • The Dockerfile creates an image with:
    • Ubuntu
    • Python
    • Flask
    • Gunicorn
    • Sklearn
    • Pandas
    • Numpy
    • JSON Schema
    • Other dependencies needed

For each instruction or command from the Dockerfile, the Docker builder generates an image layer and stacks it upon the previous ones. Therefore, the Docker image resulting from the process is simply a read-only stack of different layers.

Deploy Docker image in Heroku

  • Preparation for Heroku:
    • After completing the API part, firstly requirements.txt file is built with mandatory libraries to run the API.
    • In order to wrap the API as a Docker container, Dockerfile is created with required Python version, app.py file and install the requirements using the requirements.txt.
    • For Heroku to interpret which server and Flask direction to use, Procfile is created to use app for Flask and gunicorn on the web server.
    • Lastly, runtime.txt is important to signal Heroku which exact language and which version to use. In our case python 3.7.6.

Authors

Project Link: https://github.com/jotwo/challenge-api-deployment

Icônes conçues par Freepik from www.flaticon.com

challenge-api-deployment's People

Contributors

jotwo avatar sravanthiai avatar trickydaze avatar wiiki09 avatar

Forkers

christer19

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.