Giter Club home page Giter Club logo

build-ml-pipeline-for-short-term-rental-prices's Introduction

Final Project: ML Pipeline for Short term Rental Prices in NYC

This GitHub is my solution to a project of one the Machine Learning Devops Nanodegree.

For Reviewers

You can fing my results and code at the following links

Description

The focus of this project is to build and end to end machine learning learnrning pipeline for short term rental prices in NYC. In this project we investigate the integration of several tools that enanble us to perform experimentations in a clear and structure way. The tools on which we focused are:

  • Hydra: for configurations and hyperparameter tuning
  • W&B: used as artifact store, data versioning, and monitoring the training
  • MLflow: to orchestrate the whol ML lifecycle.

Dataset

The open source dataset is about rental prices in New York City, provided by Airbnb.

Install Dependecies

All the steps of the pipeline can be run levareging MLflow. So this is the onlu thing you need to install. Then MLflow will take care of installing averyting else is needed for each component of the pipeline, creating isolated virtual environments for each component. In order to install MLflow:

> pip install mlflow

To make sure your installed mlflow succesfully run the following command.

> pip show mlflow

Running the entire pipeline or just a selection of steps

Now you should be able to run the entire pipeline from the root directory using mlflow.

> mlflow run .

If you want to run the download and the basic_cleaning steps, you can similarly do:

> mlflow run . -P steps=download,basic_cleaning

You can override any other parameter in the configuration file using the Hydra syntax, by providing it as a hydra_options parameter. For example, say that we want to set the parameter modeling -> random_forest -> n_estimators to 10 and etl->min_price to 50:

> mlflow run . \
  -P steps=download,basic_cleaning \
  -P hydra_options="modeling.random_forest.n_estimators=10 etl.min_price=50"

Monitor training and artifact

All the steps in the pipeline will produce results (in term of performance and artifacts) that will be saved into wandb. alt text

Make sure you have a wandb account, and you are logged in:

> wandb login

License

License

build-ml-pipeline-for-short-term-rental-prices's People

Contributors

iyebohboh avatar lewi0332 avatar march-08 avatar sudkul avatar uanjali avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.