Giter Club home page Giter Club logo

mlflow's Introduction

How to use MLflow to manage the Machine Learning lifecycle

In this repo, I experiment with MLflow to:

  • track machine learning experiments based on:

    • metrics
    • hyper-parameters
    • source scripts executing the run
    • code version
    • notes & comments
  • compare different runs between each other

  • set up a tracking server locally and on AWS

  • deploy the your model using MLflow Models

Quickstart locally

To execute the code:

  • Install pipenv to run a virtual environment with mlflow (it's cleaner this way)
pip install pipenv
  • Clone the project
git clone [email protected]:ahmedbesbes/mlflow.git
  • Install the dependencies
cd mlflow/
pipenv install .
  • Start a tracking server locally
mlflow ui
  • Launch the training (or whatever code that logs to MLflow)
python train.py

Launch a tracking server on AWS

If you're a team of developers or data scientists, you can spin up a tracking server where everyone logs his/her runs

1. Prepare an EC2 machine and an S3 bucket

  • create an IAM user on AWS. Get its credentials, namely Access key ID and Secret access key

  • with this same user, create an s3 bucket to store future artifacts: give this bucket a name. Mine is mlflow-artifact-store-demo but you cannot pick it

  • Launch an EC2 instance: it doesn't have to be big. a t2.micro eligible to free tier does perfectly the job

  • Configure the security group of this instance to accept inbound http traffic on port 5000

  • ssh into your EC2:

    • install pip

      sudo apt update
      sudo apt install python3-pip
    • install pipenv

      sudo pip3 install pipenv
      sudo pip3 install virtualenv
      
      export PATH=$PATH:/home/[your_user]/.local/bin/
  • now with pipenv, install the dependencies to run the mlflow server

    pipenv install mlflow
    pipenv install awscli
    pipenv install boto3
  • on the EC2 machine, configure aws with user's crendentials so that the tracking server can have access to s3 and display the artifacts on the UI.

    enter aws configure then follow the instructions to enter the credentials

  • start an mlflow server on the EC2 instance by defining the host as 0.0.0.0 and the --default-artifact-root as the S3 bucket

    mlflow server -h 0.0.0.0  \
                  --default-artifact-root s3://mlflow-artifact-store-demo

2. Set AWS credentials and change the tracking URI and

  • set the AWS credentials as environment variables so that the code uploads artifacts to the s3 bucket

    export AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
    export AWS_SECRET_ACCESS_KEY = <your-aws-secret-access-key>
  • change the tracking URI to the public dns of your EC2 machine + port 5000

    In my case the tracking URI was: http://ec2-35-180-45-108.eu-west-3.compute.amazonaws.com:5000/

Now you everything should be be good: after running the script locally you can inspect metrics on the UI that run on the remote server

By clicking on a specific run, you can see its artifacts uploaded to S3.

In fact, these artifacts are effectively on S3.

Slides

  • French version
  • English version (coming soon)

mlflow's People

Contributors

ahmedbesbes avatar besbes-ahmed avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.