Giter Club home page Giter Club logo

notebook-to-ml-pipeline's Introduction

From notebooks to operational ML pipelines

This repository provides the code base for the HSG CAS Big data & AI for managers session on how to operationalize machine learning models (October 2020).

##Pre-requisites

In order to be able to run all the code provided, the following pre-requisites must be met:

  • An environment to run Jupyter notebooks in, e.g. Visual Studio Code must be available. Also see "working with Jupyter Notebooks in VS Code".

  • Azure ML Studio workspace must be available. Follow these instructions to create it. In case you need a trial Azure subscription first, start here.

  • Once you Azure ML workspace is provisioned, download the workspace config.json file as shown below. Make sure to put the file in the same folder that contains the notebooks. shown here: Here you can find more information on setting up a local development environment to work with Azure ML.

  • To run ML pipelines remotely, you will also need to have a service principal account as shown in the "Service Principal Authentication" section in this notebook. The service principal is used in remote runs for authentication and authorization.

  • To interact with Azure ML from your local python environment, you must first install the azureml-sdk for python. It can be installed like any other library or package via pip. Siply type !pip install azureml-sdk in one cell of your Jupyter notebook and execute the cell. More information on the installation can be found here.

Contents

To access the contents of this repo, simply download it from the GitHub website manually or execute git clone https://github.com/marcscho/notebook-to-ml-pipeline from the command line in a folder of your choosing.

Notebook Content
0_register_dataset.ipynb Registers the csv file as a dataset in Azure ML
0_register_secret.ipynb Registers the secret for the service principal in the Azure Keyvault to ensure it is not stored in clear text in a script. This secret is then retrieved by the pipeline runs and used to authenticate as well as to retrieve assets such as models and datasets from the Azure ML workspace.
1_first_model.ipynb Trains a first basic model locally using python's sklearn library.
2_experiment_tracking.ipynb Re-runs the training process of the previous notebook, this time with tracking model metrics inside Azure ML experiments for traceability purposes. To view them, open Azure ML Studio and click "Experiments" in the navigation bar on the left.
3_predictions.ipynb Sets up a ML pipeline for batch scoring. Loads the previously registered dataset and registered model and generates a CSV file with the models predictions which is then uploaded to the Azure ML storage.
4_deploy_realtime.ipynb Publishes the previously trained model as a REST endpoint in a Docker container instance. This enables getting real-time predictions from the model.
5_test_ml_endpoint.http Can be used to send new data via HTTP request to the model endpoint deployed in the previous notebook. Requires the "HTTP client" extension to be installed in VS Code.

Clean up

To ensure no unwanted costs are incured, ensure that the endpoint deployed with notebook #4 is deleted. To do so, in Azure ML Studio, click "Endpoints" in the navigation bar on the left. Here you will find the endpoint named german-credit-hsg. Highlight it and finally click the "Delete" button in the menu bar just above it.

When no compute resources are running, the Azure ML Studio workspace does not incur any costs.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

notebook-to-ml-pipeline's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.