Giter Club home page Giter Club logo

ml-monitoring-for-attrition-risk-assessment-system's Introduction

ML Monitoring for an Attrition Risk Assessment System

A simple framework for machine learning models post deployment, which in this project, is applied to an attrition risk assessment system.

Project Structure

This repository follows the following structure:

.
├── sourcedata/                       # Directory storing data for model development
├── practicedata/                     # Directory storing data for practice
├── testdata/                         # Directory storing data for testing
├── ingesteddata/                     # Directory storing clean data
├── practicemodels/                   # Directory storing trial models
├── models/                           # Directory storing models build with sourcedata
├── productiondeployment/             # Directory soring deployed model
├── images/                           # Directory soring images  
├── ingestion.py   
├── training.py                     
├── scoring.py         
├── deployment.py
├── reporting.py    
├── diagnostics.py   
├── app.py    
├── apicalls.py  
├── fullprocess.py
├── wsgi.py                             
├── config.json
├── slice_output.txt
├── cronjob.txt
└── README.md

Running the Project

To install the requirements, run the following command:

pip install -r requirements.txt

To run the app locally, run the following command:

python app.py

To run the API calls, open a new CL/terminal tab, then run the following command:

python apicalls.py

To run other scripts individually, run the following command:

python {SCRIPT_NAME}.py

The ML Monitoring Process

The entire monitoring process, which is orchestrated by the fullprocess.py script, follows the diagram below:

process_flow

This process can be setup to run automatically at specific intervals using cron. The shell command used to do this can be found in cronjob.txt. This command tells the fullprocess.py script to run every 10 minutes.

For more information on cron expressions, check out their documentation.

Data Ingestion

First, the script will check for new data under the input_folder_path, defined in config.json. If new data is present, the ingestion process, defined by the ingestion.py script, will run and produce a new dataset, stored under output_folder_path, again defined in config.json.

In this example, the paths are just subdirectories. However, the same concept can be extended to other data storage options.

Model Retraining & Redeployment

Using the new data produced in the step above, the script will proceed to test the existing model, stored in prod_deployment_path, against this new data. The model scores are recorded, and if model drift is detected, the script will initiate the retraining and redeployment steps (defined in the training.py and deployment.py scripts), before finally initiating the reporting and diagnostics steps (defined in the reporting.py and diagnostics.py scripts).

Other than model drift, data drift will also be assessed and reported using EvidentlyAI.

data_drift_dashboard

For more information on EvidentlyAI, check out their documentation.

Flask API

The codes for the Flask API are defined in the app.py script, which can be run via the command line or terminal. As models are retrained and redeployed, the app will continuously update as well. To test the endpoints defined in the app.py script, run apicalls.py in a separate CL/terminal tab.

For more information on Flask, check out their documentation.

ml-monitoring-for-attrition-risk-assessment-system's People

Contributors

gianatmaja avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.