Giter Club home page Giter Club logo

unionml's Introduction

Union.ai Logo

UnionML

The easiest way to build and deploy machine learning microservices



PyPI - Python Version PyPI version shields.io Documentation Status Build PyPI - Downloads Roadmap


UnionML is an open source MLOps framework that aims to reduce the boilerplate and friction that comes with building models and deploying them to production.

You can create UnionML Apps by defining a few core methods that are automatically bundled into ML microservices, starting with model training and offline and online prediction.

Built on top of Flyte, UnionML provides a high-level interface for productionizing your ML models so that you can focus on curating a better dataset and improving your models.

To learn more, check out the ๐Ÿ“– Documentation.

Installing

Install using conda:

conda install -c conda-forge unionml

Install using pip:

pip install unionml

A Simple Example

Create a Dataset and Model, which together form a UnionML App:

from unionml import Dataset, Model

from sklearn.linear_model import LogisticRegression

dataset = Dataset(name="digits_dataset", test_size=0.2, shuffle=True, targets=["target"])
model = Model(name="digits_classifier", init=LogisticRegression, dataset=dataset)

Define Dataset and Model methods for training a hand-written digits classifier:

from typing import List

import pandas as pd
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score

@dataset.reader
def reader() -> pd.DataFrame:
    return load_digits(as_frame=True).frame

@model.trainer
def trainer(
    estimator: LogisticRegression,
    features: pd.DataFrame,
    target: pd.DataFrame,
) -> LogisticRegression:
    return estimator.fit(features, target.squeeze())

@model.predictor
def predictor(
    estimator: LogisticRegression,
    features: pd.DataFrame
) -> List[float]:
    return [float(x) for x in estimator.predict(features)]

@model.evaluator
def evaluator(
    estimator: LogisticRegression,
    features: pd.DataFrame,
    target: pd.DataFrame
) -> float:
    return float(accuracy_score(target.squeeze(), predictor(estimator, features)))

And that's all โญ๏ธ!

By defining these four methods, you've created a minimal UnionML App that you can:

Contributing

All contributions are welcome ๐Ÿค ! Check out the contribution guide to learn more about how to contribute.

Gitpod

Open in Gitpod

unionml's People

Contributors

abubakrce19 avatar bilal-aamer avatar cosmicbboy avatar eapolinario avatar jonwiggins avatar kumare3 avatar mmorzywolek avatar mrkrishnaagarwal avatar samhita-alla avatar smartmind12 avatar smritisatyanv avatar sugatoray avatar thomasjpfan avatar williamardianto avatar zevisert avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

unionml's Issues

Scheduling API: support for offline batch prediction with FlyteRemote to execute on Flyte backend

Currently unionml supports batch predictions through the model.remote_predict method, which executes a prediction workflow on the configure flyte backend.

However, the only way of scheduling this today is by pulling out the flyte workflow in model.prediction_workflow and using the flyte API to schedule.

The purpose of this task is to come up with some interface for this in unionml.

model.deploy() # deploys workflows to Flyte backend
model.serve() # deploys a model server
model.schedule(...) # schedules a launchplan on Flyte backend

UnionML blogpost

  • Why:
    • "Problems: bridging the gap between training and serving"
    • "Tying it all together."
  • Vision:
    • "Take the boilerplate out of building and serving models"
    • "Minimize code changes between training and online/offline prediction"
  • Roadmap:
    • Open to integrations
    • Feature store, e.g. Feast, Tecton
    • Serving (e.g. BentoML, Seldon, KServe)

Notes:

  • UnionML is targeting multiple contexts as microservices
  • Write about type-safety and how to monitor ML services: whylogs, pandera, GE, pydantic

Support loading model files from local file system in FastAPI app

The purpose of this PR is to support deployment of flytekit-learn apps via FastAPI server and AWS Lambda functions without having to use a Flyte cluster as a backend.

User Story (Local):

  • Create a flytekit-learn app locally
  • Use model.train to train a model, save it to some local file path
  • Point FastAPI/AWS Lambda to the file path

User Story (Remote):

  • Create a flytekit-learn app locally
  • Use model.train to train a model, save it to some local file path
  • Sync the local model with some remote blob store
  • FastAPI
    • deploy script should read in the the model file from blob store
    • FastAPI app points to path in local filesystem of the app server
  • AWS Lambda
    • function should read in model file from blob store

[Bug] The unionml flytekit demo is outputting the same prediction

QA UnionML docs

go through UnionML docs and make sure:

  • code and instructions work
  • copy reads well

implement nn.Module flytekit type transformer

don't require torch as a package dependency, just try importing it in unionml/__init__.py and registering the type transformers if torch is installed.

Try out specifying flytekit.plugins entrypoint in the setuptools config to make sure that these unionml plugins can work with flytekit plugin system

fail deployment if branch is dirty, but support --ignore-diff to deploy anyway

Calling model.remote_deploy() or using unionml deploy on the CLI will deploy all the UnionML Flyte microservices using the current git sha, which will cause issues when a user makes changes to the current branch without committing them.

The purpose of this issue is to raise an Exception if the current git branch is dirty, telling the user to make a commit or use the --ignore-diff option if they want to deploy anyway.

Support model checkpoints from previous training runs

As a ML engineer, I want to be able to continue training a model from a previous model.remote_train run (with equivalent semantics for a local model.train call) so that I can pick up training from where I left off.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.