Giter Club home page Giter Club logo

mlos's Introduction

MLOS

MLOS DevContainer MLOS Linux MLOS Windows Code Coverage Status

MLOS is a project to enable autotuning for systems.

Contents

Overview

MLOS currently focuses on an offline tuning approach, though we intend to add online tuning in the future.

To accomplish this, the general flow involves

  • Running a workload (i.e., benchmark) against a system (e.g., a database, web server, or key-value store).
  • Retrieving the results of that benchmark, and perhaps some other metrics from the system.
  • Feed that data to an optimizer (e.g., using Bayesian Optimization or other techniques).
  • Obtain a new suggested config to try from the optimizer.
  • Apply that configuration to the target system.
  • Repeat until either the exploration budget is consumed or the configurations' performance appear to have converged.

optimization loop

Source: LlamaTune: VLDB 2022

For a brief overview of some of the features and capabilities of MLOS, please see the following video:

demo video

Organization

To do this this repo provides two Python modules, which can be used independently or in combination:

  • mlos-bench provides a framework to help automate running benchmarks as described above.

  • mlos-viz provides some simple APIs to help automate visualizing the results of benchmark experiments and their trials.

    It provides a simple plot(experiment_data) API, where experiment_data is obtained from the mlos_bench.storage module.

  • mlos-core provides an abstraction around existing optimization frameworks (e.g., FLAML, SMAC, etc.)

    It is intended to provide a simple, easy to consume (e.g. via pip), with low dependencies abstraction to

    • describe a space of context, parameters, their ranges, constraints, etc. and result objectives
    • an "optimizer" service abstraction (e.g. register() and suggest()) so we can easily swap out different implementations methods of searching (e.g. random, BO, LLM, etc.)
    • provide some helpers for automating optimization experiment runner loops and data collection

For these design requirements we intend to reuse as much from existing OSS libraries as possible and layer policies and optimizations specifically geared towards autotuning systems over top.

By providing wrappers we aim to also allow more easily experimenting with replacing underlying optimizer components as new techniques become available or seem to be a better match for certain systems.

Contributing

See CONTRIBUTING.md for details on development environment and contributing.

Getting Started

The development environment for MLOS uses conda and devcontainers to ease dependency management, but not all these libraries are required for deployment.

For instructions on setting up the development environment please try one of the following options:

  • see CONTRIBUTING.md for details on setting up a local development environment
  • launch this repository (or your fork) in a codespace, or
  • have a look at one of the autotuning example repositories like sqlite-autotuning to kick the tires in a codespace in your browser immediately :)

conda activation

  1. Create the mlos Conda environment.

    conda env create -f conda-envs/mlos.yml

    See the conda-envs/ directory for additional conda environment files, including those used for Windows (e.g. mlos-windows.yml).

    or

    # This will also ensure the environment is update to date using "conda env update -f conda-envs/mlos.yml"
    make conda-env

    Note: the latter expects a *nix environment.

  2. Initialize the shell environment.

    conda activate mlos

Usage Examples

mlos-core

For an example of using the mlos_core optimizer APIs run the BayesianOptimization.ipynb notebook.

mlos-bench

For an example of using the mlos_bench tool to run an experiment, see the mlos_bench Quickstart README.

Here's a quick summary:

./scripts/generate-azure-credentials-config > global_config_azure.jsonc

# run a simple experiment
mlos_bench --config ./mlos_bench/mlos_bench/config/cli/azure-redis-1shot.jsonc

See Also:

mlos-viz

For a simple example of using the mlos_viz module to visualize the results of an experiment, see the sqlite-autotuning repository, especially the mlos_demo_sqlite_teachers.ipynb notebook.

Installation

The MLOS modules are published to pypi when new releases are tagged:

To install the latest release, simply run:

# this will install just the optimizer component with SMAC support:
pip install -U mlos-core[smac]

# this will install just the optimizer component with flaml support:
pip install -U "mlos-core[flaml]"

# this will install just the optimizer component with smac and flaml support:
pip install -U "mlos-core[smac,flaml]"

# this will install both the flaml optimizer and the experiment runner with azure support:
pip install -U "mlos-bench[flaml,azure]"

# this will install both the smac optimizer and the experiment runner with ssh support:
pip install -U "mlos-bench[smac,ssh]"

# this will install the postgres storage backend for mlos-bench
# and mlos-viz for visualizing results:
pip install -U "mlos-bench[postgres]" mlos-viz

Details on using a local version from git are available in CONTRIBUTING.md.

See Also

Examples

These can be used as starting points for new autotuning projects.

mlos's People

Contributors

amueller avatar anjagruenheid avatar bpkroth avatar dependabot[bot] avatar ephoris avatar eujing avatar jmaureen avatar kkanellis avatar motus avatar poojanilangekar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mlos's Issues

Add .editorconfigs

We can consider adding some .editorconfig style entries to aid editors to follow style guidelines in addition to build-time lint checking.

Add tests for IPython Notebooks

At the very least we should make sure that these notebooks continue to execute without throwing exceptions. At best we validate that their output is what we expected.

We should also assert that specific notebooks are checked in with outputs.

Reliance on docker is problematic on windows machines

Installing docker requires windows 10 2004 which is blocked on some machines and it might be tricky to install. I didn't manage on my workstation. This might need a workaround for doing this in a teaching environment.

Reevaluate use of `getcwd` in python code

There are several places where getcwd is used to compose a path assuming that the script is executed from the source/Mlos.Python directory or else used to create a temporary file.

For the first, we should change it to be relative to the file referencing it so that the script can be executed from a different directory.

For the second, we should be using the system provided get temp files to avoid security issues.

Add C support

This will need some C++ wrapper functions and some build tweaks to consuming projects.

Current thought is to integrate with sqlite as an example.

Using vscode from WSL for dotnet editing requires additional setup work

Right now, attempting to edit dotnet code inside vscode, launched from a WSL instance throws an error about a missing .net sdk.

In theory, we should be able to do the following:

# setup the environment to find the locally installed dotnet
. ./scripts/init.linux.sh
# start vscode and inherit those environment variables (especially PATH)
code .

Unfortunately, it seems that in WSL, vscode is launched really as a remote Windows process with a proxy server to the WSL environment, so those variables are not passed through.

microsoft/vscode-remote-release#1700

Either we should apply some of the suggested fixes in that issue to automatically set them up for the user, or just document how to install a dotnet sdk in the environment.

A third option I'd like to find time to do is to publish our docker images and provide a .devcontainer/ json for automatically letting vscode set itself up in a reasonable way to edit in the container with all the right bits already prepared.

test_lasso_hierarchical_categorical_predictions seems flaky

@edcthayer adding an issue to track this here

Seen this a couple of times recently. Sometimes with slightly different KeyErrors (e.g. medium_quadratic_params instead).
Rerunning it seems to make it go away.

2020-09-22T15:46:14.7210731Z [10 rows x 5 columns]
2020-09-22T15:46:14.7211123Z     raise_missing = True
2020-09-22T15:46:14.7211913Z     self = <pandas.core.indexing._LocIndexer object at 0x0000021609A396D8>
2020-09-22T15:46:14.7212932Z   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\site-packages\pandas\core\indexing.py", line 1646, in _validate_read_indexer
2020-09-22T15:46:14.7213796Z     raise KeyError(f"{not_found} not in index")
2020-09-22T15:46:14.7214363Z     ax = Index(['vertex_height', 'medium_quadratic_params.x_1',
2020-09-22T15:46:14.7214990Z        'medium_quadratic_params.x_2', 'low_quadratic_params.x_1',
2020-09-22T15:46:14.7215540Z        'low_quadratic_params.x_2'],
2020-09-22T15:46:14.7216039Z       dtype='object')
2020-09-22T15:46:14.7216430Z     axis = 1
2020-09-22T15:46:14.7216840Z     indexer = array([ 0,  3,  4,  1,  2, -1, -1], dtype=int64)
2020-09-22T15:46:14.7217460Z     key = Index(['vertex_height', 'low_quadratic_params.x_1', 'low_quadratic_params.x_2',
2020-09-22T15:46:14.7218181Z        'medium_quadratic_params.x_1', 'medium_quadratic_params.x_2',
2020-09-22T15:46:14.7218832Z        'high_quadratic_params.x_1', 'high_quadratic_params.x_2'],
2020-09-22T15:46:14.7219465Z       dtype='object')
2020-09-22T15:46:14.7219832Z     missing = 2
2020-09-22T15:46:14.7220333Z     not_found = ['high_quadratic_params.x_1', 'high_quadratic_params.x_2']
2020-09-22T15:46:14.7220882Z     raise_missing = True
2020-09-22T15:46:14.7221500Z     self = <pandas.core.indexing._LocIndexer object at 0x0000021609A396D8>
2020-09-22T15:46:14.7222280Z KeyError: "['high_quadratic_params.x_1', 'high_quadratic_params.x_2'] not in index"

From:
https://pipelines.actions.githubusercontent.com/fOuLpdRLhJegHdOkuie9qUN2ZnNM5WKTsTmQkIMYbUKwUvkp3o/_apis/pipelines/1/runs/246/signedlogcontent/14?urlExpires=2020-09-22T16%3A10%3A29.3461235Z&urlSigningMethod=HMACV1&urlSignature=EIMbhzZxH5LmY8MLhX91%2FWJ9h02Ut6Typi2flZjOW5Q%3D

Execute Python long haul tests on a schedule

@byte-sculptor observed that we are currently missing the Python long haul tests.

We don't necessarily want these in the CI pipelines (they take too long), but we do want to execute them periodically.

To do that, we'll need a separate .github/workflows/scheduled.yml sort of file for the scheduled tasks.

Another task that we'll want to put in there are periodic docker image rebuilds (e.g. to catch security patches). See Also: #36

See Also: https://docs.github.com/en/free-pro-team@latest/actions/reference/events-that-trigger-workflows#scheduled-events

Enable gcc support

Currently to build C++ code generation code we rely on clangs support for MSVC attributes (e.g. to ignore duplicate definitions at link time).

There are multiple ways around this that would help enable gcc support including reorganizing the code generation output or using some macros to add #ifdef wrappers around the attributes usages.

Consider this as icing/wishlist for now.

Add support for alternative backend storage

MLOS should be a bit more generic in its support of backend storage for the models, optimizers, experiments, etc. (e.g. not just SqlServer).

Some other potential targets include: mlflow with files, sqlite, mysql, postgres, etc.

Add logging infrastructure

We've noticed that troubleshooting, particularly where there is cross process/language waiting or lookups involved, could be added by some simple print statements to track the status of various operations (e.g. assembly lookup, synchronization points, etc.)

However, these statements are undesirable in a production environment so have been eschewed thus far.

We should implement a logging mechanism to make that configurable on a case by case basis.

Improve mlos library to support Python 3.8

Several of the modern tool installers default to Python 3.8. We should see what can be done to make the mlos python library work for that as well rather than pinning on 3.7 which complicates the install/setup process.

Add linux support

Make sure the project builds and works easily in a Linux environment.

Create mailmap

We should create a mailmap for microsoft alias -> github handles so that the commit logs map both identities to the same person.

Add Java support

Mostly a placeholder for future work: we should add java support for code generation to talk to the external agent for tracking experiments and telemetry so that we can tune another common class of systems: distributed java applications.

One nice possibility here could be to integrate existing java language attributes so that the code generation process is a bit more native feeling (rather than the extra C# annotated structs that we currently have to support cross compiler C++).

Add code coverage checks

Breaking out from #6:

It would be good to add code coverage checks and badges for that to the repo landing pages.

Add OSS Examples

We need some OSS examples to use both for initial experience and documentation purposes as well as CI/CD test integrations.

Publish C# API documentation from comments

We already do this for Python using sphinx.
It's already possible to output xml from the msbuild .csproj files. Should be able to output either HTML directly or use another tool to help with that.

Add ROADMAP.md

We should have a place to document the high level features and items we'd like to support.

There's a spot to link to this on the top level README.md right now, but it's currently a dead link.

Publish docker build images

We should publish the base build image portion used in the current base Dockerfile to avoid needing to execute all of the apt-get commands each time.

This would also be useful for CI pipelines wanting to make use of those images to build/test.

Enable -Wall for C++ builds on Linux

Current Linux builds don't enable -Wall (i.e. fail on all warnings).

This is generally good practice and would help us enable integration with more projects with fewer build issues.

Getting there may take some reorgs of the code generation output (e.g. to avoid duplicate definitions that are currently just tagged to ignore in the linker).

Python Unit Test Timeouts

Hmm, Python unit tests are still timing out. Something else might be going on than the Python unit tests just randomly taking a long time due to high-degree polynomials being chosen (which I think we already reduced).

Originally posted by @bpkroth in #66 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.