Giter Club home page Giter Club logo

sl_core's Introduction

sl_core

Core application code for SuperLearner, templates, and documentation.

TL;DR

# Run the SuperLearner locally with sample data
git clone https://github.com/parallelworks/sl_core
cd sl_core
./local_superlearner_test.sh

Usage

For testing the usage of the SuperLearner, please start with train_predict_evaluate.sh which is a wrapper that launches the learning part of the SuperLearner code (train.py) as well as post-training operations of making predictions (predict.py), estimating errors via a regression and principal component analysis (PCA, pca.py) and running feature permutation importance (FPI, fpi.py). The launch script local_superlearner_test.sh is a template for how to specify the options of train_predict_evaluate.sh.

Broadly, train_predict_evaluate.sh is meant to be launched as part of a larger workflow that includes syncing machine learning code, data, and archives via GitHub and running multiple instances of the SuperLearner in parallel. This workflow is defined by workflow.sh (the code) and workflow.xml which is a "form" displayed by the PW platform that allows for users to specify workflow parameters and launch the workflow via calls to the PW API. API-launching a workflow is particularly useful when integrating the workflow with GitHub actions since the actions run as Docker containers on GitHub and can be set up to launch PW workflow through the API. Please see the ML-archive repository associated with this workflow for more information.

Additional information about each of the stages of the machine learning (train, predict, pca, fpi) are available in ML.md.

Install

There are three ways to install a workflow on Parallel Works.

  1. Use a github.json if available. A GitHub-integrated workflow can be automatically cloned to the PW platform if the user has access to the repository. For example, the JSON code block below would work as the contents of github.json for this workflow because it points to this workflow's public repository. No other files are needed on the PW platform since the rest of the necessary files are cloned each time the user selects the workflow.
{
    "repo": "https://github.com/parallelworks/sl_core.git",
    "branch": "main",
    "dir": ".",
    "xml": "workflow.xml",
    "thumbnail": "superlearner.png",
    "readme": "README.md"
}
  1. Copy the workflow from the PW Marketplace if available. Workflows (GitHub-integrated or "classic", see below) can be shared with other PW users on the PW Marketplace. To get to the Marketplace, click on PW account's name in the upper right corner and then select Marketplace (globe icon) from the drop down menu.

  2. Install the workflow by copying files. The "classical" way to install a PW workflow is to first create a new workflow; in the case of sl_core, use a Bash type workflow. Then, in the PW IDE terminal:

# Navigate to the workflow directory created by the platform.
cd /pw/workflows/sl_core

# Remove the default files in this directory.
rm -f

# Manually copy the workflow files into the workflow directory.
git clone https://github.com/parallelworks/sl_core .

Contents

  • create_conda_env.sh: Automated build of Conda environment to run the SuperLearner.

  • requirements.txt: All the versions of the software used in the current Conda environment. This can be used to build an environment more quickly; see notes in create_conda_env.sh.

  • sample_inputs: Some sample inputs for using the SuperLearner; these files are used by run.sh in sl_fit_validate.

  • sample_outputs: Some sample output files from the SuperLearner.

  • sl_fit_validate: Contains the SL run.sh and main.sh. The runscript is a very simple example runscript. Consider moving these files into the top level of the repo for combatibility with workflow.json-style workflow deployments.

  • sl_test: An old version of the SuperLearner that is kept here only for development reference/convenience. It will likely be removed soon.

sl_core's People

Contributors

pwghsa01 avatar stefangary avatar erexer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.