Giter Club home page Giter Club logo

test-tube's Introduction

react-router

Test Tube

Log, organize and parallelize hyperparameter search for Deep Learning experiments

PyPI version

Docs

View the docs here


Test tube is a python library to track and parallelize hyperparameter search for Deep Learning and ML experiments. It's framework agnostic and built on top of the python argparse API for ease of use.

pip install test_tube

Main test-tube uses

Compatible with Python any Python ML library like Tensorflow, Keras, Pytorch, Caffe, Caffe2, Chainer, MXNet, Theano, Scikit-learn


Examples

The Experiment object is a subclass of Pytorch.SummaryWriter.

Log and visualize with Tensorboard

from test-tube import Experiment
import torch

exp = Experiment('/some/path')
exp.tag({'learning_rate': 0.02, 'layers': 4})    

# exp is superclass of SummaryWriter
features = torch.Tensor(100, 784)
writer.add_embedding(features, metadata=label, label_img=images.unsqueeze(1))

# simulate training
for n_iter in range(2000):
    e.log({'testtt': n_iter * np.sin(n_iter)})

# save and close
exp.save()
exp.close()
pip install tensorflow   

tensorboard --logdir /some/path

Run grid search on SLURM GPU cluster

from test_tube.hpc import SlurmCluster

# hyperparameters is a test-tube hyper params object
hyperparams = args.parse()

# init cluster
cluster = SlurmCluster(
    hyperparam_optimizer=hyperparams,
    log_path='/path/to/log/results/to',
    python_cmd='python3'
)

# let the cluster know where to email for a change in job status (ie: complete, fail, etc...)
cluster.notify_job_status(email='[email protected]', on_done=True, on_fail=True)

# set the job options. In this instance, we'll run 20 different models
# each with its own set of hyperparameters giving each one 1 GPU (ie: taking up 20 GPUs)
cluster.per_experiment_nb_gpus = 1
cluster.per_experiment_nb_nodes = 1

# run the models on the cluster
cluster.optimize_parallel_cluster_gpu(train, nb_trials=20, job_name='first_tt_batch', job_display_name='my_batch')   

# we just ran 20 different hyperparameters on 20 GPUs in the HPC cluster!!    

Optimize hyperparameters across GPUs

from test_tube import HyperOptArgumentParser

# subclass of argparse
parser = HyperOptArgumentParser(strategy='random_search')
parser.add_argument('--learning_rate', default=0.002, type=float, help='the learning rate')

# let's enable optimizing over the number of layers in the network
parser.opt_list('--nb_layers', default=2, type=int, tunable=True, options=[2, 4, 8])

# and tune the number of units in each layer
parser.opt_range('--neurons', default=50, type=int, tunable=True, low=100, high=800, nb_samples=10)

# compile (because it's argparse underneath)
hparams = parser.parse_args()

# optimize across 4 gpus
# use 2 gpus together and the other two separately
hparams.optimize_parallel_gpu(MyModel.fit, gpu_ids=['1', '2,3', '0'], nb_trials=192, nb_workers=4)

Or... across CPUs

hparams.optimize_parallel_cpu(MyModel.fit, nb_trials=192, nb_workers=12)

You can also optimize on a log scale to allow better search over magnitudes of hyperparameter values, with a chosen base (disabled by default). Keep in mind that the range you search over must be strictly positive.

from test_tube import HyperOptArgumentParser

# subclass of argparse
parser = HyperOptArgumentParser(strategy='random_search')

# Randomly searches over the (log-transformed) range [100,800).

parser.opt_range('--neurons', default=50, type=int, tunable=True, low=100, high=800, nb_samples=10, log_base=10)


# compile (because it's argparse underneath)
hparams = parser.parse_args()

# run 20 trials of random search over the hyperparams
for hparam_trial in hparams.trials(20):
    train_network(hparam_trial)

Convert your argparse params into searchable params by changing 1 line

import argparse
from test_tube import HyperOptArgumentParser

# these lines are equivalent
parser = argparse.ArgumentParser(description='Process some integers.')
parser = HyperOptArgumentParser(description='Process some integers.', strategy='grid_search')

# do normal argparse stuff
...

Log images inline with metrics

# name must have either jpg, png or jpeg in it
img = np.imread('a.jpg')
exp.log('test_jpg': img, 'val_err': 0.2)

# saves image to ../exp/version/media/test_0.jpg
# csv has file path to that image in that cell

Demos

How to contribute

Feel free to fix bugs and make improvements! 1. Check out the current bugs here or feature requests. 2. To work on a bug or feature, head over to our project page and assign yourself the bug. 3. We'll add contributor names periodically as people improve the library!

Bibtex

To cite the framework use:

@misc{Falcon2017,
  author = {Falcon, W.A.},
  title = {Test Tube},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/williamfalcon/test-tube}}
}    

License

In addition to the terms outlined in the license, this software is U.S. Patent Pending.

test-tube's People

Contributors

williamfalcon avatar borda avatar alok avatar backpropper avatar zafarali avatar akhti avatar seantrue avatar dlfelps avatar felix-petersen avatar jtamir avatar karanchahal avatar schneider-mathias avatar oscmansan avatar tullie avatar expectopatronum avatar kvhooreb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.