Giter Club home page Giter Club logo

rxrx1-utils's Introduction

rxrx1-utils

Starter code for the CellSignal NeurIPS 2019 competition hosted on Kaggle.

To learn more about the dataset please visit RxRx.ai.

Notebooks

Here are some notebooks to illustrate how this code can be used.

Setup

This starter code works with python 2.7 and above. To install the deps needed for training and visualization run:

pip install -r  requirements.txt

If you plan on using the preprocessing functionality you also need to install other deps:

pip install -r preprocessing_requirements.txt

Preprocessing

Reading individual image files can become an IO bottleneck during training. This is will be a common problem faced by people who use this dataset so we are also releasing an example script to pack the images into TFRecords and zarr files. We are also making available some pre-created TFRecords available in Google Cloud Storage. Read more about the provided TFRecords below.

images2tfrecords

Script that packs raw images from the rxrx1 dataset into TFRecords. This scripts runs locally or using Google DataFlow.

Run python -m rxrx.preprocess.images2tfrecords --help for usage instructions.

images2zarr

Script that packs raw images from the rxrx1 dataset into zarrs. This script only runs locally but could easily be extended to run using Google DataFlow.

This script packs each site image into a single zarr. So, instead of having to load 6 separate channel pngs for a singe image all of those channels will be saved together in a single zarr file. You could extend the script to pack more images into a single zarr file similar to what is done for TFRecords. This is left as an exercise to the IO bound reader. :) Read more about the Zarr format and library here.

Run python -m rxrx.preprocess.images2zarr --help for usage instructions.

Training on TPUs

This repo has barebones starter code on how to train a model on the RxRx1 dataset using Google Cloud TPUs.

The easiest way to see this in action is to look at this notebook.

You can also spin up a VM to launch jobs from. To understand TPUs the best place to start is the TPU quickstart guide. The ctpu command is helpful and you can find its documentation here. Note, that you can easily download and install ctpu to you local machine.

Example TPU workflow

First spin up a VM:

ctpu up -vm-only -forward-agent -forward-ports -name my-tpu-vm

This command will create the VM and ssh you into it. Note how the -vm-only flag is used. This allows you to spin up the VM separate from the TPU which helps prevent spending money on idle TPUs.

Next, setup the repo and install the dependencies:

git clone [email protected]:recursionpharma/rxrx1-utils.git
cd rxrx1-utils
pip install -r requirements.txt # optional if just training!

Note that for just training you can skip the pip install since the VM will have all the needed deps already.

Next you need to spin up a TPU for training:

export TPU_NAME=my-tpu-v3-8
ctpu up -name "$TPU_NAME" -preemptible -tpu-only -tpu-size v3-8

Once that is complete you can start a training job:

python -m rxrx.main --model-dir "gs://path-to-bucket/trial-id/"

You'll also want to launch a tensorboard to watch to check the results:

tensorboard --logdir=gs://path-to-bucket/

Since we used the -forward-ports in the ctpu command when starting the VM you will be able to view tensorboard on your localhost.

Once you are done with the TPU be sure to delete it!

ctpu delete -name "$TPU_NAME" -tpu-only`

You can then iterate on the code and spin up a TPU again when ready to try again.

When you are done with your VM you can either stop it or delete it with the ctpu command, for example:

ctpu delete -name my-tpu-vm

Provided TFRecords

As noted above we are providing TFRecords. They live in the following buckets:

gs://rxrx1-us-central1/tfrecords
gs://rxrx1-europe-west4/tfrecords

The data lives in these two regional buckets because when you train with TPUs you want to train from buckets in the same region as your TPU. Remember to use the appropriate bucket that is in the same region as your TPU!

The directory structure of the TFRecords is as follows:

└── tfrecords
         ├── by_exp_plate_site-42
         │   ├── HEPG2-10_p1_s1.tfrecord
         │   ├── HEPG2-10_p1_s2.tfrecord
         │   ├── ….
         │   ├── U2OS-03_p3_s2.tfrecord
         │   ├── U2OS-03_p4_s2.tfrecord
         │   └── U2OS-03_p4_s2.tfrecord
         └── random-42
           ├── train
           │   ├── 001.tfrecord
           │   ├── 002.tfrecord
…. 

The random-42 denotes that the data has been split up randomly across different tfrecords, each record holding ~1000 examples. The 42 is the random seed used to generate this partition. The example code in this repository uses this version of the data.

The by_exp_plate_site-42 is where each TFRecord contains an all of the images for a particular experiment, plate, and site grouping. Internally the well addresses are random in the TFRecord. The advantage of this grouping is that you can be selective on the experiments you train on. Due to the grouping each TFRecord here has only about ~277 examples per file.

For good training batch diversity it is recommended that you use the TF Dataset API to interleave examples from these various files. The provided input_fn in this repository already does this.

rxrx1-utils's People

Contributors

bearnshaw avatar bmabey avatar dizcology avatar kryptonite0 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.