Bio Brew

Simple package manager for bioinformatic tools.
Fork of drio’s biobrew with some slight architecture changes.

Allows you to write recipes that describe how to install your favorite bioinfo tool. Once a recipe is written, you can install it simply with:

bb install tool

Where tool is your tool’s name.

Have a feature request? Want a new recipe? Create an Issue to let us know what you want.

Creating a Recipe

See bio.brew’s example recipe to get a feel for how a recipe should look.

In practice, a new recipe will usually start as a copy of an existing recipe tweaked to work.

Install

There are a few ways to install and configure bio.brew. My (current) preferred method involves checking out the source from github using git and then using the bb_load_env setup as detailed below. The requirement for this method is of course that you need git installed on your system.

Install with git

git clone git://github.com/vlandham/bio.brew.git

Installation Configuration

Minimum Configuration

At a minimum, you should define the BB_PATH and BB_INSTALL environmental variables and add $BB_INSTALL/bin and $BB_PATH/bin to your PATH.

BB_PATH defines the location of the bio.brew package, while BB_INSTALL tells bio.brew where to install tools.

This will add the bb executable to your path as well as binaries created using bio.brew. Just adding these variables will not add the recipe specific ENV variables to your path (see below).

Here is an example setup for basic bio.brew Brew use. This example assumes you have the bio.brew tool downloaded to the directory: /home/username/bio.brew and want bio.brew to install tools at /home/username/tools

export BB_INSTALL=/home/username/tools
export BB_PATH=/home/username/tools/bio.brew
export PATH="$BB_INSTALL/bin:$BB_PATH/bin:$PATH"

With this configuration, bio.brew will download tar/zip packages to /home/username/tools/tarballs extract and build these packages in /home/username/tools/stage and then symbolically link the correct binaries from these packages into /home/username/tools/bin.

The bb_load_env script described below automates most of this path configuration. It also provides the extra benefit of also adding ENV variables that are setup by particular recipes. This is described in a bit more detail in the Using bio.brew section below.

bb_load_env Helper Script

The small script bb_load_env has been provided to give one option for configuring bio.brew. You can add bb_load_env to your shell’s configuration file. A default value for BB_INSTALL is provided in bb_load_env, but I would specifiy explictly where you want to download these files. For bash, this would look like:

# .bashrc
# Assumes BB_INSTALL set as above
[[ -s "$BB_INSTALL/bio.brew/bb_load_env" ]] && source "$BB_INSTALL/bio.brew/bb_load_env"

That will set the proper path(s). Take a quick peek to load_env, it is a very small script.

Using bio.brew

Installing a new recipe

Here we use bio.brew to install samtools. The -j8 is passed to make when compiling

bb list
# produces a list of possible recipes to install with 
# the versions of the applications that each recipe is 
# made for (the seed of the recipe).
bb install samtools
# output produced by download and install of samtools
# if there is an errror, look at log files in $BB_PATH/log/samtools
bb activate samtools
# this will create sym links of the newly created binaries 
# from the samtools staging directory to $BB_INSTALL/bin
# Having this as a separate step allows you to download an update
# and test it without it replacing your current version of a tool.

As you can see, it is now installed in bio.brew’s local directory

$ ls -acl $BB_INSTALL/bin

Using Tools

Once installed the binaries will be in your path so you can just call them.

$ which samtools
# should output path $BB_INSTALL/bin/samtools

$ samtools view [options]

Keep in mind some recipes (packages) do not generate binary files, for those cases, bb creates special scripts (log/*.sh) and the necessary ENV vars are created. For example, picard is a bunch of jars that cannot be executed directly. By loading the recipe’s enviroment you will have a ENV variable called PICARD that points to the jars. It is then very easy to use the jars as necessary.

For example, once picard is installed and activated, assuming you have sourced the bb_load_env script in your .bashrc as described above, you could use picard jar files without knowing where exactly picard is installed:

$ ls $PICARD
# outputs all jars in picard directory.

$ java -Xmx2g -jar $PICARD/BamToBfq.jar [options]

Its just that easy!

Recipe Troubles

So you write a recipe, you forget some step, and now things are messed up. What do you do?

Recipe creation is an iterative process. Usually you won’t get it completely right the first time. In the future, we hope to make bio.brew more accommodating to this process. For now, there are a few things you can do.

Run bb clean

If bio.brew dies during one of its steps, it could leave behind lock files that might prevent running the installation process again.

To get rid of these, run bb clean like so:

bb clean picard

This will remove the meta files bio.brew creates, allowing you to re-run the recipe (after you’ve fixed the problem).

Check the log files

To see what when wrong, check the installation log files. These should be in the $BB_PATH/logs directory. Built-in helpers usually generate log files for their processes.

Add logging to your recipe

bio.brew comes with a handy log helper, which you can use to echo out relevant information during the installation of your recipe

log "performing some command now"
# ...
log "done with command"

Manually remove files

If something goes really wrong, you might have to dig down and remove files manually to start over again.

Remember: recipes are just bash scripts. There are no inherent restrictions on what you put in them. Please recipe responsibly.

How I Use bio.brew

As an example setup, here is how I’ve installed and setup bio.brew on OSX 10.5.8. Setup should be very similar for other versions of OS X as well as Linux distributions.

First, I created a tools directory in my home directory and cloned the bio.brew repo there:

mkdir ~/tools
cd ~/tools
git clone [email protected]:vlandham/bio.brew.git

Then I added the following to my .bashrc file:

export BB_INSTALL=~/tools
export BB_PATH=~/tools/bio.brew
PATH="$BB_INSTALL/bin:$BB_PATH/bin:$PATH"
export PATH

[[ -s "$BB_INSTALL/bio.brew/bb_load_env" ]] && source "$BB_INSTALL/bio.brew/bb_load_env"

This indicates that all my bio.brew recipes will be installed in the bin directory under ~/tools

As I manage java, subversion, and other packages externally of bio.brew, I first fake these inside of bio.brew to satisfy dependencies

bb fake java
bb fake svn

Now I’m ready to use bio.brew to install and manage my bioinformatic packages!

bb install bowtie
bb activate bowtie

Testing

untested

At that point you can use ./bin/bb to list, install or remove recipes. You can also run the ``tests/do_all.sh`` script to install all the recipes. That may take some time (20 minutes in a fairly powerful machine with 8 cores and using bb -j8 install).

Justification

Imagine you get access to a new linux box or HPC cluster. You want to start doing your analysis, coding, data-mining but you do not have your toolbox available. Yes, you can ask the sysadmins to install that for you. That may work. You may also install the tools yourself.

BB (biobrew) helps you either way. With a simple oneliner you can have most of your tools ready to use. BB is a tiny package manager based entirely in bash. It makes the extension of recipes (package definitions) very easy.

BB currently works in Linux enviroments (it should work just fine in other flavours of Unix). It only needs a typical Linux enviroment with gcc, curl (and potentially git). Besides classic and inseparable unix tools, bio.brew’s focus is on bioinformatic tools and packages.

bio.brew has a similar feel to it as the homebrew package manager for Mac OSX. The difference being that it is much more lean in terms of functionality and packages provided, it is written entirely in shell instead of a mixture of shell and ruby, it works on more than just Mac OSX, and its focus is only on core bioinformatic tools.

Give it a try and make it better by improving the framework and adding more recipes.

Differences from Original bio.brew

The recipe seed name is a more fundamental part of the system architecture now. Recipes are defined by the combination of the recipe name as well as the seed name. In effect, the recipe name describes a tool and the seed name indicates a specific version of that tool.
The above architecture allows and encourages keeping multiple versions (seeds) of a tool around but having only one version active at a particular time.
The STAGE_DIR is where the compiled resources for a recipe/seed live. This is separate and external to the installation directory.
The installation directory is configurable by defining BB_INSTALL (see configuration below).
- To be an active seed is to have the binaries of the particular seed of the recipe linked in the bin directory of bio.brew
- This is achieved partly through the use of a current symbolic link for each recipe that points to the directory containing the active seed.
Installation of a recipe seed and activation of that seed are two separate steps in this version of bio.brew.
- This allows some sort of testing to be performed before the current version (seed) of the tool is switched to the new version.
Required operations before/after installation and removal are called automatically, instead of having to call these functions inside individual recipes.
Some additional helper functions for installation and checking external dependencies.
Additional bio.brew commands:
- clean – to remove bio.brew files related to a recipe/seed. Useful for initially developing a recipe to clean out failed installation attempts.
- activate – as mentioned above, activation (moving the current pointer for a recipe to the newly installed seed) is a separate step from install.
- test – stubbed functionality that could be used in the future to add testing for a recipe/seed before it is activated.
- fake – If there is a dependency in a recipe, but you don’t want to use bio.brew to install the dependency, use fake to have bio.brew think that it is installed. (Most useful for java dependencies).

Cavets

These changes were made because of the nature of the environment we wanted bio.brew to work in – and because we believe it is very useful to be able to keep older versions of these tools around as necessary. I feel these changes are an improvement, but there are some things not completely integrated from the original bio.brew:

Not all the recipes have been modified to use the new methodology. See bamtools.sh or cufflinks.sh for examples of recipes that do use the new setup and conventions.
The test script(s) has not been tested and will probably not work.

TODO

Allow specific options for actions (-j should not be generic, only for install).
more recipes!
better support for moving between seeds of a recipe
better support for removing recipes accurately and cleanly

vlandham / bio.brew Goto Github PK

bio.brew's Introduction