Giter Club home page Giter Club logo

apollo-git-backup's Introduction

Apollo Git Backup

This repository contains the scaffolding for creating your own Apollo Git Backup. There are a couple of noteworthy features:

  • Backs up fasta and gff3 file in git
  • XUnit reports
  • Extracts organism directory lineage
  • Can convert historical data to JBrowse tracks for visualisations of change over time.

Tools

Backup

The backup is straightforward, requireing python and jq. You will need to install dependencies with pip first:

pip install -r requirements.txt

Then you can run the backup script with the following:

export [email protected]
export APOLLO_PASSWORD=password
bash run.sh

This will produce a JUnit compatible report.xml. The developers of this project run it as a cron job in Jenkins and consume the XUnit report to ensure that it is functioning correctly.

Lineage

As part of the backup script run, lineage files are output. In our Apollo, due to how the Galaxy/Apollo connector works, we store every single JBrowse data directory that has ever been associated with an organism. It turns out that this is incredibly useful. If something goes wrong with a re-generated trackList, you can easily "downgrade" to a previous "build" of the data directory. A lineage file simply consists of every folder ID that was used in chronological order:

"/opt/apollo/data/galaxy/106799"
"/opt/apollo/data/galaxy/117811"
"/opt/apollo/data/galaxy/117885"
"/opt/apollo/data/galaxy/125281"
"/opt/apollo/data/galaxy/131198"
"/opt/apollo/data/galaxy/150697"

We had an emergency scenario where we discovered that we were dropping tracks during the re-build of data directories for 20 of our organisms. This was the incident that initially prompted adding the lineage files data. With the lineage files in hand, after patching the bug in our build process, we could simply copy the files forward from the first data dir build to the latest one.

Historical Data Tracks

As an experiment, we've build a script history.sh which works through the history and updates a JBrowse data directory with each version. This can then be loaded into a JBrowse instance or an Apollo instance through a simple include in the trackList.json.

Here you can see when terminators were added and various rename events that took place over time:

Another example showing the progression of feature renames over time:

Setting Up Your Own Backup

If you want to use this repo as a basis for backing up an apollo instance, great! First: fork it. You will need to have your own copy of the repo as the main build script commits data to it automatically

Then, we recommend running it in Jenkins, under an account with commit privileges to your repo. Over time, this repo will fill up with all of your changes and you'll be able to go backwards and forwards in time for all facets of annotation, not just structural model changes.

License

GPLv3

(Remember that you're running it on your own server and probably not distributing it to anyone, thus there's no need to share if you don't want to.)

Support

This material is based upon work supported by the National Science Foundation under Grant Number (Award 1565146)

apollo-git-backup's People

Contributors

hexylena avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.