Giter Club home page Giter Club logo

clairvoyante_demo's Introduction

Clairvoyante demo on DNAnexus

The repo contains code for setting up running Clairvoyante on DNAnexus platform. While it is designed to help deploy workflows and applets on DNAnexus platform, the configuration could be also useful for people who like to try Clairvoyante on their own computer.

If you have an DNAnexus account and familar with dx-toolkit, you can login with dx login and select the public project "clairvoyante_dnanexus_demo" by

dx select clairvoyante_dnanexus_demo

or

dx select project-FG1YJ6Q08GQ1bPV084096XZX

You can see the content of the project by

$ dx ls
BGI_SEQ_NA12878_DATA/
BGI_SEQ_NA12878_training_results/
Illumina_HG001_training_results/
ONT_HG001_training_results/
PacBio_HG001_training_results/
applets/
cv_data/
data_from_the_preprint/
_clairvoyante_jupyter_demo
cv_variant_call
gpu_asset
notebook_gpu_asset

Clone project

You can create a new project and clone the data inside clairvoyante_dnanexus_demo to test it out. For example, to clone the project to a new project called cv_test, you can do

dx new project cv_test
dx cp clairvoyante_dnanexus_demo:/ cv_test:/

Useful job scripts for going through building training dataset to calling variants

In the job_scripts directory, there are some pre-define job submission scripts that you can try out. For example, to train a model with Illumina's NA12878 data:

  1. To generate training data
get_illumina_train_data.sh
  1. Wait for the job to finsh. it takes about 3.5 hours. After the data is prepared, we can run train the model:
illumina_training.sh

The result will be in Illumina_HG001_training_results. Check the model_training.log file to find the best validation loss batch number from line like

Best validation loss at batch: 57
  1. To call variants Edit the MODLE_PREFIX environment variable in the get_illumina_variant_calls.sh to use the model from the specific bath. For example
MODLE_PREFIX=cv_model-000057

Then you can run get_illumina_variant_calls.sh to generate variant calls for HG002.

You can take a look at those script and modify for other data.

get_illumina_variant_calls.sh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.