Giter Club home page Giter Club logo

adna-workflow's Introduction

This is the software used to run the Reich Lab ancient DNA workflows.

Workflows are written in the Workflow Description Language (WDL) and are run using the Broad Institute Cromwell workflow tool.

Workflows are setup to run on the Harvard Medical School O2 SLURM cluster. Running on other platforms will may require modifications.

The workflows interact with a Django website and database to track the results of samples across multiple sequencing runs.

The workflows use numerous external programs:

Workflows are setup to run on the scratch (temporary) filesystem, then copy permanent results to the group filesystem.

  • demultiplex.wdl - This takes an Illumina sequencer output directory as input and outputs a series of bams named by the index and barcode. This is run once per sequencing run.

    These bams follow the naming pattern [i5 index]_[i7 index]_[p5 barcode]_[p7 barcode], and are stored on the permanent filesystem.

    Paired-end reads are merged into single-end reads, requiring some minimum overlap and allowing for some mismatch depending on base quality scores. Adapters are trimmed during this while merging.

    There are two sets of bams, one aligned to the whole human genome reference hg19, and one aligned to the mitochondrial Reconstructed Sapiens Reference Sequence (RSRS). Bams are filtered to include only reads aligning to the reference.

  • analysis.wdl - This calculates a number of metrics for each bam, both on the Reich Lab set of ~1240k nuclear target data and MT data. It builds bams for each sample based on prior sequencing runs.

  • release_and_pulldown.wdl - Build release versions of bams with one read group per flowcell lane. Run pulldown to generate pseudo-haploid genotype data.

adna-workflow's People

Contributors

matthewmah avatar adammicco avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.