Giter Club home page Giter Club logo

rala's Introduction

Rala

Build status for c++/clang++

Layout module for raw de novo DNA assembly of long uncorrected reads.

Description

Rala is intended as a standalone layout module to assemble raw reads generated by third generation sequencing. It trims sequence adapters, purges chimeric sequences and removes repeat-induced overlaps by examining sequence pile-o-gram. Afterwards, an assembly graph is built and simplified in the default way, i.e. applied are transitive reduction, tipping and bubble popping. Leftover tangles are resolved by laying out the graph in a 2D plane with the help of the force directed placement algorithm, and long edges are removed (see figures bellow).

Rala takes as input two files: sequences in FASTA/FASTQ format and overlaps between them in PAF/MHAP format. Both input files can be compressed with gzip. Output is a set of contigs in FASTA format.

For maximal performance, Rala should be run twice with minimap (or minimap2) to improve repeat annotation due to k-mer filtering. A sample run can be seen bellow (and found in misc/raven.sh alongside minimap2 version):

minimap -t <threads> -L100 -Sw5 -m0 <sequences> <sequences> > overlaps.paf

rala -t <threads> -p <sequences> overlaps.paf > uncontained.fasta

minimap -t <threads> -L100 -w5 -m0 -f0.00001 uncontained.fasta <sequences> > uncontained.paf

rala -t <threads> -s uncontained.paf <sequences> overlaps.paf > <layout>

Pile-o-gram of a chimeric read

Pile-o-gram of two reads which overlap on a repetitive region

Force directed layout of an assembly graph with highlighted false overlaps&

Dependencies

  1. gcc 4.8+ or clang 3.4+
  2. cmake 3.2+

Installation

To install Rala run to following commands:

git clone --recursive https://github.com/rvaser/rala.git rala
cd rala
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make

After successful installation, an executable named rala will appear in build/bin.

Optionally, you can run sudo make install to install rala executable to your machine.

Note: if you omitted --recursive from git clone, run git submodule update --init --recursive before proceeding with compilation.

Usage

Usage of rala is as following:

rala [options ...] <sequences> <overlaps>

    <sequences>
        input file in FASTA/FASTQ format (can be compressed with gzip)
        containing sequences
    <overlaps>
        input file in MHAP/PAF format (can be compressed with gzip)
        containing pairwise overlaps

    options:
        -p, --preconstruct
            print uncontained sequences for second iteration
        -s, --sensitive-overlaps <file>
            input file in MHAP/PAF format (can be compress with gzip)
            containing more sensitive overlaps
        -u, --include-unassembled
            output unassembled sequences (singletons and short contigs)
        -d, --debug <string>
            enable debug output with given prefix
        -t, --threads <int>
            default: 1
            number of threads
        --version
            prints the version number
        -h, --help
            prints the usage

Contact information

For additional information, help and bug reports please send an email to: [email protected].

Acknowledgement

This work has been supported in part by Croatian Science Foundation under the project UIP-11-2013-7353.

rala's People

Contributors

rvaser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rala's Issues

gfa output

Hi
Can we get .gfa file output to view the graph?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.