Giter Club home page Giter Club logo

age's Introduction

README file for AGE software distribution

ABOUT
Software AGE implements optimal algorithms for aligning genomic sequences with
structural variations (SVs). Precise alignment allows for correct breakpoint
determination. The algorithms are described in the publication
Bioinformatics. 2011 Mar 1;27(5):595-603. Epub 2011 Jan 13. Additional info
on importance of breakpoints can be obtained from 
* Nat Biotechnol. 2010 Jan;28(1):47-55. Epub 2009 Dec 27.
* Nature. 2011 Feb 3;470(7332):59-65.

The software runs on linux based systems.


1. Compilation
==============

$ make

or (without parallel support)

$ make OMP=no

which ever works


2. Running
==========

$ ./age_align file1.fa file2.fa

The input files must be in FASTA format and can contain multiple sequences.
The output is alignments for each pair of sequences with first sequence from
the first file and second sequence from the second file.

When aligning long read or assembled conting to a chromosome, useful options
are -coor1 and -coor2. These options allow specifying region(s) of the
inputed sequences, to be used in an alignment. For example, for a prediction of
a deletion in the region chr12:11396601-11436500 and assembled conting for
the allele containing this deletion, one may use the following command 

./age_align chr12.fa conting.fa -coor1=11395601-11437500

i.e. align conting to the the predicted region extended by 1 kb downstream and
upstream.

The penalty model used for determining the cost of insertions and deletions is 
the affine gap model. The gap penalty function G(i) is defined
as G(i) = go + (ge x i). Note that when there is a single gap then G(i) = go + ge

Help:
$ ./age_align


Options:

-indel			assume deletion or insertion (default)
-tdup			assume tandom duplication
-invl			assume inversion with conting (second sequence)
			spanning over left breakpoint
-invr			assume inversion with conting (second sequence)
			spanning over right breakpoint
-inv			assume inversion; tries alignment over the left and
			right breakpoints; report the best alignment
-coor1=start-end	align subsequence of first sequence defined by given
			coordinates
-coor2=start-end	align subsequence of second sequence defined by given
			coordinates
-revcom1		align reverse complement of first sequence
-revcom2		align reverse complement of second sequence
-both			align first sequence to second one and its reverse
			complement; report the best alignment
-match			score for nucleotide match
-mismatch		score for nucleotide mismatch
-go=value		gap open penalty, negative value
-ge=value		gap extend penalty, negative value
-allpos			always display boundary positions
-berg			align the sequences using Hirschberg's linear memory algorithm

Examples:

./age_align -coor1=20-2350    file1.fa file2.fa
./age_align -coor1=20-        file1.fa file2.fa
./age_align -coor2=-240       file1.fa file2.fa
./age_align -revcom1 -revcom2 file1.fa file2.fa
./age_align -both             file1.fa file2.fa
./age_align -inv  -both       file1.fa file2.fa
./age_align -tdup -both       file1.fa file2.fa





Please send your comments and suggestions to [email protected].

age's People

Contributors

abyzov avatar sbandara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

age's Issues

Error

Hi
I get the following error when submitting directly on the command line on our HPC server.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted

This error also leads me to believe that the program will hang if it doesnt have enough memory or, it cannot run the program on such large files.

Is this due to the file size being too large? The program will run with large files of the same sequence but crashes if I try to run it on smaller 10Mb (Bytes) chunks of the reference sequence. Looking around has led me to believe that this is a memory issue. Is this correct?

Ideally, I want to look for breakpoints across the entire genome. Is this possible using AGE?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.