Giter Club home page Giter Club logo

alualu's Introduction

PAIR2: efficient and powerful software to detect polymorphic Alu elements in high-throughput sequencing data. NB: unpublished yet. Please contact me for usage configFile: indicate the paths of input/output files and parameters to use in the program. @p: denote parameter 'p' given in configFile

###################################

run alu_deletion (make opt3/alu_delete)
 (1) run "alu_delete configFile preprocess"
 	 sort the alu position file, make folders for outputs

 (2) run "alu_delete configFile write_tmps_pn X" 
     X is index of individual according to the @file_pn. eg. for 2000 individuals, we can have 2000 jobs running in parallel, X ranging from 0 to 1999. 
 30X sequencing depth takes about 5hrs per individual (very rarely it may exceed 12 hrs)  

 (3) run "alu_delete configFile write_vcf_pns"
     make big vcf files     

###################################

run alu_insertion (make opt3/alu_insert)
  step(2) takes a few hours to run, all the other steps takes < 60 min, < 4G memory is enough for 99.9% samples.

 (1)  alu_insert configFile preprocess  

 (2)  run "alu_insert configFile write_tmps_pn X", X is index of an individual.

 (3)  make a file, @file_pn_used 
  include a subset of individuals from file @file_pn, individuals failed to run step (2) or with large amount of discordant reads are excluded.
  eg. 2000 individuals, up to 10% can be removed because it takes too much time to handle their discordant reads.

 (4) run "alu_insert configFile combine_pos_pns"
      which generate files named @file_alu_insert1/insert_pos.chr*  

 (5) run "alu_insert configFile clipReads_pn X" for each pn
     write positions of clip reads at @file_alu_insert1/clip/chr*/pn
 X is index of individual, if its corresponding name does not exist in @file_pn_used, the program will quit

 (6) run "alu_insert configFile clipReads_pns chr*" for each chromosome (chr1 - chrX)
 	 write insert_alu1/clip/chr*_pos/*
     which generate files at  insert_alu1/clip/chr*.clip_pn and  insert_alu1/clip/chr*.clip_region

 (7) run "alu_insert configFile write_tmp0_pn X" for each pn
     which writes alu and clip reads for each individual, at insert_alu1/cons/chr*/*

 (8) run "alu_insert configFile clipPos_pns chrX" for each chromosome (chr1 - chrX)
     write exact insertion breakpoints at  insert_alu1/cons/chr*_clip_pass       

 (9) run "alu_insert configFile write_tmp2_pn X" for each pn
     genotype calling      

 (10) run "alu_insert configFile fixed_vcf_pns" 
     final vcf file

alualu's People

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.