Giter Club home page Giter Club logo

glapd's Introduction

This is a customizable LAMP primer sets designing system: GLAPD (whole genome based LAMP primer design for a set of target genomes).
######
Introduction:
  LAMP(Loop-mediated isothermal amplification) is a simple and effective new method to amplify DNA sequence.
  One LAMP primer set contains four single LAMP primers from six primer regions in genome. The six regions are F3, F2, F1c, B1c, B2 and B3. Sequences from F1c and F2 are synthesized into primer FIP and sequences from B1c and B2 are synthesized into BIP. In order to accelerate the amplification, two additional loop primers(LF, LB) can be added.

  GLAPD can design LAMP primer sets based on whole genome. It can design common primers for a set of target genomes and the primers are specific for background genomes.
  Firstly, GLAPD identifies all candiate single primer regions. Then those single primers are aligned to target genomes and background genomes. Thirdly, GLAPD combines candidate single primers into LAMP primer sets. At last the commonality and specificity check are calculated for LAMP primer sets with the information from alignment. 
######
SYSTEM REQUIREMENTS:
GLAPD now runs under Linux operation system, it needs perl and gcc. If run GPU version, the computer capability of GPU >=2.0. 
######
OTHER SOFTWARES MAY BE NEEDED
Bowtie, which could be downloaded from http://bowtie-bio.sourceforge.net/index.shtml
CUDA driver, which could be downloaded from http://www.nvidia.com
######
Files and Directories:
This system is packaged in one file, "GLAPD.tar.gz", which can be downloaded from http://cgm.sjtu.edu.cn/GLAPD/ and https://github.com/jiqingxiaoxi/GLAPD2.git. Important files inside this tar.gz file are listed here. 

|-Par/
     tm_nn_parameter.txt	Parameters for calculating Tm.
     stab_parameter.txt		Parameters for calculating the stability of single primers.
     *.db, *.ds			Sixteen files for calculating the secondary structure of primers.
|-par.pl			Get the information of positions and mismatches for each single primer.
|-single.c			The CPU version of GLAPD for identify single primer regions.
|-LAMP.c			The CPU version of GLAPD for design LAMP primer sets.
|-Makefile			Makefile for CPU version.
|-GPU/
     single.cu			The GPU version of GLAPD for identify single primer regions.
     LAMP.cu			The GPU version of GLAPD for design LAMP primer sets.
     Makefile			Makefile for GPU version.
|-bowtie/
     bowtie			The bowtie program
     bowtie-build		The bowtie-build program
|-example/
     example.fa			Sequences as example, from "NC_002951.2 Staphylococcus aureus subsp. aureus COL chromosome".
     target-list.txt		The list of names from target genomes. It contains 3 strains of S. aureus.
     background-list.txt	The list of names from background genomes. It contains 3 strains of other bacteria.
     index.*			The index files for Bowtie software.
######
INSTALLATION:
  tar -zxvf GLAPD.tar.gz
If you want to use CPU version:
  cd GLAPD/
  make
If you want to use GPU version:
  cd GLAPD/GPU/
  make
######
QUICK START:
1. If you want design LAMP primers for a sequence without taking care of commonality and specificity:
  cd GLAPD/
  ./Single -in example/example.fa -out Test
    (Two files "Inner/Test" and "Outer/Test" are created.)
  ./LAMP -in Test -ref example/example.fa -out success.txt
    (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

2. If you want design common LAMP primers without taking care of specificity:
  cd GLAPD/
  ./Single -in example/example.fa -out Test
    (Two files "Inner/Test" and "Outer/Test" are created.)
  perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt
    (Three files "Inner/Test-common_list.txt", "Inner/Test-common.txt" and "Outer/Test-common.txt" are created.)
  ./LAMP -in Test -ref example/example.fa -out success.txt -common
    (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

3. If you want design specific LAMP primers without taking care of commonality:
  cd GLAPD/
  ./Single -in example/example.fa -out Test
    (Two files "Inner/Test" and "Outer/Test" are created.)
  perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --specific example/background-list.txt       
    (Two files "Inner/Test-specific.txt" and "Outer/Test-specific.txt" are created.)                        
  ./LAMP -in Test -ref example/example.fa -out success.txt -specific        
    (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

4. If you want design common and specific LAMP primers:
   cd GLAPD/
  ./Single -in example/example.fa -out Test
    (Two files "Inner/Test" and "Outer/Test" are created.)
  perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt --left
    (Five files "Inner/Test-common_list.txt", "Inner/Test-common.txt", "Inner/Test-specific.txt", "Outer/Test-common.txt" and "Outer/Test-specific.txt" are created.)
  ./LAMP -in Test -ref example/example.fa -out success.txt -common -specific
    (Ten LAMP primer sets are designed successfully stored in "success.txt" file.)
######
RUN THE SYSTEM:
1.Identify candidate single primer regions:
Command:
  Single -in <ref_genome> -out <single_primers> [options]*

Arguments:
  -in <ref_genome>
    reference genome, fasta formate
  -out <single_primers>
    output the candidate single primers
  -dir <directory>
    the directory for output file
    default: current directory
  -loop
    identifiy candidate single primer regions for loop primers
  -check <int>
    check single primers' secondary structure or not
    0: don't check secondary structure; other values: check
    default: 1
  -par <par_directory>
    parameter files under the directory are used to check primers' secondary structure
    default: GLAPD/Par/
  -h[-help]
    print usage

2.Align sequences from single primer regions(optional):
Command:
  perl par.pl --in <sinlge_primers_file> --ref <ref_genome> --common[--specific] <genomes_list> --bowtie <bowtie> --index <database> [options]*

Arguments:
  --in <single_primers_file>
    the file name of candidate single primer regions, files are generated from Single program
  --ref <ref_genome>
    reference genome, fasta formate
  --dir <directory>
    dirctory for files of candidate single primer regions
    default: current directory
  --loop
    include loop primers
  --common <genomes_list>
    the genomes in the file(target genomes) are expected to be amplified by LAMP primer sets
  --specific <genomes_list>
    the genomes in the file(background genomes) are not expected to be amplified by LAMP primer sets
  --left
    background_group = all_genome_in_database - target_group
    used with --common
    invalid if exist --specific
  --bowtie <bowtie>
    the bowtie program
  --index <database>
    bowtie index file name, comma-separated
  --mis_s <int>
    the max number of mismatches allowed when align single primers to background genomes
    this value between 0 and 3. the bigger of the value, the more specific
    default: 2
  --mis_c <int>
    the max number of mismatches allowed when align single primers to target genomes
    this value between 0 and 3. the smaller of the value, the more common
    default: 0
  --threads <int>
    number of threads to launch when align
    default: 1
  --help|--h
    print help information

3.Design LAMP primer sets:
Command:
  LAMP -in <sinlge_primers_file> -ref <ref_genome> -out <LAMP_primer_sets> [options]*

Arguments:
  -in <single_primers_file>
    the file name of candidate single primer regions, files are generated from Single program
  -ref <ref_genome>
    reference genome, fasta formate
  -dir <directory>
    the directory for output file
    default: current directory
  -out <LAMP_primer_sets>
    output successfully designed LAMP primer sets
  -num <int>
    the expected output number of LAMP primer sets
    default: 10
  -loop
    design LAMP primer sets with loop primers
  -common
    design common LAMP primer sets those can amplify more than one target genomes
  -specific
    design specific LAMP primer sets those can't amplify any background genomes
  -check <int>
    check primers' tendency of binding to another in one LAMP primer set or not
    0: don't check; other values: check
    default: 1
  -par <par_directory>
    parameter files under the directory are used to check primers' binding tendency
    default: GLAPD/Par/
  -fast
    fast mode to design LAMP primer sets, in this mode GLAPD may lost some right results
  -h/-help
    print usage
######
INPUT FILE FORMAT:
1) The reference genome must be fasta format.
2) The common list file and specific list file used in step 2 must be one genome name per line. They can be generated by taking the sequence names from target or background genomes directly.
3) The bowtie index can be generated by "bowtie-build" command, more details in http://bowtie-bio.sourceforge.net/tutorial.shtml.
OUTPUT FILE FORMAT:
1) In the files generated by "Single" program, each line means a candidate single primer. For example:
	"pos:3   length:25       +:2     -:0     61.89"
   Each line has five fields seperated by tabs. From left to right, the fields are:
   1. The position of the single primer in reference genome (0-based)
   2. The length of the single primer
   3,4. Sum of all applicable flags. "+" means the primer from the plus strand of reference genome and "-" means the primer from minus strand. If the number is "0", the single primer isn't from the plus or minus strand of reference genome. Flags are:
	1: this single primer can be used to amplify a target if the GC-content of target region is >=60%
	2: this single primer can be used to amplify a target if the GC-content of target region is <=45%
	4: this single primer can be used to amplify a target if the GC-content of target region is between 45% and 60%
   5. Tm
2) In the files generated by "par.pl" ("XX-common.txt" and "XX-specific.txt), each line stores an alignment. For example:
	"3       25      2       501407  1       0"
   Each line has six fields seperated by tabs. From left to right, the fields are:
   1. The position of the single primer in reference genome (0-based)
   2. The length of the single primer
   3. The genome turn in target group or background group (0-based)
   4. The position of alignment in this genome(field 3)
   5. "1" means this single primer can be used to amplify the genome (field 3) in plus strand. "0" means this single primer can't be used to amplify the genome (field 3) in plus strand.
   6. "1" means this single primer can be used to amplify the genome (field 3) in minus strand. "0" means this single primer can't be used to amplify the genome (field 3) in minus strand.
3) In the file generated by "par.pl" ("XX-common_list.txt"), each line contains one target genome. For example:
	"NC_002951.2     0"
   Each line has two fields seperated by tab. From left to right, the fields are:
   1. The name of target genome
   2. The turn of target genome in target group (0-based)
4) The LAMP primer set is stored in the file generated by "LAMP" program, for example:
	"The 1 LAMP primers:
	  F3: pos:36,length:18 bp, primer(5'-3'):CGGTTCCCTGTACTCGAA
	  F2: pos:74,length:19 bp, primer(5'-3'):AATTCCTTTGTTGAGGCCG
	  F1c: pos:115,length:24 bp, primer(5'-3'):CGAAATCTTCAAACACTACGTGCT
	  B1c: pos:175,length:22 bp, primer(5'-3'):CCTGACGGAAGCAGCATTAAGT
	  B2: pos:237,length:19 bp, primer(5'-3'):CGAACGTAACCAAAGTCGT
	  B3: pos:276,length:23 bp, primer(5'-3'):TAAAAAATAAAAAACCGTGCACC
	  This set of LAMP primers could be used in 3 genomes, there are: NC_002951.2, NC_017340.1, NC_002745.2"
   One LAMP primer set contains at least six single primers. The positions of single primers in reference genome and their length, sequence are listed in this file. When user designs common primers, which target genomes can be amplified by this primer set are also listed in this file.
######
TIPS:
1) Select reference genome:
  The reference genome can be select one randomly from the group of target genomes, or the most expected genome amplified by the LAMP primer set.
2) Specific file:
  When run the step 2, if you have the "common file", you can use "--left" option to replace the "--specific". In this way, all genomes in database expect for those in "common file" are defined as the background genomes.
3) Fast mode:
  When run the step 3, use the "-fast" option can accelerate the designing. But this mode may lost some right LAMP primer sets.
######
If you have any questions, please contact with us:
Ben Jia: [email protected]
Chaochun Wei: [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.