The glapd's intro from qiyueming

This is a customizable LAMP primer sets designing system: GLAPD (whole genome based LAMP primer design for a set of target genomes).
######
Introduction:
LAMP(Loop-mediated isothermal amplification) is a simple and effective new method to amplify DNA sequence.
One LAMP primer set contains four single LAMP primers from six primer regions in genome. The six regions are F3, F2, F1c, B1c, B2 and B3. Sequences from F1c and F2 are synthesized into primer FIP and sequences from B1c and B2 are synthesized into BIP. In order to accelerate the amplification, two additional loop primers(LF, LB) can be added.

GLAPD can design LAMP primer sets based on whole genome. It can design common primers for a set of target genomes and the primers are specific for background genomes.
Firstly, GLAPD identifies all candiate single primer regions. Then those single primers are aligned to target genomes and background genomes. Thirdly, GLAPD combines candidate single primers into LAMP primer sets. At last the commonality and specificity check are calculated for LAMP primer sets with the information from alignment.
######
SYSTEM REQUIREMENTS:
GLAPD now runs under Linux operation system, it needs perl and gcc. If run GPU version, the computer capability of GPU >=2.0.
######
OTHER SOFTWARES MAY BE NEEDED
Bowtie, which could be downloaded from http://bowtie-bio.sourceforge.net/index.shtml
CUDA driver, which could be downloaded from http://www.nvidia.com
######
Files and Directories:
This system is packaged in one file, "GLAPD.tar.gz", which can be downloaded from http://cgm.sjtu.edu.cn/GLAPD/ and https://github.com/jiqingxiaoxi/GLAPD2.git. Important files inside this tar.gz file are listed here.

|-Par/
tm_nn_parameter.txt Parameters for calculating Tm.
stab_parameter.txt Parameters for calculating the stability of single primers.
*.db, *.ds Sixteen files for calculating the secondary structure of primers.
|-par.pl Get the information of positions and mismatches for each single primer.
|-single.c The CPU version of GLAPD for identify single primer regions.
|-LAMP.c The CPU version of GLAPD for design LAMP primer sets.
|-Makefile Makefile for CPU version.
|-GPU/
single.cu The GPU version of GLAPD for identify single primer regions.
LAMP.cu The GPU version of GLAPD for design LAMP primer sets.
Makefile Makefile for GPU version.
|-bowtie/
bowtie The bowtie program
bowtie-build The bowtie-build program
|-example/
example.fa Sequences as example, from "NC_002951.2 Staphylococcus aureus subsp. aureus COL chromosome".
target-list.txt The list of names from target genomes. It contains 3 strains of S. aureus.
background-list.txt The list of names from background genomes. It contains 3 strains of other bacteria.
index.* The index files for Bowtie software.
######
INSTALLATION:
tar -zxvf GLAPD.tar.gz
If you want to use CPU version:
cd GLAPD/
make
If you want to use GPU version:
cd GLAPD/GPU/
make
######
QUICK START:
1. If you want design LAMP primers for a sequence without taking care of commonality and specificity:
cd GLAPD/
./Single -in example/example.fa -out Test
(Two files "Inner/Test" and "Outer/Test" are created.)
./LAMP -in Test -ref example/example.fa -out success.txt
(Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

2. If you want design common LAMP primers without taking care of specificity:
cd GLAPD/
./Single -in example/example.fa -out Test
(Two files "Inner/Test" and "Outer/Test" are created.)
perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt
(Three files "Inner/Test-common_list.txt", "Inner/Test-common.txt" and "Outer/Test-common.txt" are created.)
./LAMP -in Test -ref example/example.fa -out success.txt -common
(Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

3. If you want design specific LAMP primers without taking care of commonality:
cd GLAPD/
./Single -in example/example.fa -out Test
(Two files "Inner/Test" and "Outer/Test" are created.)
perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --specific example/background-list.txt
(Two files "Inner/Test-specific.txt" and "Outer/Test-specific.txt" are created.)
./LAMP -in Test -ref example/example.fa -out success.txt -specific
(Ten LAMP primer sets are designed successfully stored in "success.txt" file.)

4. If you want design common and specific LAMP primers:
cd GLAPD/
./Single -in example/example.fa -out Test
(Two files "Inner/Test" and "Outer/Test" are created.)
perl par.pl --in Test --ref example/example.fa --bowtie Bowtie_path/bowtie --index example/index --common example/target-list.txt --left
(Five files "Inner/Test-common_list.txt", "Inner/Test-common.txt", "Inner/Test-specific.txt", "Outer/Test-common.txt" and "Outer/Test-specific.txt" are created.)
./LAMP -in Test -ref example/example.fa -out success.txt -common -specific
(Ten LAMP primer sets are designed successfully stored in "success.txt" file.)
######
RUN THE SYSTEM:
1.Identify candidate single primer regions:
Command:
Single -in <ref_genome> -out <single_primers> [options]*

Arguments:
-in <ref_genome>
reference genome, fasta formate
-out <single_primers>
output the candidate single primers
-dir <directory>
the directory for output file
default: current directory
-loop
identifiy candidate single primer regions for loop primers
-check <int>
check single primers' secondary structure or not
0: don't check secondary structure; other values: check
default: 1
-par <par_directory>
parameter files under the directory are used to check primers' secondary structure
default: GLAPD/Par/
-h[-help]
print usage

2.Align sequences from single primer regions(optional):
Command:
perl par.pl --in <sinlge_primers_file> --ref <ref_genome> --common[--specific] <genomes_list> --bowtie <bowtie> --index <database> [options]*

Arguments:
--in <single_primers_file>
the file name of candidate single primer regions, files are generated from Single program
--ref <ref_genome>
reference genome, fasta formate
--dir <directory>
dirctory for files of candidate single primer regions
default: current directory
--loop
include loop primers
--common <genomes_list>
the genomes in the file(target genomes) are expected to be amplified by LAMP primer sets
--specific <genomes_list>
the genomes in the file(background genomes) are not expected to be amplified by LAMP primer sets
--left
background_group = all_genome_in_database - target_group
used with --common
invalid if exist --specific
--bowtie <bowtie>
the bowtie program
--index <database>
bowtie index file name, comma-separated
--mis_s <int>
the max number of mismatches allowed when align single primers to background genomes
this value between 0 and 3. the bigger of the value, the more specific
default: 2
--mis_c <int>
the max number of mismatches allowed when align single primers to target genomes
this value between 0 and 3. the smaller of the value, the more common
default: 0
--threads <int>
number of threads to launch when align
default: 1
--help|--h
print help information

3.Design LAMP primer sets:
Command:
LAMP -in <sinlge_primers_file> -ref <ref_genome> -out <LAMP_primer_sets> [options]*

Arguments:
-in <single_primers_file>
the file name of candidate single primer regions, files are generated from Single program
-ref <ref_genome>
reference genome, fasta formate
-dir <directory>
the directory for output file
default: current directory
-out <LAMP_primer_sets>
output successfully designed LAMP primer sets
-num <int>
the expected output number of LAMP primer sets
default: 10
-loop
design LAMP primer sets with loop primers
-common
design common LAMP primer sets those can amplify more than one target genomes
-specific
design specific LAMP primer sets those can't amplify any background genomes
-check <int>
check primers' tendency of binding to another in one LAMP primer set or not
0: don't check; other values: check
default: 1
-par <par_directory>
parameter files under the directory are used to check primers' binding tendency
default: GLAPD/Par/
-fast
fast mode to design LAMP primer sets, in this mode GLAPD may lost some right results
-h/-help
print usage
######
INPUT FILE FORMAT:
1) The reference genome must be fasta format.
2) The common list file and specific list file used in step 2 must be one genome name per line. They can be generated by taking the sequence names from target or background genomes directly.
3) The bowtie index can be generated by "bowtie-build" command, more details in http://bowtie-bio.sourceforge.net/tutorial.shtml.
OUTPUT FILE FORMAT:
1) In the files generated by "Single" program, each line means a candidate single primer. For example:
"pos:3 length:25 +:2 -:0 61.89"
Each line has five fields seperated by tabs. From left to right, the fields are:
1. The position of the single primer in reference genome (0-based)
2. The length of the single primer
3,4. Sum of all applicable flags. "+" means the primer from the plus strand of reference genome and "-" means the primer from minus strand. If the number is "0", the single primer isn't from the plus or minus strand of reference genome. Flags are:
1: this single primer can be used to amplify a target if the GC-content of target region is >=60%
2: this single primer can be used to amplify a target if the GC-content of target region is <=45%
4: this single primer can be used to amplify a target if the GC-content of target region is between 45% and 60%
5. Tm
2) In the files generated by "par.pl" ("XX-common.txt" and "XX-specific.txt), each line stores an alignment. For example:
"3 25 2 501407 1 0"
Each line has six fields seperated by tabs. From left to right, the fields are:
1. The position of the single primer in reference genome (0-based)
2. The length of the single primer
3. The genome turn in target group or background group (0-based)
4. The position of alignment in this genome(field 3)
5. "1" means this single primer can be used to amplify the genome (field 3) in plus strand. "0" means this single primer can't be used to amplify the genome (field 3) in plus strand.
6. "1" means this single primer can be used to amplify the genome (field 3) in minus strand. "0" means this single primer can't be used to amplify the genome (field 3) in minus strand.
3) In the file generated by "par.pl" ("XX-common_list.txt"), each line contains one target genome. For example:
"NC_002951.2 0"
Each line has two fields seperated by tab. From left to right, the fields are:
1. The name of target genome
2. The turn of target genome in target group (0-based)
4) The LAMP primer set is stored in the file generated by "LAMP" program, for example:
"The 1 LAMP primers:
F3: pos:36,length:18 bp, primer(5'-3'):CGGTTCCCTGTACTCGAA
F2: pos:74,length:19 bp, primer(5'-3'):AATTCCTTTGTTGAGGCCG
F1c: pos:115,length:24 bp, primer(5'-3'):CGAAATCTTCAAACACTACGTGCT
B1c: pos:175,length:22 bp, primer(5'-3'):CCTGACGGAAGCAGCATTAAGT
B2: pos:237,length:19 bp, primer(5'-3'):CGAACGTAACCAAAGTCGT
B3: pos:276,length:23 bp, primer(5'-3'):TAAAAAATAAAAAACCGTGCACC
This set of LAMP primers could be used in 3 genomes, there are: NC_002951.2, NC_017340.1, NC_002745.2"
One LAMP primer set contains at least six single primers. The positions of single primers in reference genome and their length, sequence are listed in this file. When user designs common primers, which target genomes can be amplified by this primer set are also listed in this file.
######
TIPS:
1) Select reference genome:
The reference genome can be select one randomly from the group of target genomes, or the most expected genome amplified by the LAMP primer set.
2) Specific file:
When run the step 2, if you have the "common file", you can use "--left" option to replace the "--specific". In this way, all genomes in database expect for those in "common file" are defined as the background genomes.
3) Fast mode:
When run the step 3, use the "-fast" option can accelerate the designing. But this mode may lost some right LAMP primer sets.
######
If you have any questions, please contact with us:
Ben Jia: [email protected]
Chaochun Wei: [email protected]

qiyueming / glapd Goto Github PK

glapd's Introduction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent