Giter Club home page Giter Club logo

in_silico_pcr's Introduction

IN_SILICO_PCR


CONTENTS

  1. Introduction
  2. Requirements
  3. Installation
  4. Usage
  5. Output
  6. License
  7. Contact

1. INTRODUCTION

This script searches nucleotide sequences using a set of primer sequences and attempts to anticipate the results of a PCR reaction using these same sequences. Results are based on sequence identity. Real-world factors such as melting temperature, annealing time, salt concentration, or secondary structure are not accounted for in this script.

This program was adapted and extended from a php script (link) authored by Joseba Bikandi which was covered by the GNU GPL v2 license.

2. REQUIREMENTS

Perl

3. INSTALLATION

Download and save.

4. USAGE

Basic command: perl in_silico_PCR.pl -s query.fasta -a (forward primer sequence) -b (reverse primer sequence) > results.txt 2> amplicons.fasta

For list of options, call the script without any inputs: perl in_silico_PCR.pl

4.1 Required Inputs:

-s
Path to sequence file in fasta format. Can be a multi-fasta file with multiple sequence records.

And Either
-a
Sequence of forward primer, 5' → 3'
-b
Sequence of reverse primer, 5' → 3'
Or
-p
Path to file containing one or more primer sequence pairs.
File format should be:
forward primer sequence (5' → 3') <tab> reverse primer sequence (5' → 3') <tab> primer set ID (no spaces)
One primer set per line.
Example:

ATGACATAAGA	TACTACTGAA	primer_set_1
GATATACTGAG	CTAGACTACT	dnaA

Primer sequences containing IUPAC ambiguity codes are accepted.

4.2 Optional Inputs:

  • -l
    Maximum "amplicon" length, in bp.
    (default: 3000)

  • -m
    Allow up to one base mismatch per primer sequence.
    (default: no mismatches tolerated)

  • -i
    Allow up to one base insertion or deletion per primer sequence.
    (default: no indels tolerated)

  • -e
    Exclude primer sequences from amplicon output sequence, position, and length.
    (default: primer sequences included in result)

  • -r
    Ensure that output amplicons are in hte orientation given by the order of the primers, i.e. forward primer at 5' end, reverse primer at 3' end.
    (default: amplicons will be output in the orientation they are found in the query sequence)

  • -c
    If no amplicons are found in the sequence, will instead output and and all primer hits followed by sequence either to the end of the contig or to the maximum amplicon length (given by -l), whichever is shorter.
    (default: only sequences where both primers hit in the correct orientation and within the maximum amplicon length are output)

5. OUTPUT

A summary of each in silico "PCR" reaction result will be output to STDOUT. The output can be directed to a file instead of the screen by using > results_file.txt on the command line. There are four columns in the results:

  1. "AmpID": Name of the amplicon. It will include the name of the primer set (third column in the file given to -p) followed by "amp" for full amplicon (both primers in correct orientation within maximum distance), followed by a number to represent the number of amplicons found in the the input sequence. If the -c option is given above, then if one or both primer sequences is found in the input sequence, but not in the correct orientation or within the maximum amplicon size, then the prefix will be followed by "p1" to represent the first primer sequence or "p2" to represent the second primer sequence.
  2. "SequenceId": The name of the input sequence in which the primer sequences were found. Will be "No amplification" if no amplicon was identified.
  3. "Position in sequence": The 5'-most position (1-based) at which the amplicon or primer sequence starts in the input sequence. This value will vary depending on whether -e is given or not (see Options above)
  4. "Length": The total length of the amplified sequence. This value will vary depending on whether -e is given or not (see Options above)

"Amplicon" sequences in fasta format will be output to STDERR. The output can be directed to a file instead of the screen by using 2> amplicons.fasta on the command line.

6. LICENSE:

in_silico_pcr
Copyright (C) 2017 Egon A. Ozer

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. See LICENSE.txt

7. CONTACT:

Contact Egon Ozer with questions or comments.

Written with MacDown.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.