Giter Club home page Giter Club logo

metaseq's Introduction

METAgenomic Beads Barcoding Quantification (METABBQ) pipeline.

This is a data processing pipeline to achieve bacterial/fungal long amplicons from complex environmental samples. Two experiments were mainly implemented: sing-tube Long Fragment Reads (stLFR) and Rolling Circle Replication (RCR).

Installation

Prerequisites

  • python >= 3.6
  • perl >= 5
  • fastp - A modified version which implemented a module to split the stlfr barcodes.
  • Mash - A modified version to fit stLFR data
  • Snakemake - a pythonic workflow system.
  • blast - The classic alignment tool finding regions of similarity between biological sequences.
  • Assemble methods
    • SPAdes - SPAdes Genome Assembler
    • MEGAHIT - An ultra-fast and memory-efficient NGS assembler

I recommend to install above tools in an virtual env via conda:

  1. create and install part of them:
conda create -n metaseq -c bioconda -c conda-forge snakemake pigz megahit blast
source activate metaseq
  1. According to the corresponding documents, install fastp, SPAdes and community, etc. under env metaseq

Make sure above commands (executables) can be found in the PATH.

install
Clone the launcher to initiate the work dir as well as to call sub-functions.

cd /path/to/your/dir
git clone https://github.com/ZeweiSong/metaSeq.git
export PATH="/path/to/your/dir/metaSeq":$PATH

I haven't yet write any testing module to check abve prerequesites. At present you may need to test it yourself.

Usage

Prepare configs

cd instance
metabbq cfg  

This command will create a default.cfg in your current dir. You should modifed it to let the launcher know the required files and parameters

Initiating a project Prepare an input.list file to describe the sample name and input sequence file path.

metabbq -i input.list -c default.cfg -V

By default, the metabbq will create a directory with the name of {sample} and a sub-directory named input under it.

Run Quality-Contorl module

metabbq smk -j -np {sample}/clean/BB.stat
# -j make the jobs execuated paralled under suitable cores/threads
# -n mean dry-run with a preview of "what needs to be run". Remove it to really run the pipeline.

Run pre-binning assembly module

You need to select a assemble tool in the configure file and the corresponding output file name in following:

metabbq smk -j -np {sample}/summary.BI.megahit.contig.fasta
metabbq smk -j -np {sample}/summary.BI.idba.contig.fasta
metabbq smk -j -np {sample}/summary.BI.spades.contig.fasta

Troubleshooting

Feedback are welcome to submit in the issue page.

metaseq's People

Contributors

biociao avatar zeweisong avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.