Giter Club home page Giter Club logo

nextflow-annotate's Introduction

nextflow-annotate

This is a push to gather together some tools that are helpful for genome annotation, and serve as a forkable, version-controlled, reusable, and citable record of our pipeline. The steps use nextflow as a workflow engine so we can abstract the individual steps from their execution environment (SGE, MPI or simple local multithreading).

This is not a push-button solution, but it can serve as a starting point for annotating your new genome.

Prerequisites

The minimum prerequisites are docker and nextflow, and a fasta file (henceforth scaffolds.fasta) of your genome assembly.

Some steps require software or data with licences that restrict distribution, but I've kept them to a minimum and will make it clear when those pieces are necessary.

Steps

Each of these steps corresponds to one of the nextflow recipes provided by this repository.

Transposon Identification

Taking cues from jamg, we transcribe all of the open reading frames and then use hhblit to match against a database of known transposons. A GFF file is produced that describes to position of the transposons that we find.

This uses two docker images, which will be pulled automatically from the docker registry as needed.

Finding Repeats

Repeats are an important part of the final genome annotation. I recommend a two-step process:

  1. Find denovo repeats with RepeatScout.
  2. Use the RepeatScout output in conjuctions with the latest RepBase library as input to RepeatMasker

I've taken care of the RepeatScout and RepeatMasker installation by bundling them as docker images. The only hiccup is that RepBase requires registration.

nextflow-annotate's People

Contributors

jvhaarst avatar jwdebler avatar robsyme avatar upendrak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nextflow-annotate's Issues

updates

Hi,
Do you have any plans to update this project?

Thank you in advance.

Mic

pair-end reads

Hi,
How to use pair-end reads fastq files?

Thank you in advance,

Michal

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.