Giter Club home page Giter Club logo

lrcage's Introduction

LRCAGE (long-read CAGE)

This repository contains scripts to call peaks, to retain a list of confident transcripts, and to build a proteome database for immunopeptidome analysis.

callpeak

a script to call peaks from LRCAGE, LRhex, and nanoCAGE data

usage: LRCAGE callpeak [-h] --inputlist INPUTLIST --peak PEAK
                       (--tpm TPM | --readcount READCOUNT) [--gcap GCAP]
                       [--gcap_mincount GCAP_MINCOUNT]
                       [--half_peak_width HALF_PEAK_WIDTH] [--thread THREAD]

optional arguments:
  -h, --help            show this help message and exit
  --inputlist INPUTLIST
                        list of input bam files
  --peak PEAK           output peak file name
  --tpm TPM             minimum TPM per peak
  --readcount READCOUNT
                        minimum read count per peak
  --gcap GCAP           minimum G-cap ratio
  --gcap_mincount GCAP_MINCOUNT
                        minimum number of soft-clipped G reads
  --half_peak_width HALF_PEAK_WIDTH
                        half peak size
  --thread THREAD       number of threads

filtertx

a script to retain a list of confident transcripts using transcripts identified from LRCAGE data and a list of peaks.

usage: LRCAGE filtertx [-h] --gtf GTF --talon TALON [--libinfo LIBINFO]
                       [--mincount MINCOUNT] [--peak PEAK]
                       [--peakratio PEAKRATIO] --oprefix OPREFIX

optional arguments:
  -h, --help            show this help message and exit
  --gtf GTF             input gtf file
  --talon TALON         input TALON.tsv file
  --libinfo LIBINFO     library size information
  --mincount MINCOUNT   minimum count to define confident transcripts
  --peak PEAK           peaks used to retain transcripts with complete 5' ends
  --peakratio PEAKRATIO
                        minimum fraction of reads for peak-transcript pair per
                        trancsript
  --oprefix OPREFIX     prefix for output files

buildprot

a script to create a proteome database using newly characterized transcripts as input.

usage: LRCAGE buildprot [-h] --gtf GTF --ref REF [--txinfo TXINFO]
                        [--thread THREAD] --oproteome OPROTEOME --refproteome
                        REFPROTEOME --refgtf REFGTF

optional arguments:
  -h, --help            show this help message and exit
  --gtf GTF             input gtf file
  --ref REF             reference genome fasta
  --txinfo TXINFO       transcript information
  --thread THREAD       number of threads
  --oproteome OPROTEOME
                        output proteome
  --refproteome REFPROTEOME
                        reference proteome
  --refgtf REFGTF       reference gtf

Run scripts using docker images

  1. Download scripts from github
cd <your installation directory>
git clone https://github.com/juheon/LRCAGE.git
  1. Download docker images from dockerhub
docker pull jhmaeng/lrcage
docker images ; See if you can find jhmaeng/lrcage
  1. Modify and run “run_with_docker.sh” script in ”your installation directory”

lrcage's People

Contributors

juheon avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

twlab

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.