Giter Club home page Giter Club logo

ngs-bits's Introduction

ngs-bits - Short-read sequencing tools

Linux build status MacOS build status install with bioconda

Obtaining ngs-bits

Binaries of ngs-bits are available via Bioconda. Alternatively, ngs-bits can be built from sources:

ChangeLog

Changes already implemented in GIT master for next release:

  • none so far.

Changes in release 2021_06:

  • General: Improved GRCh38 support in several tools.
  • General: Using BGZIP for compressed VCFs now to allow indexing them with tabix.
  • VcfAnnotateFromBed: Made separator configurable; Added check for separator in source BED file; Fixed broken output VCF if input has no FORMAT column.
  • VcfAnnotateFromVcf: Fixed crash in VCF header parser.
  • NGSDExportSamples: Added ancestry column.
  • SampleAncestry: Improved runtime and memory use.
  • SampleGender: Improved runtime for algorithm 'hetx'.
  • SomaticQC: Added support for mutect2.
  • NGSD:
    • Added disease status 'Unclear' to table 'sample'.
    • Added table 'processed_sample_ancestry'.
    • Added percent occupied to 'runqc_lane' (for Illumina NovaSeq).

For older releases see the releases page.

Support

Please report any issues or questions to the ngs-bits issue tracker.

Documentation

Have a look at the ECCB'2018 poster.

The documentation of individual tools is linked in the tools list below.
For some tools the documentation pages contain only the command-line help, for other tools they contain more information.

License

ngs-bits is provided under the MIT license and is based on other open source software:

Tools list

ngs-bits contains a lot of tools that are used for NGS-based diagnostics in our institute.

Some of the tools need the NGSD, a database that contains for example gene, transcript and exon data.
Installation instructions for the NGSD can be found here.

Main tools

  • SeqPurge - A highly-sensitive adapter trimmer for paired-end short-read data.
  • SampleSimilarity - Calculates pairwise sample similarity metrics from VCF/BAM files.
  • SampleGender - Determines sample gender based on a BAM file.
  • SampleAncestry - Estimates the ancestry of a sample based on variants.
  • CnvHunter - CNV detection from targeted resequencing data using non-matched control samples.
  • RohHunter - ROH detection based on a variant list annotated with AF values.
  • UpdHunter - UPD detection from trio variant data.

QC tools

The default output format of the quality control tools is qcML, an XML-based format for -omics quality control, that consists of an XML schema, which defined the overall structure of the format, and an ontology which defines the QC metrics that can be used.

BAM tools

  • BamClipOverlap - (Soft-)Clips paired-end reads that overlap.
  • BamDownsample - Downsamples a BAM file to the given percentage of reads.
  • BamFilter - Filters a BAM file by multiple criteria.
  • BamHighCoverage - Determines high-coverage regions in a BAM file.
  • BamToFastq - Converts a BAM file to FASTQ files (paired-end only).

BED tools

  • BedAdd - Merges regions from several BED files.
  • BedAnnotateFromBed - Annotates BED file regions with information from a second BED file.
  • BedAnnotateGC - Annnotates the regions in a BED file with GC content.
  • BedAnnotateGenes - Annotates BED file regions with gene names (needs NGSD).
  • BedChunk - Splits regions in a BED file to chunks of a desired size.
  • BedCoverage - Annotates the regions in a BED file with the average coverage in one or several BAM files.
  • BedExtend - Extends the regions in a BED file by n bases.
  • BedGeneOverlap - Calculates how much of each overlapping gene is covered (needs NGSD).
  • BedHighCoverage - Detects high-coverage regions from a BAM file.
  • BedInfo - Prints summary information about a BED file.
  • BedIntersect - Intersects two BED files.
  • BedLowCoverage - Calcualtes regions of low coverage based on a input BED and BAM file.
  • BedMerge - Merges overlapping regions in a BED file.
  • BedReadCount - Annoates the regions in a BED file with the read count from a BAM file.
  • BedShrink - Shrinks the regions in a BED file by n bases.
  • BedSort - Sorts the regions in a BED file
  • BedSubtract - Subracts one BED file from another BED file.
  • BedToFasta - Converts BED file to a FASTA file (based on the reference genome).

FASTQ tools

  • FastqAddBarcode - Adds sequences from separate FASTQ as barcodes to read IDs.
  • FastqConvert - Converts the quality scores from Illumina 1.5 offset to Sanger/Illumina 1.8 offset.
  • FastqConcat - Concatinates several FASTQ files into one output FASTQ file.
  • FastqDownsample - Downsamples paired-end FASTQ files.
  • FastqExtract - Extracts reads from a FASTQ file according to an ID list.
  • FastqExtractBarcode - Moves molecular barcodes of reads to a separate file.
  • FastqExtractUMI - Moves unique moleculare identifier from read sequence to read ID.
  • FastqFormat - Determines the quality score offset of a FASTQ file.
  • FastqList - Lists read IDs and base counts.
  • FastqMidParser - Counts the number of occurances of each MID/index/barcode in a FASTQ file.
  • FastqToFasta - Converts FASTQ to FASTA format.
  • FastqTrim - Trims start/end bases from the reads in a FASTQ file.

VCF tools (small variants)

  • VcfAnnotateFromBed - Annotates the INFO column of a VCF with data from a BED file.
  • VcfAnnotateFromVcf - Annotates the INFO column of a VCF with data from another VCF file (or multiple VCF files if config file is provided)
  • VcfBreakMulti - Breaks multi-allelic variants into several lines, making sure that allele-specific INFO/SAMPLE fields are still valid.
  • VcfCheck - Checks a VCF file for errors.
  • VcfExtractSamples - Extract one or several samples from a VCF file.
  • VcfFilter - Filters a VCF based on the given criteria.
  • VcfLeftNormalize - Normalizes all variants and shifts indels to the left in a VCF file.
  • VcfSort - Sorts variant lists according to chromosomal position.
  • VcfStreamSort - Sorts entries of a VCF file according to genomic position using a stream.
  • VcfToBedpe - Converts a VCF file containing structural variants to BEDPE format.
  • VcfToTsv - Converts a VCF file to a tab-separated text file.

BEDPE tools (structural variants)

Gene handling tools

Phenotype handling tools

Misc tools

  • PERsim - Paired-end read simulator for Illumina reads.
  • FastaInfo - Basic info on a FASTA file.

ngs-bits's People

Contributors

marc-sturm avatar axelgschwind avatar leonschuetz avatar c-schroeder avatar flo-lenz avatar fohlen avatar jakobmatthes avatar tstohn avatar ubuntolog avatar imgagbot avatar meissnert avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.