Giter Club home page Giter Club logo

libgtf's Introduction

Build Status

libGTF a library for parsing/searching GTF and BED files

Note that this library is not yet appropriate for real-world use (the documentation hasn't even been written yet!).

In essense, this library construct an interval tree representation of GTF or BED files and allows overlap searches, similar to GenomicRanges or bedtools. The primary difference here is the convenient C interface that allows incorporation into other programs easily.

Note that murmur3.c and murmur3.h are C implementations of MurmurHash. The C implementation is from Peter Scott and MurmurHash itself is by Austin Appleby. Both of these are in the public domain.

Installation

git clone https://github.com/dpryan79/libGTF.git
git submodule init
git submodule update
make

Examples

There are currently a few example programs in the tests/ directory.

  • testBED demonstrates how to parse a BED file and display its tree representation in dot format.
  • testGTF is the equivalent program for GTF files.
  • testFindOverlaps demonstrates how to find overlaps of alignments in SAM/BAM/CRAM format with a GTF file. This is largely similar to featureCounts and htseq-count, though this program doesn't handle paired-end alignments intelligently (it is, afterall, just a demonstration). The code demonstrates processing sets of overlaps and merging them for further processing. This program also demonstrates how to use different match and strand types. Note that in a real program, it'd be simpler to use the findOverlapsBAM() function.
  • testCountOverlaps demonstrates how to count the number of overlapping elements per alignment. Note that there's no method to filter this, for example, to only count the number of exonic overlaps. For such purposes, it's better to use the findOverlaps() function and filter the results.
  • testOverlapsAny is similar to testCountOverlaps, but simply returns a binary 0 or 1, indicating whether there's at least a single overlap.
  • The aforementioned programs also demonstrate how one can use filter functions, both when parsing a GTF file and when traversing one.

To Do

  • compare with featureCounts/bedtools and ensure the results are the same

libgtf's People

Contributors

dpryan79 avatar

Watchers

James Cloos avatar Wei Zhu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.