Giter Club home page Giter Club logo

jugds.jl's Introduction

jugds: Julia Interface to CoreArray Genomic Data Structure (GDS) Files

GPLv3 GNU General Public License, GPLv3 (2015-2020)

Build Status

pre-release version: v0.1.0

Features

This package provides a high-level Julia interface to CoreArray Genomic Data Structure (GDS) data files, which are portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The jugds package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access.

Installation

  • Development version from Github, requiring julia >= v1.0
using Pkg

Pkg.status()
Pkg.add(PackageSpec(url="https://github.com/CoreArray/jugds.jl.git"))

Package Maintainer

Dr. Xiuwen Zheng ([email protected])

Tutorials

Citation

Original papers (implemented in R/Bioconductor):

Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS (2012). A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. DOI: 10.1093/bioinformatics/bts606.

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray -- A storage-efficient high-performance data format for WGS variant calls. Bioinformatics. DOI: 10.1093/bioinformatics/btx145.

Copyright Notice

Examples

using jugds

fn = abspath(dirname(pathof(jugds)), "..", "demo", "data", "ceu_exon.gds")
f = open_gds(fn)
f
close_gds(f)
File: jugds/demo/data/ceu_exon.gds (32.5K)
+    [  ] *
|--+ description   [  ] *
|--+ sample.id   { Str8 90 LZMA_ra(35.8%), 258B } *
|--+ variant.id   { Int32 1348 LZMA_ra(16.8%), 906B } *
|--+ position   { Int32 1348 LZMA_ra(64.6%), 3.4K } *
|--+ chromosome   { Str8 1348 LZMA_ra(4.63%), 158B } *
|--+ allele   { Str8 1348 LZMA_ra(16.7%), 902B } *
|--+ genotype   [  ] *
|  |--+ data   { Bit2 1348x90x2 LZMA_ra(26.3%), 15.6K } *
|  |--+ extra.index   { Int32 0x3 LZMA_ra, 19B } *
|  \--+ extra   { Int16 0 LZMA_ra, 19B }
|--+ phase   [  ]
|  |--+ data   { Bit1 1348x90 LZMA_ra(0.91%), 138B } *
|  |--+ extra.index   { Int32 0x3 LZMA_ra, 19B } *
|  \--+ extra   { Bit1 0 LZMA_ra, 19B }
|--+ annotation   [  ]
|  |--+ id   { Str8 1348 LZMA_ra(38.4%), 5.5K } *
|  |--+ qual   { Float32 1348 LZMA_ra(2.26%), 122B } *
|  \--+ filter   { Int32,factor 1348 LZMA_ra(2.26%), 122B } *
\--+ sample.annotation   [  ]
   \--+ family   { Str8 90 LZMA_ra(57.1%), 222B }

Also See

JSeqArray.jl: data manipulation of whole-genome sequencing variants in Julia

jugds.jl's People

Contributors

zhengxwen avatar

Stargazers

 avatar

Watchers

 avatar  avatar

jugds.jl's Issues

Creating gds file example?

Hello, how would I use this package to create some gds files?

Reading the source code, put_attr_gdsn seems like the logical candidate but it doesn't work

using jugds
sigma = rand(5, 5)
gds = create_gds("test_gds.gds") # create empty gds
put_attr_gdsn(gds, "sigma", sigma) # write to gds

ERROR: MethodError: no method matching put_attr_gdsn(::type_gdsfile, ::String, ::Matrix{Float64})
Closest candidates are:
  put_attr_gdsn(::type_gdsnode, ::String, ::Any) at /scratch/users/bbchu/.julia/packages/jugds/mO314/src/jugds.jl:283

Above code was ran on Julia 1.6

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.