Giter Club home page Giter Club logo

bigarrays.jl's Introduction

BigArrays.jl

Build Status

storing and accessing large julia array using different backends.

Features

  • serverless, clients do IO directly
  • arbitrary subset cutout (saving should be chunk size aligned)
  • extensible with multiple backends
  • arbitrary shape, the dataset boundary can be curve-like
  • arbitrary dataset size (in theory, tested dataset size: ~ 9 TB)
  • chunk compression with gzip/blosclz/jpeg
  • highly scalable due to the serverless design
  • arbitrary data type

supported backends

  • AWS S3
  • Google Cloud Storage
  • Local HDF5 files

Installation

Pkg.clone("https://github.com/jingpengwu/AWS.jl.git")
Pkg.clone("https://github.com/jingpengwu/GoogleCloud.jl.git")
Pkg.clone("https://github.com/seung-lab/BigArrays.jl.git")
Pkg.clone("https://github.com/seung-lab/S3Dicts.jl.git")
Pkg.clone("https://github.com/seung-lab/GSDicts.jl.git")

usage

BigArrays do not have limit of dataset size, if your reading index is outside of existing file range, will return an array filled with zeros.

use the hdf5 files backend

using BigArrays.H5sBigArrays
ba = H5sBigArray("/directory/of/hdf5/files/");
# use it as normal array

ba[101:200, 201:300, 1:3] = rand(UInt8, 100,100,3)
@show ba[101:200, 201:300, 1:3]

use backend of AWS S3

setup info file

the info file is a JSON file, which defines all the configuration of the dataset. It was defined in neuroglancer

test example

use backend of Google Cloud Storage

the info configuration file is the same with S3 backend.

test example

Development

BigArrays is a high-level architecture to transform Key-Value store (backend) to Julia Array (frontend). it provide an interface of AbstractArray, and implement the get_index and set_index functions.

Add new backend

The backends are different key-value stores. To add a new backend, you can simply do the following:

  • wrap the key-value store as a Julia Associate type. S3Dicts is an example is a good example.
  • implement the getindex and setindex! functions. S3Dicts example
  • make sure that the key-value store have a field of configDict containing the block size and data type.

bigarrays.jl's People

Contributors

jingpengw avatar nicholasturner1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.