Giter Club home page Giter Club logo

partitioned-bit-packing's Introduction

Partitioned Bit-Packed Vector

An implementation of a partitioned bit-packed vector is shown in this work. The partitioned bit-packed vector is an extension of the normal bit-packed vector. It aims to increase the flexibility of the data-structure while maintaining a high compression benefit and a fast access perfomance. Modern in-memory databases could benefit from this more flexible data structure.

In this repository you'll find implementations of a normal bit-packed vector and partitioned bit-packed vectors. Several different access algorithms have been implemented and evaluated.

OS Support:

  • Mac OSX (Build System: GCC 4.8)
  • Linux (Build System: GCC 4.8, current build untested)
  • Windows (Build System: Visual Studio, current build untested)

Dependencies:

  • (Optional) PAPI (http://icl.cs.utk.edu/papi/) is recommended for benchmarking. The benchmarks can run without PAPI, but it is recommended to do benchmarks on a Linux-OS using PAPI to measure the performance.

Abstract

Taken from my Bachelor's Thesis "Partitioned Bit-Packed Vectors for In-Memory Column Stores"

In recent database development, in-memory databases have grown more and more in popularity. The hardware development of the past years has made it possible for databases to hold all of their data in main memory. Still, most applications on in-memory databases are latency-bound rather than compute-bound. Combining strong compression techniques and efficient data structures is essential to fully utilize the hardware capabilities. A common data structure for efficient storing is the bit-packed vector. The bit-packed vector uses a fixed encoding length, which cannot be changed after initialization. Therefore it requires full re-initialization, when the encoding-length changes.

In this Thesis a new data structure is proposed. The partitioned bit-packed vector is an alternative to the regular bit-packed vector. In the partitioned bit-packed vector the encoding length of the stored elements can be increased dynamically. This eliminates the need for expensive re-initializations. In this Thesis it is shown how this new data structure works and its performance is evaluated. It is discussed what benefits the partitioned bit-packed vector can bring to an in-memory databases, when replacing the regular bit-packed vector. The results of this Thesis suggest that this new data structure has the capabilities to improve the performance of existing in-memory column-stores for specific workloads.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.