Giter Club home page Giter Club logo

hippo-postgresql's Introduction

#Hippo

Build Status

Hippo is a fast, yet scalable, sparse database indexing approach. In contrast to existing tree index structures, Hippo avoids storing a pointer to each tuple in the indexed table to reduce the storage space occupied by the index. Hippo only stores disk page ranges that represent the indexed database table and maintains histogram-based summaries for the page ranges. The summaries are brief histograms which represent the data distribution of one or more pages. The main contributions of Hippo are as follows:

  • Low Indexing Overhead

  • Competitive Query Performance

  • Fast Index Maintenance

#Play around with Hippo index

For the ease of testing, we have implemented Hippo index into PostgreSQL kernel (9.5 Alpha 2) as one of the backend access methods. This verision has been tested on Ubuntu Linux LTS 14.04.

Download the source code

$ git clone https://github.com/Sarwat/hippo-postgresql.git

Build and Installation

Once you've synced with GitHub, the folder should contain the source code for PostgreSQL. The build and installation steps are exactly same with official PostgreSQL.

$ cd SourceFolder
$ ./configure
$ make
$ su
$ make install
$ adduser postgres
$ mkdir /usr/local/pgsql/data
$ chown postgres /usr/local/pgsql/data
$ su - postgres
$ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
$ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &
$ /usr/local/pgsql/bin/createdb test
$ /usr/local/pgsql/bin/psql test

You may need to install some required packages to pass the installation if use Ubuntu 14.04. Please try the following command:

$ sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev

PostgreSQL Regression Test

After the installation, you have to make sure the source code on your machine pass all the PostgreSQL Regression Tests (157 in total).

$ cd SourceFolder

$ make check

Usage in SQL

Here list some SQL commands of Hippo index. For more details, please see the following Hippo index test SQL script:

./src/test/regress/sql/hippo.sql (Default)

./src/test/regress/sql/hippo_random.sql

Build Hippo

ALTER TABLE hippo_tbl ALTER COLUMN randomNumber SET STATISTICS 500;

ANALYZE hippo_tbl;

CREATE INDEX hippo_idx ON hippo_tbl USING hippo(randomNumber) WITH (density = 20);

Query Hippo

SELECT * FROM hippo_tbl WHERE randomNumber > 1000 AND randomNumber < 2000;

Insert new records into Hippo

INSERT INTO hippo_tbl ... ... ...;

Delete old records from Hippo

DELETE FROM hippo_tbl WHERE randomNumber > 1000 AND randomNumber < 2000;

VACUUM;

Drop Hippo

DROP INDEX hippo_idx;

Currently supported data type

Integer

Currently supported operator

<, <=, =, >=, >

Notes

Currently, due to the conflicts between Hippo index and PostgreSQL kernel, Hippo only works on the temporary postmaster server which is built in PostgreSQL Regression Test Mode. We are still striving to release it in PostgreSQL Production Mode.

For using Hippo in PostgreSQL Regression Test Mode, you need to

  • Read and change Hippo index test SQL script:
./src/test/regress/sql/hippo.sql (Default)

./src/test/regress/sql/hippo_random.sql
  • View Hippo index test SQL script output:
./src/test/regress/results/hippo.out (Default)

./src/test/regress/results/hippo_random.out
  • Modify Regression Test schedule if necessary
./src/test/regress/parallel_schedule

For example, you can change "ignore: hippo_random" to "test: hippo_random". This will execute a random Hippo index test and the test may fail due to unpredicted results. The failure is normal.

#Hippo Video Demonstration Want to have a try? Do not hesitate!

Watch this video (No need for headsets) and learn how to get started: Hippo Video Demonstration (on remote computer) or Hippo Video Demonstration (on Youtube).

Publication

Jia Yu, Mohamed Sarwat. "Two Birds, One Stone: A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems". (Research paper)

(To appear) In Proceeding of the 43rd International Conference on Very Large Data Bases VLDB 2017, Munich, Germany, August 2017

Contact

Contributors

DataSys Lab

Hippo index is one of the projects under DataSys Lab at Arizona State University. The mission of DataSys Lab is designing and developing experimental data management systems (e.g., database systems).


#PostgreSQL Database Management System

This directory contains the source code distribution of the PostgreSQL database management system.

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

http://www.postgresql.org/download

See the file INSTALL for instructions on how to build and install PostgreSQL. That file also lists supported operating systems and hardware platforms and contains information regarding any other software packages that are required to build or run the PostgreSQL system. Copyright and license information can be found in the file COPYRIGHT. A comprehensive documentation set is included in this distribution; it can be read as described in the installation instructions.

The latest version of this software may be obtained at http://www.postgresql.org/download/. For more information look at our web site located at http://www.postgresql.org/.

hippo-postgresql's People

Contributors

jiayuasu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.