Giter Club home page Giter Club logo

cassandra-indexing's Introduction

Rationale

If you use Cassandra long enough, eventually you'll need to support queries that secondary indexes won't accomodate. Instead, you can use wide-rows in Cassandra. Roughly speaking, you then use a single row as an index using composite keys as column names. Since columns are stored in a sorted data structure, querying for a slice of columns is fast (vs. a Range slice).

Much like our cassandra-triggers implementation, we used AOP to implement a generic mechanism for wide-row indexing. Read on.

Design

Below is the design we used to implement a generic wide-row indexing mechanism.

Storage

We use two column families to implement the solution: Configuration and Indexes. Both of these are under a keyspace, Indexing.

Configuration CF

The Configuration CF contains which column families need indexing, and which columns should be used for indexing.
You are able to configure multiple indexes for the same column family. Each configured row is an index. The rowkey is the anem of the index. The columns in that row would then specify the target keyspace and column family, and the columns to be used in the index (in order).

Indexes CF

There should be a row per index. That row will contain a column for each row in the target column family being indexed. The name for that column will be a composite type that includes the columns to be indexed from the original row and the rowkey.

Usage

To fetch records perform a column slice on the row in the Indexes column family. Then use the results to perform specific key fetches in the source table. Since columns are always sorted when stored, and specific key fetches are fast, the overall extract should be fast.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.