Giter Club home page Giter Club logo

ctools's Introduction

Apache Cassandra SSTable Offline Tool

A set of tools to work with Cassandra SSTable files offline.

Requirements

python2.7

Getting started

gettoken.py - This script converts a given key to token using RandomPartitioner

token-hexkey.py - This script converts a given key in hex format to token using RandomPartitioner

sstable.py - This script provides common classes to parse SSTable component files

sstable-metadata.py - This script reads the SSTable stats file to display SSTable metadata information. It supports version "ha" onwards. Version 3.11 also supported.

sstable-index.py - This script reads the SSTable index file to display SSTable row index entries. It is tested with version "jb"

sstable2json.py - This script reads the rows and columns in a given SSTable and converts those to JSON format similar to sstable2json tool. It doesn't require access to cassandra column families in system keyspace to decode SSTable data like sstable2json tool. It is tested with version "ic","jb", "ka" and "lb". It supports parsing CQL data

Examples

$ ./sst.py -m data/lb/iris-9cb598404fd011eabbb8b16d9d604ffd/lb-1-big-Data.db

rowSizes: ([1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14...

$ ./sst.py -c -d data/lb/iris-9cb598404fd011eabbb8b16d9d604ffd/lb-1-big-Data.db

[ {"key": "00000005", "cells": [["636c617373","497269732d76697267696e696361",1581757206044154], ["706574616c6c656e677468","40c00000",1581757206044154], ["706574616c7769647468","40200000",1581757206044154], ["736570616c6c656e677468","40c9999a",1581757206044154], ["736570616c7769647468","40533333",1581757206044154]]}, ... ]

References

More detailed description of storage internals can be found at http://distributeddatastore.blogspot.com

ctools's People

Contributors

bharatendra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctools's Issues

ImportError: No module named sstidx

Hello,
I tried to execute python2 sst.py -c -d ../md-1-big-Data.db
but I got the following error:

Traceback (most recent call last):
  File "sst.py", line 20, in <module>
    from sstidx import IndexInfo
ImportError: No module named sstidx

I think the error is at:

ctools/sst.py

Line 20 in 4f2ee5a

from sstidx import IndexInfo
since the module to import should be sstable not sstidx.

Script doesn't work properly with the Cassandra 2.1.12 and ka version

I was trying to extract the data from SSTable generated by Cassandra 2.1.12. But it doesn't give the proper output. I can see few proper string rest of the data is stil in alphanumeric format. Could you please help. I am running Cassandra on ubuntu 14.04 with SizeTieredCompaction.

colorChange rendorengine","4170706c655765624b6974",1465205127516000], ["����U%�U advantage�vetikocUcY9PJPqR8cCpPG3ON`

SStables `ka` version support

@bharatendra, first of all I'd like to thank you for releasing this. It is an extremely useful library as datastax's sstables2json requires an function cassandra installation to do it's job.

Are there any plans or tips to make this work with ka format version?

// jb (2.0.1): switch from crc32 to adler32 for compression checksums
//             checksum the compressed data
// ka (2.1.0): new Statistics.db file format
//             index summaries can be downsampled and the sampling level is persisted
//             switch uncompressed checksums to adler32
//             tracks presense of legacy (local and remote) counter shards

The changes from jb don't seem very scary (at least from the doc) but a lack of better documentation makes it a bit hard to understand what's going on.

Thank you very much in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.