Giter Club home page Giter Club logo

Comments (6)

abhi-kr avatar abhi-kr commented on July 28, 2024 1

For existing cluster, if you already have all required hbase-secondary index related configurations configured in your cluster machines(HMaster+Regionservers, else after making all configuration changes restart tour cluster) then you can make use of class "org.apache.hadoop.hbase.index.mapreduce.TableIndexer" to create index on existing user tables:

./hbase org.apache.hadoop.hbase.index.mapreduce.TableIndexer -Dtablename.to.index=<table_name> -Dtable.columns.index='IDX1=>cf1:[q1->datatype&length];cf2:[q1->datatype&length],[q2->datatype&length],[q3->datatype& lenght]#IDX2=>cf1:q5,q5'

Here,
tablename.to.index: Table name to create index.
table.columns.index : Table columns on which index to be created.

The format used here is:
IDX1 - Name of the Index given by user
cf1 - Column family name of user table
q1 - qualifier name
datatype - datatype of column values "cf1:q1"
[Int, String, Double, Float]
length - Maximum length of the values of "cf1:q1"
# is used to separate between two index details

from hindex.

hy2014 avatar hy2014 commented on July 28, 2024

may be you rowkey is too long, i think.

from hindex.

kunkumar avatar kunkumar commented on July 28, 2024

I have created a hase table and index table with hindex framework, but when we are uploading more data into same table, it keeps on increasing the size of index table only and no actual data is appearing in Hbase table. In this case my input data is 80 GB and the index table has grown to 200+ GB and no new data appearing in the main table.

Can rowkey size be a reason for such huge table size ?

from hindex.

hy2014 avatar hy2014 commented on July 28, 2024

index table rowkey contains the index column/value and user table rowkey. As you said, your user table data size has no change, so your index table affect data size.

from hindex.

SilentMing avatar SilentMing commented on July 28, 2024

Is there any detail description in how to implement hindex in an existing Cluster?

from hindex.

SilentMing avatar SilentMing commented on July 28, 2024

Thanks for your kind answer, abhi-kr. I did it successfully.

from hindex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.