vcnc / haeinsa Goto Github PK

View Code? Open in Web Editor NEW

159.0 159.0 47.0 3.5 MB

Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase

License: Apache License 2.0

Java 99.50% Thrift 0.50%

haeinsa's People

Contributors

Stargazers

Watchers

haeinsa's Issues

Isolation level of Haeinsa should be described more clearly

Isolation level of Haeinsa should be described more clearly. In document, isolation level of Haeinsa is described as serializability. But it can be confused with ANSI SQL SERIALIZABLE in BERE95. In this paper, ANSI SQL SERIALIZABLE is defined as consistency level which does not allow phantom read. Phantom read can be well described with predicate lock, which protect transaction from others to modify any data satisfy search condition included in the ongoing transaction.

Although Haeinsa does not allow photom in most of operations including intra row scan, it still has phantom read phenomenon in inter row scan. So isolation level of Haeinsa meets condition of conflict serializability, but not 'ANSI SQL SERIALIZABILITY'.

BERE95: Hal Berenson et al. A Critique of ANSI SQL Isolation
Levels. In Proceedings of SIGMOD, 1995.

Support setAttribute method on Haeinsa operations

HeinsaGet, HaeinsaPut, HeinsaDelete, HaeinsaIntraScan and HaeinsaScan should have setAttribute for compatibility to HBase operations. See OperationWithAttributes.

table.put with multi threads (question)

We see a concurrency problem when running put with multiple threads and one Transaction.
The case is :
We share the HaeinsaTransaction object. between few threads which calling put simultaneously.
After all succeed we commit

After we noticed the problem we dived into the code a little bit, and saw the following:
protected HaeinsaTableTransaction createOrGetTableState(byte[] tableName) {
HaeinsaTableTransaction tableTxState = txStates.getTableStates().get(tableName);
if (tableTxState == null) {
tableTxState = new HaeinsaTableTransaction(this);
txStates.getTableStates().put(tableName, tableTxState);
}
return tableTxState;
}

This code seems to be not "Thread safe".
Please confirm this assumption.

We can fix this issue and commit, But maybe there are other places ? and also maybe this is bad practice from other reasons.

Please advice ?

2 Different clients (2 different machines ) with different clock!

Hello,
We are running 2 different clients (on different machines) that have difference in their System.currentmillis.
Does this may cause problems in the correctness of the transactions?

Also, without using your API, it caused us the following problem:
When client A gets an update on item X, he calls Put(row, ts). The ts is taken from the client's machine local timestamp (synced by NTP). Lets assume it was ts = 3.
Now client B (on different machine) gets after 1ms an update on the same item. The ts here is 2 (on client A machine the ts is now 4).

Both updates succeeded. If I will call get, I will get the first update instead of the second one.
The reason is that the second update is with younger ts (due to miss time syncronization between machine).
The solution was not to set the ts on clients machine and let the region server to set it.

I am concerned we will have the same problem with your library.

What do you think?

Ehud

Tx implement setTimeout method

Could you please add the setTimeout() method of the Tx? So the tx beginner can decide the timeout time according to the business cases.

Deaklock on HaeinsaComplexTest::testSerializability

Thread 1 and thread 2 has same commit timestamp.
thread 1 do prewrite primary, thread 2 do abort immediately after that for primary.
Then thread 1 & 2 wait for expiration and all start over again.

For now, I making HaeinsaTransactions::hasSameCommitTimestamp to return false all the time.
I am not sure weather this bug has only deadlock implication or data integrity implication as well. But I think it is unsafe to assume that 2 transactions are the same if they have same commit timestamp.

Migrate to container-based Travis build

https://docs.travis-ci.com/user/migrating-from-legacy/

Requires downloading Thrift compiler from somewhere other than PPA.

Support Cloud BigTable of Google Cloud Platform

Google announced Cloud BigTable which support subset of HBase APIs:
http://googlecloudplatform.blogspot.kr/2015/05/introducing-Google-Cloud-Bigtable.html

Cloud BigTable do not support some HBase APIs, such as Coprocessors, Admin and so on:
https://cloud.google.com/bigtable/docs/hbase-differences

Haeinsa is likely to support Cloud BigTable, since Haeinsa do not use Coprocessors. (Unlike other libraries implementing transaction on HBase)

But, there is problem:

Deleting a specific version of a column based on its timestamp is not supported.
The following methods in the class org.apache.hadoop.hbase.client.Delete are not supported:

new Delete(byte[] row, long timestamp)
addColumn(byte[] family, byte[] qualifier)
addFamily(byte[] family, long timestamp)
addFamilyVersion(byte[] family, long timestamp)

Since Haeinsa depends on deleting a specific version of a column based on its timestamp, this problem must be solved to Haeinsa to support Cloud BigTable.

This problem might be able to solved by using something tombstone mechanism, rather than deleting column by specific version.

Support deletion of complete rows

Is there a strong reason for not being able to delete a complete row?

Currently the method HaeinsaTable.delete() checks that the size of the family map is not zero and throws an exception if that is the case. The comment above the precondition check states that it is not possible to delete entire rows because of the lock column.

But shouldn't it be possible to remove the complete row in the HaeinsaTable.makeStable() method? Instead of setting the lock state for this row to STABLE, the row is removed from HBase.

In case the row is a secondary row in the transaction, the primary row needs the information that this row needs to be deleted. During transaction recovery (in the state COMMITTED) it can check whether the row has been deleted or not. It can delete the row if the prewriteTimestamp matches (in order to not delete a concurrently added new version).

In case the row is the primary row in the transaction, it can be removed completely.

If there are two competing transactions, one trying to delete and the other trying to update the row, then the updating one would fail (if the other one was faster) because the checkAndPut() operation fails when the row no longer exists. For the other way round HBase's checkAndDelete() operation can be used.

High Failure Rate In Unit Test testSerializability When Using HBase 1.2

Hi,

This is not really an issue with the current codebase as it uses HBase 0.94 . I am trying to upgrade it to use HBase 1.2 for our purposes and hoping you can give some guidance on the only failing unit test. I can share my fork if you want but didn't make many changes other than to get rid of the table pool as now HBase Table instances are lightweight, so I create them and HaeinsaTable instances on demand.

All unit tests except HaeinsaComplextTest.testSerializability are passing. That one seems to suffer from a high failure rate but still seems to be making progress towards 100 successful iterations that it requires, but at a very slow rate. Most transactions are failing with the following:

FAILED ITERATION 8 THREAD Serializability-job-thread-4 ERROR: this row is unstable and not expired yet.
ERROR
kr.co.vcnc.haeinsa.exception.NotExpiredYetException: this row is unstable and not expired yet.
at kr.co.vcnc.haeinsa.HaeinsaTable.checkAndIsShouldRecover(HaeinsaTable.java:438)
at kr.co.vcnc.haeinsa.HaeinsaTable.access$000(HaeinsaTable.java:75)
at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.next(HaeinsaTable.java:1030)
at kr.co.vcnc.haeinsa.HaeinsaTable.get(HaeinsaTable.java:160)
at kr.co.vcnc.haeinsa.HaeinsaComplexTest$2.run(HaeinsaComplexTest.java:244)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I see the same in the older codebase but the failure rate is lower and it fairly quickly gets to 100 successes and the test passes. In the new code it runs for hours and most threads are still at iteration number in the 50's, at that point the threads start throwing out of memory exceptions.

If I can get some clues where to look at I'd be happy to contribute any changes reqiured back to the project.

Thanks.

Add unit tests for test robustness of Haeinsa against I/O failures

Haeinsa is designed to be robust against I/O failures of HBase. To simulate I/O failures, HTableInterface implementation which operation fails randomly is needed. This implementation of HTableInterface throws IOException on method randomly.

In the unit test, we check consistency of data in HBase. This implementation of unit test might be similar to implementation of HaeinsaComplexTest.

This idea is inspired by blog post from cloudera. (See 'Randomized fault testing' section)

Support deletion of complete rows

Is there a strong reason for not being able to delete a complete row?

But shouldn't it be possible to remove the complete row in the HaeinsaTable.makeStable() method? Instead of setting the lock state for this row to STABLE, the row is removed from HBase.

In case the row is the primary row in the transaction, it can be removed completely.

HaeinsaResult.getValue return null when using the API without HaeinsaTransaction

When running read without HaeinsaTransaction.
We can see that we get the values in the HaeinsaResult, but I can use the getValue api nor the containsColumn (since it is using the getValue).
Is this a bug ?

In order to work around this I use list and matching the value by (family, qualifier) with my own code.

Merge mutations during apply mutations step if possible

Haeinsa applies mutations of transaction to rows on commit phase. If transaction contains several mutations for certain row, Haeinsa applies mutations step by step with checkAndPut and checkAndDelete operation. We can optimize this step.

Let's assume that certain transaction should applies following mutations to certain row:

put(row, col1, value1)
put(row, col2, value2)
delete(row, col1)
put(row, col3, value3)
put(row, col4, value4)
delete(row, col3)
put(row, col5, value5)

Here is comparison of how Haeinsa works on current version and optimized version:

AS-IS Haeinsa	TO-BE Haeinsa
checkAndPut(row, [{col1, value1}, {col2, value2}, {lock, newLock}) checkAndDelete(row, [col1]) checkAndPut(row, [{col3, value3}, {col4, value4}]) checkAndDelete(row, [col3]) checkAndPut(row, [{col5, value5}], {lock, newLock})	checkAndPut(row, [{col2, value2}, {lock, newLock}]) checkAndDelete(row, [col1, col3]) checkAndPut(row, [{col1, value4}, {col2, value5}, {lock, newLock}])

Series of operations above have the same result but TO-BE has small number of operations which means faster. I didn't mention deleteFamily operation on the example, but this should be consider too.

Note that this optimization is effective on complicate transaction. With this merging mutation algorithm, we can achieve faster execution of complicate transaction.

Why Haeinsa needs both 'PrewriteTimestamp' & 'CommitTimestamp'?

해인사에서는 'PrewriteTimestamp'와 'CommitTimestamp'를 사용합니다.

그런데 CommitTimestamp만 발급받아서 데이터 버전으로 사용하면 안되는지 궁금합니다.
(PrewriteTimestamp가 데이터 버전으로 쓰이는데, CommitTimestamp가 그 역할도 같이 할 순 없는지)

PrewriteTimestamp와 CommitTimestamp를 굳이 구분하여 사용하는 이유가 무엇입니까?

Add oraclejdk8 in travis-ci configuration

Travis-CI now supports oraclejdk8. Adding oraclejdk8 support in travis-ci configuration will be fine.
http://blog.travis-ci.com/2013-11-26-test-your-java-libraries-on-java-8/

Batch support for transactions

I am currently using this library with 3000-5000 operations in transactions.
99% of operations is reads.
In result, I need batch api in order to speed up execution.
I am planning to create batch api for gets and update commit for checking gets in batch way as well.
Any ideas why it will not work?

Inconsistency of Scanner when write during scan operation

When modifying same row during intra scanning, in some cases, following exception can be occurred.

java.util.ConcurrentModificationException
    at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
    at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
    at com.google.common.collect.Iterators$6.next(Iterators.java:641)
    at kr.co.vcnc.haeinsa.HaeinsaMutation$MutationScanner.peek(HaeinsaMutation.java:152)
    at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.nextScanner(HaeinsaTable.java:1148)
    at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.next(HaeinsaTable.java:1111)
    at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner$1.hasNext(HaeinsaTable.java:974)
    ....

This is can be cause inconsistency because it doesn't provide snapshot of hbase when starting scan. Inter-row scanning could have same issue.

Questions on Haeinsa Transactions

Haeinsa는 HBase와 초당 트랜잭션 갯수에 대한 비교 성능평가를 하여 성능을 입증 한 것으로 알고 있습니다.
하지만, 근본적으로 HBase는 다중 로우 트랜잭션을 지원하지 않는 시스템이고 Haeinsa는 다중 로우 트랜잭션을 지원하는 시스템인데,
어떻게 초당 '트랜잭션'의 속도를 비교 평가하였다는 것인지 궁금합니다.
(즉, HBase는 애초에 트랜잭션을 지원하지 않는데 어떻게 초당 트랜잭션의 수치를 계산하였는지 궁금합니다)

How can i use Transaction Operation on Hbase more detail with Haeinsa??

I was read some of your useage guide on github. (obviously, all of em)
But i really curious about How this system running in HBase more specifically.
For example, origin HBase executes its Put operation like that.

" hbase(main):001:0> put 'table', 'row', 'column family', 'value' "

But How can i use Haeinsa's transaction operation such 'terminal' environment like above?
Maybe seems it like below?

" hbase(main):001:0> HaeinsaTx 'table1', 'row1', 'column family1', 'value1', 'table2', 'row2', 'column family2', 'value2' "

I really curious about it.

THANKS.

PLUS.

if i wanna use PUT operation with Haeinsa.
Can i use it like that?
" hbase(main):001:0> HaeinsaPut 'table', 'row', 'column family', 'value' "

Support setFilter method on Haeinsa operations

HeinsaGet, HaeinsaIntraScan and HaeinsaScan should have setFilter method for compatibility to HBase operations. See OperationWithAttributes. See Filter.

Implement HaeinsaConfiguration

Several constants are declared in HaeinsaConstants now, so that user can't modify those properties. To configure those properties, such as qualifier of lock column and lock timeout, we needs the class such as HaeinsaConfiguration.

See example of implementation:

// Configure properties of Haeinsa
// If user don't set value, Haeinsa will use default value
HaeinsaConfiguration conf = new HaeinsaConfiguration();
conf.setRowLockColumnQualifier("lock");
conf.setRowLockColumnFamily("!lock!");
conf.setRowLockTimeout(TimeUnit.SECONDS.toMillis(5));
conf.setRecoverMaxRetryCount(3);

// Pass HaeinsaConfiguration to HaeinsaTransactionManager
HaeinsaTransactionManager tm = new HaeinsaTransactionManager(tablePool, conf);
HaeinsaTableIface table = tablePool.getTable("test");

High Concurrency on the same 3 rows acting strange

We are running some load / performance tests on the haeinsa.
To check recovery time , our scenario is running X transactions with 2 threads. Each transaction has the same 3 rows.
If some fail, retry again with only one thread.

We expected that the first retry using only one thread will succeed to commit all, one by one. but we see that this is not always the case, moreover when X > =100 we sometimes get into long loop of "this row is unstable and not expired yet"

So 3 questions:

If a row get into unstable state, does it mean that I will need to wait till expiration time is over ?
When commit starts but later fails from any reason, Do we get into unstable state in all cases ?
We see special case, when X is bigger than 100 transactions. Do you see a reason this issue will appear exactly when number of items is equal or bigger than 100 ?

Improve performance with async rpc

Performance of Haeinsa transaction can be improve with asynchronous RPC in HBase. If AsyncHBaseClient (which implementation was mentioned on HBASE-2182) is support on HBase, We can use this Client and improve performance of Haeinsa.

Commit operation of Haeinsa transaction consists of following sequence of operations.

prewrite primary row.
prewrite secondary rows.
commit primary row.
apply mutations to secondary rows.
make secondary rows stable.
make primary row stable.

Note that each of 2nd, 4th, 5th operations runs several HBase operation to several secondary rows. In here, series of HBase operations in 2nd step can run concurrently. Series of HBase operations of each of 4th and 5th step can be run concurrently too. To make Haesina transaction robust, following conditions should be meet:

2nd step can run concurrently only after 1st step is succeed.
After all of operations of 2nd stop succeed, 3rd step can be started.
Execution of each step should be strictly ordered.
But, operations in each step can run concurrently.

Tricky part of this optimization is 4th step. Order of mutations in each secondary row, order of checkAndPut and checkAndDelete should be strictly kept.

With this optimization, latency of transaction can be reduced, which come to low conflict rate of transactions. (But, simultaneously, it can cause more operations a bit which means more throughput is need on HBase to use Haeinsa)

Support HBase 0.98.0 (or later)

In HBase 0.98.0, there are major changes on HBase client interface. To support HBase 0.98.0, there should be some work. See Upgrading from 0.96.x to 0.98.x.

Check & put incerement or other

Hi,
We are using your great solution for Transaction operations over hbase.
Some of our API uses check and put and increment counters.
What will happened if I use the haeinsa API, and also check and put in the same rows ?

Is there a best practice for that ?

Or should I implement that my self.

BTW.

I compiled your code with cdh5 (removed the tests) you can see it in my fork, and it is working fine.

Thanks

Ehud

Support for Phoenix

Great job on the transaction support! It would be interesting to have a version of Haeinsa that works with Phoenix.

vcnc / haeinsa Goto Github PK

haeinsa's People

Contributors

Stargazers

Watchers

Forkers

haeinsa's Issues

Recommend Projects

Recommend Topics

Recommend Org