vcnc / haeinsa Goto Github PK
View Code? Open in Web Editor NEWHaeinsa is linearly scalable multi-row, multi-table transaction library for HBase
License: Apache License 2.0
Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase
License: Apache License 2.0
Isolation level of Haeinsa should be described more clearly. In document, isolation level of Haeinsa is described as serializability. But it can be confused with ANSI SQL SERIALIZABLE
in BERE95. In this paper, ANSI SQL SERIALIZABLE
is defined as consistency level which does not allow phantom read. Phantom read can be well described with predicate lock, which protect transaction from others to modify any data satisfy search condition included in the ongoing transaction.
Although Haeinsa does not allow photom in most of operations including intra row scan, it still has phantom read phenomenon in inter row scan. So isolation level of Haeinsa meets condition of conflict serializability, but not 'ANSI SQL SERIALIZABILITY'.
BERE95: Hal Berenson et al. A Critique of ANSI SQL Isolation
Levels. In Proceedings of SIGMOD, 1995.
HeinsaGet
, HaeinsaPut
, HeinsaDelete
, HaeinsaIntraScan
and HaeinsaScan
should have setAttribute
for compatibility to HBase operations. See OperationWithAttributes.
We see a concurrency problem when running put with multiple threads and one Transaction.
The case is :
We share the HaeinsaTransaction object. between few threads which calling put simultaneously.
After all succeed we commit
After we noticed the problem we dived into the code a little bit, and saw the following:
protected HaeinsaTableTransaction createOrGetTableState(byte[] tableName) {
HaeinsaTableTransaction tableTxState = txStates.getTableStates().get(tableName);
if (tableTxState == null) {
tableTxState = new HaeinsaTableTransaction(this);
txStates.getTableStates().put(tableName, tableTxState);
}
return tableTxState;
}
This code seems to be not "Thread safe".
Please confirm this assumption.
We can fix this issue and commit, But maybe there are other places ? and also maybe this is bad practice from other reasons.
Please advice ?
Hello,
We are running 2 different clients (on different machines) that have difference in their System.currentmillis.
Does this may cause problems in the correctness of the transactions?
Also, without using your API, it caused us the following problem:
When client A gets an update on item X, he calls Put(row, ts). The ts is taken from the client's machine local timestamp (synced by NTP). Lets assume it was ts = 3.
Now client B (on different machine) gets after 1ms an update on the same item. The ts here is 2 (on client A machine the ts is now 4).
Both updates succeeded. If I will call get, I will get the first update instead of the second one.
The reason is that the second update is with younger ts (due to miss time syncronization between machine).
The solution was not to set the ts on clients machine and let the region server to set it.
I am concerned we will have the same problem with your library.
What do you think?
Ehud
Could you please add the setTimeout() method of the Tx? So the tx beginner can decide the timeout time according to the business cases.
Thread 1 and thread 2 has same commit timestamp.
thread 1 do prewrite primary, thread 2 do abort immediately after that for primary.
Then thread 1 & 2 wait for expiration and all start over again.
For now, I making HaeinsaTransactions::hasSameCommitTimestamp to return false all the time.
I am not sure weather this bug has only deadlock implication or data integrity implication as well. But I think it is unsafe to assume that 2 transactions are the same if they have same commit timestamp.
https://docs.travis-ci.com/user/migrating-from-legacy/
Requires downloading Thrift compiler from somewhere other than PPA.
Google announced Cloud BigTable which support subset of HBase APIs:
http://googlecloudplatform.blogspot.kr/2015/05/introducing-Google-Cloud-Bigtable.html
Cloud BigTable do not support some HBase APIs, such as Coprocessors, Admin and so on:
https://cloud.google.com/bigtable/docs/hbase-differences
Haeinsa is likely to support Cloud BigTable, since Haeinsa do not use Coprocessors. (Unlike other libraries implementing transaction on HBase)
But, there is problem:
Deleting a specific version of a column based on its timestamp is not supported.
The following methods in the class org.apache.hadoop.hbase.client.Delete are not supported:
new Delete(byte[] row, long timestamp)
addColumn(byte[] family, byte[] qualifier)
addFamily(byte[] family, long timestamp)
addFamilyVersion(byte[] family, long timestamp)
Since Haeinsa depends on deleting a specific version of a column based on its timestamp
, this problem must be solved to Haeinsa to support Cloud BigTable.
This problem might be able to solved by using something tombstone mechanism, rather than deleting column by specific version.
Is there a strong reason for not being able to delete a complete row?
Currently the method HaeinsaTable.delete()
checks that the size of the family map is not zero and throws an exception if that is the case. The comment above the precondition check states that it is not possible to delete entire rows because of the lock column.
But shouldn't it be possible to remove the complete row in the HaeinsaTable.makeStable()
method? Instead of setting the lock state for this row to STABLE
, the row is removed from HBase.
In case the row is a secondary row in the transaction, the primary row needs the information that this row needs to be deleted. During transaction recovery (in the state COMMITTED
) it can check whether the row has been deleted or not. It can delete the row if the prewriteTimestamp
matches (in order to not delete a concurrently added new version).
In case the row is the primary row in the transaction, it can be removed completely.
If there are two competing transactions, one trying to delete and the other trying to update the row, then the updating one would fail (if the other one was faster) because the checkAndPut()
operation fails when the row no longer exists. For the other way round HBase's checkAndDelete()
operation can be used.
Hi,
This is not really an issue with the current codebase as it uses HBase 0.94 . I am trying to upgrade it to use HBase 1.2 for our purposes and hoping you can give some guidance on the only failing unit test. I can share my fork if you want but didn't make many changes other than to get rid of the table pool as now HBase Table instances are lightweight, so I create them and HaeinsaTable instances on demand.
All unit tests except HaeinsaComplextTest.testSerializability are passing. That one seems to suffer from a high failure rate but still seems to be making progress towards 100 successful iterations that it requires, but at a very slow rate. Most transactions are failing with the following:
FAILED ITERATION 8 THREAD Serializability-job-thread-4 ERROR: this row is unstable and not expired yet.
ERROR
kr.co.vcnc.haeinsa.exception.NotExpiredYetException: this row is unstable and not expired yet.
at kr.co.vcnc.haeinsa.HaeinsaTable.checkAndIsShouldRecover(HaeinsaTable.java:438)
at kr.co.vcnc.haeinsa.HaeinsaTable.access$000(HaeinsaTable.java:75)
at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.next(HaeinsaTable.java:1030)
at kr.co.vcnc.haeinsa.HaeinsaTable.get(HaeinsaTable.java:160)
at kr.co.vcnc.haeinsa.HaeinsaComplexTest$2.run(HaeinsaComplexTest.java:244)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I see the same in the older codebase but the failure rate is lower and it fairly quickly gets to 100 successes and the test passes. In the new code it runs for hours and most threads are still at iteration number in the 50's, at that point the threads start throwing out of memory exceptions.
If I can get some clues where to look at I'd be happy to contribute any changes reqiured back to the project.
Thanks.
Haeinsa is designed to be robust against I/O failures of HBase. To simulate I/O failures, HTableInterface
implementation which operation fails randomly is needed. This implementation of HTableInterface
throws IOException
on method randomly.
In the unit test, we check consistency of data in HBase. This implementation of unit test might be similar to implementation of HaeinsaComplexTest.
This idea is inspired by blog post from cloudera. (See 'Randomized fault testing' section)
Is there a strong reason for not being able to delete a complete row?
Currently the method HaeinsaTable.delete()
checks that the size of the family map is not zero and throws an exception if that is the case. The comment above the precondition check states that it is not possible to delete entire rows because of the lock column.
But shouldn't it be possible to remove the complete row in the HaeinsaTable.makeStable()
method? Instead of setting the lock state for this row to STABLE
, the row is removed from HBase.
In case the row is a secondary row in the transaction, the primary row needs the information that this row needs to be deleted. During transaction recovery (in the state COMMITTED
) it can check whether the row has been deleted or not. It can delete the row if the prewriteTimestamp
matches (in order to not delete a concurrently added new version).
In case the row is the primary row in the transaction, it can be removed completely.
If there are two competing transactions, one trying to delete and the other trying to update the row, then the updating one would fail (if the other one was faster) because the checkAndPut()
operation fails when the row no longer exists. For the other way round HBase's checkAndDelete()
operation can be used.
When running read without HaeinsaTransaction.
We can see that we get the values in the HaeinsaResult, but I can use the getValue api nor the containsColumn (since it is using the getValue).
Is this a bug ?
In order to work around this I use list and matching the value by (family, qualifier) with my own code.
Haeinsa applies mutations of transaction to rows on commit phase. If transaction contains several mutations for certain row, Haeinsa applies mutations step by step with checkAndPut
and checkAndDelete
operation. We can optimize this step.
Let's assume that certain transaction should applies following mutations to certain row:
Here is comparison of how Haeinsa works on current version and optimized version:
AS-IS Haeinsa | TO-BE Haeinsa |
---|---|
checkAndPut(row, [{col1, value1}, {col2, value2}, {lock, newLock}) checkAndDelete(row, [col1]) checkAndPut(row, [{col3, value3}, {col4, value4}]) checkAndDelete(row, [col3]) checkAndPut(row, [{col5, value5}], {lock, newLock}) |
checkAndPut(row, [{col2, value2}, {lock, newLock}]) checkAndDelete(row, [col1, col3]) checkAndPut(row, [{col1, value4}, {col2, value5}, {lock, newLock}]) |
Series of operations above have the same result but TO-BE has small number of operations which means faster. I didn't mention deleteFamily
operation on the example, but this should be consider too.
Note that this optimization is effective on complicate transaction. With this merging mutation algorithm, we can achieve faster execution of complicate transaction.
해인사에서는 'PrewriteTimestamp'와 'CommitTimestamp'를 사용합니다.
그런데 CommitTimestamp만 발급받아서 데이터 버전으로 사용하면 안되는지 궁금합니다.
(PrewriteTimestamp가 데이터 버전으로 쓰이는데, CommitTimestamp가 그 역할도 같이 할 순 없는지)
PrewriteTimestamp와 CommitTimestamp를 굳이 구분하여 사용하는 이유가 무엇입니까?
Travis-CI now supports oraclejdk8. Adding oraclejdk8 support in travis-ci configuration will be fine.
http://blog.travis-ci.com/2013-11-26-test-your-java-libraries-on-java-8/
I am currently using this library with 3000-5000 operations in transactions.
99% of operations is reads.
In result, I need batch api in order to speed up execution.
I am planning to create batch api for gets and update commit for checking gets in batch way as well.
Any ideas why it will not work?
When modifying same row during intra scanning, in some cases, following exception can be occurred.
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
at com.google.common.collect.Iterators$6.next(Iterators.java:641)
at kr.co.vcnc.haeinsa.HaeinsaMutation$MutationScanner.peek(HaeinsaMutation.java:152)
at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.nextScanner(HaeinsaTable.java:1148)
at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner.next(HaeinsaTable.java:1111)
at kr.co.vcnc.haeinsa.HaeinsaTable$ClientScanner$1.hasNext(HaeinsaTable.java:974)
....
This is can be cause inconsistency because it doesn't provide snapshot of hbase when starting scan. Inter-row scanning could have same issue.
Haeinsa는 HBase와 초당 트랜잭션 갯수에 대한 비교 성능평가를 하여 성능을 입증 한 것으로 알고 있습니다.
하지만, 근본적으로 HBase는 다중 로우 트랜잭션을 지원하지 않는 시스템이고 Haeinsa는 다중 로우 트랜잭션을 지원하는 시스템인데,
어떻게 초당 '트랜잭션'의 속도를 비교 평가하였다는 것인지 궁금합니다.
(즉, HBase는 애초에 트랜잭션을 지원하지 않는데 어떻게 초당 트랜잭션의 수치를 계산하였는지 궁금합니다)
I was read some of your useage guide on github. (obviously, all of em)
But i really curious about How this system running in HBase more specifically.
For example, origin HBase executes its Put operation like that.
" hbase(main):001:0> put 'table', 'row', 'column family', 'value' "
But How can i use Haeinsa's transaction operation such 'terminal' environment like above?
Maybe seems it like below?
" hbase(main):001:0> HaeinsaTx 'table1', 'row1', 'column family1', 'value1', 'table2', 'row2', 'column family2', 'value2' "
I really curious about it.
THANKS.
PLUS.
if i wanna use PUT operation with Haeinsa.
Can i use it like that?
" hbase(main):001:0> HaeinsaPut 'table', 'row', 'column family', 'value' "
HeinsaGet
, HaeinsaIntraScan
and HaeinsaScan
should have setFilter
method for compatibility to HBase operations. See OperationWithAttributes. See Filter.
Several constants are declared in HaeinsaConstants
now, so that user can't modify those properties. To configure those properties, such as qualifier of lock column and lock timeout, we needs the class such as HaeinsaConfiguration
.
See example of implementation:
// Configure properties of Haeinsa
// If user don't set value, Haeinsa will use default value
HaeinsaConfiguration conf = new HaeinsaConfiguration();
conf.setRowLockColumnQualifier("lock");
conf.setRowLockColumnFamily("!lock!");
conf.setRowLockTimeout(TimeUnit.SECONDS.toMillis(5));
conf.setRecoverMaxRetryCount(3);
// Pass HaeinsaConfiguration to HaeinsaTransactionManager
HaeinsaTransactionManager tm = new HaeinsaTransactionManager(tablePool, conf);
HaeinsaTableIface table = tablePool.getTable("test");
We are running some load / performance tests on the haeinsa.
To check recovery time , our scenario is running X transactions with 2 threads. Each transaction has the same 3 rows.
If some fail, retry again with only one thread.
We expected that the first retry using only one thread will succeed to commit all, one by one. but we see that this is not always the case, moreover when X > =100 we sometimes get into long loop of "this row is unstable and not expired yet"
So 3 questions:
Performance of Haeinsa transaction can be improve with asynchronous RPC in HBase. If AsyncHBaseClient
(which implementation was mentioned on HBASE-2182) is support on HBase, We can use this Client and improve performance of Haeinsa.
Commit operation of Haeinsa transaction consists of following sequence of operations.
Note that each of 2nd, 4th, 5th operations runs several HBase operation to several secondary rows. In here, series of HBase operations in 2nd step can run concurrently. Series of HBase operations of each of 4th and 5th step can be run concurrently too. To make Haesina transaction robust, following conditions should be meet:
Tricky part of this optimization is 4th step. Order of mutations in each secondary row, order of checkAndPut and checkAndDelete should be strictly kept.
With this optimization, latency of transaction can be reduced, which come to low conflict rate of transactions. (But, simultaneously, it can cause more operations a bit which means more throughput is need on HBase to use Haeinsa)
In HBase 0.98.0, there are major changes on HBase client interface. To support HBase 0.98.0, there should be some work. See Upgrading from 0.96.x to 0.98.x.
Hi,
We are using your great solution for Transaction operations over hbase.
Some of our API uses check and put and increment counters.
What will happened if I use the haeinsa API, and also check and put in the same rows ?
Is there a best practice for that ?
Or should I implement that my self.
BTW.
I compiled your code with cdh5 (removed the tests) you can see it in my fork, and it is working fine.
Thanks
Ehud
Great job on the transaction support! It would be interesting to have a version of Haeinsa that works with Phoenix.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.