Comments (5)
Yes, we should support. Please send out your test case and the explicit error message.
from spark-sql-on-hbase.
astro> describe kd_trade;
+-------------+---------+-------+
| col_name|data_type|comment|
+-------------+---------+-------+
| rowkey| string| |
| sysTradeId| string| |
| name| string| |
| mobile| string| |
|buyerAlipayNo| string| |
| province| string| |
| city| string| |
| district| string| |
| address| string| |
| platFromId| string| |
| outNick| string| |
+-------------+---------+-------+
astro> describe kd_order;
+----------+---------+-------+
| col_name|data_type|comment|
+----------+---------+-------+
| rowkey| string| |
|sysTradeId| string| |
| title| string| |
+----------+---------+-------+
-
!= 条件出错
astro> select outNick,platFromId,buyerAlipayNo from kd_trade where platFromId!='1';
15/09/23 09:03:37 INFO hbase.HBaseSQLCliDriver: Processing select outNick,platFromId,buyerAlipayNo from kd_trade where platFromId!='1'
15/09/23 09:03:37 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181 sessionTimeout=60000 watcher=catalogtracker-on-hconnection-0x7849896, quorum=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181, baseZNode=/hbase
15/09/23 09:03:37 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x7849896 connecting to ZooKeeper ensemble=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181
15/09/23 09:03:37 INFO zookeeper.ClientCnxn: Opening socket connection to server slave2/192.168.1.222:2181. Will not attempt to authenticate using SASL (unknown error)
15/09/23 09:03:37 INFO zookeeper.ClientCnxn: Socket connection established to slave2/192.168.1.222:2181, initiating session
15/09/23 09:03:37 INFO zookeeper.ClientCnxn: Session establishment complete on server slave2/192.168.1.222:2181, sessionid = 0x34fdf88af361f6a, negotiated timeout = 60000
15/09/23 09:03:37 INFO zookeeper.ZooKeeper: Session: 0x34fdf88af361f6a closed
15/09/23 09:03:37 INFO zookeeper.ClientCnxn: EventThread shut down
15/09/23 09:03:38 INFO hbase.HBaseRelation: Number of HBase regions for table kd_trade: 2
15/09/23 09:03:38 INFO spark.SparkContext: Starting job: main at NativeMethodAccessorImpl.java:-2
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Got job 12 (main at NativeMethodAccessorImpl.java:-2) with 1 output partitions (allowLocal=false)
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Final stage: ResultStage 22(main at NativeMethodAccessorImpl.java:-2)
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Missing parents: List()
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Submitting ResultStage 22 (MapPartitionsRDD[44] at main at NativeMethodAccessorImpl.java:-2), which has no missing parents
15/09/23 09:03:38 INFO storage.MemoryStore: ensureFreeSpace(16048) called with curMem=270692, maxMem=278302556
15/09/23 09:03:38 INFO storage.MemoryStore: Block broadcast_17 stored as values in memory (estimated size 15.7 KB, free 265.1 MB)
15/09/23 09:03:38 INFO storage.MemoryStore: ensureFreeSpace(14363) called with curMem=286740, maxMem=278302556
15/09/23 09:03:38 INFO storage.MemoryStore: Block broadcast_17_piece0 stored as bytes in memory (estimated size 14.0 KB, free 265.1 MB)
15/09/23 09:03:38 INFO storage.BlockManagerInfo: Added broadcast_17_piece0 in memory on 192.168.1.220:47802 (size: 14.0 KB, free: 265.3 MB)
15/09/23 09:03:38 INFO spark.SparkContext: Created broadcast 17 from broadcast at DAGScheduler.scala:874
15/09/23 09:03:38 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 22 (MapPartitionsRDD[44] at main at NativeMethodAccessorImpl.java:-2)
15/09/23 09:03:38 INFO scheduler.TaskSchedulerImpl: Adding task set 22.0 with 1 tasks
15/09/23 09:03:38 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 22.0 (TID 813, 192.168.1.221, ANY, 2932 bytes)
15/09/23 09:03:38 INFO storage.BlockManagerInfo: Added broadcast_17_piece0 in memory on 192.168.1.221:47997 (size: 14.0 KB, free: 1060.2 MB)
15/09/23 09:03:38 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 22.0 (TID 813, 192.168.1.221): org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:403)
at org.apache.spark.sql.hbase.HBaseSQLReaderRDD$$anon$1.hasNext(HBaseSQLReaderRDD.scala:174)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 601 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 25 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 601 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 29 more
15/09/23 09:03:38 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 22.0 (TID 814, 192.168.1.221, ANY, 2932 bytes)
15/09/23 09:03:39 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 22.0 (TID 814) on executor 192.168.1.221: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 1]
15/09/23 09:03:39 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 22.0 (TID 815, 192.168.1.221, ANY, 2932 bytes)
15/09/23 09:03:40 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 22.0 (TID 815) on executor 192.168.1.221: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 2]
15/09/23 09:03:40 INFO scheduler.TaskSetManager: Starting task 0.3 in stage 22.0 (TID 816, 192.168.1.222, ANY, 2932 bytes)
15/09/23 09:03:40 INFO storage.BlockManagerInfo: Added broadcast_17_piece0 in memory on 192.168.1.222:37081 (size: 14.0 KB, free: 1060.2 MB)
15/09/23 09:03:41 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 22.0 (TID 816) on executor 192.168.1.222: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 3]
15/09/23 09:03:41 ERROR scheduler.TaskSetManager: Task 0 in stage 22.0 failed 4 times; aborting job
15/09/23 09:03:41 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 22.0, whose tasks have all completed, from pool
15/09/23 09:03:41 INFO scheduler.TaskSchedulerImpl: Cancelling stage 22
15/09/23 09:03:41 INFO scheduler.DAGScheduler: ResultStage 22 (main at NativeMethodAccessorImpl.java:-2) failed in 3.199 s
15/09/23 09:03:41 INFO scheduler.DAGScheduler: Job 12 failed: main at NativeMethodAccessorImpl.java:-2, took 3.212873 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 22.0 failed 4 times, most recent failure: Lost task 0.3 in stage 22.0 (TID 816, 192.168.1.222): org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:403)
at org.apache.spark.sql.hbase.HBaseSQLReaderRDD$$anon$1.hasNext(HBaseSQLReaderRDD.scala:174)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 607 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 25 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 607 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 29 more
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
-
is not null 出错
astro> select outNick,platFromId,buyerAlipayNo from kd_trade where buyerAlipayNo is not null;
15/09/23 09:05:25 INFO hbase.HBaseSQLCliDriver: Processing select outNick,platFromId,buyerAlipayNo from kd_trade where buyerAlipayNo is not null
15/09/23 09:05:25 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181 sessionTimeout=60000 watcher=catalogtracker-on-hconnection-0x7849896, quorum=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181, baseZNode=/hbase
15/09/23 09:05:25 INFO zookeeper.RecoverableZooKeeper: Process identifier=catalogtracker-on-hconnection-0x7849896 connecting to ZooKeeper ensemble=slave4:2181,slave2:2181,slave1:2181,master:2181,slave3:2181
15/09/23 09:05:25 INFO zookeeper.ClientCnxn: Opening socket connection to server slave2/192.168.1.222:2181. Will not attempt to authenticate using SASL (unknown error)
15/09/23 09:05:25 INFO zookeeper.ClientCnxn: Socket connection established to slave2/192.168.1.222:2181, initiating session
15/09/23 09:05:25 INFO zookeeper.ClientCnxn: Session establishment complete on server slave2/192.168.1.222:2181, sessionid = 0x34fdf88af361f6e, negotiated timeout = 60000
15/09/23 09:05:25 INFO zookeeper.ZooKeeper: Session: 0x34fdf88af361f6e closed
15/09/23 09:05:25 INFO zookeeper.ClientCnxn: EventThread shut down
15/09/23 09:05:25 INFO hbase.HBaseRelation: Number of HBase regions for table kd_trade: 2
15/09/23 09:05:25 INFO spark.SparkContext: Starting job: main at NativeMethodAccessorImpl.java:-2
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Got job 15 (main at NativeMethodAccessorImpl.java:-2) with 1 output partitions (allowLocal=false)
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Final stage: ResultStage 25(main at NativeMethodAccessorImpl.java:-2)
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Missing parents: List()
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Submitting ResultStage 25 (MapPartitionsRDD[50] at main at NativeMethodAccessorImpl.java:-2), which has no missing parents
15/09/23 09:05:25 INFO storage.MemoryStore: ensureFreeSpace(16048) called with curMem=167840, maxMem=278302556
15/09/23 09:05:25 INFO storage.MemoryStore: Block broadcast_20 stored as values in memory (estimated size 15.7 KB, free 265.2 MB)
15/09/23 09:05:25 INFO storage.MemoryStore: ensureFreeSpace(14362) called with curMem=183888, maxMem=278302556
15/09/23 09:05:25 INFO storage.MemoryStore: Block broadcast_20_piece0 stored as bytes in memory (estimated size 14.0 KB, free 265.2 MB)
15/09/23 09:05:25 INFO storage.BlockManagerInfo: Added broadcast_20_piece0 in memory on 192.168.1.220:47802 (size: 14.0 KB, free: 265.3 MB)
15/09/23 09:05:25 INFO spark.SparkContext: Created broadcast 20 from broadcast at DAGScheduler.scala:874
15/09/23 09:05:25 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 25 (MapPartitionsRDD[50] at main at NativeMethodAccessorImpl.java:-2)
15/09/23 09:05:25 INFO scheduler.TaskSchedulerImpl: Adding task set 25.0 with 1 tasks
15/09/23 09:05:25 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 25.0 (TID 825, 192.168.1.222, ANY, 2672 bytes)
15/09/23 09:05:25 INFO storage.BlockManagerInfo: Added broadcast_20_piece0 in memory on 192.168.1.222:37081 (size: 14.0 KB, free: 1060.2 MB)
15/09/23 09:05:26 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 25.0 (TID 825, 192.168.1.222): org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:403)
at org.apache.spark.sql.hbase.HBaseSQLReaderRDD$$anon$1.hasNext(HBaseSQLReaderRDD.scala:174)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 625 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 25 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 625 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 29 more
15/09/23 09:05:26 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 25.0 (TID 826, 192.168.1.221, ANY, 2672 bytes)
15/09/23 09:05:26 INFO storage.BlockManagerInfo: Added broadcast_20_piece0 in memory on 192.168.1.221:47997 (size: 14.0 KB, free: 1060.2 MB)
15/09/23 09:05:26 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 25.0 (TID 826) on executor 192.168.1.221: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 1]
15/09/23 09:05:26 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 25.0 (TID 827, 192.168.1.222, ANY, 2672 bytes)
15/09/23 09:05:27 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 25.0 (TID 827) on executor 192.168.1.222: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 2]
15/09/23 09:05:27 INFO scheduler.TaskSetManager: Starting task 0.3 in stage 25.0 (TID 828, 192.168.1.223, ANY, 2672 bytes)
15/09/23 09:05:27 INFO storage.BlockManagerInfo: Added broadcast_20_piece0 in memory on 192.168.1.223:51252 (size: 14.0 KB, free: 1060.2 MB)
15/09/23 09:05:28 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 25.0 (TID 828) on executor 192.168.1.223: org.apache.hadoop.hbase.DoNotRetryIOException (Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?) [duplicate 3]
15/09/23 09:05:28 ERROR scheduler.TaskSetManager: Task 0 in stage 25.0 failed 4 times; aborting job
15/09/23 09:05:28 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 25.0, whose tasks have all completed, from pool
15/09/23 09:05:28 INFO scheduler.TaskSchedulerImpl: Cancelling stage 25
15/09/23 09:05:28 INFO scheduler.DAGScheduler: ResultStage 25 (main at NativeMethodAccessorImpl.java:-2) failed in 2.947 s
15/09/23 09:05:28 INFO scheduler.DAGScheduler: Job 15 failed: main at NativeMethodAccessorImpl.java:-2, took 2.958849 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 25.0 failed 4 times, most recent failure: Lost task 0.3 in stage 25.0 (TID 828, 192.168.1.223): org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:403)
at org.apache.spark.sql.hbase.HBaseSQLReaderRDD$$anon$1.hasNext(HBaseSQLReaderRDD.scala:174)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$3.apply(SparkPlan.scala:143)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 631 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:355)
... 25 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 631 number_of_rows: 5000 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2144)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1457)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29990)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 29 more
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
- 表关联无返回值,但实际上是有匹配的数据的。
astro> SELECT kd.outNick,kd.platFromId,ko.title FROM kd_trade kd,kd_order ko WHERE kd.sysTradeId=ko.sysTradeId;
+-------+----------+-----+
|outNick|platFromId|title|
+-------+----------+-----+
+-------+----------+-----+
astro> select sysTradeId,title from kd_order where sysTradeId="826126064117768";
+---------------+--------------------+
| sysTradeId| title|
+---------------+--------------------+
|826126064117768|美国直发包邮 普丽普莱终极深海鱼油...|
+---------------+--------------------+
astro> select sysTradeId,nme from kd_trade where sysTradeId="826126064117768";
+---------------+----+
| sysTradeId|name|
+---------------+----+
|826126064117768| 杨燕|
+---------------+----+
ps:望楼主有支持该功能,通知一声
from spark-sql-on-hbase.
Could you try to turn off the use of custom filter (and/or coprocessor) to see if the problems go away?
from spark-sql-on-hbase.
I simulated, with use of custom filter and coprocessor which is the default, on the "!=" and "not null" predicates on a non-key string column,
and both work well. I do not try the join case. I suspect your environment is questionable and recommend to double check. For one, the column mapping between the HBase table and the SQL table, the primary key setup, etc., need to be correct for the system to work properly.
from spark-sql-on-hbase.
Just as @yzhou2001 said, I think @xuzhiliang your environment is questionable, and I simulated the tests, they both work well.Maybe it is coused by the version of hbase or spark.
from spark-sql-on-hbase.
Related Issues (20)
- Is there any specific plan to support Spark 1.5 HOT 3
- toInt Error HOT 5
- toInt Error HOT 1
- Test Fails HOT 6
- Issue while running Spark-sql on Hbase HOT 1
- Exception in thread "main" java.lang.Exception: The logical table: <name> already exists HOT 5
- Error on executing 'Select * from tablename' HOT 1
- Add SparkHbase as a package in SparkSql HOT 1
- Detailed Documentation HOT 2
- "SparkSQLOnHBase" support for secure HBase HOT 1
- Some of the codes that I can't understand HOT 1
- something about HBasesqlParser how to realize HOT 2
- Problem in reading integer value HOT 5
- Spark-SQL-on-HBase-hbase_branch_1.1BUILD ERROR HOT 1
- Anybody know how to solve this question? Thank you! HOT 1
- Do you guys have any plans to move this to HBase API 1.0.x? HOT 1
- Should the doc for this in docx file format? HOT 1
- Missing class diagram in SparkSQLOnHBase_v2.2.docx HOT 1
- 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-sql-on-hbase.