Giter Club home page Giter Club logo

hadoopcryptoledger's People

Contributors

danny-at-dalhousie avatar elek avatar jornfranke avatar phelps-sg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hadoopcryptoledger's Issues

Negative gasPrice and gasLimit

Hello and thank you for creating this very useful open source project!

I started using it yesterday and ran into a couple of issues, I am hoping you can point me in the right direction. I'm including here an example app for reproducing the problems.

My setup:

  • Ubuntu 16.04 (64-bit)
  • Scala 2.11.7
  • Spark 2.2.1 (Hadoop 2.7) running in local mode
  • Geth 1.7.3-stable

My Geth node is synced to the mainnet and following the advice here, I created multiple files for holding the exported blockchain (up until block number 5M).

Here's a snippet of the directory, with the file names using a slightly different nomenclature than the example, but still produced with geth export:

➜  eth-blockchain ls -lah
total 25G
drwxrwxr-x  2 sancho sancho 4.0K Feb 13 16:22 .
drwxr-xr-x 43 sancho sancho 4.0K Feb 13 16:22 ..
-rwxrwxr-x  1 sancho sancho 126M Feb 13 03:01 blocks-0-200000
-rwxrwxr-x  1 sancho sancho 240M Feb 13 03:32 blocks-1000001-1200000
-rwxrwxr-x  1 sancho sancho 274M Feb 13 03:33 blocks-1200001-1400000
-rwxrwxr-x  1 sancho sancho 299M Feb 13 03:33 blocks-1400001-1600000
-rwxrwxr-x  1 sancho sancho 307M Feb 13 03:40 blocks-1600001-1800000
-rwxrwxr-x  1 sancho sancho 290M Feb 13 03:40 blocks-1800001-2000000
-rwxrwxr-x  1 sancho sancho 301M Feb 13 03:41 blocks-2000001-2200000
-rwxrwxr-x  1 sancho sancho 148M Feb 13 03:02 blocks-200001-400000
-rwxrwxr-x  1 sancho sancho 332M Feb 13 03:41 blocks-2200001-2400000
...

The code for the app I'm using:

package analytics

import collection.JavaConverters._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.io.BytesWritable
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.zuinnote.hadoop.ethereum.format.common.EthereumBlock
import org.zuinnote.hadoop.ethereum.format.mapreduce.EthereumBlockFileInputFormat


object TestApp {

  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf()
      .setAppName("test-app")
      .setMaster("local[*]")

    val spark = SparkSession.builder
      .config(sparkConf)
      .getOrCreate()

    val hadoopConf = new Configuration()
    hadoopConf.set("hadoopcryptoledeger.ethereumblockinputformat.usedirectbuffer", "false")

    val rdd = spark.sparkContext.newAPIHadoopFile(
      "/home/sancho/eth-blockchain",
      classOf[EthereumBlockFileInputFormat], classOf[BytesWritable], classOf[EthereumBlock], hadoopConf
    )

    println("Number of transactions with negative gas price: " + rdd
      .flatMap(_._2.getEthereumTransactions.asScala)
      .filter(_.getGasPrice < 0)
      .count()
    )

    println("Number of transactions with negative gas limit: " + rdd
      .flatMap(_._2.getEthereumTransactions.asScala)
      .filter(_.getGasLimit < 0)
      .count()
    )

    val blockNumber = 4800251

    println(s"Number of transactions with negative gas price in block $blockNumber: " + rdd
        .filter(_._2.getEthereumBlockHeader.getNumber == blockNumber)
        .flatMap(_._2.getEthereumTransactions.asScala)
        .filter(_.getGasPrice < 0)
        .count()
    )

    println(s"Number of transactions with negative gas limit in block $blockNumber: " + rdd
      .filter(_._2.getEthereumBlockHeader.getNumber == blockNumber)
      .flatMap(_._2.getEthereumTransactions.asScala)
      .filter(_.getGasLimit < 0)
      .count()
    )
  }
}

This is the build.sbt file:

lazy val commonSettings = Seq(
  scalaVersion := "2.11.7",
  test in assembly := {}
)

lazy val ethBlockchainAnalytics = (project in file(".")).
  settings(commonSettings).
  settings(
    name := "EthBlockchainAnalytics",
    version := "0.1",
    libraryDependencies ++= Seq(
      "com.github.zuinnote" %% "spark-hadoopcryptoledger-ds" % "1.1.2",
      "org.apache.spark" %% "spark-core" % "2.2.1" % "provided",
      "org.apache.spark" %% "spark-sql" % "2.2.1" % "provided"),
    assemblyJarName in assembly := s"${name.value}_${scalaBinaryVersion.value}-${version.value}.jar",
    assemblyMergeStrategy in assembly := {
      case PathList("META-INF", xs@_*) => MergeStrategy.discard
      case PathList("javax", "servlet", xs@_*) => MergeStrategy.last
      case PathList("org", "apache", xs@_*) => MergeStrategy.last
      case x =>
        val oldStrategy = (assemblyMergeStrategy in assembly).value
        oldStrategy(x)
    }
  )

The launcher script I'm using:

#!/bin/sh

JAR=$1

/usr/local/lib/spark/bin/spark-submit \
    --class analytics.TestApp \
    --driver-memory 20G \
$JAR

And finally the command I'm using to run it:

➜  EthBlockchainAnalytics src/main/resources/launcher.sh /home/sancho/IdeaProjects/EthBlockchainAnalytics/target/scala-2.11/EthBlockchainAnalytics_2.11-0.1.jar

The output of the above application when run like this is:

Number of transactions with negative gas price: 8732263
Number of transactions with negative gas limit: 25699923
Number of transactions with negative gas price in block 4800251: 2
Number of transactions with negative gas limit in block 4800251: 8

As a quick sanity check, I ran the following:

➜  ~ geth attach
Welcome to the Geth JavaScript console!

instance: Geth/v1.7.3-stable-4bb3c89d/linux-amd64/go1.9
 modules: admin:1.0 debug:1.0 eth:1.0 miner:1.0 net:1.0 personal:1.0 rpc:1.0 txpool:1.0 web3:1.0

> var txs = eth.getBlock(4800251).transactions
undefined
> for (var i=0; i<txs.length; i++) { if (eth.getTransaction(txs[i]).gasPrice < 0) console.log(txs[i]) }
undefined

Any idea why I'm seeing so many negative gas prices and gas limits when using Spark?

Create Web scraper to fetch currency exchange rates

Create web scraper to fetch currency exchange rates for currencies based on cryptoledgers.
This should support multiple sources in a generic way and the data should be made available together with cryptoledger data provided already by the HadoopCryptoLedger library.

First we need to work on a design, implementation, unit & integration tests (the latter probably on an embedded jetty and/or Tomcat)

Support for Ethereum

Ethereum is a decentralized platform that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third party interference (https://www.ethereum.org/).

This implies to add additional HadoopInputFormats to process the Ethereum blockchain. Furthermore a HiveSerde and Spark datasource should be created. Unit tests must be included.

Finally, an example for MapReduce, Hive and Spark should be provided. The examples should include unit and integration tests.

Ethereum classic is out of scope.

Is there a way around the currently supported maximum ethereum block size ?

We are running into exceptions when attempting to do graph analysis using graphframes after creating a dataframe using spark-hadoopcryptoledger-ds. Seems like there are blocksizes bigger than what can fit inside an int ?

java.lang.InterruptedException: org.zuinnote.hadoop.ethereum.format.exception.EthereumBlockReadException: Error: This block size cannot be handled currently (larger then largest number in positive signed int)

Bitcoin: Enhanced support for segwit

Although segwit blocks can be processed with the current hadoopcryptoledger library, dedicated support makes it more easier to use.

Add unit tests for segwit2x compatible blocks:
https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki

fix if needed issues.

increase the default max blocksize according to formula (although the current default of 2M should be sufficient to process the new segwit blocks). Set default max block size to block weight <= 4,000,000

BitcoinScriptpatternparser:
Add parsing for the new witness transaction types:
pay-to-witness-public-key-hash
pay-to-witness-script-hash
support compressed p2hash p2pub addresses

BitcoinUtil:
calculate segwit transaction hash
calculate transaction weight
calculate virtual transaction size
calculate base transaction size
calculate total transaction size
calculate block weight
calculate block base size

Support for Monero

Monero is a secure, private, untraceable currency. It is open-source and freely available to all.(https://getmonero.org/home).

This implies to add additional HadoopInputFormats to process the Monero blockchain. Furthermore a HiveSerde, Flink Data Source and Spark datasource should be created. Unit tests must be included.

Finally, an example for MapReduce, Hive, Flink and Spark should be provided. The examples should include unit and integration tests.

Bitcoin: Convert key hash to Bitcoin address

At the moment BitcoinScriptPatternParser returns the key hash which can be used to identify the unique address of a transaction and also to search it in blockchain explorers.
Bitcoin has also the notation of Bitcoin addresses which are primarily for users to improve the user experience. The conversion of such addresses is described in https://github.com/bitcoin/bips/blob/master/bip-0013.mediawiki

The idea of this issue is to create a functionality BitcoinScriptPatternParser.getBitcoinAddress(String) that converts the output of BitcoinScriptPatternParser.getPaymentDestination to a bitcoin address (if possible). Additionally we create HiveUDFs. All new methods need to be unit tested and documentation needs to be updated.

Support Hyperledger

The hyperledger provides various frameworks for developing cryptoledgers. Frameworks are: Fabric, Iroha, Sawtooth.

The idea of this issue to investigate how they can be included in hadoopcryptoledger with unit tests, provide examples on how to analyse of those ledgers and write integration tests for them.

Possible data loss when reading transaction values

My understanding is that the value field of an ethereum transaction can by upto 32 bytes long, however we store this as a long after parsing the RLP encoded data. It seems that we store 0L in case the data is longer than 8 bytes and there is no warning or error thrown when transaction values are discarded / replaced by zeros. Is this the case ?

Presto: Create Connector

Presto is a distributed SQL Query Engine for Big Data supporting a variety of sources. The idea is to add the hadoopcryptoledger library as a source to query directly blockchains, such as Bitcoin&Altcoins or Ethereum&Altcoins.
Currently, the hadoopcryptoledger library can be used indirectly with Presto via the Hive connector.

Relevant documentation:
https://prestodb.io/docs/current/develop/connectors.html

Issue:Spark Scala Graphx to analyze the Bitcoin transaction graph

@jornfranke
I have run the example according to the wiki:
https://github.com/ZuInnoTe/hadoopcryptoledger/wiki/Using-Spark-Scala-Graphx-to-analyze-the-Bitcoin-transaction-graph

it has run over 4 hours,and the issue occurs below.
and i run it again for about 4hours.the issue still happens.
it seems that the memory is not enough.but the memory of my docker is enough big.

[root@quickstart scala-spark-graphx-bitcointransaction]# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         2.0T  1.2T  797G  60% /
tmpfs            32G     0   32G   0% /dev
tmpfs            32G     0   32G   0% /sys/fs/cgroup
/dev/xvda1      2.0T  1.2T  797G  60% /etc/resolv.conf
/dev/xvda1      2.0T  1.2T  797G  60% /etc/hostname
/dev/xvda1      2.0T  1.2T  797G  60% /etc/hosts
shm              64M     0   64M   0% /dev/shm
[root@quickstart scala-spark-graphx-bitcointransaction]# free
             total       used       free     shared    buffers     cached
Mem:      65960180   25773772   40186408     686156     102472   14819768
-/+ buffers/cache:   10851532   55108648
Swap:            0          0          0
[root@quickstart scala-spark-graphx-bitcointransaction]# free -h
             total       used       free     shared    buffers     cached
Mem:           62G        24G        38G       670M       100M        14G
-/+ buffers/cache:        10G        52G
Swap:           0B         0B         0B

how i can deal with this problem?thanks very much.

18/06/08 06:17:35 INFO collection.ExternalSorter: Thread 913 spilling in-memory map of 24.6 MB to disk (9 times so far)
18/06/08 06:17:35 INFO collection.ExternalSorter: Thread 947 spilling in-memory map of 25.0 MB to disk (28 times so far)
18/06/08 06:17:55 INFO collection.ExternalAppendOnlyMap: Thread 922 spilling in-memory map of 27.4 MB to disk (3 times so far)
18/06/08 06:18:02 INFO collection.ExternalSorter: Thread 947 spilling in-memory map of 24.7 MB to disk (29 times so far)
18/06/08 06:18:04 INFO collection.ExternalSorter: Thread 950 spilling in-memory map of 24.7 MB to disk (15 times so far)
18/06/08 06:18:09 INFO collection.ExternalSorter: Thread 909 spilling in-memory map of 24.8 MB to disk (18 times so far)
18/06/08 06:18:25 INFO collection.ExternalSorter: Thread 913 spilling in-memory map of 24.7 MB to disk (10 times so far)
18/06/08 06:18:26 WARN memory.TaskMemoryManager: leak 18.3 MB memory from org.apache.spark.util.collection.ExternalSorter@12719270
18/06/08 06:18:26 ERROR executor.Executor: Managed memory leak detected; size = 19172698 bytes, TID = 7855
18/06/08 06:18:26 ERROR executor.Executor: Exception in task 168.0 in stage 8.0 (TID 7855)
java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
18/06/08 06:18:26 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-67,5,main]
java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
18/06/08 06:18:26 INFO spark.SparkContext: Invoking stop() from shutdown hook
18/06/08 06:18:26 INFO scheduler.TaskSetManager: Starting task 175.0 in stage 8.0 (TID 7862, localhost, executor driver, partition 175, PROCESS_LOCAL, 2049 bytes)
18/06/08 06:18:26 INFO executor.Executor: Running task 175.0 in stage 8.0 (TID 7862)
18/06/08 06:18:26 WARN scheduler.TaskSetManager: Lost task 168.0 in stage 8.0 (TID 7855, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

18/06/08 06:18:26 ERROR scheduler.TaskSetManager: Task 168 in stage 8.0 failed 1 times; aborting job
18/06/08 06:18:26 INFO storage.ShuffleBlockFetcherIterator: Getting 1280 non-empty blocks out of 1280 blocks
18/06/08 06:18:26 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Cancelling stage 12
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 12.0, whose tasks have all completed, from pool
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Stage 12 was cancelled
18/06/08 06:18:26 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (map at SparkScalaBitcoinTransactionGraph.scala:89) failed in 18010.838 s due to Job aborted due to stage failure: Task 168 in stage 8.0 failed 1 times, most recent failure: Lost task 168.0 in stage 8.0 (TID 7855, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

Driver stacktrace:
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Cancelling stage 9
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Stage 9 was cancelled
18/06/08 06:18:26 INFO scheduler.DAGScheduler: ShuffleMapStage 9 (map at SparkScalaBitcoinTransactionGraph.scala:81) failed in 9262.181 s due to Job aborted due to stage failure: Task 168 in stage 8.0 failed 1 times, most recent failure: Lost task 168.0 in stage 8.0 (TID 7855, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

Driver stacktrace:
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Cancelling stage 8
18/06/08 06:18:26 INFO scheduler.TaskSchedulerImpl: Stage 8 was cancelled
18/06/08 06:18:26 INFO scheduler.DAGScheduler: ShuffleMapStage 8 (map at SparkScalaBitcoinTransactionGraph.scala:85) failed in 1948.098 s due to Job aborted due to stage failure: Task 168 in stage 8.0 failed 1 times, most recent failure: Lost task 168.0 in stage 8.0 (TID 7855, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

Driver stacktrace:
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 172.0 in stage 8.0 (TID 7859)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 169.0 in stage 8.0 (TID 7856)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 173.0 in stage 8.0 (TID 7860)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 170.0 in stage 8.0 (TID 7857)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 167.0 in stage 8.0 (TID 7854)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 174.0 in stage 8.0 (TID 7861)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 171.0 in stage 8.0 (TID 7858)
18/06/08 06:18:26 INFO executor.Executor: Executor is trying to kill task 175.0 in stage 8.0 (TID 7862)
18/06/08 06:18:26 INFO scheduler.DAGScheduler: Job 1 failed: top at SparkScalaBitcoinTransactionGraph.scala:100, took 18011.687877 s
18/06/08 06:18:26 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1280 blocks
18/06/08 06:18:26 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 168 in stage 8.0 failed 1 times, most recent failure: Lost task 168.0 in stage 8.0 (TID 7855, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1457)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1444)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1444)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at scala.Option.foreach(Option.scala:236)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1668)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1627)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1616)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1862)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1982)
	at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
	at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1.apply(RDD.scala:1397)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1384)
	at org.apache.spark.rdd.RDD$$anonfun$top$1.apply(RDD.scala:1365)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.top(RDD.scala:1364)
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$.jobTop5AddressInput(SparkScalaBitcoinTransactionGraph.scala:100)
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$.main(SparkScalaBitcoinTransactionGraph.scala:53)
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph.main(SparkScalaBitcoinTransactionGraph.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3464)
	at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3271)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1789)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.readNextItem(ExternalAppendOnlyMap.scala:515)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.hasNext(ExternalAppendOnlyMap.scala:535)
	at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.org$apache$spark$util$collection$ExternalAppendOnlyMap$ExternalIterator$$readNextHashCode(ExternalAppendOnlyMap.scala:335)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:408)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator$$anonfun$next$1.apply(ExternalAppendOnlyMap.scala:406)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:406)
	at org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.next(ExternalAppendOnlyMap.scala:301)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 175.0 in stage 8.0 (TID 7862)
18/06/08 06:18:26 WARN scheduler.TaskSetManager: Lost task 175.0 in stage 8.0 (TID 7862, localhost, executor driver): TaskKilled (killed intentionally)
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
18/06/08 06:18:26 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
18/06/08 06:18:26 INFO ui.SparkUI: Stopped Spark web UI at http://172.17.0.2:4040
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 170.0 in stage 8.0 (TID 7857)
18/06/08 06:18:26 WARN scheduler.TaskSetManager: Lost task 170.0 in stage 8.0 (TID 7857, localhost, executor driver): TaskKilled (killed intentionally)
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 172.0 in stage 8.0 (TID 7859)
18/06/08 06:18:26 ERROR scheduler.TaskSchedulerImpl: Exception in statusUpdate
java.util.concurrent.RejectedExecutionException: Task org.apache.spark.scheduler.TaskResultGetter$$anon$3@3eec2833 rejected from java.util.concurrent.ThreadPoolExecutor@67fd2e27[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 7857]
	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
	at org.apache.spark.scheduler.TaskResultGetter.enqueueFailedTask(TaskResultGetter.scala:104)
	at org.apache.spark.scheduler.TaskSchedulerImpl.liftedTree2$1(TaskSchedulerImpl.scala:387)
	at org.apache.spark.scheduler.TaskSchedulerImpl.statusUpdate(TaskSchedulerImpl.scala:361)
	at org.apache.spark.scheduler.local.LocalEndpoint$$anonfun$receive$1.applyOrElse(LocalBackend.scala:66)
	at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
	at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
	at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
	at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:217)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
18/06/08 06:18:26 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 171.0 in stage 8.0 (TID 7858)
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 173.0 in stage 8.0 (TID 7860)
18/06/08 06:18:26 INFO executor.Executor: Executor killed task 174.0 in stage 8.0 (TID 7861)
18/06/08 06:18:27 ERROR executor.Executor: Exception in task 169.0 in stage 8.0 (TID 7856)
java.lang.IllegalArgumentException: requirement failed: File segment length cannot be negative (got -16024861)
	at scala.Predef$.require(Predef.scala:233)
	at org.apache.spark.storage.FileSegment.<init>(FileSegment.scala:28)
	at org.apache.spark.storage.DiskBlockObjectWriter.fileSegment(DiskBlockObjectWriter.scala:236)
	at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:729)
	at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:721)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:721)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
18/06/08 06:18:27 INFO executor.Executor: Not reporting error to driver during JVM shutdown.
18/06/08 06:18:27 ERROR executor.Executor: Exception in task 167.0 in stage 8.0 (TID 7854)
java.io.FileNotFoundException: /tmp/blockmgr-3d6f28c7-9d39-4c84-ad36-cd4864d0474f/1a/temp_shuffle_5b3ac818-3769-4488-8a4a-1bd053c342d6 (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:146)
	at org.apache.spark.util.collection.ExternalSorter$SpillReader.nextBatchStream(ExternalSorter.scala:526)
	at org.apache.spark.util.collection.ExternalSorter$SpillReader.org$apache$spark$util$collection$ExternalSorter$SpillReader$$readNextItem(ExternalSorter.scala:578)
	at org.apache.spark.util.collection.ExternalSorter$SpillReader$$anon$5.hasNext(ExternalSorter.scala:601)
	at scala.collection.TraversableOnce$FlattenOps$$anon$1.hasNext(TraversableOnce.scala:432)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:725)
	at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:721)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:721)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
18/06/08 06:18:27 INFO executor.Executor: Not reporting error to driver during JVM shutdown.
18/06/08 06:18:33 INFO storage.MemoryStore: MemoryStore cleared
18/06/08 06:18:33 INFO storage.BlockManager: BlockManager stopped
18/06/08 06:18:33 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
18/06/08 06:18:33 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/06/08 06:18:33 INFO spark.SparkContext: Successfully stopped SparkContext
18/06/08 06:18:33 INFO util.ShutdownHookManager: Shutdown hook called
18/06/08 06:18:33 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7aa73177-a289-47ce-ab1a-f2c3432aa428

Add support for Ethereum Name Service (ENS)

The Ethereum Name Service (ENS) allows registration of Internet addresses on the Ethereum blockchain. In some sense, it is similar to Namecoin for the Bitcoin blockchain, but based on the smart contract VM.

https://ens.domains/

The idea here is to add additional utility functionality and Hive UDFs to the library.
Of course, everything is supplied with an example and unit tests/integration tests.

Example for Namecoin

Namecoin is an experimental open-source technology which improves decentralization, security, censorship resistance, privacy, and speed of certain components of the Internet infrastructure such as DNS and identities. (https://namecoin.org/)

Bitcoin is the underlying technology and it shares a lot with respect to data structures. Hence, an example application will be provided for the HadoopCryptoLedger library based on Bitcoin for processing Namecoins.

This example should include unit tests and integration test. It will be available for MapReduce, Hive and Spark.

Investigate and implement support for JDK9

JDK9 is supposed to be release on 21st of September.

The goal of this task is to integrate JDK 9 compilation and unit testing into the CI chain to validate early that it works.

BufferUnderflowException in BitcoinBlockReader.readBlock

Downloaded the entire blockchain using the canonical bitcoind, copied it to an Hadoop (2.7.3) on AWS and ran a Spark job over the data. Was OK for about 900 partitions then failed with:

Caused by: java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:506)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:361)
at org.zuinnote.hadoop.bitcoin.format.common.BitcoinBlockReader.readBlock(BitcoinBlockReader.java:115)
at org.zuinnote.hadoop.bitcoin.format.mapreduce.BitcoinBlockRecordReader.nextKeyValue(BitcoinBlockRecordReader.java:83)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:207)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:1126)
at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:1132)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
.
.

Using version 1.0.4.

sbt version

Hi ,

i have difficulty running the
sbt +clean +assembly +it:test
command. It is because of sbt version. could you please point out the sbt version you used?

Thanks.

Flume Source to live stream Blockchain data into HDFS

Live streaming of Bitcoin blockchain data for immediate analysis to HDFS, but also other applications (e.g. via Kafka).

It could be done as a flume source or a Kafka producer.
This Flume source should

  1. Provide Bitcoin Blocks to any Flume Channel
  2. Provide Bitcoin Block metadata (e.g. number of confirmations, validations of checksums etc.) to any Flume Channel. Metadata should be related to one block and does not describe deltas, but only full changes. For example, the number of confirmations is always the currently known total number of confirmations and not new confirmations that are known. The reason is that otherwise the application would have to maintain this information which leads usually to inconsistent information (e.g. number of confirmations is different from the real number of confirmations etc.). However, the flume source would need to have a backend to manage state, which should be ideally configurable. Via JDBC one could connect to a variety of NoSQL databases (e.g. Hbase, ignite etc.).

Unit and integration tests must be provided.
An example manual needs to be provided to integrate the Flume source into any cluster that has Flume support deployed. As a basic, it shows that Bitcoin Blocks are stored in HDFS files using the append mode and configurable file size (e.g. 128M) and meta data is stored in an updatable fashion in Hbase.

Example for Litecoin

Litecoin is a peer-to-peer Internet currency that enables instant, near-zero cost payments to anyone in the world. (https://litecoin.org/)

Bitcoin is the underlying technology and it shares a lot with respect to data structures. Hence, an example application will be provided for the HadoopCryptoLedger library based on Bitcoin for processing Litecoins.

This example should include unit tests and integration test. It will be available for MapReduce, Hive and Spark.

error: not found: value assemblyJarName in assembly

I am new to big data and hadoop, I am trying to make use of hadoopcryptoledger library to do some bitcoin graph analysis, I followed this tutorial Using Spark Scala Graphx to analyze the Bitcoin transaction graph (https://github.com/ZuInnoTe/hadoopcryptoledger/wiki/Using-Spark-Scala-Graphx-to-analyze-the-Bitcoin-transaction-graph)

While executing the command

sbt clean assembly test it:test 

I ran into an issue:

/home/jnikhil/hadoopcryptoledger/examples/scala-spark-graphx-    
bitcointransaction/build.sbt:30: error: not found: value assemblyJarName
assemblyJarName in assembly := "example-hcl-spark-scala-graphx-bitcointransaction.jar"
[error] Type error in expression 

Does anyone know why am I facing this issue?

gradle problem:testIntegeration maoReduceGenesisBlock() fail

@jornfranke
Hi,thanks for your great project.I occur this issue when run gradlew build.

╷
├─ JMockit integration ✔tcoinblock:testIntegration
└─ JUnit Jupiter ✔
   └─ MapReduceBitcoinBlockIntegrationTest ✔
      ├─ mapReduceGenesisBlock() ✘ Successfully executed mapreduce application ==> expected: <0> but was: <1>
      └─ checkTestDataGenesisBlockAvailable() ✔

Failures (1):
  JUnit Jupiter:MapReduceBitcoinBlockIntegrationTest:mapReduceGenesisBlock()
    MethodSource [className = 'org.zuinnote.hadoop.bitcoin.example.MapReduceBitcoinBlockIntegrationTest', methodName = 'mapReduceGenesisBlock', methodParameterTypes = '']
    => org.opentest4j.AssertionFailedError: Successfully executed mapreduce application ==> expected: <0> but was: <1>
       org.zuinnote.hadoop.bitcoin.example.MapReduceBitcoinBlockIntegrationTest.mapReduceGenesisBlock(MapReduceBitcoinBlockIntegrationTest.java:187)

I have tried over for one day.but still i can't work over it.I need your help. thanks very much.

My jdk is 1.8.0_171
hadoop version

Hadoop 2.6.0-cdh5.14.2
Subversion http://github.com/cloudera/hadoop -r 5724a4ad7a27f7af31aa725694d3df09a68bb213
Compiled by jenkins on 2018-03-27T20:40Z
Compiled with protoc 2.5.0
From source with checksum 302899e86485742c090f626a828b28
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.14.2.jar
spark-shell --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

ERROR BitcoinBlockReader:476 - Cannot skip block in InputStream

Hi Jorne,

As discussed via email. I was hesitant to raise an issue as I think this may be down to error with me.

Magic, I've tried "F9BEB4D9" and also "F9BEB4D9,FABFB5DA" as well as not setting it.

For maxblocksize I've tried not having set, "2M", "2", "8M", "8". Not sure if the M is supposed to be there.

I'm using the maven module. Spark 2.10. When using method shown in SparkBitcoinBlockDataSource I get the following error, I think it might actually be a problem with spark setup:

java.lang.NoSuchMethodError: scala.runtime.ObjectRef.create(Ljava/lang/Object;)Lscala/runtime/ObjectRef;

I'll try using gradle to build your project and run the examples.

Cannot find Transaction hash

Hi,

Thanks a lot for developing this library. I was going through blocks and transactions contained within them. When I tried to get hashes of transactions, I could not find them. By hash i mean the tx id: d77435177fe87b96d8e751221bfd1dfba9dc23e0cd521df59ed1725dba2c74f3 as in https://blockchain.info/tx/d77435177fe87b96d8e751221bfd1dfba9dc23e0cd521df59ed1725dba2c74f3

My code is stuck at this point:

val noOfTransactionPair = bitcoinBlocksRDD.map(e => {
      val txList = new ListBuffer[(String,Int,Int)]()
      val blk: BitcoinBlock = e._2
      val transactions: util.List[BitcoinTransaction] = blk.getTransactions
      for(x<-0 to transactions.size()-1){
        val transaction: BitcoinTransaction = transactions.get(x)
       **transaction.getTxHash** //this does not exist?
        txList+=((blk.getTime.toString, transaction.getListOfInputs.size, transaction.getListOfOutputs.size()))
      }

      txList
    }) 

Please let me know what i am missing.

Support for Emercoin

Emercoin is another cryptocurrency, but at the same time enables - similar to Namecoin - storing of key/value pairs.

Emercoin seems to be largely Bitcoin based so we may be already able to read it out of the box.

We need to add special utility functions (similar to the one we added for Namecoin) to read key/value pairs in a user friendly way and provide UDFs.

Finally, we should provide examples including unit/integration tests

https://github.com/emercoin/emercoin

Bitcoin: Support AuxPow for AltCoins

AuxPow is a system which is used by Altcoins (mainly Namecoin) that changes the structure of a block (cf. https://en.bitcoin.it/wiki/Merged_mining_specification).

This would require parsing additional information. It seems that AuxPow can probably be detected by checking that there is only "one transaction" (in fact it refers to one txinput) followed by a a 32 0x00 and an ending in (prevTxOutIndex) of FF FF FF FF.

Support for Ripple

Ripple "is a network of computers which use the Ripple consensus algorithm to atomically settle and record transactions on a secure distributed database, the Ripple Consensus Ledger (RCL). Because of its distributed nature, the RCL offers transaction immutability without a central operator. The RCL contains a built-in currency exchange and its path-finding algorithm finds competitive exchange rates across order books and currency pairs":
https://github.com/ripple/rippled

This implies to add additional HadoopInputFormats to process the Ripple blockchain. Furthermore a HiveSerde and Spark datasource should be created. Unit tests must be included.

Finally, an example for MapReduce, Flink, Hive and Spark should be provided. The examples should include unit and integration tests.

Drop support for JDK 7

JDK7 has been out of maintenance since a long time, hence we need to drop support for it.
This also means we need to check new features of JDK8, such as Lambdas for the Spark examples in Java, unsigned longs, and where applicable parallel collections

Quality Management Release

This is a quality management release that includes:

  • Unit tests for coverage up to 80%
  • Evaluate quality of selected unit tests by enabling Mutation Testing using Pitest
  • Unit tests for the example applications
  • Integration tests for the example applications
  • Automated publishing of test reports to Github pages
  • Fix of reported bugs and code smells reported by static code analyzers
  • Versioning of build tool (gradle wrapper)

Automated cloud testing

The goal of this task is to automate the system testing of this library by including automated testing:

  • Automatically load the full blockchains into cloud storage (We start with the Bitcoin blockchain)
  • Execute the following test scenarios on the full blockchain
    • Get the balance of each wallet (using Hive on TEZ/ Flink / Spark)
    • Determine the average transaction fees / block (using Hive on TEZ / Flink / Spark)
    • Create Bitcoin Transaction graph and execute a PageRank on it
    • ... to be extended

Automated testing should also report on time needed for each step in the test scenario

Ethereum: Utility function for EIP-721 (non-fungible tokens)

EIP-721 has been created in spirit of EIP-20, but focuses on non-fungible tokens. This allows you to trade a huge variety of assets on the Ethereum blockchain, such as physical properties, virtual collectables or negative value assets (loans, burdens etc.).

The goal of this utlity function is to facilitate analysis on those standardized types of tokens.
Includes unit tests and example application.

spark-submit NoSuchMethodError org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$.extractTransactionData

@jornfranke

I am trying to build spark-submit according to this wiki:
https://github.com/ZuInnoTe/hadoopcryptoledger/wiki/Using-Spark-Scala-Graphx-to-analyze-the-Bitcoin-transaction-graph

spark-submit --class org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph --master local[8] ./target/scala-2.10/example-hcl-spark-scala-graphx-bitcointransaction.jar /user/bitcoin/input /user/bitcoin/output

It appears that the method such as extractTransactionData is not found. how i can solove it?thanks very much.

18/06/07 09:54:24 ERROR executor.Executor: Exception in task 10.0 in stage 0.0 (TID 10)
java.lang.NoSuchMethodError: scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$.extractTransactionData(SparkScalaBitcoinTransactionGraph.scala:108)
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$$anonfun$1.apply(SparkScalaBitcoinTransactionGraph.scala:60)
	at org.zuinnote.spark.bitcoin.example.SparkScalaBitcoinTransactionGraph$$anonfun$1.apply(SparkScalaBitcoinTransactionGraph.scala:60)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:192)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Bitcoin: Add example on how to calculate transaction fee

This is about adding examples for
Hive: - here you need to explode the transactions in a table, calculate their current hash and join the inputs of a transactions based on the hash and input index. The difference between the total sum of input values - the total sum of output values corresponds to the transaction fee
Flink: - similar as in Hive using FlinkSQL
Spark: - similar as in Hive using Spark SQL
MapReduce: here a more detailed mapreduce job need to be specified
for calculating transaction fees.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.