Giter Club home page Giter Club logo

handel's Introduction

CircleCI

Handel

Handel is a fast multi-signature aggregation protocol for large Byzantine committees. This is the reference implementation in Go.

The protocol

Handel is a Byzantine fault tolerant aggregation protocol that allows for the quick aggregation of cryptographic signatures over a WAN. Handel has both logarithmic time and network complexity and needs minimal computing resources. For more information about the protocol, we refer you to the following presentations:

We have a paper in submission available here: https://arxiv.org/abs/1906.05132 Please note that the slides are not up-to-date with the latest version of the paper.

The reference implementation

Handel is an open-source Go library implementing the protocol. It includes many openness points to allow plugging different signature schemes or even other forms of aggregation besides signature aggregation. We implemented extensions to use Handel with BLS multi-signatures using the BN256 curve. We ran large-scale tests and evaluated Handel on 2000 AWS nano instances located in 10 AWS regions and running two Handel nodes per instance. Our results show that Handel scales logarithmically with the number of nodes both in communication and re- source consumption. Handel aggregates 4000 BN256 signatures with an average of 900ms completion time and an average of 56KBytes network consumption.

Installation

This library requires go version 1.11+

This library uses go modules, so make sure either you clone this library outside your $GOPATH or use GO111MODULE=on before building it.

If you want to hack around the library, you can find more information about the internal structure of Handel in the HACKING.md file.

License

The library is licensed with an Apache 2.0 license. See LICENSE for more information.

handel's People

Contributors

bkolad avatar nikkolasg avatar nkeywal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

handel's Issues

Master node should start slaves

Currently AWS plafrom starts both master and slaves nodes, we can use master to fully control the lifetime of slaves nodes

discussion - how many packets per level

should have discussion about the two approaches the two nicolas have implemented:

  • Each time there is a better signature for a given level, we send this new signature to candidateCount peers. The periodic update only sends the current best signature at each level periodically. This sends a lot of "low quality" signatures but homogeneize all nodes.
  • when there is a better signature, it will be sent only during the next iteration of the periodic update to one single peer. Only when there is a full signature, it sends to candidateCount peers. This reduces the load of "low quality" signatures but can produces more heterogeneous signatures between all peers.

remote execution will hang if we print lines too long

The remote execution will hang if we print lines too long. It works if they are shorter eg. with \n)
this can be reproduced with the following patch in the simul package

diff --git a/processing.go b/processing.go
index ea7de83..514059b 100644
--- a/processing.go
+++ b/processing.go
@@ -7,6 +7,7 @@ package handel
 import (
        "errors"
        "fmt"
+       "os"
        "sync"
        "time"
 )
@@ -315,10 +316,11 @@ func (f *evaluatorProcessing) processStep() bool {
 func (f *evaluatorProcessing) verifyAndPublish(sp *incomingSig) {
        startTime := time.Now()
        err := (error)(nil)
-       if f.sigSleepTime <= 0 {
+       if f.sigSleepTime <= 0 && false {
                err = verifySignature(sp, f.msg, f.part, f.cons)
        } else {
-               time.Sleep(time.Duration(f.sigSleepTime * 1000000))
+               //time.Sleep(time.Duration(f.sigSleepTime * 1000000))
+               os.Stdout.WriteString("*********************************************************************************************************************************************************************************************************")
        }
        endTime := time.Now()

bn256.hashedMessage panics for some messages

For some messages M, hashedMessage function panics.
Example:
message = []byte("I am the byzantine general.")

Reason:
Under the hood bn256.RandomG1(reader) is using big.Int(rand io.Reader, max *big.Int) function.
This function assumes implementation of io.Reader can be called
many times and each time reads up fresh data.

We fail to satisfy this condition as we create reader with :
reader := bytes.NewBuffer(hashed) this reader has finite capacity and
_, err = io.ReadFull(reader, bytes) will return error when the reader is exhausted. We don't handel this error hence the bug.

Simulation framework

We need to have a simulation implementation strategy. There's different ways we could do that, please add to that list if you think of other ways.

Architecture

There's probably many ways to design a simulation framework. We should list the different options here. Here's the one that we implemented in my previous job, in Go ( we could probably take some pieces here and there to drastically reduce dev time):

  1. One "sink" node that receives measurements from any nodes running the experiement. Every node knows how to contact that sink - via a separate connection in TCP or UDP datagrams. At the end, the sink compute average / min_dev, etc and outputs the result in a CSV file. You can find the relevant code / packages here

Interfaces

First, in order to collect relevant measurements, we need configurable network, store, processing interfaces and Handel structs so we can wrap them around with functionalities related to measurements. i.e. Envision stg like

type MeasurementNetwork struct { 
    packetsSent uint32
    packetsReceived uint32
    Network 
}

func (m *MeasurementNetwork) Send(ids []Identity, p *Packet) {
     m.packetsSent++
     m.Network.Send(ids,p)
}

Multiple solutions possible:

  1. Export Handel's field of general interfaces (processing, store etc) so one
    can wrap some into another interface. simul/ package can contain the
    wrappers.
    • PRO: Very easy to wrap interfaces around
    • CON: Public field of handels exposed
  2. Having "constructor" function for each interface that are put into the
    config struct. Can even put public the current implementations.
    • PRO: quite modularizable.
    • CON: larger config, difficult to know in advance which fields are
      required when creating an interface.
  3. sets up a "SimulationHandel" struct, with its own interfaces inside the
    handel package
    • PRO: every implementation details could be kept hidden but still usable
      for collecting results, code should be able to be separated from main
      logic
    • CON: simulation code separated but still in same package, not so
      "production-ready".

CI started to fail

Without any change on the code.
It seems we can reproduce the problem locallty:
FAIL github.com/ConsenSys/handel/simul/p2p/libp2p [build failed]
@nikkolasg any insight?

Add threshold flexbility to start a level

Today we start a level only when it has all the signatures.
We could have something smarter when the missing signature comes from a node that should have communicated long ago.

Technically, we could have a module to identify suspicious nodes. If the missing sigs comes from a suspicious node we start the level. A node would become suspicious if it hasn't sent its signature after a given delay or of it hasn't responded when we use tcp or quic to communicate.

Add metrics about signatures

We should track, per node:

  • the number of signatures removed from the queue (because there is a better signature already)
  • the time taken to check a signature
  • the length of the queue of the signatures to sign

The last one will be useful to check that we do not overload the cpu if we put more than one handel node for 2 core.

Minotoring: in the report we have more messages received than sent

For example:
network,nodes,run,threshold,net_rcvd_min,net_rcvd_max,net_rcvd_avg,net_rcvd_sum,net_rcvd_dev,net_sent_min,net_sent_max,net_sent_avg,net_sent_sum,net_sent_dev,sigen_system_min,sigen_system_max,sigen_system_avg,sigen_system_sum,sigen_system_dev,sigen_user_min,sigen_user_max,sigen_user_avg,sigen_user_sum,sigen_user_dev,sigen_wall_min,sigen_wall_max,sigen_wall_avg,sigen_wall_sum,sigen_wall_dev,sigs_sigCheckedCt_min,sigs_sigCheckedCt_max,sigs_sigCheckedCt_avg,sigs_sigCheckedCt_sum,sigs_sigCheckedCt_dev,sigs_sigCheckingTime_min,sigs_sigCheckingTime_max,sigs_sigCheckingTime_avg,sigs_sigCheckingTime_sum,sigs_sigCheckingTime_dev,sigs_sigQueueSize_min,sigs_sigQueueSize_max,sigs_sigQueueSize_avg,sigs_sigQueueSize_sum,sigs_sigQueueSize_dev,sigs_sigSuppressed_min,sigs_sigSuppressed_max,sigs_sigSuppressed_avg,sigs_sigSuppressed_sum,sigs_sigSuppressed_dev,store_replaceTrial_min,store_replaceTrial_max,store_replaceTrial_avg,store_replaceTrial_sum,store_replaceTrial_dev,store_successReplace_min,store_successReplace_max,store_successReplace_avg,store_successReplace_sum,store_successReplace_dev
udp,202,0,200,148.000000,422.000000,252.185000,50437.000000,62.003978,140.000000,376.000000,245.085000,49017.000000,50.381736,0.432000,1.476000,0.961380,192.276000,0.262533,10.992000,47.940000,28.775040,5755.008000,8.814224,5.350673,19.115064,13.303494,2660.698744,3.087293,21.000000,237.000000,65.805000,13161.000000,37.153586,13.296296,79.000000,38.689162,7737.832348,11.845670,0.307692,82.571429,14.384738,2876.947551,12.854892,15.000000,361.000000,126.305000,25261.000000,53.479268,1.000000,190.000000,38.105000,7621.000000,35.554530,13.000000,34.000000,23.345000,4669.000000,3.400558

Insecure hashing in bn256/sign method

The method to hash a message to a point is insecure m -> scalar s -> s * G , as no easy method is provided by the go or cf packages and time pressure. We should try to implement a correct method, maybe by following the ideas in this paper https://www.di.ens.fr/~fouque/pub/latincrypt12.pdf . Although that will probably require forking off Go's or CF's package in order to access to the lower level methods.

QUIC network implementation thrashes sessions

I did a quick review of the code for the QUIC network implementation. It seems to be creating a QUIC session and dropping it for every incoming packet. Is this intentional? If yes, why?

Of course thrashing sessions will penalise performance – and a 3x slowdown when compared to UDP is not even bad in that circumstance.

To run fair UDP vs QUIC comparisons, this aspect should be fixed.

(BTW – apologies in advance if I misread the code – I did a very quick pass)

Relevant code: https://github.com/ConsenSys/handel/blob/master/network/quic/net.go#L127

CC @marten-seemann

BinomialPartitioner: optimizations

The binomial partitioner currently makes heavy computations each time it computes the partitioning of a level etc. These computations could be greatly optimized and even maybe cached. For the former, using simple binary operations on the IDs to compute the common prefix length should be sufficient for example.

Add flag for platform specyfic configuration in simul/main.go

For now we are passing platform specific parameters as flag in the generic
simul/main.go launcher. This can be confusing as user has to understand which flags she needs for given platform. For example -regions flag is required for AWS platform but not for localhost.

Solution:
Add platform specific config file

Libp2p - weird behaviors

We now have a comparative baseline simulation using libp2p where each peer connects to a few other peers (designated as a parameter "Count" in a config file), subscribe to the "handel" topic, broadcast their signature and wait to receive enough signatures.
Unfortunately, this simulation exhibits weird behaviors (~failures) of the libp2p pubsub library. We can tests these failures in two different ways, in the fail_libp2p branch:

  1. Running the test TestGossipMeshy in simul/p2p/libp2p which is directly inspired from the tests found in the libp2p/pubsub repo.
  2. Running the simulation in simul/ with go run main.go -config config_gossip.toml -platform localhost - It's the generalization of the tests. Even with a large number of connected peers, the simulation fails most often.

Please note that sometime theses tests pass, but most often they don't - repeat the experience !

For the test, using a Neighbor connector that makes each peer connects only to some "neighbors" in the ID space (modulo), so all peer's connections form a circle - it's a completely connected graph. On the contrary, using Random connector that randomly connects peers (as in the libp2p pubsub's tests) fails most of the time.

fifoProcessing: deprecated

fifoProcessing is no longer in use in the main code, but only in two places in the tests. It should be removed.

Processing + Partitioner still uses fmt.Printf

github.com/ConsenSys/handel.logf(NOT the logger interface) is called from these 5 sites:

partitioner.go|231 col 8| static function call from (*github.com/ConsenSys/handel.binomialPartitioner).Combine
partitioner.go|244 col 7| static function call from (*github.com/ConsenSys/handel.binomialPartitioner).Combine
processing.go|364 col 7| static function call from github.com/ConsenSys/handel.verifySignature
processing.go|457 col 7| static function call from (*github.com/ConsenSys/handel.fifoProcessing).verifySignature
processing.go|418 col 8| static function call from (*github.com/ConsenSys/handel.fifoProcessing).processIncoming

Handel constructor simplification

At the moment, the Handel constructor looks like the following:

func NewHandel(n Network, r Registry, id Identity, c Constructor,msg []byte, s Signature, conf ...*Config) *Handel

I see two problems with that:

  1. It is very long: 7 arguments is long, even more for Go. And it add quite a cognitive load to be able to understand all these arguments and set them properly.
  2. The Config contains the "Contributions" field that may required to be changed. Of course one can take the default value, but if a users sets it one time, it has to set it all the other times as well when the number of ids change.

Real / Faked Latency for AWS ?

How do we simulate latency for AWS instances within one region ? Do we need to simulate it at all, given the time conditions ? We should at least explore the naive solution of adding a time.Sleep(100 * time.Millisecond) at the network level and see how it compares without.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.