Giter Club home page Giter Club logo

narwhal's People

Contributors

asonnino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

narwhal's Issues

Committee update

What is the best way to:

  • Update the committee (change of authorities)
  • Change authorities network info (not the authorities per se, but for instance their ip addresses)

Redundant Dag Traverse

Hello there,

I am new to this project and have one question about the consensus crate.

I feel the 'Consensus:order_leaders' is redundant. The traverse of the dag is repeated twice(or more if some leader has no link to the later blocks) in both order_leader and order_dag. Is there any purpose for doing that?

Autenticate node's channels

We need some (lightweight) way to ensure only members of the committee can talk to each other, and that bad nodes cannot impersonate good ones.

Ensure crash recovery

Have we thought already about crash and recovery?
For example, we need to persist headers that we already signed in order to not lose this information when we crash and recover. Is it possible to read important information from DB every time we recover?

  • Our round.
  • All certificates from previous round.
  • All headers we signed in current round.
  • digest that we need to re-include?

Reliable sender's connection replies has a potential ordering issue

Hello.

I think there is a bug in reliable sender's connection's pending replies queue ordering, but please let me know if it is my understanding that is lacking.

If we send multiple messages to a peer, there is no reason that the acknowledgements will be received in the same order. Yet the code seems to assume that because it is sending the received ACK message to the first handler in the queue (pop_front).

How do the replicas respond to a client?

Hello! I am working on a secondary development of this project.

Reading the code, I see that the client (from benchmark_client.rs) sends data to the replicas here. A replica accepts it while makes and commit a new block. During this call, the client does not wait for a reply from the replica and thus it receives no data from the replica.

Consider a scenario where the client needs to fetch the data stored on the state machine replication. Could you please give me some hints on how I can modify your code so that replicas could respond to client requests?

Thanks a lot!

Panic upon storage failure

There is no point in keeping the system running if the storage fail (this is in fact dangerous). We currently panic upon storage failure but only at a late stage.

Smarter sync mechanism

There are many ways to improve our current sync strategy.

For instance, if we get the same parent from more than one peer, we can ask these peers first before random selection.

An other example could be:

  • Ask all nodes for missing data
  • All nodes start streaming chunks (FC coded) of the data
  • Stop streaming once we can reconstruct the data

All nodes keep in memory already FC-coded data in case others need to sync.

configuration and logs of test results

Hello,
    We conducted a 100-node test on the WAN, and the test configuration and results are as follows:

+ CONFIG:
Faults: 33 node(s)
Committee size: 100 node(s)
Worker(s) per node: 1 worker(s)
Collocate primary and workers: True
Input rate: 234,500 tx/s
Transaction size: 200 B
Execution time: 41 s
Header size: 1,000 B
Max header delay: 200 ms
GC depth: 50 round(s)
Sync retry delay: 10,000 ms
Sync retry nodes: 33 node(s)
batch size: 200,000 B
Max batch delay: 200 ms
+ RESULTS:
Consensus TPS: 213,986 tx/s
Consensus BPS: 42,797,208 B/s
Consensus latency: 4,771 ms
End-to-end TPS: 207,890 tx/s
End-to-end BPS: 41,577,920 B/s
End-to-end latency: 7,852 ms

    We have some questions about the test log and configuration.

  1. First, according to the configuration, the sending rate of each client is 3500 tx/s, and our test time is 30s. But according to client's log, each client sends about 800 tx in this period of time.
  2. According to the worker's log, every 4 tx make up a batch, but batch contains 14000B is displayed, which doesn't seem to match the configured tx_size = 200 B.
Batch jOiahFVevxMc4+RQEIlZfEjFHha/oBesYqcEHBKSZiU= contains sample tx 786
Batch jOiahFVevxMc4+RQEIlZfEjFHha/oBesYqcEHBKSZiU= contains sample tx 787
Batch jOiahFVevxMc4+RQEIlZfEjFHha/oBesYqcEHBKSZiU= contains sample tx 788
Batch jOiahFVevxMc4+RQEIlZfEjFHha/oBesYqcEHBKSZiU= contains sample tx 789
Batch jOiahFVevxMc4+RQEIlZfEjFHha/oBesYqcEHBKSZiU= contains 140000 B
  1. I would also like to ask about the meaning of the same number after each "Committed B" in the primary log.
Committed B97(mZFTSr1a8XJoClO4) -> WK0oFGTH44pm3PYAtejZ05EDysdIuDJ1MZuphZQe3m4=
Committed B97(mZFTSr1a8XJoClO4) -> isS9EtiKzZ2qm3DfL37mt8o02TPS519+/aEDggnzTTE=
Committed B97(mZFTSr1a8XJoClO4) -> vI69CmxG2PlTMDc5GYZreMZVUIFIhS8zSIQCQmBxAlk=

No store for synchronizer

Do not use the store in the synchronizer, it can be much faster to keep data in memory (we have a lot of memory).

Accounting for sync replies

We currently reply to any sync request we receive, which costs us resources (specifically for the worker). We need to do some accounting to prevent bad nodes from monopolizing our resources.

Re-include missed txs into our blocks

Currently, we never do it. But it might be the case that the upper layer consensus declined our block and thus we need to re-include it TXs. We need to think about the API with the consensus layer - it should tell as which blocks we can move to cold storage and which we need to retry.

We currently re-include digests until they appear in certified header. However, a certified header might still not get into the DAG. So need to think of a more accurate condition to stop re-include digests.

Protect primary against DoS

A bad node may make us run out of memory by sending many headers with very high round numbers. An easy fix is to add one parent certificate (not its hash) to the header, and only sign header with a round of certificate.round + 1.

Support read operations

What is the best way to support read operations for clients? Remember that the state is sharded amongst the workers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.