exonum / exonum Goto Github PK

View Code? Open in Web Editor NEW

1.2K 60.0 251.0 36.19 MB

An extensible open-source framework for creating private/permissioned blockchain applications

Home Page: https://exonum.com

License: Apache License 2.0

Rust 97.65% Shell 0.02% JavaScript 0.19% Python 2.14%

blockchain rust cryptography bitcoin p2p consensus-algorithm byzantine database

exonum's Issues

Sending status message after every block and reset timeout

Translate all comments into English

Combine public/private API endpoints from services

Restore first round timeout after accepting а new height

Regression after #6 pull request.

Add basic peer discovery mechanism.

It's a proposition to add logic of broadcasting a newer Connect, sent by a PublicKey, to all existing peers in exonum::node::NodeHandler.handle_connect(&mut self, message: Connect).
Resolution will allow to easily add new fullnodes on blockchain network.
This has to correlate with #14 and have a limit on allowed frequency of handling each new Connect message from the same PublicKey.
https://*************/projects/22/tasks/1307.

Implement new leader election algorithm

Update a leader election algorithm to provide weak censorship resistance.

Changes are the following

Every author of an accepted proposal is moving to the disabled state for F blocks (During the next F blocks one don't have a right to create new block proposals). The node behaves as usual in other activities, including voting for a new block, signing messages, etc.
We need to shuffle possible leader nodes in a deterministic manner. To do so, we take a permutation over M = N - F validators. The number of permutation is calculated as T = Hash(H) mod M!. Such calculation provides uniform distribution of the orders, that is byzantine validators would be randomly distributed inside the current height H.

Leveldb don't link on ubuntu 12.02

To compile on travis we need to set ubuntu: trusty.
The errors looks like:

/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libleveldb.a(env_posix.o): In function `leveldb::(anonymous namespace)::PosixEnv::Schedule(void (*)(void*), void*)':

(.text+0xaf2): undefined reference to `operator delete(void*)'

/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libleveldb.a(env_posix.o): In function `leveldb::(anonymous namespace)::PosixEnv::Schedule(void (*)(void*), void*)':

(.text+0xb8b): undefined reference to `std::__throw_bad_alloc()'

The error probably in C++ runtime.

Use `Duration` for timeouts

Currently we are using SystemTime for timeouts, for example:

pub fn add_status_timeout(&mut self) {
    let time = self.channel.get_time() + Duration::from_millis(self.status_timeout());
    self.channel.add_timeout(NodeTimeout::Status, time);
}

Duration can be used instead:

add_timeout(NodeTimeout::Status, Duration::from_millis(self.status_timeout()));

Although, straightforward implementation will change timeouts behavior because they are handled through the same channel as other events. Perhaps, we need separate channels/queues.

Managing node through managing API ("RPC")

Implement clear rules for serializing keys in `storage` module.

Now we implicity that any types used as keys in storage table serialized to bytes in big endian order. But the trait for keys is AsRef<[u8]> which cannot provide proper flexibility.

See the PR for details.

Separate keys for signing consensus messages and transactions

Generic fabric for clap configuration

There should be a way to create configuration for real net:

In internal discussion we decide to split creating configuration in next steps

Create config template.
Each validator should add hisself to this template in some order.
Then this template should propagate to all validators.
Each validators finnalize this template to they own config.

Review methods and functions naming into whole project

Review all codebase according to naming conventions (2.4.* pages) and suggest renaming.

http://aturon.github.io/style/naming/README.html

Сделать алгоритм консенсуса независимым от времени

Убрать время из сообщений типа Propose, Block и Request*, и логику их валидации.
Добавить время в сообщений типа Precommit (указывается текущее время на момент создания сообщения, при получении сообщения никак не валидируется)
Обновить тесты, убедиться что все работает.

Merge Request и код пока выливается на gitlab.

Investigate external/internal ip for network discovery.

The fact, that node is sending initially statically defined addr of itself in Connect message may be problematic for deployment of nodes across different networks/organisations.
In general case, a node cannot know its own ip, as seen by another peer, without using external services.
Moreover, a node's ip may vary from different peers' perspective.
#16
https://*********/projects/22/tasks/1307

Storage refactoring and implementing iterators

Add proof of tx inclusion within block in blockchain_explorer

Add merkle tree proof in blockchain_explorer: api/v1/blockchain/transactions/:hash.
Depends on #76 and #77.

#131
#139

Remove time crate dependency.

time crate should be replaced by std::time.

precommits verification

@defuz @alekseysidorov ; found 1 likely bugs in precommits verification:
It seems there's no code to verify, that precommmits are from distinct validators. Replicating single precommit self.state.majority_count() times will suffice to pass verification.

        let precommits = msg.precommits();
        if precommits.len() < self.state.majority_count() ||
           precommits.len() > self.state.validators().len() {
            error!("Received block without consensus, block={:?}", msg);
            return;
        }
        let precommit_round = precommits[0].round();
        for precommit in &precommits {
            let r = self.verify_precommit(&block_hash, block.height(), precommit_round, precommit);
            if let Err(e) = r {
                error!("{}, block={:?}", e, msg);
                return;
            }
        }

Dynamic ids for services

Lazy evaluating of hashes inside index trees

Add tx_length field to Block

needed for verifying merkle tree proofs to evaluate the merkle tree depth on client; the https://tools.ietf.org/html/rfc6962#section-2.1 CT solution with hashing prefixes was dicarded.

Currently tx_hash field in block is the root of merkle table of block's transactions.
Add tx_length field in block with the length of the same merkle table.

Test time-independent consensus algorithm in denial scenarios 5/1 and 6/2 (no txs load).

propose_timeout=500 in all cases. These are empty blocks.

6 and 8 nodes
round_timeout = 3000
status_timeout = 5000
performance is poor. Many (1-6) rounds per an empty block.

5-12 blocks per minute on 6 nodes.

6 and 8 nodes
round_timeout = 3000
status_timeout = 3000

74 blocks per minute on 6 nodes.

5/1 and 6/2 nodes (5/1 means 6 validators total, 1 stopped for maintenance or due to denial).
round_timeout = 3000
status_timeout = 1000
No data.

This is related to #2 and #29.

Refactor error handling

As proposed in #39, it would be convenient to change functions returning (), that contains code like

if some_bad_case {
  error!("ERROR MESSAGE");
  return;
}

let val = match get() {
    Some(val) => val,
    Error(err) => {
        error!("{:?}", err);
        return;
    }
};

into functions returning Result<(), Error>, so the code above could be rewritten into:

let val = get()?;

Core schema and its use in services

I've got into a problem understanding how the core schema interacts with the service code (on the example of the anchoring service). I'm in the dark here, so a clarification could be helpful.

I don't quite understand why the core exposes its schema as a part of its public interface. (Which leads to some questionable choices, such as having the notion of configurations and especially configuration changes embedded into the core - whereas there is a separate service for that.) It could be more developer-friendly to have a pseudo-service interface for the core. Furthermore, this hypothetical interface is similar in its goal to the one used now for service HTTP GET requests; only the middleware could automatically decide not to provide Merkle proofs in the case of inter-service interaction within a full node. This interface could return, via dedicated methods:

a list of current validators/admins
current height
block at given height
tx with given hash
block with given hash

And so on. Now, the anchoring service has an optional dependency on the configuration change service (e.g., in order to change the anchoring address), and it should probably:

understand if the config change service is available
interact with that service (again, via GET-methods) if it is available, in order to get the following config

Perhaps, I'm misunderstanding something, but I would describe the current approach as hacking the core (e.g., with get_following_configuration and the like) just in case it runs with one particular service. Is this done for efficiency reasons?

Proposed solution: A good solution would require inter-process communication. A good place to start seems to treat a View passed to the transaction's execute method as the execution context of the transaction. Then, it could be passed to other service calls (ideally, implicitly - middleware should take care of that). Behind the scenes, an execution context would correspond to many things, including the DB view, but we would want to hide these details from service developers, right?

So, instead of

pub fn execute(&self, view: &View) {
    let schema = Schema::new(view);
    let actual_cfg = schema.get_actual_configuration()?;
    let validators = actual_cfg.validators;
}

it would look like

pub fn execute(&self, context: &ExecutionContext) {
    // narrow() notation is taken from CORBA
    let service = context.get_service(CoreService::SERVICE_ID).narrow<CoreService>();
    let validators = service.get_validators(context);
}

Sorry for my Rust, but you probably get the idea.

Network discovery failure via RequestPeers

This was observed to sometimes result in

network discovery problems: not all validators getting updates for transactions which resulted in them having empty txpools and broadcasting empty proposals.
network partitioning (stop/start all validators, watch them not continuing progress on blocks).

    pub fn handle_request_peers(&mut self, msg: RequestPeers) {
        let peers: Vec<Connect> = self.state.peers().iter().map(|(_, b)| b.clone()).collect();
        for peer in peers {
            self.send_to_peer(*msg.from(), peer.raw());
        }
    }
------
    pub fn send_to_peer(&mut self, public_key: PublicKey, message: &RawMessage) {
        if let Some(conn) = self.state.peers().get(&public_key) {
            trace!("Send to addr: {}", conn.addr());
            self.channel.send_to(&conn.addr(), message.clone());
        } else {
            warn!("Hasn't connection with peer {:?}", public_key);
        }
    }

If node A missed node B's Connect, node A won't send its peers to B upon being requested.

Proposed fix: add addr and time fields to RequestPeers, effectively combining Connect and RequestPeers. (and combining handling logic too).

        addr:           SocketAddr  [32 => 38]
        time:           SystemTime  [38 => 50]

get rid of unwrap fn public_key_of(&self, id: ValidatorId).

@alekseysidorov Accidentally spotted this tiny method. https://github.com/exonum/exonum-core/blob/master/exonum/src/node/consensus.rs#L766
It's likely to cause panics when performing incoming consensus messages verification from rogue nodes:

   `handle_propose`
   --> src/node/consensus.rs:106:28
    |
106 |             let key = self.public_key_of(msg.validator());
    |                            ^^^^^^^^^^^^^

   `handle_prevote`
   --> src/node/consensus.rs:246:28
    |
246 |             let key = self.public_key_of(prevote.validator());
    |                            ^^^^^^^^^^^^^

   --> src/node/consensus.rs:285:32
    |
285 |                 let key = self.public_key_of(validator);
    |                                ^^^^^^^^^^^^^

   `handle_precommit`
   --> src/node/consensus.rs:340:25
    |
340 |         let peer = self.public_key_of(msg.validator());
    |                         ^^^^^^^^^^^^^

   --> src/node/consensus.rs:676:28
    |
676 |             let key = self.public_key_of(validator);
    |                            ^^^^^^^^^^^^^

Implement persistent storage of dynamicly obtained peers map.

https://*********/projects/22/discussions?modal=Discussion-56-22
#16

Separate build instructions from readme.md and update & translate them

Currently have bug. Scenario:

Clean repository
Mac OsX 10.12.3
run cargo test --all

Compiling lazy_static v0.2.2
error: linking with `cc` failed: exit code: 1
****
  = note: ld: library not found for -lleveldb
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Add whitelist support

For now, we can connect to node with self generated pair (public_key, secret_key).
We should add filter to disallow connection from node with unauthorized public_key.

Exonum do not support big-endian architectures

At least two modules are implicitily assume that current hardware is little-endian.

Storage
Messages

It seems not to be critical issue because most of modern hardware is little-endian.

Make blockchain_explorer part of exonum crate

Binary exonumctl seems like a natural part of exonum crate.
Common public API definitions too.
We may live with helpers.rs into exonum crate.

@alekseysidorov @gisochre Any thought?

Refactor message! and storage_value!

message! and storage_value! share same code to generate packed like structures, so we need to
take out shared code to some module.
message! and storage_value! has borrowed fields, so we cant derive deserialize for it
message! should semantically depend from service
there many boilerplate code that user need to write const MESSAGE_TYPE, const SIZE, const SERVICE_ID, and [from => to] to each field

For first iteration we decided:

Make seperate traits that implements exonum json Deserialize and Serialize aspects (partial implemented in #71 )
Implement Field for array of Field (fix #32)
The main idea is code should be well documented, so we should not use associated types for return values.

"Propose timeout" refactoring

exonum-propose-timeout-adjuster should implement a corresponding trait instead of being a service. So there should be a default implementation in the core.

Remove /benches and /testnet folders

Does anyone remember why these folders exist? I would like to delete them.

@vldm Perhaps these pieces can be useful for deploying a test net, but I would still prefer to remove it in the current form.

cc @alekseysidorov @asmsoft @vldm

Make EventLoopConfig.notify_capacity modifiable by local config.

If generator send many transaction, some node could not receive transactions, and keep pool empty. In generator log error: - Unable to send to, an error occurred: Full appears.

Use tuple struct instead of simple `type`

Currently we have "typedefs" for some things like height, round, etc.:

pub type Round = u32;
pub type Height = u64;
pub type ValidatorId = u32;

Instead they can be made into tuple struct.

Advantages:

Prevents possible (probably, unlikely) errors, for example: fn foo(Round, ValidatorId).
Force consistency: currently we have many places where row types are used instead of our typedefs.
"Cool typesafety". 😆

Disadvantages:

Additional "boilerplate": round.0 instead of round if we need underlying value.
?

I can make such refactoring if we decide that we need it.

Remove profiler_service

Why?

This is the only service that remained in exonum-core
Everything that it does is just calling of flame_dump
Call flame_dump from execute method of a transaction is a good example of bad design :)

What to do:

Use conditional compilation for calling flame_dump
Right place for doing it is non-existent method handle_terminate of Node (look at this for more info)

Remove `RequestPrecommits` message

It seems that RequestPrecommits message is not needed anymore and should be removed.

Tracking issue for 0.1 release

Features & code changes:

@defuz:

Implement iterators for storage, refactoring storage key #7 #58
Panic handling during transaction execution #59
Add developer notes #8

@alekseysidorov:

Combine public/private API endpoints from services #66 #53
~~Dynamic IDs for services #65~~

@gisochre:

Network discovery failure via RequestPeers #73
transaction location within block #77

@DarkEld3r:

"Propose timeout" refactoring #49
Review methods and functions naming into whole project #55
Sending status message after every block and reset timeout #63
Separate keys for signing consensus messages and transactions #62
Documenting consensus messages #48

@vldm:

Ser/de, refactoring messages #17 #32
Whitelist support for full nodes #14
Generic fabric for clap configuration (assistant @alekseysidorov) #61
Tx generator, running benchmarks #54
Handling of mempool filling #64
Verify profiling #52
Managing node through managing API ("RPC") (assistant @alekseysidorov) #60

@deniskolodin:

Modifying block structure #138

Documentation #111:

Each responsible provide separate PR which add #![deny(missing_docs)] to their modules. After that, we add #![deny(missing_docs)] for overall exonum.

Module blockchain @alekseysidorov
Messages, storage values and seriaization @vldm
Module node @DarkEld3r
Module storage @defuz

Release process

TBD

Panic handling during transaction execution

Handling of mempool filling

Investigate slog ecosystem and transfer logging to it.

#86
Rich feature set may enable additional capabilities for exonum-core.

https://github.com/slog-rs/slog

Persistent storing of sent consensus messages

We need this to fulfill the requirements of the consensus algorithm.

Remove (useless) comments

Useless comments should be removed, especially commented code.

Documenting consensus messages

It was determined in #46 that consensus messages (e.g., Propose) are not sufficiently documented for now. Each such message could be commented like

// Request connected peers from the node `to`.
//
// ### Processing
//   * The message is authenticated by the pubkey `from`. 
//     It must be in the receiver's full node list
//   * If the message is properly authorized, the node responds with...
//
// ### Generation
// A node generates `RequestPeers` under such and such conditions...
message! {
     RequestPeers {

Note that consensus messages are slightly different from transaction messages defined by services; neither Processing, nor Generation sections can be straightforwardly translated for transaction messages (although these messages should probably be documented too). This is because tx message processing is encapsulated in the execute() method of the transaction (i.e., can be documented there); and there are no specific rules as to when ordinary tx messages are generated.

Proposed solution: I think some documentation for consensus messages is needed both here and in general Exonum docs. Message descriptions here could be useful in order to verify that messages are processed and generated as intended without needing to consult an external source. And they can be copy-pasted to the general docs if necessary.

Check that new `StoredConfiguration` references the last committed config in `prev_hash` field in `Schema::commit_configuration`

Also check that distinct validators are passed inVec<PublicKey

Consensus on the threshold of 1/3 sleeping validators

After merging #6 request (#2 issue), we are going to have a problem with consensus on the threshold of 1/3 sleeping validators. Proposed solution: if 1/3 validators send messages for round R or higher, then validators from lower rounds should jump to round R.

Add developer notes file

Add a file with developer notes, similar to https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md
Fix link and TODO in CONTRIBUTING.md

Add table for storing info on locating transactions within blocks in db

The table should store transaction_hash -> storage_value! {block_height_u64, tx_position_within_block_u64}

exonum / exonum Goto Github PK

exonum's Issues

For first iteration we decided:

Features & code changes:

Documentation #111:

Release process

Recommend Projects

Recommend Topics

Recommend Org