Giter Club home page Giter Club logo

reth's People

Contributors

abnerzheng avatar alessandromazza98 avatar allnil avatar chirag-bgh avatar clabby avatar danipopes avatar dothebesttogetthebest avatar emhane avatar evalir avatar gakonst avatar github-actions[bot] avatar i-m-aditya avatar int88 avatar joshiedo avatar jsvisa avatar jtraglia avatar lambdaclass-user avatar leruaa avatar loocapro avatar mattsse avatar megaredhand avatar mempirate avatar onbjerg avatar rakita avatar rjected avatar rkrasiuk avatar shekhirin avatar supernovahs avatar tcoratger avatar techieboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reth's Issues

Peer reputation system

#194 Introduced a rudimentary reputation system.

Some parts are missing:

  • Make thresholds/weights configurable
  • Custom membership sets of peers?

Add dedicated channel between `TransactionsManager` and `NetworkManager`

currently the NetworkManager emits a series of events via an mpsc channel, including all incoming request for Transactions, which include oneshot::Sender to send back the result.

However, oneshot::Sender is not Clone

instead of using a mpsc event channel, we should use a dedicated channel for all tx related events.

Use `UnauthedEthStream` pattern for creating an `EthStream`

A P2PStream is created by first constructing an UnauthedP2PStream (which wraps an underlying Stream + Sink) and awaiting on the UnauthedP2PStream::handshake method, returning a P2PStream:

let unauthed_stream = UnauthedP2PStream::new(stream);
let p2p_stream = unauthed_stream.handshake(client_hello).await.unwrap();

This ensures that all P2PStreams in use are with peers that have successfully completed the p2p handshake.

EthStream currently uses an authed flag to determine if a stream is authenticated, when instead we can take the above approach to construct an EthStream. Similar to #157, we should also return the incoming Status with the authenticated EthStream.

Tracking: Execution/Validation of blocks

Cover all needed validation and building for Ethereum block/header inside one library so it can be reused if needed. It will contain the main execution with EVM and all checks required for validating and creating blocks.

Some checks can differ depending on the consensus in place.

Common tasks:

  • Implement rlp for all primitive types.

Validator

Have utilities for stages to check parts of header/block/transactions fields. Those functions need to be aligned with the need (performance) of every stage. Some of those checks can be placed inside reth-primitives.

  • #152
  • Transaction signature recovery: #179
  • Transaction verification

Builder

Have the ability with a list of transactions to create state change and all needed fields for the header.
TODO tasks

Transaction execution

  • Storage provider: #172
  • Transaction execution with storage provider. #238

Have the ability to execute:

  • Past transaction's
  • given transaction on:
    • Latest state
    • State from past.

Tracking: Staged sync

Stage abstraction

This abstraction should be mostly done, pending changes related to how the database abstractions evolve - e.g. instead of taking a raw MDBX transaction, we will likely receive another type in the future.

Pipeline

  • Better unwind priorities (@onbjerg): The current unwind priority system is based on Akula's method, but it can and should be simplified to prevent footgunning
  • Error and skip events (@onbjerg): The pipeline emits events that are currently only used for testing, but may be useful later on for metrics or other things. In some cases Ran and Unwound events are emitted with "special" values that denote that a stage either failed or was skipped. We should just add events for these cases
  • Commit intervals (@onbjerg): Currently data is committed to the database every time a stage returns from Stage::execute, but realistically this behavior should be tuneable to only commit meaningful progress

Tooling

  • Benchmarking helpers: We want to benchmark stages, so we will probably end up needing some utilities to make that easier
  • Profiling: We want insight into what the stages are doing to find paths to optimize. Currently we use tracing to mark out spans and emit events - we might be able to leverage this info in conjunction with e.g. tracing_tracy to be able to use Tracy. However, there may be tools that are better suited for profiling in our case.
  • Metrics: While not only a thing for staged sync (we need them in general), tools to expose metrics should be provided as well.

Stages

Initially we will use the good learnings from Akula, which is based on good learnings from Silkworm and Erigon, and essentially delineate the stages around the same boundaries as they have. As we progress, we might need more stages than listed here (or fewer).

For the more complex stages I propose we create separate tracking issues that link back to this one.

  • HeaderDownload (@rkrasiuk): Downloads headers over P2P
  • TotalGasIndex1: Builds an index of BlockNumber -> TotalGas. Seems to mostly be used for reporting.
  • BlockHashes1: Builds an index of BlockHash -> BlockNumber from the BlockNumber -> BlockHash table built in the HeaderDownload stage
  • BodyDownload: Downloads block bodies and saves a minimal structure containing ommers, the first transaction ID in the block and the number of transactions. Also builds a table of TxId -> Tx.
  • TotalTxIndex1: Builds an index of BlockNumber -> TotalTx. Seems to only be used for reporting in the next stage.
  • SenderRecovery: Recovers sender addresses in each transaction
  • Execution: Executes blocks
  • HashState: Hashes accounts and account storage
  • Interhashes: Builds trie hashes
  • AccountHistoryIndex1: Builds indexes related to account histories/changesets
  • StorageHistoryIndex1: Builds indexes related to storage histories/changesets
  • TxLookup1: Builds an index of TxHash -> BlockNumber, used in the RPC to look up transactions by hash.
  • CallTraceIndex1: Builds indexes that specify where an account has been the origin or destination of a message
  • Finish: Sets the chain tip (used in the RPC to figure out what our latest synced block is)

Footnotes

  1. These stages are generally what I would categorize as indexes, which we may be able to generalize somewhat. โ†ฉ โ†ฉ2 โ†ฉ3 โ†ฉ4 โ†ฉ5 โ†ฉ6 โ†ฉ7

Sender Recovery Stage

Tracking issue - Staged Sync #40

Implement the stage for recovering transaction senders.

Steps:

  1. Collect transactions for the block range [previous progress, previous stage progress]
  2. Recover transactions senders
  3. Insert the data into tx senders table

Considerations:

  • chunk block ranges and insert in batches. determine batch size

Related PRs:

Fix `Ord` usage when determining shared capabilities

Currently we use a BTreeSet to sort shared capability names in P2PStream, and the sorting is used to determine message IDs. However, since the element is a String, it sorts lexicographically rather than alphabetically:

    // TODO: the Ord implementation for strings says the following:
    // > Strings are ordered lexicographically by their byte values. This orders Unicode code
    // points based on their positions in the code charts. This is not necessarily the same as
    // โ€œalphabeticalโ€ order.
    // We need to implement a case-sensitive alphabetical sort

Replace secp256k1 with k256 crate

the p2p net code currently relies on secp256k crate since some parts are adapted from other codebases.

This should be replaced with k256 crate, see ethers-signers for example.

DB refactor Error enum

Currently, we are wrapping eyre around enum Error items. We should probably define list of errors inside reth-interface and do casting between internal errors mdbx has and Errors defined externally in reth-interfaces

Tracking: P2P

P2P Networking proposal and tracking

RLPx Peer Connection

The RLPx peer connection should implement framed encryption as specified by RLPx.

Client

The ECIES transport should implement AsyncRead and AsyncWrite, so the p2p connection can use it.
Long term, it would be nice to interact with framed encryption like this (similar to TLS libraries):

let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
let rlpx = RLPxConnector::new(secret_key, peer_id)?;
let mut client: ECIESStream = rlpx.connect(tcp_conn).await?;

client.write_all(b"hello").await?;

let mut buffer = [0; 16];
let n = client.read(&mut buffer[..]).await?;

High priority tasks:

  • Implement AsyncRead on ECIESStream<Io>
  • Implement AsyncWrite on ECIESStream<Io>

Lower priority:

  • Refactor ECIESStream<Io> to support the above UX.

Server

The RLPx Server should also work with any transport that implements AsyncRead and AsyncWrite.
Longer term, it should be possible to serve RLPx like this:

let acceptor = RLPxAcceptor::new(secret_key);
let listener = TcpListener::bind(&addr).await?;

// roughly based off of the design of tokio::net::TcpListener
loop {
    let (client: ECIESStream, remote_addr) = acceptor.accept(listener).await?;
    process_socket(client).await;
}

Low priority tasks:

  • Implement accept pattern on ECIESStream for an arbitrary Io that implements AsyncRead and AsyncWrite.

p2p Peer Connection

The RLPx peer connection will contain a server and client portion, and will take care of capability negotiation (the p2p capability), pings, and disconnects.
Both the client and server should be capable of working with a type that implements AsyncRead and AsyncWrite, meaning the transport does not need to implement framed encryption.
This makes it slightly easier to test, since it would not require creating an ECIESStream.

Client

We should be able to do something roughly similar to the above designs, but we need an extra task to handle driving the p2p state.
This is roughly inspired by the design of hyper's client::conn.

// it would be nice to have utilities to properly generate a `Hello` message
let hello = Hello { ... };

let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
let p2p_config = P2PConfig::new(hello);
let (mut client: P2PStream, p2p_state) = p2p_config.handshake(tcp_conn).await?;

tokio::spawn(async move {
    if let Err(e) = p2p_state.await {
        println!("Error! maybe ping error, etc: {}", e);
    }
});

// just an example, real messages are likely to be much more concrete
let message: Message = MessageBuilder::new()
    .with_hello(hello)
    .request_id(0xf)
    .message("hello");

// todo: may want to consider a message size limit on read
// ECIESStream can implement AsyncRead/AsyncWrite, but `p2p` changes the interface from working mainly with bufs to also including a message id
// luckily the stream after being decrypted should have a header that tells us the length
client.send(message).await?;
let peer_message = client.read().await?;

Tasks:

  • Implement RLPx connection with the above UX.

Server

This is roughly inspired by the design of hyper's server::conn.

let mut listener = TcpListener::bind(&addr).await?;
let server = P2PServer::new(hello);
loop {
    let (stream, remote_addr) = tcp_listener.accept().await?
    let mut client: P2PStream = server.accept(stream).await
    process_message(client).await;
}

Tasks:

  • Implement P2PServer

eth Peer Connection

This contains a client and server, just like the RLPx and p2p connection.
Instead of a p2p handshake, both a p2p and eth handshake are performed.

Client

// would also be nice to have a sensible way to generate `Status` messages
let status = Status { ... };

let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
// this should create a `P2PConfig` under the hood
let eth_config = EthConfig::new(hello, status);
let (mut client: EthStream, p2p_state) = eth_config.handshake(tcp_conn).await?;

tokio::spawn(async move {
    if let Err(e) = p2p_state.await {
        println!("Error! maybe ping error, maybe disconnect, etc: {}", e);
    }
});

// since we are in a subprotocol now we can create a more concrete message
let get_pooled_transactions: GetPooledTransactions = vec!["0x7cab7b2c72d1c6cc8539ee5e4b8af9b86a130d63b1428c2c52c4454ead266ff4"].into();

// TODO: should the client impl just take care of choosing request ids and abstract out multiplexing?
let pooled_transactions = client.send_request(get_pooled_transactions).await?;
let mut hashes = client.stream_hashes().await?;
// if there were another client its stream could be joined
while let Some(hash) = hashes.next().await {
    // ...
}

Tasks:

  • Implement EthConfig
  • Implement EthStream

Server

The eth server will wait for incoming connections and stream incoming

let mut listener = TcpListener::bind(&addr).await?;
// this call should make sure that the `Hello` message includes `eth` as a capability
let server = EthServer::new(hello, status);
loop {
    let (stream, remote_addr) = tcp_listener.accept().await?
    // this should create a `P2PServer` under the hood
    let mut client: EthStream = server.accept(stream).await
    process_message(client).await;
}

Tasks:

  • Implement EthServer

eth Wire Protocol

NOTE See: https://github.com/rjected/ethp2p

We need to provide RLP encoding and decoding for each of the eth protocol messages and each of the types they contain.

  • RequestPair type
  • Status (#20)
  • NewBlockHashes
  • Transactions
  • GetBlockHeaders
  • BlockHeaders
  • GetBlockBodies
  • BlockBodies
  • NewBlock
  • NewPooledTransactionHashes
  • GetPooledTransactions
  • PooledTransactions
  • GetReceipts
    • What is this request/response pair used for?
  • Receipts

Once this is done, we can also create a Message type can be created that is the sum type of all the protocol messages.

eth Network abstraction

Finally, the Network abstraction will integrate all of the above components to provide a reliable connection to a set of peers on the network.
This will implement interfaces that other components (e.g. staged sync, txpool) will use.

The network abstraction will also integrate with discovery mechanisms and manage the discovery state.
This will act as a Server (an eth server) if the user is able to accept incoming connections and will use discovery methods to create outgoing connections, which will yield a set of EthStreams.

Current questions

How will we implement the Engine API? It prompts the execution layer to request blocks, which happens over eth p2p.
It could also just use the API provided by the Network abstraction.

Fix `Sink` API usage in `P2PStream`

Ref #114, currently the stream uses send from SinkExt in multiple places, which completes when the Sink has fully processed the item. Instead, it should be using the Sink API like this, buffering requests when the Sink is not ready:

if poll_ready().is_ready() {
   start_send(msg)
} else {
  // need to buffer until sink ready.
}

Discuss: How to handle block reward to miner

If there is transactions present inside block we could just add block reward (2eth) to last transaction change set.

The problem is how to handle it if there are no transactions inside block, we still need index for transaction to index validator account balance change.

Do we always include +1 transaction changeset for validator block reward? In that case index of transactions would always be body.len()+1. This could probably work but want others to be aware of it if we do it this way.

Allow to configure block propagation in devp2p

With POS, block propagation via devp2p is considered invalid: https://eips.ethereum.org/EIPS/eip-3675#devp2p

   looks good, although there is still no forkid for the merge yet afaik. for example geth drops block broadcasts when its [`merger.PoSFinalized()`](https://github.com/ethereum/go-ethereum/blob/ae42148093fdfd72749ff3dda2b986cef543510f/eth/handler_eth.go#L121) returns true. this is set by the [handler for `engine_forkchoiceUpdatedV1`](https://github.com/ethereum/go-ethereum/blob/ae42148093fdfd72749ff3dda2b986cef543510f/eth/catalyst/api.go#L251)

Originally posted by @Rjected in #205 (review)

ref merger.PoSFinalized()

Tracking: Transaction Pool

Progess here #22

Todos

  • implement new erigon style pool design, with several subpool
  • Good abstractions and interfaces for p2p and RPC
  • Implement tx pool traits for reth's concrete pool types
  • simplify types where possible
  • Good Test Coverage
  • Fuzzing?

Lower prio

  • Integrate metrics (prometheus)

Questions

There are currently two generic abstraction over:

the Transaction type itself: this would allow adding arbitrary additional context on top of required info like nonce, gas price, etc.. (maybe like a marker that this tx is part of a bundle)
Priority which is an arbitrary value that determines the best transactions in the pending pool (which contains tx that can be executed on the current state)
currently, the priority is expected to be determined like: fn Ordering::priority(tx: &Transaction) -> Priority I think ideally this should also allow gaining access to other transactions that are currently pending, so priority can be determined in relation to other txs.

perhaps sorting should be changed to be an operation on the entire pending pool instead.

I think "multiple pools" would basically be this: instead of dividing the pending pool, we add pay for order flow as an Ordering function applied to the entire pool, like fn Ordering::best(&pendingpool) -> impl Iterator< Transaction>

The representation of dependencies in transactions is probably a kind of graph, need to have a closure look at how mev bundles are handled now.

Add transaction type conversion for rpc

The rpc_types::Transaction type is different than primitives::Transaction.

where the former has optional fields, the latter has not.

conversions are needed:

get_transaction -> pool -> primitives::Transaction -> rpc_types::Transaction

for unmined transaction fields like block will be none, to unify this, ad function like https://github.com/foundry-rs/foundry/blob/master/anvil/src/eth/backend/mem/mod.rs#L1837 can be used which accepts the optional block

and for eth_sendTransaction -> rpc_types::Transaction -> sign -> primitives::Transaction -> pool. This conversion can fail. https://github.com/foundry-rs/reth/blob/a7cf915677fdeedb55796f39bb28cdb0435033b5/crates/net/rpc-types/src/eth/transaction/request.rs#L49

ref https://github.com/foundry-rs/foundry/blob/master/anvil/core/src/eth/transaction/mod.rs

Discussion: 1 body per request vs multiple bodies per request

Creating this issue to unblock #220 and continue the discussion here.

The question is: What are the tradeoffs between requesting 1 body per request to a peer vs requesting multiple bodies per request to a peer?

Currently:

  • The body downloader requests 1 body per call to the P2P layer, but does multiple calls at the same time
  • This is done because the downloader returns bodies in order of block number, not the order in which requests are fulfilled
  • This means that the downloader assumes that the body returned is valid (but as soon as it is handed over to the stage, it is validated)

Alternatively:

  • We request multiple bodies per call to the P2P layer
  • There is no order guarantee on the bodies returned by the P2P layer, so we have to match ourselves
  • A way to match is to calculate the transactions root of the body, however, in order to match this root with a header, we need to store a mapping of TxRoot => HeaderHash, and the downloader would need a RO transaction to the database (which is not the case currently)

In other words: Is the added complexity (extra database calls, more computation) worth the upside? What is the upside?

Worth noting:

  • If we compute the transaction root as soon as the request is fulfilled, we might compute the transaction root for a large number of bodies we later discard, if they are after an invalid body. This is not currently the case, since the transaction root is calculated in the order the blocks are handed over to the stage
  • The alternative requires a new table for only that part of sync (it is not used by e.g. RPC)

Tracking: DB

TODO:

  • Abstraction layer
  • Fuzz test (proptest) ser/de
  • Benchmarks speed vs size - types and codecs
  • conditional dependencies for db/codecs #51

Continuous work:

  • More tables
  • Add proper codec annotations to tables

Codec Abstraction: #51
DataApi [discussion]: #29
MDBX table implementation: #15

Tracking: Eth chain tests

Run all chain tests from eth/test: https://github.com/ethereum/tests/tree/develop/BlockchainTests
This is one of two ways to check if a client is consistent with ethereum. (Second one is running on mainnet)

Run test

  • Parse json files: #38
  • Transfer json models to primitives and load pre-state to mocked database.
  • Run on stages (maybe modify it if needed) and prepare checks (roots,hashes) that are going to be optional/disabled at first.

Mocking

For running tests in stages we wouldn't need inmemory database that would allow us to do that.

  • Define and do cleanup on talked db abstraction.
  • Integrate abstraction inside stages.
  • Mock Database/Trasanction interface with BTreeMap.

Tracking: RPC support

rpc interfaces and types are already added.

Todo

Server

  • Implement Server instances and all API handlers
  • #1098

Handlers

Tests

  • Payload testing via rpctestgen

Current Q

  • Database and Pool access: Database abstraction will likely be provided by #29
  • IPC support: not supported by jsonrpsee, could modify IPC support from anvil and use the Server impl as well.

Add test implementation for BlockProvider

having a Noop impl for BlockProvider will be useful for things that depend on a type that implements that trait.

for example inside the Network we have Arc<BlockProvider>, for tests that don't use that type we can use an implementation of that type that does nothing, but allows us to create types with fields of type Arc<BlockProvider>.

Improve broadcasting variants

broadcasting messages like Transactions depend on the peer (does the peer know the transaction?)

rn the message type Arc<Vec<Tx>> but in rather should be Vec<Arc<Tx>>.

Pre-open source checklist

A few things we need to sort out before open sourcing:

  • Code of conduct
  • Contributing guidelines
  • Issue and PR templates
  • Good contributor docs
  • A manual/book
  • Responsible disclosure (cc @gakonst I think this is something you are best suited to figure out?)
  • Release flow

In terms of release workflow, we need to discuss what our direction is in three-ish areas: means of distribution, how often we release, and if we provide LTS.

On the first area, @gakonst expressed that we should have something like foundryup. I think generally this works OK for developer tooling, but I'm unsure if this will work for nodes, or if we should go a more traditional route and publish on popular package managers?

On the second area, I have no opinion, and on the third I don't have a strong opinion, but I do not have a lot of experience providing LTS and it might not make sense to do.

Discuss: where to store ommers

Akula for example stores them inside Header as Vec<Header> or as we have that num|hash mapping for headers we could store them inside Header table.

More on discussion here: #190 (comment)

Return incoming `Hello` on `p2p` handshake

Currently we initialize a p2p connection is by creating an UnauthedP2PStream, which returns the authenticated P2PStream<S> when UnauthedP2PStream::handshake completes successfully.

Instead, UnauthedP2PStream should return the successful Hello message in addition to the P2PStream<S>.

Add metered channel abstractions

Different parts of the codebase will exchange information via (unbounded)channels.

The benefit of an unbounded channel is that a send will always succeed, this is important if we want to ensure that a value will arrive as long as the channel is not closed.

Values will be buffered, however, which makes the system memory the implicit upper bound of channels. A stalled channel could cause issues here. For debugging/testing purposes we could write abstractions that keep track of how many values are being sent/received via a global counter or even via metrics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.