paradigmxyz / reth Goto Github PK
View Code? Open in Web Editor NEWModular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
License: Apache License 2.0
Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
License: Apache License 2.0
The client receives blocks via engine_newpayloadv{1,2}
https://github.com/ethereum/execution-apis/blob/main/src/engine/specification.md#engine_newpayloadv1
The network needs: latest block number as part of the Status
message and to handle Get{Header,Blocks}
requests.
Especially for testing and debugging tracing support can be very useful.
Keep track of things that can be simplified once stabilized:
Arc::unwrap_or_clone
: rust-lang/rust#93610BTreeMap::pop_first
: rust-lang/rust#62924$seek:ty
is defined as part of the macro input, but is not used inside the macro. We should remove it and flatten the macro to be a single pattern with $seek:ty
removed
Refactored header downloader introduced in #249 has control flows that can bubble up DownloadError::RequestError
variant. Currently, it's not handled in the headers stage. Define proper behavior for unexpected request errors
https://github.com/foundry-rs/reth/blob/6e7928ab84eb8ca895a23c717336146761f3102b/crates/stages/src/stages/headers.rs#L88-L89
currently only full transactions are propagated.
Implement rules for either propagating full transactions or just hashes
Appending more than 240 headers with a cursor results in MDBX_EKEYMISMATCH: The given key value is mismatched to the current cursor position
.
Steps to reproduce:
end
value in tests::headers_stage_prev_progress
to 10241
cargo test --package reth-stages --lib -- stages::headers::tests::headers_stage_prev_progress --exact --nocapture
#194 Introduced a rudimentary reputation system.
Some parts are missing:
Ref #93
double check encoding/decoding
See https://github.com/ethereum/EIPs/blob/master/EIPS/eip-155.md
currently the NetworkManager
emits a series of events via an mpsc channel, including all incoming request for Transactions, which include oneshot::Sender
to send back the result.
However, oneshot::Sender
is not Clone
instead of using a mpsc
event channel, we should use a dedicated channel for all tx related events.
A P2PStream
is created by first constructing an UnauthedP2PStream
(which wraps an underlying Stream + Sink
) and await
ing on the UnauthedP2PStream::handshake
method, returning a P2PStream
:
let unauthed_stream = UnauthedP2PStream::new(stream);
let p2p_stream = unauthed_stream.handshake(client_hello).await.unwrap();
This ensures that all P2PStream
s in use are with peers that have successfully completed the p2p
handshake.
EthStream
currently uses an authed
flag to determine if a stream is authenticated, when instead we can take the above approach to construct an EthStream
. Similar to #157, we should also return the incoming Status
with the authenticated EthStream
.
Cover all needed validation and building for Ethereum block/header inside one library so it can be reused if needed. It will contain the main execution with EVM and all checks required for validating and creating blocks.
Some checks can differ depending on the consensus in place.
Common tasks:
Have utilities for stages to check parts of header/block/transactions fields. Those functions need to be aligned with the need (performance) of every stage. Some of those checks can be placed inside reth-primitives
.
Have the ability with a list of transactions to create state change and all needed fields for the header.
TODO tasks
Have the ability to execute:
This abstraction should be mostly done, pending changes related to how the database abstractions evolve - e.g. instead of taking a raw MDBX transaction, we will likely receive another type in the future.
Ran
and Unwound
events are emitted with "special" values that denote that a stage either failed or was skipped. We should just add events for these casesStage::execute
, but realistically this behavior should be tuneable to only commit meaningful progresstracing
to mark out spans and emit events - we might be able to leverage this info in conjunction with e.g. tracing_tracy
to be able to use Tracy. However, there may be tools that are better suited for profiling in our case.Initially we will use the good learnings from Akula, which is based on good learnings from Silkworm and Erigon, and essentially delineate the stages around the same boundaries as they have. As we progress, we might need more stages than listed here (or fewer).
For the more complex stages I propose we create separate tracking issues that link back to this one.
BlockNumber -> TotalGas
. Seems to mostly be used for reporting.BlockHash -> BlockNumber
from the BlockNumber -> BlockHash
table built in the HeaderDownload
stageTxId -> Tx
.BlockNumber -> TotalTx
. Seems to only be used for reporting in the next stage.TxHash -> BlockNumber
, used in the RPC to look up transactions by hash.Add missing trait impl for FetchClient
As mentioned in #103, we should implement a pubkey / address recovery method and add the following test:
https://github.com/foundry-rs/foundry/blob/870da6f73ee6ede429ed5742bb91eed3121071e3/anvil/core/src/eth/transaction/mod.rs#L1347
cargo-fuzz
supports coverage which it does not look like cargo-test-fuzz
does. Should we move?
Tracking issue - Staged Sync #40
Implement the stage for recovering transaction senders.
Steps:
[previous progress, previous stage progress]
Considerations:
Related PRs:
https://github.com/paritytech/jsonrpsee
does not have an ipc client yet and it doesn't look like it's on the roadmap. The older jsorpc crate has one though: https://github.com/paritytech/jsonrpc/blob/master/ipc/src/server.rs
But it should be possible to add one following the jsonrpsee ws client:
https://github.com/paritytech/jsonrpsee/blob/master/client/ws-client/src/lib.rs
Support Prometheus metrics in the reth
binary:
metrics-exporter-prometheus
recorder with the above configurationPrefixLayer
to group all metrics to the reth.
namespace:Currently we use a BTreeSet
to sort shared capability names in P2PStream
, and the sorting is used to determine message IDs. However, since the element is a String
, it sorts lexicographically rather than alphabetically:
// TODO: the Ord implementation for strings says the following:
// > Strings are ordered lexicographically by their byte values. This orders Unicode code
// points based on their positions in the code charts. This is not necessarily the same as
// โalphabeticalโ order.
// We need to implement a case-sensitive alphabetical sort
the p2p net code currently relies on secp256k crate since some parts are adapted from other codebases.
This should be replaced with k256 crate, see ethers-signers for example.
Currently, we are wrapping eyre around enum Error items. We should probably define list of errors inside reth-interface
and do casting between internal errors mdbx has and Errors defined externally in reth-interfaces
Provide some utils to spawn geth/erigon instance(s) to interact with, ref #113 .
Intended to check compatibility with network related implementations.
Currently, headers stage proceeds to insert headers from the response without any checks. Add basic validation for correctness of the response
https://github.com/foundry-rs/reth/blob/6e7928ab84eb8ca895a23c717336146761f3102b/crates/stages/src/stages/headers.rs#L71-L73
The RLPx peer connection should implement framed encryption as specified by RLPx.
The ECIES transport should implement AsyncRead
and AsyncWrite
, so the p2p
connection can use it.
Long term, it would be nice to interact with framed encryption like this (similar to TLS libraries):
let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
let rlpx = RLPxConnector::new(secret_key, peer_id)?;
let mut client: ECIESStream = rlpx.connect(tcp_conn).await?;
client.write_all(b"hello").await?;
let mut buffer = [0; 16];
let n = client.read(&mut buffer[..]).await?;
High priority tasks:
AsyncRead
on ECIESStream<Io>
AsyncWrite
on ECIESStream<Io>
Lower priority:
ECIESStream<Io>
to support the above UX.The RLPx Server should also work with any transport that implements AsyncRead
and AsyncWrite
.
Longer term, it should be possible to serve RLPx like this:
let acceptor = RLPxAcceptor::new(secret_key);
let listener = TcpListener::bind(&addr).await?;
// roughly based off of the design of tokio::net::TcpListener
loop {
let (client: ECIESStream, remote_addr) = acceptor.accept(listener).await?;
process_socket(client).await;
}
Low priority tasks:
accept
pattern on ECIESStream
for an arbitrary Io
that implements AsyncRead
and AsyncWrite
.p2p
Peer ConnectionThe RLPx peer connection will contain a server and client portion, and will take care of capability negotiation (the p2p
capability), pings, and disconnects.
Both the client and server should be capable of working with a type that implements AsyncRead
and AsyncWrite
, meaning the transport does not need to implement framed encryption.
This makes it slightly easier to test, since it would not require creating an ECIESStream
.
We should be able to do something roughly similar to the above designs, but we need an extra task to handle driving the p2p
state.
This is roughly inspired by the design of hyper
's client::conn
.
// it would be nice to have utilities to properly generate a `Hello` message
let hello = Hello { ... };
let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
let p2p_config = P2PConfig::new(hello);
let (mut client: P2PStream, p2p_state) = p2p_config.handshake(tcp_conn).await?;
tokio::spawn(async move {
if let Err(e) = p2p_state.await {
println!("Error! maybe ping error, etc: {}", e);
}
});
// just an example, real messages are likely to be much more concrete
let message: Message = MessageBuilder::new()
.with_hello(hello)
.request_id(0xf)
.message("hello");
// todo: may want to consider a message size limit on read
// ECIESStream can implement AsyncRead/AsyncWrite, but `p2p` changes the interface from working mainly with bufs to also including a message id
// luckily the stream after being decrypted should have a header that tells us the length
client.send(message).await?;
let peer_message = client.read().await?;
Tasks:
This is roughly inspired by the design of hyper
's server::conn
.
let mut listener = TcpListener::bind(&addr).await?;
let server = P2PServer::new(hello);
loop {
let (stream, remote_addr) = tcp_listener.accept().await?
let mut client: P2PStream = server.accept(stream).await
process_message(client).await;
}
Tasks:
P2PServer
eth
Peer ConnectionThis contains a client and server, just like the RLPx and p2p
connection.
Instead of a p2p
handshake, both a p2p
and eth
handshake are performed.
// would also be nice to have a sensible way to generate `Status` messages
let status = Status { ... };
let tcp_conn = TcpStream::connect("127.0.0.1:30303")?;
// this should create a `P2PConfig` under the hood
let eth_config = EthConfig::new(hello, status);
let (mut client: EthStream, p2p_state) = eth_config.handshake(tcp_conn).await?;
tokio::spawn(async move {
if let Err(e) = p2p_state.await {
println!("Error! maybe ping error, maybe disconnect, etc: {}", e);
}
});
// since we are in a subprotocol now we can create a more concrete message
let get_pooled_transactions: GetPooledTransactions = vec!["0x7cab7b2c72d1c6cc8539ee5e4b8af9b86a130d63b1428c2c52c4454ead266ff4"].into();
// TODO: should the client impl just take care of choosing request ids and abstract out multiplexing?
let pooled_transactions = client.send_request(get_pooled_transactions).await?;
let mut hashes = client.stream_hashes().await?;
// if there were another client its stream could be joined
while let Some(hash) = hashes.next().await {
// ...
}
Tasks:
EthConfig
EthStream
The eth
server will wait for incoming connections and stream incoming
let mut listener = TcpListener::bind(&addr).await?;
// this call should make sure that the `Hello` message includes `eth` as a capability
let server = EthServer::new(hello, status);
loop {
let (stream, remote_addr) = tcp_listener.accept().await?
// this should create a `P2PServer` under the hood
let mut client: EthStream = server.accept(stream).await
process_message(client).await;
}
Tasks:
EthServer
eth
Wire ProtocolNOTE See: https://github.com/rjected/ethp2p
We need to provide RLP encoding and decoding for each of the eth
protocol messages and each of the types they contain.
RequestPair
typeStatus
(#20)Once this is done, we can also create a Message
type can be created that is the sum type of all the protocol messages.
eth
Network abstractionFinally, the Network
abstraction will integrate all of the above components to provide a reliable connection to a set of peers on the network.
This will implement interfaces that other components (e.g. staged sync, txpool) will use.
The network abstraction will also integrate with discovery mechanisms and manage the discovery state.
This will act as a Server
(an eth
server) if the user is able to accept incoming connections and will use discovery methods to create outgoing connections, which will yield a set of EthStream
s.
How will we implement the Engine API? It prompts the execution layer to request blocks, which happens over eth
p2p.
It could also just use the API provided by the Network
abstraction.
Ref #114, currently the stream uses send
from SinkExt
in multiple places, which completes when the Sink
has fully processed the item. Instead, it should be using the Sink
API like this, buffering requests when the Sink
is not ready:
if poll_ready().is_ready() {
start_send(msg)
} else {
// need to buffer until sink ready.
}
the current trait expects a 1:1 hash:Body relationship, however the network request accepts Vec<H256>
Ref #226
having Vec<H256>
in place makes it easier to go from 1 to many without breaking a lot of interfaces
If there is transactions present inside block we could just add block reward (2eth) to last transaction change set.
The problem is how to handle it if there are no transactions inside block, we still need index for transaction to index validator account balance change.
Do we always include +1 transaction changeset for validator block reward? In that case index of transactions would always be body.len()+1
. This could probably work but want others to be aware of it if we do it this way.
Currently, the senders
stage looks up unwind block hash and then unwinds to its latest tx index. To prevent undefined behavior in case of a corrupted database, look up the earliest available block
https://github.com/foundry-rs/reth/blob/fb2861f1125e8184f0e9f213814b1941f602d799/crates/stages/src/stages/senders.rs#L120
currently, we use node,NodeId
and peer,PeerId
interchangeably.
we should use peer,PeerId
consistently.
cc @Rjected
With POS, block propagation via devp2p is considered invalid: https://eips.ethereum.org/EIPS/eip-3675#devp2p
looks good, although there is still no forkid for the merge yet afaik. for example geth drops block broadcasts when its [`merger.PoSFinalized()`](https://github.com/ethereum/go-ethereum/blob/ae42148093fdfd72749ff3dda2b986cef543510f/eth/handler_eth.go#L121) returns true. this is set by the [handler for `engine_forkchoiceUpdatedV1`](https://github.com/ethereum/go-ethereum/blob/ae42148093fdfd72749ff3dda2b986cef543510f/eth/catalyst/api.go#L251)
Originally posted by @Rjected in #205 (review)
Progess here #22
There are currently two generic abstraction over:
the Transaction type itself: this would allow adding arbitrary additional context on top of required info like nonce, gas price, etc.. (maybe like a marker that this tx is part of a bundle)
Priority which is an arbitrary value that determines the best transactions in the pending pool (which contains tx that can be executed on the current state)
currently, the priority is expected to be determined like: fn Ordering::priority(tx: &Transaction) -> Priority I think ideally this should also allow gaining access to other transactions that are currently pending, so priority can be determined in relation to other txs.
perhaps sorting should be changed to be an operation on the entire pending pool instead.
I think "multiple pools" would basically be this: instead of dividing the pending pool, we add pay for order flow as an Ordering function applied to the entire pool, like fn Ordering::best(&pendingpool) -> impl Iterator< Transaction>
The representation of dependencies in transactions is probably a kind of graph, need to have a closure look at how mev bundles are handled now.
Also the naming ready vs pending seems a bit confusing?
ref #59 (review)
unify redundant network crate capability handling and eth-wire types
The rpc_types::Transaction
type is different than primitives::Transaction
.
where the former has optional fields, the latter has not.
conversions are needed:
get_transaction -> pool -> primitives::Transaction -> rpc_types::Transaction
for unmined transaction fields like block
will be none, to unify this, ad function like https://github.com/foundry-rs/foundry/blob/master/anvil/src/eth/backend/mem/mod.rs#L1837 can be used which accepts the optional block
and for eth_sendTransaction -> rpc_types::Transaction -> sign -> primitives::Transaction -> pool
. This conversion can fail. https://github.com/foundry-rs/reth/blob/a7cf915677fdeedb55796f39bb28cdb0435033b5/crates/net/rpc-types/src/eth/transaction/request.rs#L49
ref https://github.com/foundry-rs/foundry/blob/master/anvil/core/src/eth/transaction/mod.rs
Creating this issue to unblock #220 and continue the discussion here.
The question is: What are the tradeoffs between requesting 1 body per request to a peer vs requesting multiple bodies per request to a peer?
Currently:
Alternatively:
TxRoot => HeaderHash
, and the downloader would need a RO transaction to the database (which is not the case currently)In other words: Is the added complexity (extra database calls, more computation) worth the upside? What is the upside?
Worth noting:
comment to self: should prob do the same in headers downloader
Originally posted by @rkrasiuk in #220 (comment)
Run all chain tests from eth/test: https://github.com/ethereum/tests/tree/develop/BlockchainTests
This is one of two ways to check if a client is consistent with ethereum. (Second one is running on mainnet)
primitives
and load pre-state to mocked database.For running tests in stages we wouldn't need inmemory database that would allow us to do that.
BTreeMap
.rpc interfaces and types are already added.
Server
instances and all API handlersServer
impl as well.having a Noop impl for BlockProvider will be useful for things that depend on a type that implements that trait.
for example inside the Network
we have Arc<BlockProvider>
, for tests that don't use that type we can use an implementation of that type that does nothing, but allows us to create types with fields of type Arc<BlockProvider>
.
broadcasting messages like Transactions
depend on the peer (does the peer know the transaction?)
rn the message type Arc<Vec<Tx>>
but in rather should be Vec<Arc<Tx>>
.
currently HeadersRequest
start field is of type BlockId
, whereas the eth
protocol requires BlockHashOrNumber
.
We should consider unifying this to BlockHashOrNumber
otherwise it will require an additional convert before the request is send to handle the Pending
, Latest
, Safe
variants.
wdyt @rkrasiuk ?
A few things we need to sort out before open sourcing:
In terms of release workflow, we need to discuss what our direction is in three-ish areas: means of distribution, how often we release, and if we provide LTS.
On the first area, @gakonst expressed that we should have something like foundryup
. I think generally this works OK for developer tooling, but I'm unsure if this will work for nodes, or if we should go a more traditional route and publish on popular package managers?
On the second area, I have no opinion, and on the third I don't have a strong opinion, but I do not have a lot of experience providing LTS and it might not make sense to do.
The response objects used by the client types should include the peer that sent this message, so it can be reported if the message was bad.
Akula for example stores them inside Header
as Vec<Header>
or as we have that num|hash
mapping for headers we could store them inside Header
table.
More on discussion here: #190 (comment)
Add missing handler for broadcasting new transactions from the pool to the network.
Currently we initialize a p2p
connection is by creating an UnauthedP2PStream
, which returns the authenticated P2PStream<S>
when UnauthedP2PStream::handshake
completes successfully.
Instead, UnauthedP2PStream
should return the successful Hello
message in addition to the P2PStream<S>
.
Different parts of the codebase will exchange information via (unbounded)channels.
The benefit of an unbounded channel is that a send
will always succeed, this is important if we want to ensure that a value will arrive as long as the channel is not closed.
Values will be buffered, however, which makes the system memory the implicit upper bound of channels. A stalled channel could cause issues here. For debugging/testing purposes we could write abstractions that keep track of how many values are being sent/received via a global counter or even via metrics
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.