ethereum / builder-specs Goto Github PK
View Code? Open in Web Editor NEWSpecification for the external block builders.
Home Page: https://ethereum.github.io/builder-specs/
License: Creative Commons Zero v1.0 Universal
Specification for the external block builders.
Home Page: https://ethereum.github.io/builder-specs/
License: Creative Commons Zero v1.0 Universal
To compute the state root for Capella blind block, a full withdrawal list will be required in state transaction. Currently, a blind block contains a header that only contains a withdrawal root, not a list. That information is not sufficient to construct state root. One solution is client can utilize get_exepcted_withdrawal
from the parent state and use the expected withdrawal and call process_withdrawals
. I think we might want to clarify this in the builder specs
I've been looking at Teku's builder endpoint recently. After sending the signed blinded block to the builder, Teku will wait 8 seconds for a payload before erroring out. Just to be clear, it will not fallback to the local payload here; it's too late for that. I'm wondering if that is an appropriate value and/or if we should define a timeout for this in the spec. I think that 8 seconds makes sense. In the spec, I only see a timeout for getting the payload header.
These are all of the builder timeouts defined in Teku:
Duration EL_BUILDER_CALL_TIMEOUT = Duration.ofSeconds(8);
Duration EL_BUILDER_STATUS_TIMEOUT = Duration.ofSeconds(1);
Duration EL_BUILDER_REGISTER_VALIDATOR_TIMEOUT = Duration.ofSeconds(8);
Duration EL_BUILDER_GET_HEADER_TIMEOUT = Duration.ofSeconds(1);
Duration EL_BUILDER_GET_PAYLOAD_TIMEOUT = Duration.ofSeconds(8);
Correct me if I'm wrong, but I believe validators only see the value of the block in the bids but not other important properties like the used gas. Thus it cannot compromise between gas usage and fee. Since larger blocks increase the orphan risk, a validator may want to specify a minimum miner tip per gas.
I propose two changes.
value - minimumtip*gasusage
is maximal.Otherwise, we will see more and more transaction with just 1 wei miner tip. Block builders have no incentive not to include them (their orphan risk is negligible as they will still have the best bid in the next block). And validators don't even see the size of the block before accepting the bid.
According to the current spec, builder software can accept registrations from validators who are currently active but also slashed.
Given that slashed proposers cannot produce valid blocks, we could tighten the spec to disallow this set of possible registrations.
It is a bit of an edge case but does reduce the possible load on builder software so I wanted to open this issue to see if there is any interest to modify the spec here.
I've benchmarked encoding and decoding of signed-blinded-beacon-block payloads with SSZ vs JSON: flashbots/go-boost-utils#50
SSZ seems 40-50x faster:
BenchmarkJSONvsSSZEncoding/Encode_JSON-10 18561 62939 ns/op 50754 B/op 274 allocs/op
BenchmarkJSONvsSSZEncoding/Encode_SSZ-10 855457 1368 ns/op 9472 B/op 1 allocs/op
BenchmarkJSONvsSSZEncoding/Decode_JSON-10 7621 153292 ns/op 30433 B/op 613 allocs/op
BenchmarkJSONvsSSZEncoding/Decode_SSZ-10 276904 3968 ns/op 10116 B/op 152 allocs/op
This would be particularly relevant for getPayload
calls, because of the payload size and timing constraints (but perhaps also interesting for getHeader
and registerValidator
).
Each getPayload
call currently does 8 JSON encoding and decoding steps (if we include mev-boost in the mix):
getPayload
request body (signed blinded beacon block)getPayload
request bodygetPayload
request bodygetPayload
request bodygetPayload
response body (execution payload)getPayload
response bodygetPayload
response bodygetPayload
response bodyConsidering 20-40ms per coding on average, that's up to 200-300ms JSON latency (or more).
Sending the data SSZ encoded could reliably shave 200-250ms off each getPayload
roundtrip.
I propose we add SSZ encoded payloads as first-class citizens.
Questions:
getPayload
with SSZ body, and if the relay responds with an error then try with JSON again? Or should it send an accept-encoding: ssz
header to getHeader, and if it receives the header in SSZ encoding then use SSZ for getPayload too, otherwise fall back to JSON?I think it makes sense to write the error conditions first. They are essentially the abort conditions
- if any of these, error
- otherwise, return this (with these conditions on values)
rather than
- return this!
- but don't do that if any of these error conditions
The validator registration endpoint currently takes a single registration with the understanding that relay muxers will send the registration to each relay they are aware of.
The spec for this endpoint returns a binary success/error response which precludes the case where a some relays succeed where others fail when processing a registration.
I think we want to erase this level of detail from the API in which case we don't need to change anything. For example, a caller of this endpoint who receives an error should assume the registration has completely failed and they need to try again (e.g. beacon node queues a retry for later). If they get a success, they should assume the responsibility on promulgating their registration is now with the muxer software (and the implication is that internally, the muxer may retry against transient relay failures, etc.)
If this is not the intended semantics, then we should extend the response so many errors can be returned.
This proposal removes immediate signature verification of new validator registrations, making the verification asynchronous. The information about verification status shouldn’t be returned from the registration endpoint any more, and instead queried from data API.
This change removes a CPU bottleneck in the relay, along with possible DOS attack vectors, and allows registration process to be resilient to high loads.
With the current process, relays don’t have an even load and relay operators need to use expensive infrastructure to cover the load spikes. As the signature verification failure status is not used in the flow, relaxing the spikes should greatly reduce relay’s maintenance costs, increasing the number of people who can afford running relays.
This change addresses a number of problems and threats to the relay ecosystem that are the effect of verifying registration signatures on register validator submission.
The mainnet right now has more than 400k validators and we expect this number to grow with Ethereum adoption. There are existing mechanisms that on every epoch re-register existing validators, resending hundreds of thousands of validator registrations in the few first seconds of an epoch.
Verifying signatures is not a computationally trivial operation - to carry the CPU load a singular relay instance needs to be run on an oversized server that is not utilizing its computing power for the rest of an epoch.
The current implementation of [mev-boost](https://github.com/flashbots/mev-boost/blob/main/server/service.go#L276-L284) does not return error contents back to the validator on failed registration.
The verification of registration signatures is also not immediately used as the registrations are only used to be returned later by the api endpoint ([link](https://github.com/flashbots/mev-boost-relay/blob/174a4a66280aa0289551f61dbabbb17ec202c18d/services/api/service.go#L1420)).
Current network traffic characteristics are similar to a DDOS attack, as the current mechanism creates an attack vector where a bad actor sends its own registration slightly ahead of time and then floods the server with incorrect registrations. It’s also possible to completely clog the relay with just the number of new registrations.
The recurrent registration problem was reported a few times before, but the core of the problem was never satisfactorily resolved.
Relays need to be designed in a way that can handle the load. Some variance might be nice but couldn't be relied upon anyway.
This change is meant to eliminate the CPU-bound performance problems, without changing the entire network’s behavior.
Current Validator registration process:
This proposal aims to make the signature verification (step 4) asynchronous.
The only reason a good, lawful and honest validator may be concerned about its signature state is at the time of its initial or consecutive deployments, i.e. when configuration can change. There is no benefit to knowing it is still correct on every deployment.
Therefore, the information about the state of verification may be offloaded into a separate endpoint and removed from (POST) /relay/v1/builder/validators
. It can be achieved by extending /relay/v1/data/validator_registration
with additional enum field - status
.
Go (possible implementation)
type Status int64
const (
Unverified Status = iota
Verified
Invalid
)
// SignedValidatorRegistration https://github.com/ethereum/beacon-APIs/blob/master/types/registration.yaml#L18
type SignedValidatorRegistration struct {
Message *RegisterValidatorRequestMessage `json:"message"`
Signature Signature `json:"signature" ssz-size:"96"`
Status Status `json:"status"`
}
The endpoint would then return:
{
"message": {
"fee_recipient": "0xabcf8e0d4e9587369b2301d0790347320302cc09",
"gas_limit": "1",
"timestamp": "1",
"pubkey": "0x93247f2209abcacf57b75a51dafae777f9dd38bc7053d1af526f220a7489a6d3a2753e5f3e8b1cfe39b56f43611df74a"
},
"signature": "0x1b66ac1fb663c9bc59509846d6ec05345bd908eda73e670af888da41af171505cc411d61252fb6cb3fa0017b679f8bb2305b26a285fa2737f175668d0dff91cc1b66ac1fb663c9bc59509846d6ec05345bd908eda73e670af888da41af171505"
"status": 1 // or possibly in a string form "verified"
}
The new process would assume that after successful initial verification (e.g. submission time, is known validator) every registration would be persisted with the default unverified state.
It should be left for the relay development team to decide on implementation details, however, the goal for the verification itself is to become “eventually verified”. This new flow would allow various improvements not limited to verifying signatures in the background process, throttling the number of parallel calculations, or calculating signatures only upon request.
For existing relay implementations it would still be possible to preserve verification calculations on submission and save as already verified.
This change introduces a weak inconsistency for people who were expecting to find an error on incorrect signature - that should no longer be returned.
Change to the /relay/v1/data/validator_registration
endpoint is additive - meaning there are no protocol-breaking changes.
In existing codebases, the value can still be calculated in the same place as it was before and saved as already verified. So no big immediate changes should be needed.
This proposal doesn’t depend on any other work.
There is no standard of the multi-validator registration process - it is unclear how relays should behave upon a failure of one validator. From the user perspective, it’s undesirable to fail all validators in the payload if one has a broken signature. This change allows all validators that passed pre-check to be registered and discarded only when needed.
As described above this change targets performance improvements for good, lawful and honest validators, as signatures may only change during the deployment of a new configuration, followed by a process restart. There is no benefit to verifying that one’s signature is still correct for every request. Furthermore, current implementations of processes like mev-boost would not return this information back to the validator either.
This simple change can allow relays to use smaller servers, utilizing more of the cpu idle time - as we no longer have 3s window for verifying >400k signatures - allowing more people to afford running relays.
Issue for discussing various parts of the error responses.
SignedBlindedBeaconBlock
in SSZ format, using a Content-Type: application/octet-stream
request header. If mev-boost / the relay does not accept that format, should it return status code 415? https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/415This is a bit of an edge case but I think it is worth specifying that if a proposer (with the boost software acting as a proxy) receives multiple signed builder bids with the same value but differing payload headers, the proposer should break the tie with the hash tree root of the signed bid (e.g. choose the lessor root interpreting the root as a 256-bit number)
Here
builder-specs/builder-oapi.yaml
Line 22 in b66471c
Either at release time or in master should indicate the release version
follow up from the work in #76 to ensure the capella
and deneb
specs make sense
There is an ambiguity right now with the pubkey
contained in the SignedBuilderBid
and the signature
included in that message.
We want to know which builder produced a given bid/block for accountability reasons and so the signed message should include an identifier (and ideally a signature) for the builder.
Proposers also want to be able to know (and importantly, prove to others) which relays they use when obtaining blocks from the external builder network. We ideally have a separate identifier and signature to bind a particular relay to a bid they forward to validators.
Right now, there is only one set of (pubkey, signature) in the SignedBuilderBid
message and it is not clear in the specs which set of actors they refer to.
To remedy, one solution is:
BuilderBid
message that signs over the header
and value
data.SignedBuilderBid
message.And ideally the relay validates the signature from the builder and the proposer validates the signature from the relay and builder. Proposers could skip builder verification if they want, or perhaps do offline in situations where it becomes relevant.
Another solution that reduces the overhead here compared to the status quo is to make the builder opaque to the proposer (so that the SignedBuilderBid
only has pubkey and signature for the relay) with an option to also have the relay sign over the builder identity (so include a second pubkey for the builder in the BuilderBid
message). If it is not clear from the messages exchanged in the builder APIs which builder was responsible for a block, then I think there should be a separate protocol put in place to increase accountability between builders and relays but if we go this route then this concern can be handled elsewhere.
If we refer to the current execution Engine API spec for the build inputs: https://github.com/ethereum/execution-apis/blob/main/src/engine/specification.md#payloadattributesv1
we see that there is no way to specify a gas limit.
It is critical for the security of the network that this parameter remains under the control of the validator set and not the builder set (which should be much smaller and possibly not as aligned with ethereum overall).
I think an easy fix is to specify a PayloadAttributesV2
message (h/t @mkalinin ) that simply appends a gas_limit
field:
* gasLimit: QUANTITY, 64 Bits - value for the block gas limit of the new payload
The current design of the API is focused on simplicity, with the goal of providing an interface for receiving blocks from a single, trusted builder. This scenario is contrived though, because in reality there will be additional software involved, such as a multiplexer like mev-boost
and relays which create a buffer between builders and proposers.
Expanding the scope of the API beyond the contrived architectural view may make sense to better accommodate these designs, so I would like others to weigh in.
status
EndpointThe status
endpoint is defined to return whether the builder is operating normally and able to produce blocks. It's undefined what the response for this endpoint should be when called against a multiplexer. It could return "OK"
as long as one connected relay is up, or maybe only return "OK"
if all configured relays are up.
I am okay with this being an implementation detail of the multiplexer. As long as that software believes it can retrieve blocks, it should return "OK"
.
Bid
Expanding the definition of the Bid
object to include signatures from both the relay and the builder would give the proposer additional insight into the entity that actually built the block.
I think an issue with this is that the relay can simply fill out the builder signature with another key pair they have on hand and continue masking the builder's real identity.
--
My current opinion is to not expand the scope, but rather focus on the functionality provided by the API to make sure it is adequate to build more complicated systems on top of. Right now, only the most basic protocol for retrieving external blocks is defined. This allows a great deal of flexibility. For example, CLs who want to implement mev-boost
natively do not have specialized fields for mev-boost
that they need to worry about.
If there are other compelling examples / reasons for expanding the scope, please share them here.
The above validation results in duplicates of valid messages returning: {"code":400,"message":"invalid timestamp"}
I was under the impression registrations needed to be re-signed infrequently, and the once per epoch registration we are sending would be a duplicate.
This timestamp requirement also creates a re-signing once per epoch requirement (or however frequently we re send registrations), which could add a lot of load for a client running a lot of validators.
The spec currently says validators should submit their registrations once per epoch.
However, it seems like the mev-boost component has the responsibility to keep registrations it learns about and upstream them to connected relays in which case the validator client may be posting data every ~6 mins for no additional gain.
I'm opening this issue to seek feedback on expectations around this process and if we can reduce the frequency I'm happy to change the spec to reflect that.
Based on my current understanding, the validator client only needs to call the registration API when they boot or if any of their registration data changes (which practically may be synonymous with boot if clients only allow changes to config during validator client process boot).
The signing routines for the Builder API messages are under-specified.
The spec currently says to compute a message to sign over using compute_signing_root
which takes the SSZ data and a Domain
. Rather than specify a Domain
, the spec says to simply use the DomainType
0x00000001
.
There are two degrees of freedom to go from a DomainType
to a Domain
which we should decide how to handle:
The basic question to answer is if we want to version messages across instances of the protocol or not. The degrees of freedom here are to allow for messages (like deposits) where the relevant information either is not known or we don't want to restrict a message to a given fork.
If we follow the current layout of the builder spec here, we should version (so use state.genesis_validators_root
and the appropriate fork_version
) the builder bid types but not version (so use "default" values as defined in the spec) the validator registration types. Moreover, this maps to the "deposits vs. other protocol messages" usage in the consensus specs.
the ethereum consensus specs are executable, such that the specs are testable, and test vectors can be generated for consensus clients to consume. the builder specs should be executable in a similar way.
the scope of this issue is to set up the basic scaffolding for executable builder specs.
for example:
Makefile
)Right now we're trusting the bid from the builder will pay the value it claims. This can and should be provable with a state proof.
Suppose we're building block N+1 -- the builder bid object can be extended to have a proof to the balance of the fee_recipient
at block N and N+1. The proofs would be rooted against the headers for N and N+1. This would allow the validator to verify that their account does receive the value as expected.
Currently block proposers outsource block building to remote builders via relays to maximize block proposal income. While this makes sense for both solo stakers and pooled stakers as they want to have access to MEV income, it comes at a great cost as it provides relays/builders full control over what transactions can be included in a payload. Block builders might choose to censor certain types of transactions even if they are perfectly valid and economically attractive. The end result would be that block proposers would inadvertently also censor these transactions as full control over transaction inclusion has been handed over to builders as the current APIs currently stand.
The current state of the builder apis can be improved to allow for stronger censorship resistance guarantees along with still allowing validators to access MEV income. Instead of the block proposer allowing the builder to build the whole payload, the block proposer now submits a list of full transactions that it wants added to the payload by the builder when requesting the execution payload header in GET /eth/v2/builder/header/{slot}/{parent_hash}/{pubkey}
. This would be a new endpoint as the expected request/response from this is different(even if it is easily extendable from the current data structures)
The builder now has to include these transactions into the payload and is a constraint in the block building process. Once the builder has built the execution payload, it returns the execution payload header along with a multiproof that verifies the existence of the desired transactions for the provided transactions field root.
Once the block proposer receives the response back it verifies that the transactions were all included by the builder:
Once this is verified, the block proposer can then sign the blinded block and request the full block from the builder and then broadcast it. This proposal accomplishes a few things:
The transactions to be force included can simply be retrieved from the local node's mempool. Ex: Top N
most valuable transactions seen. In the event that the relay/builder attempts to censor, the proposer simply falls back to local block building. This allows honest validators to detect when relays attempt to censor and fall back to building locally.
An advantage of this proposal is that it is pretty straightforward for consensus clients to implement on top of the existing builder api.
Remaining testnets before the Merge:
@metachris and I would like to consider CL clients running version v0.1.0
of these specs during the upcoming Sepolia merge as a dress-rehearsal for later testnet merges and ultimately mainnet.
There is a trade-off here between shipping a safe merge with reduced code surface and deploying an untested software stack a few epochs into the finalized mainnet merge.
I'm raising this issue as a place for discussion of these various trade offs.
I'll open with the following extension to the builder-specs
:
`MERGE_DELAY_EPOCHS` | `16` | `epoch`
Proposers **MUST** wait until `MERGE_DELAY_EPOCHS` epochs after the merge transition has been finalized before they can start querying the external builder network.
* NOTE: if the merge transition happens in epoch N and is finalized in epoch N+2, then proposers **MUST** not use the external builder network until epoch N + 2 + `MERGE_DELAY_EPOCHS`.
Repo seems to be missing a /dist
dir with CSS, image, etc. assets.
If you look at the beacon-APIs
repo, this folder is just checked in (so I don't think it can be generated locally by some tooling).
I fixed the website render by cloning this repo along w/ the beacon APIs submodule and just copying /dist
next to index.html
.
We should probably do the same here.
A builder provides an ExecutionPayloadHeader
when offering bids to a given proposer.
The proposer (and specifically the local beacon node the proposer is running) should check the header against their local execution state to ensure the header is not inaccurate or even malicious.
We should specify the set of validations that should be performed in the validator guide.
There are some validations relevant for how implementers of the spec use the builder network that are "hidden" in the API documentation.
This organization is a bit different than how similar spec repos (like the consensus-specs
) are styled so may lead to some confusion.
e.g. I suspect the root cause for this: flashbots/mev-boost#245
Let's pull all validations out of the API docs (or at least put them into the "literate specs")
I wanted to ask to what extent ( or up to the discretion of the validator clients) should we warn users on invalid Gas Limits? or if the builder will handle with something more sensible and is the responsibility is on the builder?
As discussed previously I believe client teams are using the 30M as the default but perhaps that's simply an implementation detail ( instead of the previous block? or whatever is higher?).
note: would need to release to avoid lowering the limit by default if the network settles on something different.
For user overrides what were we thinking ( I just hope an error isn't thrown back to us) for the intended flow on a bad range for gas limit. Sending something very large will result in you not being able to include that many transactions in a block and sending too little the block is not profitable. Will those be adjusted somehow or is it totally up to the validator client.
there is a well-known attack on the BLS signature scheme called a "rogue public key" attack
you can read more about it here: https://hackmd.io/@benjaminion/bls12-381#Rogue-key-attacks
the mitigation is straightforward: publish a "proof of possession" along w/ the public key.
given that this spec current requires builders to sign over their messages, we should also specify that builders publish a "proof-of-possession" alongside their public key and any other configuration info required to connect.
concretely, the "proof-of-possession" can just sign over the message that is the encoding of the builder's BLS public key according to the SSZ spec defined in this repo: https://github.com/ethereum/consensus-specs
Validators are expected to periodically (once per epoch? once per N epochs?) submit a SignedValidatorRegistration
to builder software but I do not think this is document anywhere outside of the comment in the https://github.com/ethereum/beacon-APIs
We should craft a spec/validator.md
that contains some light prose on the expected validator flow to be able to participate in the external builder network.
not sure what is going on but I think something broke the rendering https://ethereum.github.io/builder-specs/
in this commit: 58e2c66
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.