spacemeshos / protocol Goto Github PK

View Code? Open in Web Editor NEW

38.0 25.0 14.0 1.62 MB

This repo contains the Spacemesh protocol specifications and related documentation

Home Page: http://protocol.spacemesh.io/

License: Apache License 2.0

HTML 26.40% CSS 73.60%

blockmesh spacemesh abft post poet

protocol's People

Contributors

Stargazers

Watchers

Forkers

nunofernandes-plight cryptohuang danymill sbnair nkryuchkov forkkit chlin501 cocoonventures nenadvasic helpfulcryptominer errit kemalmehmet111 0xwhalerush elvin-du

protocol's Issues

s/miner/smesher/

Do we want to update the terminology throughout these docs? CC @avive

Need some clarification about the "faulty" aspect of hare

While reading the white paper of spacemesh, I've raised a few questions, hope here's the right place to find the answer!

The abstract of "Tortoise and Hares Consensus" thesis says:

We propose Meshcash, a new framework for cryptocurrency protocols that combines a
novel, proof-of-work based, permissionless byzantine consensus protocol (the tortoise) that
guarantees eventual consensus and irreversibility, with a possibly-faulty but quick consensus
protocol (the hare).

What exactly does the faulty mean here? Does it mean that the hare protocol can't reach consensus(liveness) or the consensus is wrong(safeness)?

In spacemesh's paper, the hare protocol is iterative, and it's not clear how the tortoise protocol could help when the hare protocol is stuck in iterations.

Add P2P/networking docs

https://docs.google.com/document/d/1fyASTE83bcSyLecyKOFS4CNeMhmjtwWxu-_ymLI2S2M/edit

Should replace https://github.com/spacemeshos/go-spacemesh/wiki/p2p2 and https://github.com/spacemeshos/go-spacemesh/wiki/P2P-Architecture

Yosher shared these newer notes, check whether they contain anything relevant for this doc:

NIPoST validity

https://github.com/spacemeshos/protocol/blob/master/mining/04-nipost.md should be updated to explain exactly how a NIPoST is validated.

need a good visual for NIPoST

Add it here: https://github.com/spacemeshos/protocol/pull/5/files/5a09f8a52a687194ce127a29c9ed635dc1945d68#diff-3ce2aee4b1e938c9c876b8e205b9f867R3

NIPoST may be thought of as the glue that chains multiple, sequential proofs together, or alternatively as a second order or meta-proof which wraps or contains, as constituent parts, a Proof of Elapsed Time and multiple Proofs of Space-time

Add flows for each protocol

Add high-level flow for each component so tests may be designed from this flow. e.g.

Transaction processing
State transition function
Mesh sync
etc...

@ilans

NIPoST animation

Add it here: https://github.com/spacemeshos/protocol/pull/5/files/5a09f8a52a687194ce127a29c9ed635dc1945d68#diff-3ce2aee4b1e938c9c876b8e205b9f867R14

CREATE ANIMATION INSTEAD OF AN IMAGE; consider improving image by numbering the arrows 1, 2, 3

Add sync docs

I can't remember where we left this off, @antonlerner is this something you still intend to work on?

Write a wiki page explaining Spacemesh consensus in context

We need to explain how the hare protocol is different than Hotstuff and why we chose it over Hotstuff. This is a common question by protocol researchers / dev looking at Spacemesh protocol for the first time.

See #28

Resources

Transaction example

Here is a detailed output of all the buffers during a transaction. This might be helpful for someone who wants to implement transactions in a wallet

Restarted application in 425ms.
flutter: >any bip39 library
flutter: >generate mnemonic
flutter: mnemonic: bus object report ask kind torch rule swamp observe crowd worry say
flutter: >get seed of mnemonic
flutter: Seed from mnemonic: [145, 185, 165, 25, 157, 154, 229, 127, 168, 252, 138, 212, 128, 138, 235, 13, 133, 145, 213, 72, 57, 162, 43, 123, 6, 32, 240, 28, 84, 138, 195, 17]
flutter: >Ed25519Lib.newDerivedKeyFromSeed(Uint8List.fromList(seed),Uint8List.fromList([0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),Uint8List.fromList(utf8.encode('Spacemesh blockmesh')));
flutter: privateKey bytes: 2e45f3bf151774762d01947ab550bec31ac9f852b7e70eebc9aee94ef3ce9fcdbedd6066b5fa9725f53e27381fc61195318e92d223521607b3e93f4ba8588c10
flutter: >Ed25519Lib.sign2(privateKey, dummyMessage);
flutter: derive public key from dummy message and signature: [200, 242, 219, 170, 34, 249, 111, 156, 243, 22, 106, 225, 76, 144, 179, 113, 247, 58, 114, 37, 173, 226, 115, 145, 84, 154, 159, 33, 209, 5, 91, 118, 141, 97, 15, 210, 69, 125, 103, 23, 143, 60, 232, 111, 186, 173, 127, 166, 134, 161, 192, 80, 248, 133, 169, 195, 129, 1, 36, 206, 252, 93, 110, 12]
flutter: >Ed25519Lib.extractPublicKey(dummyMessage, signature!);
flutter: publicKey bytes: bedd6066b5fa9725f53e27381fc61195318e92d223521607b3e93f4ba8588c10
flutter: >hex.encode(publicKeyList).substring(24)
flutter: address derived from publickey: 1fc61195318e92d223521607b3e93f4ba8588c10
flutter: create GlobalStateServiceClient grpc client
flutter: accountQueryId = new AccountId(address: privateKey.sublist(24));
flutter: accountQueryFilter = new AccountDataFilter(accountId: accountQueryId,accountDataFlags: AccountDataFlag.ACCOUNT_DATA_FLAG_ACCOUNT.value);
flutter: accountQuery = new AccountDataQueryRequest(filter: accountQueryFilter, maxResults: 1);
flutter: AccountDataQueryResponse accountQueryResponse = await accountClient.accountDataQuery(accountQuery);
flutter: accountQueryResponse: [accountWrapper: {
  accountId: {
    address: [31, 198, 17, 149, 49, 142, 146, 210, 35, 82, 22, 7, 179, 233, 63, 75, 168, 88, 140, 16]
  }
  stateCurrent: {
    counter: 4
    balance: {
      value: 999999999956
    }
  }
  stateProjected: {
    counter: 12
    balance: {
      value: 999999999861
    }
  }
}
]
flutter: accountNonce = accountQueryResponse.accountItem.first.accountWrapper.stateProjected.counter.toInt()
flutter: create TransactionServiceClient grpc client
flutter: recipient size: 20
flutter: >InnerTx(_accountNonce, _address, _gasLimit, _fee, _amount)
flutter: create inner transaction : [0, 0, 0, 0, 0, 0, 0, 12, 198, 76, 190, 17, 82, 89, 180, 221, 149, 2, 230, 31, 2, 218, 203, 212, 253, 91, 15, 76, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 11]
flutter: inner transaction size: 52
flutter: >Ed25519Lib.sign2(privateKey, Uint8List.fromList(unsignedOutStream));
flutter: create transaction signature: [142, 87, 68, 197, 219, 109, 61, 214, 238, 46, 237, 2, 230, 52, 96, 158, 79, 139, 90, 245, 196, 84, 132, 26, 221, 4, 95, 50, 251, 167, 62, 38, 237, 195, 125, 237, 225, 75, 67, 69, 212, 77, 161, 118, 127, 175, 105, 210, 95, 218, 11, 91, 36, 46, 174, 122, 120, 93, 175, 254, 217, 185, 123, 9]
flutter: signature size: 64
flutter: origin length: 20
flutter: origin: [179, 233, 63, 75, 168, 88, 140, 16]
flutter: >_hashTxId is sha256 over _innerTransactionStream + _signature + _origin
flutter: >OuterTx(_innerTransactionStream, _signature, _origin, _hashTxId)
flutter: create outer transaction packet: [0, 0, 0, 0, 0, 0, 0, 12, 198, 76, 190, 17, 82, 89, 180, 221, 149, 2, 230, 31, 2, 218, 203, 212, 253, 91, 15, 76, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 11, 142, 87, 68, 197, 219, 109, 61, 214, 238, 46, 237, 2, 230, 52, 96, 158, 79, 139, 90, 245, 196, 84, 132, 26, 221, 4, 95, 50, 251, 167, 62, 38, 237, 195, 125, 237, 225, 75, 67, 69, 212, 77, 161, 118, 127, 175, 105, 210, 95, 218, 11, 91, 36, 46, 174, 122, 120, 93, 175, 254, 217, 185, 123, 9, 201, 174, 233, 78, 243, 206, 159, 205, 190, 221, 96, 102, 181, 250, 151, 37, 245, 62, 39, 56, 31, 198, 17, 149, 49, 142, 146, 210, 35, 82, 22, 7, 179, 233, 63, 75, 168, 88, 140, 16, 242, 92, 107, 193, 117, 116, 142, 112, 2, 44, 58, 127, 165, 226, 203, 165, 32, 93, 143, 172, 215, 222, 255, 36, 187, 80, 248, 223, 214, 70, 177, 163]
flutter: outer transaction packet size: 188
flutter: sent grpc transaction request

Improve README

It's pretty spare right now. It should probably include deep links into the various content sections.

Add transactions docs

See #24

create an animation that will explain the DAG construction

Add here:

https://github.com/spacemeshos/protocol/pull/5/files/3a1d9aa565447762aa0b2a1373730229a1a73627..0367a3101a9c925e3114dc655754d248a9b1ae80#diff-d3cd360fbadbb5acc3d7167a0709f432R47

Add consensus mechanism docs

https://docs.google.com/document/d/10Z6Ysdbre-PQ7YXo6ZOUSiscKCwMN32420SgMIExR_Q/edit

Add section on data structures

We'll want to be able to break these out in more detail than we can do inline in existing docs, see e.g. #5 (comment)

Fix relative links

Relative links, e.g., the ones in mining/01-overview.md, which point to other md files in the same directory, are broken in the http://protocol.spacemesh.io/#/mining/01-overview doc. Turn these into absolute links.

CREATE VERTICAL ANIMATION TO ILLUSTRATE THE NIPOSTS CHAINING: initialization -> poet -> execution -> poet -> execution -> poet -> ….

Document genesis flow

This is less critical because it only happens once per network but it should be documented

add a step by step description of sync protocol to wiki

describe the finalised sync protocol in the project wiki

No distributed system can assume clocks are synchronized

From the protocol paper

Our model assumes honest parties have synchronized clocks.

Clock synchronization among nodes in a distributed system is, as far as I understand, literally impossible — edit: if the system should be available and/or consistent. Am I mistaken? How can this invariant be confirmed?

Add section on identity

Nodes in Spacemesh have multiple personality disorder :) They each use multiple, distinct identities. Per @noamnelke (#22 (comment)):

We use 3 distinct keypairs:

For P2P auth - ephemeral, changes every time the node restarts, only used to encrypt communication with peers.
For mining - used for signing blocks, ATXs and hare messages, as well as generate eligibility proofs. We actually have two keypairs for mining using different signing schemes (ED vs. BLS) used for different purposes, but that's the topic for another discussion.
For wallets - used for signing transactions.
Keeping the p2p keypair separate from the others is a privacy enabling feature, since p2p IDs are considered not private - anyone on the network can tell the IP address of any p2p ID. While traffic analysis can help associate the two IDs, there are steps one can take to regain some privacy and we want to add more privacy enabling features in the future (e.g. dandelion).

As an aside, we keep the wallet and node keypairs distinct by default because the security model of these two uses is very different. Stealing one's mining private key without their PoST data can enable disqualifying them in the worst case. With the PoST data, which is much harder to steal since it's huge, it can allow one to steal future revenue, but not covertly.

Stealing one's wallet private key, as you surely understand, allows taking away all of their savings. So while a miner's private key must be kept "hot", the wallet private key can and should be kept in cold storage.

I added the relevant P2P info here to the P2P doc. Consider creating a new doc to talk more about identity and the decision to use distinct keypairs for different purposes.

Redundant transactions in blocks in the same layer

Following on our conversation in this thread:

barakshani
we get redundancy, but how much would depend on the number of transactions waiting in the mempool. Given the number of (average) blocks in a layer and the limit of transaction per block (so basically given the limit of transactions in a layer), there is some number of transactions in the mempool for which we don't expect to get duplications at all (if the limit of transactions in a layer is n, then once there are roughly n^2 transactions in the mempool, we expect no duplications)

@barakshani what happens when there is < n transactions in the mempool? Don't all of the blocks in a layer basically contain the same set of transactions then? Nothing would break, but it just seems like a lot of redundant data storage doesn't it?

Consensus

~~What would happen if we had a Hare but no Tortoise?~~
- We could still achieve consensus but it would be a different kind of consensus. Tortoise gives us 1. irreversibility (the property that once something is valid it becomes harder and harder to reverse it with more blocks in the mesh, i.e., the way Bitcoin works), 2. self-healing, and we'd lose these properties. We'd also need to store Hare vote data on the mesh somehow so that future participants could verifiably calculate the global state.
~~Without the Hare, would pbase lag further behind? I.e., does the Hare help pbase "keep up"?~~
- Yes, that's the main reason we have the Hare: it "helps convergence" since all honest parties vote the same way. All honest parties might vote the same way anyway (or very close to it), but the Hare makes a balancing attack harder to pull off.
~~How do inactive miners achieve Hare consensus with the active Hare committee participants? Do they just passively watch gossiped Hare protocol messages?~~
- There is no single "committee"; the miners active in each round change and a miner must be ready to become active at any moment.
- In theory you could run Hare entirely "off chain", without gossip, in private P2P channels, but then you'd need the participants (at least those in the final round) to sign and commit to the results and publish that (i.e., adding an extra round).
- If this final, signed statement were published, any miner could just validate its signatures to validate the result.
- As it stands, however, everyone needs to follow the messages from each round in order to validate the results.
How close do we expect pbase to stay to the current top layer? Is there some threshold that we expect it to stay within - let's say, assuming normal assumptions hold and there is no attack. Another way of phrasing the question: is there an N such that, if the difference between pbase and the current top layer exceeded N, we'd worry, or say there's an issue/some assumption had failed? Is there some formula we can use to calculate how large we expect N to be at any given time?
- Barak: I don't have an exact formula, but one can be made. There should be some layer, after which the probability that pbase does not advance is reducing exponentially.
The Hare expects all of the blocks for a given layer to propagate throughout the network before the Hare kicks off, after the hare-wakeup-delta parameter, right? Doesn't this mean that it punishes blocks that arrive just a little bit late - i.e., blocks that could've arrived later if there were no Hare and were only a Tortoise?
- yes, it does. that's the main point.
~~How, exactly, is the leader chosen in Hare round 2? Does each active participant in this round include the output of a VRF in their proposal? What does the Hare expected_leaders flag do?~~
- All participants send VRF output, and lowest one is chosen as leader. We use a probabilistic threshold to maximize the likelihood that we have at least one leader.
~~When exactly, and how, does each node know its participation status (active/passive) in each Hare round, for each Hare execution cycle, in each layer?~~
- Barak: Once a miner have the Hare beacon and the number of total eligible parties for some layer, it can calculate its role for all possible rounds in that layer. The computation is explained in the protocol paper. In practice, each miner computes its role at the beginning of each round.
~~Does Hare round two actually broadcast an "accept" message at the end of round two (as documented here)? This doesn't seem to line up with the ADDNR18 paper (and it may be my mistake).~~
- No, fixed this
~~What happens if there's a problem in the Hare pre-round, e.g., there's no (or insufficient) overlap in the pre-round views? Would Hare terminate with an empty set?~~
- Barak: yes, it should terminate with an empty set, according to the Hare validity properties.

~~from #27 (comment): Tortoise doesn't involve message-passing among nodes, does it? So what does it mean to say that Hare is more efficient in terms of communication complexity?~~
- The block votes are the messages
- Barak: I am guessing that means that overall, less messages are needed in order to get (irreversible) consensus on a specific layer.
~~from #27 (comment): Without Hare, is the network more vulnerable to attacks? If so, how? How would this play out in practice?~~
- yes, a balancing attack
~~Blocks point to previous blocks, via the view and the Hare votes. Which layer should the block point to blocks in? Is there a limit? The white paper says “recent layer”, what does that mean?~~
- Another way of asking the question: a miner would be incentivized to be “safe” by voting for older blocks, i.e., blocks before pbase - what incentivizes them to vote for more recent blocks? Don’t we want them to vote for the most recent ones they’ve seen? Due to latency won’t there be blocks that show up late and some miners consider contextually invalid? So I want to point at blocks where I’m certain there’s no chance someone would think it’s invalid
- A block's view should contain all, and only, orphan blocks. View should contain all syntactically valid blocks that a miner knows about, regardless of contextual validity. Tortoise counts votes of all syntactically valid blocks - if a block arrived late, then other miners won't know about it and won't count its votes.
- Explicit votes should be on layers above pbase.
~~Voting: Does a block have to vote for other blocks? What if it doesn’t? (And if not, why would I bother?) Why does a block need to include a view/“Visible Mesh”?~~
- Barak: No. Nothing happens. It's part of the protocol and honest miners should follow it. We want to guarantee that all honest miners count votes the same.
~~Is there a limit to how many blocks a block can vote for? Is there a minimum? IOW, what exactly does the protocol say about voting?~~
- Barak: No limit (possibly self healing needs unlimited voting distance?)
~~Is PoST used for anything other than generating an ATX/blocks? E.g., do you need an ATX (for this epoch) to be eligible for Hare?~~
- ATX gives eligibility for blocks and Hare. PoST is only used for ATX generation.

Other

~~Is the node ID keypair distinct from an ephemeral P2P identity keypair that's regenerated when a node restarts? Do nodes have no knowledge of the node ID public key of their neighbors?~~
- yes, distinct
~~Fees~~
- Are fee payments implicit (i.e., no corresponding transactions, they are just implied by the protocol) or explicit (i.e., transactions corresponding to each fee payment are inserted into a block)?
- When, exactly, are fees paid? At the end of the layer? When can they be spent by the recipient miner?
- Implicit, paid right away when a layer is finalized by the Hare
~~What happens if a miner creates and gossips an invalid block? (I.e., the miner is not eligible to produce a block in the current epoch, or layer)~~
- The block is discarded and not gossiped any further, so it would not propagate. We'll likely also want to disconnect from and blacklist a peer that sends us a syntactically invalid block.
The number of blocks per layer is an average, right? (I.e., total blocks per epoch divided by number of layers per epoch gives us average blocks per layer in that epoch.) What's the probability distribution of the number of blocks per layer? Is it thus hypothetically possible that a given layer could have very few, or even zero blocks?
- If there are more miners than blocks in an epoch, every miner is eligible to produce one block and there will be "more blocks than planned."
- For each block eligibility a miner randomly draws a layer from the epoch, with replacement, so the distribution is binomial.
- This is all in the honest/expected case. An attacker could produce more blocks than they're supposed to, in which case they'd receive a smaller reward per block. An eligible miner may also miss a block if it's e.g. offline.

What happens if a second tx is received by a miner with the same nonce as a tx that’s already in their mempool - what happens? Does it compare fees and drop the tx with the lower fee?
When does a miner drop a tx from its mempool after that tx gets mined into a block (by another miner)? Right away when it sees a block that includes that tx, or only after the tx makes it into global state? Is there only one mempool, or is there a second pool for tx that have been mined but aren't finalized yet? What happens if a tx gets mined into a block but that block ends up getting dropped later?
~~Does the miner submission/input/request to a PoET server include/is specific/tied to to a specific epoch? Or could the PoET proof be used for any epoch?~~
- Barak: PoET servers should be considered as external service - they have no notion of epochs. Yet, the miner's registration challenge contains (implicitly, and at the moment also explicitly) an epoch_id. Note, however, that there is no "correct" epoch for a PoET proof; 2 miners can register to the same PoET round for different epochs (for whatever reason), and both ATXs, that include the same PoET proof, should be valid.
Would gossip protocol gossip e.g. an ATX before it had seen the PoST proof that it references? Is it possible that a node receives an ATX without having seen the PoST proof it references? What happens then?
- Barak: If a node cannot validate the ATX, for example since it does not hold the PoET proof (so it cannot validate the NIPoST), it won't propagate it. The miner should try and fetch the missing parts, and if it fails, it considers this ATX syntactically invalid. Note that PoST is sent as part of the ATX, but the PoET proof is sent separately.
~~Does an ATX have contextual validity?~~
- Barak: Yes, for example if another ATX with the same sequence number exists (this means that the miner tries to re-use their space-time for different ATXs). The protocol paper has more details.
Blocks
- The white paper says that a block includes a “Tick t” field. What is this for? Is this actually used? I don’t see it in the code.
- ~~Why does the block contain both the minerId and a Signature? Why can’t we just extract the minerId from the Signature?~~
  - I believe that in our implementation blocks do not contain the miner's id.
- ~~And why doesn’t the whitepaper include the Signature under block?~~
  - Barak: I don't know "why", but signatures are not really part of blocks (you sign the block, once finalised), and anyway it is clear (to the cryptographers among us) that one also signs their block.
- ~~What “happens” to a contextually invalid block? Does it just disappear?~~
  - If some miner does see it, it'll be part of that miner's views and that miner will consider its votes. Other than this, nothing happens. It does not affect the state.
- ~~Do blocks include transactions or just pointers to them? Just txid, right?~~
  - Yes, just txid
- Why does a block have a timestamp? What’s it used for? Isn’t this unnecessary with the layer ID?
~~What was the exact bug that caused the testnet fork? How do we know it won’t happen again?~~
- Barak: we don't know the exact bug, but a syntactically valid block was not propagated to all miners. The subsequent events that actually caused a fork (or maybe "mesh inconsistency" is better, since there was no fork in the global state) are not super important because it was a block from Genesis (for which we have a "stupid" implementation) and there are several known open tasks, that should have prevented this fork. We don't know that.
Sync
- Is fully/weakly synced per layer, i.e., can I be fully synced for layer 100 and weakly for layer 101?
  - Barak: The only situation, in our current implementation, that being weakly but not fully synced may happen is for one layer after a node finishes syncing (e.g. a miner finished syncing in layer 100, and during layer 101 it only listens, but does not actively participate).
- What is the precise definition of weakly and strongly synced?
- ~~Do you ask one peer or multiple/all peers for a block, tx, atx, etc. as part of sync?~~
  - Barak: As far as I know you ask each item from one peer at a time, but you switch between peers.
- ~~The syncer will never getLayerFromNeighbors for an older layer, right? Only the top/current layer? And everything else happens recursively~~
  - Barak: No, you may specify which layer you want from your peers. Doing things recursively is inefficient and probably will blowup your memory. There's no reason of doing so, when you know exactly what layers you are missing.
- Why does a node need to wait until it’s fully synced to generate an ATX/begin to participate in mining? Or at least to submit to PoET?
  - Barak: The miner probably doesn't have to, but it is more hermetic. E.g. we would like blocks to contain in their view only orphan blocks and to vote correctly; ATXs need positioning ATX, so they have to be somewhat synced. Keep in mind that in practice, if it's the first time the miner syncs, it has to wait a full epoch before it can generate blocks or actively participate in the Hare protocol.
Params
- ~~Why are hare-wakeup-delta and sync-validation-delta different? Why do we set/tweak the first for the testnet but not the second?~~
  - Barak: Implementation decision, guess to give more flexibility. Don't expect consistency, these were developed - and tested - at different times. We should probably set neither.

Related: #27
Re: #29
CC @ilans

Discovering all of the nodes in the network

Continuing the conversation from #22 (comment)...

@noamnelke pointed out:

While I agree that it would be nice if it wasn't possible to discover the entire network, unfortunately, this is not the case. If one wants to, they can simply keep running the discovery protocol and eventually be aware of all the nodes. Moreover, even a benevolent node is likely to be aware of most of the network pretty fast even without making special effort.

Which reminds me a lot of this attack against MimbleWimble/Grin.

How concerned are we about this? Is there anything we can do about this?

CC @avive @y0sher @tal-m