Comments (2)
@hannahhoward I think this is a great direction to explore! A couple thought/extensions here:
One factor we might use to "estimate" how far out we can safely go is agreement among multiple remote peers about what is contained in the CID list for a selector query
An alternative approach is to build up trust in a manifest over time
- inspired by @petar in a related context, the idea is that given a manifest where we haven't yet verified all the parent nodes we increase the amount of unverified data we're willing to request at once
- for example, we could increase the results geometrically, first allowing 10 unverified blocks at once, then if those all come back good then 20, 40, 80, etc. This would guarantee that at least half of the data we get at any given time is definitely good
This proposal also has a bit of an issue whereby an unverified manifest can force us to download nodes from an unrelated graph in a delegated DoS attack (i.e. malicious peers convince honest nodes to download bits from a target thereby wasting both party's bandwidths at minimal expense to the attackers). This can be mitigated by asking that peer for a manifest before asking it for the associated unverified nodes.
- this means we won't ask a peer for a block unless we A) want it (e.g. normal bitswap operation) B) think we want it and they've confirmed they have it
- we can maintain compatibility with non-upgraded nodes by simply not doing the above check if those peers do not support graphsync and/or a newer version of Bitswap. It makes them somewhat attackable, but it's not too bad and any affected node could upgrade.
from beyond-bitswap.
Another extension that would solve the trust issue and let us parallelize safely:
What if we could prove that the manifest returned by the cid+selector for the graphsync/do-no-send-blocks
was correct? I think this is possible (but a bit tricky) with a snark. Here's how I imagine it could work:
- A client asks a provider for all the cids/links traversed for a cid+selector.
- The provider returns a list of cids along with a small proof that they contain some block with the given cid and when traversed with the selector returns the given list of cids.
- The client can then ask the network in parallel for all those cids since it knows for sure that they are related to the cid+selector. No trust required.
- These cidlist+proofs can be cached for a given cid+selector, and we can even store the caches on a separate set of machines since we know they are correct.
In practice right now it's a little trickier I think (we may not be able to implement ipld traversal in a snark circuit for example). But I bet something simpler along the same idea could work today.
from beyond-bitswap.
Related Issues (20)
- The use of real files doesn't work with local:docker HOT 2
- Test cases freeze with a large number of instances HOT 2
- Build dynamic/interactive UI to analyze results. HOT 2
- Add compatibility for alternative datastore.
- Testing compression bandwidth savings HOT 1
- The new testground version broke several features HOT 1
- Refactor test cases code. HOT 3
- Script to download base datasets.
- Should we import datasets as CAR files? HOT 2
- Thoughts on RFCBBL1201
- Why Bitswsap and compression may not be a good match? HOT 2
- High-level thoughts & directions for BB project RFCs HOT 2
- Dashboard comparing IPFS file-sharing performance for different versions (benchmarking.ipfs.io)
- Error when running experiments with docker:go builder HOT 2
- Bitswap Transfer test: Unexpected additional data HOT 5
- Bitswap test: Fetching files in parallel HOT 2
- Issues with running the testbed transfer plan and RFC experiments
- MAX_CONNECTION_RATE metric may not be working
- Figures do not show files in ascending order.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from beyond-bitswap.