dalek-cryptography / bulletproofs Goto Github PK
View Code? Open in Web Editor NEWA pure-Rust implementation of Bulletproofs using Ristretto.
License: MIT License
A pure-Rust implementation of Bulletproofs using Ristretto.
License: MIT License
The paper on Duplex Keccak defines the duplex mode as absorbing and squeezing <=r
(rate) bits (including padding) at each Keccak invocation. Our current implementation allows writing arbitrary string and then reading arbitrary string w/o absorbing any input, with padding applied only to the last block, and only at the absorbing phase.
Fixing this would require a new Keccak API designed specifically to support duplex mode. Crate tiny-keccak
does not seem to support this at the moment.
I am a bitcoin dev who is interested to create a branch that integrated the Bullet proofs. We could use rust wrappers to integrate and use it directly from C. For me, Bulletproofs are just mind boggling making the BP ideal into a global crypto currency. They are Light and in the same time very efficient in both CPU and memory usage. BP looks so amazing to be integrated into anonymous coins and I believe would solve the entire Bitcoin pseudo anonymity issue.
My only concerns are about about correctness of the implementation and readiness. Could this implementation be integrated in a real world cryptocurrency? Do you guys think, exploits could appear?
Thanks and I am really into Bulletproofs ๐
The current Generators
struct is designed to work for m
-aggregated range proofs of n
bits each, with both m
and n
powers of 2. The aggregated rangeproof generators are constructed as a single chain of size m*n
.
To deterministically construct nothing-up-my-sleeve generators, we can produce arbitrary-length chains of generators from a label by hashing the label to a point, then hashing that point to a new point, ....
For the circuit context, it may not be convenient to specify up-front how many generators will eventually be used, especially if we're building up R1CS systems in a bellman-like way. It seems a lot more convenient to just take the head of a lazily-evaluated arbitrary-length sequence of generators.
To make this more consistent with the rangeproofs, we could change the rangeproof generators to be m
chains of n
points each, with a different label for each m
. The verifier would then join each of the m
chains during verification. This might also allow "rearrangable aggregation": currently, each party's vectors are concatenated, and the entries are combined with a giant vector of powers of a random factor, so each party needs to know their index in order to determine their place in that vector. But maybe each party could just use a different z
value or w/e, and the verifier could stitch these vectors together in the same way. This would allow each party's proof shares to be freely arranged (allowing stuff like blind aggregation, aggregation with variable sizes, ... etc).
Review/refine the README and API docs that we have.
Things that we discussed:
RP16
, RP32
, RP64
), cf. #44.V
from the proof data.was: Notes on inner-product protocol
Make sure we have notes on the inner-product protocol.
verification_scalars
and how they get into the mega multiplication -> #43.WARNING: This is an early request-for-comments version of the proposal. Please discuss.
Rust is great for engineering complex cryptographic schemes: its memory model and type system enable us to make complex protocol easy and safe to use (example), while not compromising on performance (example). However, people write applications in other languages, most notably in C/C++, JavaScript, Go, Java, Swift etc. So for the world to actually enjoy safe and performant cryptography, we need to integrate well with all these environments.
Does not scale: re-implementing multiple layers of the architecture, from finite field arithmetic in the curve25519-dalek to zero-knowledge proofs in Ristretto-bulletproofs.
Does scale: adding C/wasm API and language bindings for higher-level protocols, as they have more narrow API and do not expose users to raw cryptography unnecessarily.
For every project it probably makes sense to keep all language bindings in its repository to make sure the bindings stay in sync with the Rust API. Keeping language bindings in distinct repositories is a sure path to divergence of APIs and undermining trust of the users.
How can it work? Users of the project can submit PRs to add bindings for their language, together with a test suit. Project developers will be able to get automatic feedback whether their updates are compatible with all the language bindings, and can even add necessary fixes themselves (as the types passing FFI boundary should be pretty simple: integers, byte arrays and opaque pointers to structures defined in Rust).
Do we expose a ProofTranscript
object, or a simpler "message" buffer? Do we expose MPC API for aggregated proofs?
Safety considerations: It's probably safer to not expose stateful pieces such as proof transcript, exposing instead simpler interfaces like "give us customization/msg string, secrets, and here's a proof blob for you".
For things like aggregation MPC, the Rust type system and memory ownership allows safe use of all Party/Dealer types, but it would not be the case if exposed via FFI to less strict languages. E.g. in Ruby/JS/Java/Go one would be able to accidentally reuse some values and cause replay of the protocol, leading to secrets' recovery.
Ergonomics considerations: Generally, people are used to simpler one-shot APIs and comfortable with ad-hoc hashing schemes, so explicit PT will feel overly complicated, unless you know what you are doing.
Therefore, leaving sophisticated API in Rust for now is good for safety and invites people to compose their schemes in Rust, providing their own FFI on top. We can revisit this later. Either people will do more crypto in Rust (good choice!), and/or there will be extra more sophisticated FFI to do composition of protocols in foreign languages.
wasm-bindgen
generates all FFI boilerplate directly from Rust declarations.Right now we have both a RangeProof
struct and an AggregatedProof
struct, and more generally we have two seperate implementations of both proving and verifying.
This is a tracking issue for the task of de-duplicating the proof structs and figuring out to what extent it makes sense to do so for the implementation.
PR #140 made proving and verifying work on the constraint system. The next step is to entirely remove the circuit module, move whatever code is still needed into the constraint system implementation, and pass directly from the constraints to proof creation or verification.
Currently, the inner product proof only works for vectors with lengths that are powers of two.
This was fine for the range proof code, where this was a valid assumption.
However, for circuits, the vectors will not always be powers of 2.
After some discussion, we decided that the best solution in that case is to pad the circuits such that circuit.n
is a power of 2, so that the vectors being input to the IPP are also lengths of powers of 2.
Initial discussion here: #102
TODO: double check that empty IPP outputs are okay
See comment here: https://github.com/chain/ristretto-bulletproofs/blob/main/src/inner_product_proof.rs#L62
The inputs to the function include a vector of points H
, and adjustment scalars hprime_factors
. The proof is supposed to be formed with respect to H'
defined as H'[i] = hprime_factors[i] * H[i]
.
Right now, to do this, the H
points are multiplied by the adjustment scalars, which is inefficient, since they could be combined into the formulas further down (scalar*scalar
operations are essentially free compared to scalar * point
operations).
More of a query than an issue. I'm looking into using this to compute proofs for ML applications. The ML will be done in an external language so I was wondering if there's a C/Rust API I could use to call this library.
It would be cool (but very low priority) to be able to compile to web assembly and produce proofs from a browser.
For now this ticket could just be a point of accumulation for stuff that would be useful for doing that.
Problem
People have tried to dig into the algebra behind the RP and were confused at the very beginning.
Cf. https://twitter.com/NicolasDorier/status/984985725341806592
Also @bobg once mentioned that a prior, more elaborate, version of the notes was more educating.
Suggestions
v
to its bits, and the semantics of each equation.<a,b> = a_0*b_0 + ... a_{n-1}*b_{n-1}
.n
statements with an n-1
-degree polynomial.The inner-product proof spends lots of time in this computation https://github.com/chain/ristretto-bulletproofs/blob/main/src/inner_product_proof.rs#L102-L103 performing double-base scalar multiplications.
The verifier's side inlines all of the computation into a single check. This isn't possible for the prover, which has to produce the L, R
points, but (looking quickly) it seems like it should be possible for the prover to only do one multiscalar mul per L
and R
, and drop all of the double-base scalar muls.
Possibly supersedes #80
Since we want to be able to prove statements about existing commitments, we should probably allow specifying what the generators are for those commitments. (For instance, one case where they would vary is the commitments to attributes of an anonymous credential produced by the CMZ'13 aMAC presentation).
Two issues that are related enough that they should probably be fixed at the same time:
Make it clear that current Keccak-based Transcript is very yolo and should be replaced with something more solid.
The current constraint system verify implementation does multiple multiscalar multiplications; these could all be batched together into a single multiscalar multiplication using random values.
ConstraintSystem.prove()
currently returns (CircuitProof, VerifierInput)
. However, this is redundant with the ConstraintSystem::prover_new()
function, which returns the committed points that comprise the verifier input. Therefore, we should update the R1CS API so that ConstraintSystem.prove()
only returns CircuitProof
, and also remove the create_verifier_input()
private function.
It seems like we can benefit from grouping together value, blinding factor and generators in a single struct. Lets call it OpenCommitment
(or DeferredCommitment
, or something like that).
Benefits:
WDYT?
Given N rangeproofs it is possible to verify them in a single multi-scalar multiplication by merging the relevant iterators.
Make sure that the challenges are bound to all data in the proof transcript.
We should have some CI.
curve25519-dalek
uses Travis, which is nice because if someone who also uses Travis forks the repo, the config carries over to their fork perfectly.
I'm happy to set up Travis if others don't mind, or to do something else if people would prefer a different CI.
We should add benchmarks for proof aggregation and try running them on zoomzoom
.
This probably isn't worth doing until we're mostly finished writing the constraint system code, but we should do it before we finish it.
Distinct from the notes on the protocol, we have motivational notes on why/how it works.
The criterion
config used for all benchmarks is the default one. For things that are expensive (e.g., aggregated proving), this can take a long time:
Benchmarking Aggregated 32-bit rangeproof creation/16: Collecting 100 samples in estimated 685.74 s (5050 iterations)
Maybe there's a way to turn this down.
Before we can have a 1.0 version, we need stable dependencies:
curve25519-dalek
https://github.com/dalek-cryptography/curve25519-dalek/milestone/1Implement a scheme of aggregating M range proofs in one proof for M distinct value commitments.
Once we have benchmarks (#68), we should try parallelizing the parties' work for the single-party case using rayon.
("Galaxy brain: parallelize your code by doing MPC with yourself over a threadpool")
Note from a discussion with @cathieyun @oleganza about aggregation of circuit proofs (cf #97):
The existing documentation describes aggregation for rangeproofs as multiple parties who want to produce a joint proof that each of their values is in range. This is correct, but it obscures a distinction that's useful in the circuit context.
In fact each of the provers is working independently to produce a proof of their statement (that their value is in range), and the dealer is aggregating these statements into a proof that each party's statement holds. But no two parties collaborate to produce a joint proof of a single statement. Because each of the parties is proving the same range statement, this distinction isn't really important in the rangeproof setting.
However, in the circuit setting, it does matter: the distinction is between
having each party prove satisfiability of their own circuit (the circuits could possibly be distinct, as long as there's some higher-level way to signal to the verifier which circuits are being proved in which part of the proof);
having parties collaborate to prove satisfiability of a single circuit without having to reveal secrets to each other.
For the first one, we can use basically the same framework that we already have to produce an aggregated proof of m
parties. For the second, we would need a different aggregation framework.
(Why is the second one useful? Because it allows, e.g., proving mixes between parties that don't trust each other.)
When we have notes on circuit proving, we should change the existing rangeproof notes to clarify this distinction.
n
is not 0 or a power of 2.other tasks:
ProverInput
and VerifierInput
mod
is more balancedAdd #[derive(Serialize)]
for the proofs and commitments.
Moving from #57 (comment)
The dealer logic should verify proof shares internally.
We need to have a set of error types, probably using failure
.
These should cover stuff like:
Currently, the range proof module uses the Generators
, and may be making assumptions that Generators
enforces that n
and m
are powers of two. However, there are two problems with this - the Generators
file doesn't actually perform any checks on n
and m
, and for circuit proofs we want to use the Generators
with n
(and maybe also m
) that is not a power of 2.
TODO: go through the range proof code and find where it is assuming that Generators
enforces that n
and m
are powers of two, and do those checks in the range proof itself.
Initially discussed here: #102
PR work here: #134
First discussed here:
#111
Distinct from the notes on the protocol, we have motivational notes on why/how it works.
Make sure we have adequate notes on the range proof protocol.
Easy subset of #89, there's a bunch of single-base scalar multiplications that could be avoided by unrolling the first iteration of the IPP loop and combining the base-adjustment scalars into the multiscalar multiplications.
The RangeProof
struct holds all of its points as RistrettoPoint
s, which means it's bigger than it needs to be, and we also waste a little bit of time, since we decompress during deser, then forget the compressed data, and recompress during verification.
(This was pointed out as a possible flaw when we posted our blog post, but there's no actual difference in our benchmarks, since decompress cost = compress cost and we didn't count the decompress cost).
We also use Serde for the messages used in the MPC protocol, but I'm not sure it's a big deal there.
The curve25519-dalek
Serde support doesn't work with compressed points at all: the idea is that the user defines a struct with some points, and either it deserializes successfully (with all points valid) or fails entirely, so there's no way to create a "halfway-valid" state, and there's a clear validation boundary.
If we'd like to work with CompressedRistretto
points, I think we would need to implement our own serialization logic, but we maybe need that anyways to get compact proofs.
Sub-components:
#35 : Notes on RP protocol
#36 : Notes on IPP protocol
#37 : Notes on RP motivation
#38 : Notes on IPP motivation
#39 : API docs and README
We can keep the motivational notes in the notes
submodule, and keep the RP / IPP notes in the range_proof
/ inner_product_proof
modules. Since the public structs are re-exported from the main part of the crate, this gives the following separation:
External docs (doc.dalek.rs):
Internal docs (doc-internal.dalek.rs):
notes
moduleCommitment
that is an enum of Open(Scalar, Scalar)
and Closed(RistrettoPoint)
would be a nice addition to working with Pedersen commitments. It could have a close(PedersenGenerators)
function that generates the RistrettoPoint commitment.
(see r1cs-commitments
branch for my first pass at doing this)
The scalar
module had batch inversion code copied from curve25519-dalek
, but that's now upstreamed so we can get rid of it.
Make sure that the prover operates in constant time.
Use Rayon to exploit parallelism when we can. This issue corresponds to #10.
Having separate types for RangeProof64
, RangeProof32
, etc. means we can know how big the vectors are, domain separate, ensure that range bitsizes are powers of 2, etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.