dalek-cryptography / bulletproofs Goto Github PK

View Code? Open in Web Editor NEW

1.0K 1.0K 218.0 3.42 MB

A pure-Rust implementation of Bulletproofs using Ristretto.

License: MIT License

Rust 99.89% Makefile 0.11%

bulletproofs's People

Contributors

Stargazers

Watchers

Forkers

etsangsplk brentzundel dantengsky gterzian boookmarks crisma7 pelian p1gd0g jonnycrunch eramitg thepgoat srikumarks khoazany gpestana avalonlit isislovecruft krishislegend kevaundray omershlo antidb roconnor-blockstream rubdos lovesh kimwonlee genysys jimpo ggoranov greyspectrum vickiniu xoloki bruceherve qed-it thomaznas marckloter functional-ken hardchain-lab sticker592 blockchainchaos 3for fqliao ssahai christiankiller myself659 dxplus kenshamir hayeah danielsonggit szsteven louiszhuang cbeck88 iceboy926 luzezeng contrun uheqiang hfzjcl cperezz zyfrank lucieno kyoohyunghan zkcryptography komalklr fmiguelgarcia vihu xbee hansieodendaal dendisuhubdy eranrund drewzt hdevalence edwinhere refugeesus dunnock fkrell matteocam tari-project andelf darwin-systems ktmlm zkcrypto peng-zhe ridone-technologies apiprotocol coolcoolcoder comit-network anfi-project wedpr-team samkim-crypto geekpradd laplacekorea detrustnet blockhubcodes mobilecoinfoundation emmorais icodein garryfcr davidrusu fairads rbarda mathsorcerer learn-zk

bulletproofs's Issues

Correct use of sponge in the Proof Transcript

The paper on Duplex Keccak defines the duplex mode as absorbing and squeezing <=r (rate) bits (including padding) at each Keccak invocation. Our current implementation allows writing arbitrary string and then reading arbitrary string w/o absorbing any input, with padding applied only to the last block, and only at the absorbing phase.

Fixing this would require a new Keccak API designed specifically to support duplex mode. Crate tiny-keccak does not seem to support this at the moment.

Readiness to be integrated in real world cryptocurrency

I am a bitcoin dev who is interested to create a branch that integrated the Bullet proofs. We could use rust wrappers to integrate and use it directly from C. For me, Bulletproofs are just mind boggling making the BP ideal into a global crypto currency. They are Light and in the same time very efficient in both CPU and memory usage. BP looks so amazing to be integrated into anonymous coins and I believe would solve the entire Bitcoin pseudo anonymity issue.

My only concerns are about about correctness of the implementation and readiness. Could this implementation be integrated in a real world cryptocurrency? Do you guys think, exploits could appear?

Thanks and I am really into Bulletproofs 👍

Rework `Generators`

The current Generators struct is designed to work for m-aggregated range proofs of n bits each, with both m and n powers of 2. The aggregated rangeproof generators are constructed as a single chain of size m*n.

To deterministically construct nothing-up-my-sleeve generators, we can produce arbitrary-length chains of generators from a label by hashing the label to a point, then hashing that point to a new point, ....

For the circuit context, it may not be convenient to specify up-front how many generators will eventually be used, especially if we're building up R1CS systems in a bellman-like way. It seems a lot more convenient to just take the head of a lazily-evaluated arbitrary-length sequence of generators.

To make this more consistent with the rangeproofs, we could change the rangeproof generators to be m chains of n points each, with a different label for each m. The verifier would then join each of the m chains during verification. This might also allow "rearrangable aggregation": currently, each party's vectors are concatenated, and the entries are combined with a giant vector of powers of a random factor, so each party needs to know their index in order to determine their place in that vector. But maybe each party could just use a different z value or w/e, and the verifier could stitch these vectors together in the same way. This would allow each party's proof shares to be freely arranged (allowing stuff like blind aggregation, aggregation with variable sizes, ... etc).

API documentation and README

Review/refine the README and API docs that we have.

Refine public API

Things that we discussed:

~~Using a static set of sizes reflected in the types (e.g. RP16, RP32, RP64), cf. #44.~~
Remove V from the proof data.

Review bit functions in the inner product protocol notes

was: Notes on inner-product protocol

Make sure we have notes on the inner-product protocol.

double check for off-by-one errors in the bit functions (since there were/are typos in the paper)
add a section on verification_scalars and how they get into the mega multiplication -> #43.

Implement FFI for use from external languages

WARNING: This is an early request-for-comments version of the proposal. Please discuss.

Problem

Rust is great for engineering complex cryptographic schemes: its memory model and type system enable us to make complex protocol easy and safe to use (example), while not compromising on performance (example). However, people write applications in other languages, most notably in C/C++, JavaScript, Go, Java, Swift etc. So for the world to actually enjoy safe and performant cryptography, we need to integrate well with all these environments.

Strategy

Does not scale: re-implementing multiple layers of the architecture, from finite field arithmetic in the curve25519-dalek to zero-knowledge proofs in Ristretto-bulletproofs.

Does scale: adding C/wasm API and language bindings for higher-level protocols, as they have more narrow API and do not expose users to raw cryptography unnecessarily.

Maintenance

For every project it probably makes sense to keep all language bindings in its repository to make sure the bindings stay in sync with the Rust API. Keeping language bindings in distinct repositories is a sure path to divergence of APIs and undermining trust of the users.

How can it work? Users of the project can submit PRs to add bindings for their language, together with a test suit. Project developers will be able to get automatic feedback whether their updates are compatible with all the language bindings, and can even add necessary fixes themselves (as the types passing FFI boundary should be pretty simple: integers, byte arrays and opaque pointers to structures defined in Rust).

Where to draw FFI boundary?

Do we expose a ProofTranscript object, or a simpler "message" buffer? Do we expose MPC API for aggregated proofs?

Safety considerations: It's probably safer to not expose stateful pieces such as proof transcript, exposing instead simpler interfaces like "give us customization/msg string, secrets, and here's a proof blob for you".
For things like aggregation MPC, the Rust type system and memory ownership allows safe use of all Party/Dealer types, but it would not be the case if exposed via FFI to less strict languages. E.g. in Ruby/JS/Java/Go one would be able to accidentally reuse some values and cause replay of the protocol, leading to secrets' recovery.

Ergonomics considerations: Generally, people are used to simpler one-shot APIs and comfortable with ad-hoc hashing schemes, so explicit PT will feel overly complicated, unless you know what you are doing.

Therefore, leaving sophisticated API in Rust for now is good for safety and invites people to compose their schemes in Rust, providing their own FFI on top. We can revisit this later. Either people will do more crypto in Rust (good choice!), and/or there will be extra more sophisticated FFI to do composition of protocols in foreign languages.

Checklist

C API for RangeProof (single proof creation)
NodeJS bindings using the C API.
Java bindings (JNI) using the C API.
Go bindings (via cgo) using the C API.
Ruby bindings using the C API.
~~JS binding using the wasm API (there will be some fun moving arrays to/from wasm).~~ Seems like wasm-bindgen generates all FFI boilerplate directly from Rust declarations.

Thanks to @domluna for raising the question in #95.

Align aggregated range proofs with single range proofs

Right now we have both a RangeProof struct and an AggregatedProof struct, and more generally we have two seperate implementations of both proving and verifying.

This is a tracking issue for the task of de-duplicating the proof structs and figuring out to what extent it makes sense to do so for the implementation.

Remove the `circuit` module and work solely with the constraint system

PR #140 made proving and verifying work on the constraint system. The next step is to entirely remove the circuit module, move whatever code is still needed into the constraint system implementation, and pass directly from the constraints to proof creation or verification.

Enforce that circuit `n` parameter is zero or a power of two

Currently, the inner product proof only works for vectors with lengths that are powers of two.
This was fine for the range proof code, where this was a valid assumption.
However, for circuits, the vectors will not always be powers of 2.
After some discussion, we decided that the best solution in that case is to pad the circuits such that circuit.n is a power of 2, so that the vectors being input to the IPP are also lengths of powers of 2.

Initial discussion here: #102

TODO: double check that empty IPP outputs are okay

Circuit support

Tracking issue for cross-referencing all of the work required to implement the circuit proving parts of Bulletproofs.

Subparts:
Circuit proof protocol notes: #101 #98
Circuit proof code: #102

Optimize inner-product prover by inlining adjustment scalars

See comment here: https://github.com/chain/ristretto-bulletproofs/blob/main/src/inner_product_proof.rs#L62

The inputs to the function include a vector of points H, and adjustment scalars hprime_factors. The proof is supposed to be formed with respect to H' defined as H'[i] = hprime_factors[i] * H[i].

Right now, to do this, the H points are multiplied by the adjustment scalars, which is inefficient, since they could be combined into the formulas further down (scalar*scalar operations are essentially free compared to scalar * point operations).

Calling library from an external language

More of a query than an issue. I'm looking into using this to compute proofs for ML applications. The ML will be done in an external language so I was wondering if there's a C/Rust API I could use to call this library.

Compile to webassembly

It would be cool (but very low priority) to be able to compile to web assembly and produce proofs from a browser.

For now this ticket could just be a point of accumulation for stuff that would be useful for doing that.

Clarify rangeproof buildup

Problem

People have tried to dig into the algebra behind the RP and were confused at the very beginning.

Cf. https://twitter.com/NicolasDorier/status/984985725341806592

Also @bobg once mentioned that a prior, more elaborate, version of the notes was more educating.

Suggestions

Clarify transition from v to its bits, and the semantics of each equation.
Clarify the inner product notation at first use. E.g. <a,b> = a_0*b_0 + ... a_{n-1}*b_{n-1}.
Clarify the cryptographic trick of combining n statements with an n-1-degree polynomial.
Would be good to conclude the IPP section with the rewrite of the final equation in the original RP notation which is replaced in the beginning of the section. This way it will be easy to tie this to actual RP protocol description and all the preceding notes.

Try to optimize inner-product proof

The inner-product proof spends lots of time in this computation https://github.com/chain/ristretto-bulletproofs/blob/main/src/inner_product_proof.rs#L102-L103 performing double-base scalar multiplications.

The verifier's side inlines all of the computation into a single check. This isn't possible for the prover, which has to produce the L, R points, but (looking quickly) it seems like it should be possible for the prover to only do one multiscalar mul per L and R, and drop all of the double-base scalar muls.

Possibly supersedes #80

Change generators API to allow flexible generators for Pedersen commitments

Since we want to be able to prove statements about existing commitments, we should probably allow specifying what the generators are for those commitments. (For instance, one case where they would vary is the commitments to attributes of an anonymous credential produced by the CMZ'13 aMAC presentation).

Refine parameters / proof contents for aggregated proofs

Two issues that are related enough that they should probably be fixed at the same time:

the current aggregated RP functions don't take generators as parameters, so they spend effort recompute them every time;
the current aggregated RP functions don't allow taking existing commitments as inputs (they're generated internally), and keep the commitments as part of the proofs.

Document sponge Transcript implementation

Make it clear that current Keccak-based Transcript is very yolo and should be replaced with something more solid.

Combine multiscalar multiplications in constraint system proof verification

The current constraint system verify implementation does multiple multiscalar multiplications; these could all be batched together into a single multiscalar multiplication using random values.

ConstraintSystem.prove() should not return VerifierInput

ConstraintSystem.prove() currently returns (CircuitProof, VerifierInput). However, this is redundant with the ConstraintSystem::prover_new() function, which returns the committed points that comprise the verifier input. Therefore, we should update the R1CS API so that ConstraintSystem.prove() only returns CircuitProof, and also remove the create_verifier_input() private function.

Pedersen commitment structure

It seems like we can benefit from grouping together value, blinding factor and generators in a single struct. Lets call it OpenCommitment (or DeferredCommitment, or something like that).

Benefits:

We can pass a single list of such commitments instead of a two lists - with values and with blinding factors.
We can pass pedersen generators within such structs instead of passing them inside GeneratorsView, which may help us simplify the API of the latter.

WDYT?

Notes on circuit proving

Understand and take notes on the circuit proving protocol in the Bulletproofs paper (protocol 3).

Writing the notes down: #101
Move circuit notes to circuit module: #132

Batch verification for the rangeproofs

Given N rangeproofs it is possible to verify them in a single multi-scalar multiplication by merging the relevant iterators.

Check that all proof data is fed into the proof transcript

Make sure that the challenges are bound to all data in the proof transcript.

Set up CI

We should have some CI.

curve25519-dalek uses Travis, which is nice because if someone who also uses Travis forks the repo, the config carries over to their fork perfectly.

I'm happy to set up Travis if others don't mind, or to do something else if people would prefer a different CI.

Aggregated proof benchmarks

We should add benchmarks for proof aggregation and try running them on zoomzoom.

Constraint system code should zero its heap allocations.

This probably isn't worth doing until we're mostly finished writing the constraint system code, but we should do it before we finish it.

Notes on inner-product proof motivation

Distinct from the notes on the protocol, we have motivational notes on why/how it works.

Tune benchmark sample sizes for aggregated proof benchmarks

The criterion config used for all benchmarks is the default one. For things that are expensive (e.g., aggregated proving), this can take a long time:

Benchmarking Aggregated 32-bit rangeproof creation/16: Collecting 100 samples in estimated 685.74 s (5050 iterations)

Maybe there's a way to turn this down.

Stabilize dependencies

Before we can have a 1.0 version, we need stable dependencies:

1.0 curve25519-dalek https://github.com/dalek-cryptography/curve25519-dalek/milestone/1
stabilized Ristretto Ristretto/website#1
stabilized Merlin dalek-cryptography/merlin#23
stabilized Subtle

Aggregated range proofs

Implement a scheme of aggregating M range proofs in one proof for M distinct value commitments.

Parallel aggregation experiments

Once we have benchmarks (#68), we should try parallelizing the parties' work for the single-party case using rayon.

("Galaxy brain: parallelize your code by doing MPC with yourself over a threadpool")

Change docs to conceptualize proof aggregation slightly differently

Note from a discussion with @cathieyun @oleganza about aggregation of circuit proofs (cf #97):

The existing documentation describes aggregation for rangeproofs as multiple parties who want to produce a joint proof that each of their values is in range. This is correct, but it obscures a distinction that's useful in the circuit context.

In fact each of the provers is working independently to produce a proof of their statement (that their value is in range), and the dealer is aggregating these statements into a proof that each party's statement holds. But no two parties collaborate to produce a joint proof of a single statement. Because each of the parties is proving the same range statement, this distinction isn't really important in the rangeproof setting.

However, in the circuit setting, it does matter: the distinction is between

having each party prove satisfiability of their own circuit (the circuits could possibly be distinct, as long as there's some higher-level way to signal to the verifier which circuits are being proved in which part of the proof);
having parties collaborate to prove satisfiability of a single circuit without having to reveal secrets to each other.

For the first one, we can use basically the same framework that we already have to produce an aggregated proof of m parties. For the second, we would need a different aggregation framework.
(Why is the second one useful? Because it allows, e.g., proving mixes between parties that don't trust each other.)

When we have notes on circuit proving, we should change the existing rangeproof notes to clarify this distinction.

R1CS support for circuits

get r1cs → bp to work, using naive transformation #106
1.1 make sure r1cs → bp pads circuits if n is not 0 or a power of 2.
improve transformation to get rid of multiplications (simplify if only scalar mults)
improve r1cs → bp by bypassing matrices: #131

other tasks:

benchmarking
explore bellman <-> our r1cs
convert bp to use ProverInput and VerifierInput
move the arithmetic circuit code to a new file, so mod is more balanced

Support for Serde

Add #[derive(Serialize)] for the proofs and commitments.

Dealer in aggregated range proof implementation should verify shares

Moving from #57 (comment)

The dealer logic should verify proof shares internally.

Error Types

We need to have a set of error types, probably using failure.

These should cover stuff like:

proof verification failure
size mismatches in aggregation
accounting for malformed shares

Check range proof assumptions about generator lengths

Currently, the range proof module uses the Generators, and may be making assumptions that Generators enforces that n and m are powers of two. However, there are two problems with this - the Generators file doesn't actually perform any checks on n and m, and for circuit proofs we want to use the Generators with n (and maybe also m) that is not a power of 2.

TODO: go through the range proof code and find where it is assuming that Generators enforces that n and m are powers of two, and do those checks in the range proof itself.

Initially discussed here: #102

PR work here: #134

Change circuit code to use compressed points

First discussed here:
#111

Notes on rangeproof motivation

Distinct from the notes on the protocol, we have motivational notes on why/how it works.

Notes on rangeproof protocol

Make sure we have adequate notes on the range proof protocol.

Remove wasteful scalar muls in inner product proof

Easy subset of #89, there's a bunch of single-base scalar multiplications that could be avoided by unrolling the first iteration of the IPP loop and combining the base-adjustment scalars into the multiscalar multiplications.

Proofs should have compressed points

The RangeProof struct holds all of its points as RistrettoPoints, which means it's bigger than it needs to be, and we also waste a little bit of time, since we decompress during deser, then forget the compressed data, and recompress during verification.

(This was pointed out as a possible flaw when we posted our blog post, but there's no actual difference in our benchmarks, since decompress cost = compress cost and we didn't count the decompress cost).

We also use Serde for the messages used in the MPC protocol, but I'm not sure it's a big deal there.

The curve25519-dalek Serde support doesn't work with compressed points at all: the idea is that the user defines a struct with some points, and either it deserializes successfully (with all points valid) or fails entirely, so there's no way to create a "halfway-valid" state, and there's a clear validation boundary.

If we'd like to work with CompressedRistretto points, I think we would need to implement our own serialization logic, but we maybe need that anyways to get compact proofs.

Tracking issue for notes/docs

Sub-components:

#35 : Notes on RP protocol
#36 : Notes on IPP protocol
#37 : Notes on RP motivation
#38 : Notes on IPP motivation
#39 : API docs and README

We can keep the motivational notes in the notes submodule, and keep the RP / IPP notes in the range_proof / inner_product_proof modules. Since the public structs are re-exported from the main part of the crate, this gives the following separation:

External docs (doc.dalek.rs):
- public structs and their public functions, README
Internal docs (doc-internal.dalek.rs):
- notes on the protocols the code is supposed to implement is module-level docs
- notes on why/how stuff works is part of the notes module