Giter Club home page Giter Club logo

aut-ct's Introduction

Anonymous usage tokens from curve trees

Tests

Table of Contents

Introduction

(Caveat: read the caveat, please.)

  • If you are interested in proof of assets not anonymous usage tokens, please go to this page.

  • If you are time constrained and just want to see it run, or check the environment is set up correctly, then: go to Installation and then Worked Example.

Goal: Be able to use a privacy-preserving proof of ownership of a public key in a set of public keys, as a kind of token with scarcity. In particular, it should be possible to create such a token from a very large anonmity sets (10s of thousands up to millions) with a verification time which is very short so that it can be used practically in real systems. In practice this code already allows such verifications in about 40-60ms on commodity hardware for up to 2.5M pubkeys (at least).

More specifically, we imagine these public keys to be those of Bitcoin utxos (or perhaps just txos).

The basis of this is Curve Trees, implemented here. That construction allows one to prove set membership in zero knowledge, with pretty good computation and verification complexity, by essentially using an algebraic version of a Merkle tree, and using bulletproofs to get a ZKP of the validity of the "Merkle" proof.

Specifically, it can be made to work for Bitcoin public keys because of the existence of the 2-cycle of curves secp256k1 and secq256k1; we need such a 2-cycle to implement a Curve Tree construct (technically it works with a bigger cycle but the 2-cycle is optimal, since multiple bulletproofs over the same base curve can be aggregated).

That construction is not quite enough for usage of ownership of public keys as tokens; for that, there needs to be a mechanism to ensure no "double spend" of tokens. This line of thinking tends to lead to a "zerocoin" type of construction, and indeed the above paper introduces "Vcash" as doing exactly that. But this requires a globally synchronised accumulator to record (in a ZK way) the spending of each coin/commitment. Without such a "virtual ledger", the other way you could record "usage" of a key in the set is with a key image, a la Cryptonote/Monero ring signatures. This allows more localized and "loosely-coupled" verification of no-double-spend, depending on thie use case.

This "key image" approach can be implemented as described here in this repo, in this document. It's a simple additional ZK proof of knowledge of the opening of a Pedersen commitment, tied to a key image, using completely standard Sigma protocol techniques.

An initial draft of a workable protocol for "receive tokens in exchange for proven utxo ownership" is in this document.

Caveat

Everything here is completely experimental and not safe in any way. Importantly, even the underlying Curve Trees code was only written as a benchmarking tool, and therefore even that is not safe to use in anything remotely resembling a production environment.

If you choose to play around with this stuff for Bitcoin projects I suggest using signet for now.

Installing

Set up Rust, if you haven't, using rustup.

Install this project:

git clone https://github.com/AdamISZ/aut-ct

Check if it's working with

cargo test

from inside the repository. If there are no errors, you are good to go.

If you see an error like:

linker `cc` not found

... e.g., on Debian, your Rust installation is not functioning because it doesn't have the C compiler toolchain. This should fix it:

sudo apt install build-essential

Running

Build the project with cargo build --release (without release flag, the debug version is very slow), then the executable is at target/release/autct.

Start with target/release/autct --help for the summary of the syntax. Note that two flags are required (TODO: they should be arguments), namely -M for the mode/method and -k for the keyset.

Taking each of the four modes in turn:

"serve":

target/release/autct -M serve --keysets \
my-context:testdata/autct-203015-500000-0-2-1024.aks,my-other-context:some-other-filepath -n signet

The application's architecture is based around the idea of a (potentially always-on) daemon acting as an RPC server, for clients to request actions from. This allows easier usage by external applications written in different environments/languages etc., and means that such clients do not need to have implemented any of the custom cryptographic operations.

Currently the application supports three specific distinct RPC requests, represented by the modes prove, verify and newkeys.

The RPC server will often takes some considerable time to start up (1-2 minutes e.g.) (loading precomputation tables and constructing the Curve Tree), and then serves on port as specified with -p at host specified with -H (default 127.0.0.1:23333).

Additionally, a server can serve the proving and verification function for multiple different contexts simultaneously, by extending the comma-separated list as shown above. Each item in the list must have a different context label, and the keyset files for each can be the same or different, as desired.

"newkeys":

./autct -M newkeys --keysets none -n mainnet -i privkey-file

If you need a new taproot address and its corresponding private key, this convenience method allows that. The private key is written in WIF format to the file specified with the -i flag, and can be imported into other taproot-supporting wallets; the address is the 'standard' p2tr type. The network (-n) should be one of mainnet, signet or regtest.

"prove":

target/release/autct -M prove --keysets \
my-context:testdata/autct-203015-500000-0-2-1024.aks \
-i privkeyfile

As per newkeys above, the private key is read in as WIF format from the file specified with -i (the default is a local directory file called privkey).

Note that the keysets (or -k) option takes a particular format: contextlabel:filename, and that this can be a comma-separated list for the serve operation, but for proving and veryifying, you must just use one. The idea of context-label is that usage tokens' scarcity depends on context; you can use the same utxo twice in different contexts (like, different applications) but only once in the same context.

The file autct-203015-500000-0-2-1024.aks, or whatever else is specified (see here Appendix 1 for filename structure), should contain public keys in format: compressed, hex encoded, separated by whitespace, all on one line.

The output of the proving algorithm is sent to the file specified by -P, which should usually be around 2-3kB. The program will look for the pubkey corresponding to the given private key, in the list of pubkeys in the pubkey file, in order to identify the correct index to use in the proof.

Note that in contrast to verification as specified below, proving can take a non trivial time (15 seconds for large keysets is not untypical).

"verify":

target/release/autct -M request --keysets \
my-context:testdata/autct-203015-500000-0-2-1024.aks -P proof1

This client connects to the above server and calls the verify() function with a binary string taken directly from the file specified with -P, and should return with the success or failure of verification status in the field accepted. If something is wrong, for example the key image is reused, you will see an error message describing the condition.

In the directory testdata there is an example pubkey file containing approximately 330K pubkeys taken from all taproot utxos on signet at block 203015, which you can use to test if you like. For this pubkey set, the private key cRczLRUHjDTEM92wagj3mMRvP69Jz3SEHQc8pFKiszFBPpJo8dVD is for one of those pubkeys (signet!), so if you use it, the proof should verify, and the key image you get as output from the verifier should be: 2e7b894455872f3039fb734b42534be410a2a2237a08212b4c9a5bd039c6b4d080 (with the default test labels as per the worked example below).

Configuring

Use target/release/autct --help for documentation of the options available; these are the same settings you can set in the config file:

The config file is auto-generated in ~/.config/autct/default-config.toml (or similar).

Precedence operation is as you would expect: command line options take precedence over config file values, and be aware updates (i.e. just choosing a different option in a command line call) will be persisted to that config file. Third in order of precedence is the default value. As noted, two "options" (-M and -k) are required to be specified always.

The depth and branching factor are the parameters of the curve tree. The generators_length_log_2 may be removed in future but it should be the smallest power of 2 that's bigger than D(912+L-1) where D is the depth and L is the branching factor (see Issue #19 for detailed discussion). If it helps, for keysets up to 500K in size, the defaults should be fine. The rpc port can also be configured here.

Finally, one may need to set the user_string with -u to a hex serialized BIP340 pubkey or alternate user string (see here Appendix 2). This defines "who" can use the resources accessed by "consuming" the utxo in that context.

Worked Example

Paths here assume you are in the root of the repository.

Put a WIF encoded private key into a file in the current directory called privkey (by default; change with -i):

echo cRczLRUHjDTEM92wagj3mMRvP69Jz3SEHQc8pFKiszFBPpJo8dVD > privkey

Then encrypt it:

target/release/autct -M encryptkey --keysets none -i privkey -n signet

You'll be prompted for a password. Delete the file privkey afterwards; the encrypted password will be in privkey.enc.

This particular private key corresponds to an existing signet utxo with more than 500k sats in it. Please don't spend it!

Next, start the RPC server:

target/release/autct -M serve --keysets \
my-context:testdata/autct-203015-500000-0-2-1024.aks -n signet

.. as noted above, it may take a minute to start up. Once you see Starting server at 127.0.0.1:23333, it is ready.

Then switch to a new terminal. Request computation of the proof:

target/release/autct -M prove --keysets my-context:testdata/autct-203015-500000-0-2-1024.aks \
-n signet -i privkey.enc -P default-proof-file

This will likely take around 15 seconds, at the end you should see Proof generated successfully and the file default-proof-file will contain it.

If you check the other terminal you will see some debug output as the proof was created.

Next make a request to verify the proof and deliver a resource:

target/release/autct -M verify -P default-proof-file -k \
my-context:testdata/autct-203015-500000-0-2-1024.aks

Ouput log in servers terminal should look similar to this:

Elapsed time for selrerand paramater generation: 58.00ns
Elapsed time for verifier gadget call: 2.15ms
Elapsed time for verifier calls: 46.28ms
Root is odd? false
Elapsed time for verify_curve_tree_proof: 49.00ms
Verifying curve tree passed and it matched the key image. Here is the key image: "2e7b894455872f3039fb734b42534be410a2a2237a08212b4c9a5bd039c6b4d080"

Output of rpcclient should look like:

Configuration file: '/home/user/.config/autct/default-config.toml'
	mode = "request"
	... other config settings
	
Request was accepted by the Autct verifier! The proof is valid and the (unknown) pubkey is unused.

Note that if you repeat this test, you will get instead:

Request rejected, proofs are valid but key image is reused.

which is correct; key image was stored in the file autct-v1.0my-contextkeyimages.aki and reuse will not be allowed unless that file is deleted.

(Process currently verified working on Ubuntu 22.04, Debian 12 and Windows 10)

Testing

See more info here.

Making an RPC client

As an example for the case of Python, see here.

Conceptually, developers should be able to run this application in server mode, locally, and then their own app can make proof and verification calls quickly and without any custom and non-standard crypto code, to the RPC API provided here, consisting of the three methods prove, verify and createkeys. The calls are done over websockets. See the example.py in the src directory in that repo, for concrete examples.

As long as websockets are supported, the same should be very easy to do foor other programming languages/environments.

Read here for the definitions of the methods/objects used in the RPC-API.

Keysets

Apart from small test key sets as in the testing document, you might want to use real world key sets from the mainnet taproot keys. This document explains how that can be done, but be aware that these data sets are large (e.g. it could easily take 30 minutes to do this, even after you figure out the process!).

Security

See more here.

aut-ct's People

Contributors

adamisz avatar gotcha avatar

Stargazers

Andrejs Agejevs avatar BrunswickBTC avatar William Casarin avatar Édouard avatar raph avatar Miguel Morais avatar Bitkarrot avatar Marco Argentieri avatar Mukesh Tiwari avatar Javed Khan avatar  avatar Jose Storopoli avatar Yannick Seurin avatar 22388o⚡️  avatar Jani Anttonen avatar 志宇 avatar Yuval Kogman avatar Max Hillebrand avatar

Watchers

Yuval Kogman avatar  avatar  avatar  avatar

aut-ct's Issues

`J` - counters for key images

Following the example of "PoDLE" in Joinmarket - it is very likely that many potential applications will want multiple usages available per (u)txo/pubkey. For this reason it makes sense to do:

  • J = hash(application-specific-label, counter)

(see #7 for some related concepts)

... where counter is just an incrementing integer (or anything similar). For now, this is already technically possible if both sides are prepared to fold the counter into the label; messy but it works.

An interesting practical point here: if we have to check a given token against multiple counters, does it degrade performance too much? I think the answer is no: this calculation is completely orthogonal to the (expensive) curve tree construction and the (cheap but non-zero) curve tree proof verification. That can be done first, and then we can simply iterate the Ped-DLEQ proof verification with different counters (exactly as is done in Joinmarket), because it's very cheap/quick.

Private key format

Currently using hex, needs to be WIF (indeed, absent wallet integration, there is no version of this that isn't at the very least inconvenient, but raw hex 32 byte privkeys, while in some ways appropriate because only taproot is supported, are even more ridiculously inconvenient for anyone who isn't a bitcoin developer ...).

Algorithm for permissibility function for input pubkeys

Both the original paper and the code assume that the input elements of the set used as leaves of the curve tree are already permissible points.

Thus if your input values (curve points) are not permissible, converting them to permissible is "left as an exercise for the reader", so to say.

However the conversion of points created as the tree is traversed, into permissible, is handled in the code using this simple algorithm:

define f(pt):
while (pt is not permissible)
  pt = pt + H
return pt

where H is the generator for the randomness of the Pedersen commitment.

This function f is not viable as a hash of the initial public key for the leaf, however, since it isn't in any sense a cryptographic hash - it doesn't have second preimage resistance. Speaking less fancily, if the value of f(P) is P+3H, then clearly the value of f(P+H) is also P+3H. So we have second preimages.

Does this matter? I think the answer is similar to the concern in #2 , i.e. it is not a problem for the full protocol including Ped-DLEQ but it may or may not be a problem for other protocols using a Curve Tree.

If H is generated as NUMS, which it must be, then the user does not know the private key of P+H (staying in the hypothetical scenario where f() changes P to P+3H, so P+H is a second preimage, as are P+2H and P+3H), so if the algorithm here includes a proof of knowledge of the private key of P, which it does in the form of the Ped-DLEQ attached protocol, then this second preimage cannot be used to create an overall valid proof.

All the same it is worth considering whether an alternative function to the f() described above, might be better, if it avoid such preimages.

Prover slowed by creating permissible points

The proving time is currently dominated by the call to create_permissible_points_and_randomness. Sample data:

Depth 2, branching factor 1024:
250K keys, total proving time 48s, CPPR call 31s
795K keys, total proving time 148s, CPPR call 103s

Notice that the CPPR call is roughly linear, and the rest of the operation is sublinear (though it probably could be better). But also, CPPR is not part of the proving time at all, it is a preprocessing step, necessary for keys derived from real world usage (like Bitcoin, nostr, etc.) that are not chosen automatically to be permissible (and in line with the philosophy of this project, which is to allow "spontaneous" proof creation, we do not expect non-users of the protocol to choose their pubkeys to suit our use-case).

To convert the code into rough English, the permissible_point function in permissible.rs in the Curve Trees repo, takes a given point on the curve (though it is called a "commitment" note that we dont' need it to have a blinding property at the leaves of the tree, for this use case; we pass through the pubkey directly), then increments +H, +2H, +3H,... until a permissible point is reached as checked by running the Universal Hash. As noted in the paper, there is a 1 in 4 chance, roughly, of a point being permissible by chance so we expect about 4 re-calculations on average.

Investigation is needed into how and to what extent this overhead can be reduced.

My best guess is that the culprit is a combination of 2 things: 1/ the factor of 4 mentioned above and 2/ the field operations required in each one of those steps for checking permissibility, as found in the is_permissible function call.

If there isn't an obvious way to optimize that, this overhead might be unavoidable - either way, as has been noted, this overhead is and must be linear, even if we make it small. However, as it is a preprocessing step, it could be cached upfront, or perhaps "streamed" by doing these (individually very cheap) transformations up front as new pubkeys come into the system (e.g. new blocks on Bitcoin Core). Or alternatively we could treat 50s as an acceptable cost for proving in some situations, where creating proofs is a rare event, but this isn't very attractive.

A note on proof size

"Executive summary" because this is way too in the weeds for most:

There's no realistic way of reducing the proof size much below 2.7kB. By making them slightly bigger we can make proving nontrivially faster, if that is helpful.

The proof size varies more or less logarithmically with the branching factor (which he have as default 1024), and (in a very complicated way), approx linearly with the depth (which he have as default 2). As a result, since the set size is (branching factor) ** depth, we want to use the shallowest tree possible for the smallest proof size (which is what we are already doing), but worth noting that the size doesn't change very much if we switch from e.g. 1024 ** 2 to 32 ** 4. However for faster proving, it might be better to go with deeper trees, at a small extra cost in proof size.


This is more of an "informational" "Issue" for now. The idea is to write out in some detail where the proof sizes come from, with the idea that it becomes possible to know whether improvements in this area are possible, and to estimate more accurately what the proof sizes are for different parameters.

The full proof serialization produced by the rpc.RPCProver.prove function contains the following:

$D$ - the serialization of the point that is the reblinded form of the prover's taproot key
$E$ - the key image of that same key.
proof - this is the Ped-DLEQ proof of correctness of the above as explained in autct.pdf
p0proof - this is the Bulletproof of the circuit for the secp256k1 curve (the root and the leaves)
p1proof - this is the Bulletproof of the circuit for the secq256k1 curve (the intermediate layer)
path - the (reblinded) path to the root from the prover's leaf
root - the root of the Curve Tree

The only reason for this note/Issue is because of the 4th and 5th entries, the bulletproofs. Calculating their size as a function of the application variables is non trivial; the rest is easy, but we'll write it here for completeness:

$D$, $E$, root - curve points on secp256k1 serialized as 33 bytes, so 99 bytes total
proof - this, as per the referenced pdf, requires 2 curve points and 2 scalars, so roughly 33x4=130 bytes and this will not vary with proving parameters
path - $d=2$ ($d$ is depth) points on secp and secq from the leaf to the root (with reblinding of course). This is the serialization of a SelectandRerandomizePath object, which consists of 2 vectors, one for each of even and odd. In the case of $d=2$, each vector has length 1, but will have the 8 bytes extra for vector serialization, giving a total size of (8+33)*2.

Each of the two bulletproofs are more or less the same, but:

  • as noted in the original paper, the size of 1 such proof is logarithmically dependent (assuming use of the inner product argument, which is of course the main advantage of Bulletproofs, so yes we do use it) on the number of multiplication gates in the circuit. Parallel arithmetic constraints (from $1 \ldots q$ as per the paper) don't affect the total size. So to know the size of the proofs provided by autct, you need to count the number of multiplication gates it uses.

Here is an attempt to audit that number:

This section:

https://github.com/AdamISZ/curve-trees/blob/08700f59427a36d8fb2b49a7e8bc7aaf6d27454e/relations/src/single_level_select_and_rerandomize.rs#L111-L117

... from the curve trees (my fork, but unchanged) repo shows the operations needed to do a single level Select and Rerandomize:

  • First the select operation, which requires one multiplication gate per branching factor, so L multiplication gates. This is because the polynomial constructed is degree $L$, with one root per set member.
  • Second, the permissible operation on the chosen point from the set. This requires 4 multiplication gates.
  • Third, the rerandomize operation. This is the part that does elliptic curve arithmetic. The code splits the field element into three bit windows, so starting from a 256 bit value, it is split into 256/3 = 86 rounded up, windows. This requires ** $7 \times 86 + 3.5 \times 84 + 10 = 906$ multiplication gates**.

Taking all this together, and including the fact that we need one Select and Rerandomize per (even/odd) depth in a single bulletproof, gives us $\frac{d}{2}(L + 911)$ multiplication gates in each of the two bulletproofs.

That means that for the concrete case, in this application, of $L=1024$, we have a total of $n=1935$ multiplication gates.

Referring to the proof structure (R1CSProof) in curve-trees/bulletproofs/proof.rs, we can see that the generalized version of bulletproofs used, the formula for its size is not quite as simple as:

$$(2 \times \lceil (\log_{2}(n))\rceil + 8) \ \textrm{group elements} + 5 \ \textrm{scalar elements} $$

, where $n$ is the number of multiplication gates, as in the Bulletproofs original paper. It's actually:

$$ 2 \times \lceil (\log_{2}(n))\rceil +3 + \textrm{len}(T)\ \textrm{group elements} + 5 \ \textrm{scalar elements} $$

Now we can assum size(group el.)=33, size(scalar)=32, but we must also account for the vector serialization used, which always adds 8 more bytes to the length of vectors.

The vector $T$ is derived as part of "Generalized Bulletproofs" (link above) based on the number of additional pedersen (vector) commitments made, and in the case of $d=2$, we have only one additional commitment per bulletproof (one for the even depth(s), one for the odd. So we start with ncomm=1 in the notation of the source file bulletproofs/prover.rs, giving op_degree=2, giving length of t_poly = 6, and length of the vector $T = 7$.

As a function, if length(T) = $f_T(d)$:

$$ f_T(d) = 2\left(2 + 2 \lfloor \frac{d}{4} \rfloor + 1\right) + 1 $$

Actually, in case where the parent rerandomization scalar in the single level select and rerandomization is zero, as is the case always for the first proof from the root to the next layer, we have zero additional commitments, but as per the formula for $f_T(d)$, this still results in $T=7$.

Thus the overall size of the two bulletproofs uses $f_T(d) + f_T(d-1)$ in place of length(T), always, a fact we'll use at the end.

Back to the concrete $L=1024, d=2$ example: this results in a total size of a single bulletproof of:

$$ 2 \times \lceil (\log_{2}(1935))\rceil +3 + 7 \ \textrm{group elements} + 5 \ \textrm{scalar elements} + 1 + 8 + 8 + 8 $$

where the additional 8s are to account for the vector serialization of the terms: $T$ and the two vectors of $\lceil (\log_{2}(1935))\rceil$ items. The additional 1 byte encodes a flag for inclusion of extra data (unused for the moment I believe, i.e. it's just a zero byte).

In brief, this gives: (22 + 3 + 7) * 33 + 5 * 32 + 24 + 1 = 1241 bytes.

There are two such bulletproofs, resulting in 2482 bytes.

Then we add the other components from the start, giving $1241 \times 2 + 99 + 130 + 82 = 2793$

Which agrees with size on disk of proof files generated by autct -M prove: 2793 bytes.

Taking a high level view, the formula should be something like (where $L$ is branching factor and $d$ is depth; they are related by $N= L^d$, where $N$ is the size of the set we're proving over):

$$ 2 \times (2 \times \lceil (\log_{2}(\frac{d}{2}(L+911))\rceil + 3 + \textrm{len}(T)) + (5 + d) \ \textrm{group elements} + (5 \times 2 + 2) \ \textrm{scalar elements} + 2 + 8 * 8 $$

... where len(T) varies according to:

$$ \textrm{len}(T) = 2 \times (2 + 2 \lfloor \frac{d}{4} \rfloor + 1) + 1 $$

If approximately correct, one may wonder if a branching factor $L=32$ is a good idea, with $(2^5)^4 = (2^{10})^2$.

Proof size would then be:

$$ 2 \times (2 \times \lceil (\log_{2}(2 \times (32+911)))\rceil + 3 + \frac{f_T(4) + f_T(3)}{2}) + (5 + 4) \ \textrm{group elements} + (5 \times 2 + 2) \ \textrm{scalar elements} + 2 + 8 * 8 $$

where len(T) = 11. This gives a total proof size of: 2991 bytes (confirmed in tests), which is bigger. However it can reduce the proving time according to my tests, which arguably is more important given that the proving time is 15-30 seconds (especially if you include the deserialization of points from file storage). So this may be worth looking into further.

Finding optimal values is non trivial given the use of non differentiable functions floor, ceil, though certainly possible. It seems that the most shallow tree structure is usually optimal for proof size, but deeper trees would be faster (how to quantify this?), but would like to put a bit more analysis work into it.

Performance of point deserialization

After resolving #10 I realized that the biggest bottleneck in the proving time is having to call the deserialize routine on each pubkey (curve point) that was serialized into the keyset file (pre-converted to permissible, but that's not relevant here; it's still one curve point per pubkey).

The deserialization call here:

let x = <Affine<P0>>::deserialize_compressed_unchecked(

... now takes about 9.5 seconds, whereas calling deserialize_compressed, which includes extra sanity checks per point deserialization, takes over 100 seconds.

This 90% speedup by definition has solved "most" of the problem, but I am still not satisfied that a linear cost in simply reading in public keys ends up being the overwhelming part of the proof cost; this would mean that (probably), running a proof over a 1M pubkey set will have a 30 seconds cost in just reading the data, which will overwhelm the 2-5 seconds needed to actually run the cryptographic proof.

For now, the situation is acceptable as long as we are not working with keysets larger than a few hundred thousand, but it's very suboptimal.

PS.

Just to check I tried two different Rust syntaxes for reading in the keys, but unsurprisingly it made no noticeable difference in performance:

let leaf_commitments: Vec<Affine<P0>> = pts_bin.iter().map(|x| <Affine<P0>>::deserialize_compressed_unchecked(
        &x[..]).expect("Failed to deserialize point")
    ).collect();

or

    let mut leaf_commitments: Vec<Affine<P0>> = Vec::new();
    for a in pts_bin.into_iter(){
        let x = <Affine<P0>>::deserialize_compressed_unchecked(
            &a[..]).expect("Failed to deserialize point");
        leaf_commitments.push(x);
    }

User labelling

In a practical usage of this primitive, you would almost always want the proof to "sign over" an ephemeral identity. Indeed, without this element, the protocol has a danger:

Imagine that you own a private key x for a utxo U of the utxo-set S, and you generate a proof P_a(S, U) where the subscript _a means 'for the context of application a' (this is currently handled by our context_label). This ensures that the proof cannot be reused in a different application b but does not tie the proof to anything else. Therefore, if the proof is to be used as a token for access to a service Q for application a, it is not safe to broadcast it, since another user can use P just as easily as you can, even if they cannot generate a new P themselves.

Since we are focused on anonymity, the solution is to attach an ephemeral user identity to the proof, as a message that is being signed over. Hence this ephemeral identity should be inside the challenge hash of the Pedersen-DLEQ proof (but note: it should not be in the preimage of the J point, since that would allow reuse of key images; that preimage should always be universal at the level at which we want to enforce scarcity).

Of course an ephemeral identity string might be the most common usage of this extra message field inside the Ped-DLEQ hash, but it makes sense for it to be an arbitrary length string whose semantics are defined by the application a. But it MUST include an identifying "user" string (or similar) if you want to allow users to gain benefit of it without the risk of it being stolen by a different user.

Finally, there is no reason I can see for now (?) to also include this message string in the Curve Tree proof hash challenge, since the mechanism already exists to tie the curve tree proof to the Ped-DLEQ proof so the former can't be reused by a non-key owner successfully (they would fail to create a new Ped-DLEQ proof).

Keyset file location vs name

The code currently reads the keysets config var as context_label:filepath but this is dumb: we want the second part of that string to represent the name of the keyset, shared by client and server (or peer and peer), whereas the location on disk is a different thing. We could try to force that to be the same for both parties using a default directory location for keysets, but it seems unnecessary. Instead, we'll add another config var with an easy default for the location of keysets, and then include only the name (the *.aks string as described here) in the keysets string.

Python bindings

I was thinking about creating python bindings for aut-ct so that it can be easily used in joinstr. Does it make sense or we have better alternatives?

Encryption of private key used for proofs

If proving is an RPC call, it leaves an issue to be addressed about the threat model being addressed w.r.t. secret key material.

In the case of a wallet, we obviously encrypt the secret "at rest", in something like a wallet file. It's unlocked in memory and used only within-process.

The architectural problem here, however, is non-trivial. Consider that, the most plausible usage scenario for the app would be:

a wallet application exists, call it W.

W wants to support this as a feature for its user.

W calls AUTCT with a proving request; it needs to provide the private key to AUTCT to allow the proving request to work.

This means that AUTCT has to be kind of "owned" by A from a security point of view (it could run it as a subprocess or similar), and trust giving to AUTCT either a raw private key or a password to decrypt a wallet file. The latter might seem more normal, but AUTCT would need to "speak the language" of the decrypted wallet file of W (and be able to decrypt it), which doesn't seem to make sense.

But the alternative, that W passes to AUTCT the raw private key as part of the RPC call, also seems dangerous. A well known example of this insecure pattern is: imagine that W called AUTCT as a subprocess with a flag -i <privkey> - this would result in the private key being visible in the OS process list.

If the RPC prove call is via an encrypted/secure channel (TLS), then this problem is addressed, so long as that channel is actually authenticated and encrypted properly, preferably with forward secrecy. I think this option is a little annoying in terms of extra baggage (self signed certs will be required), but probably correct.

Opinions welcome, hence the label.

Re-use of key images across protocol instantiations

Considering two possible definitions for the independent NUMS generator, here J that is used for deriving key images:

  • J = hash(application-specific-label,curve-tree-root)
  • J = hash(application-specific-label)

The latter seems more appropriate for the most intuitively appealing test cases. As an example, consider making proofs over the utxo set. This is a set, and therefore a tree, that you may want to update regularly, e.g. every week or every day. Every time you update, you change the tree and therefore the root. With the first of the above options, you will be able to create new, unlinkable key images for a single utxo every time you update. It's hard to see when that would be preferable to a key image, and therefore token (or tokens) that are fixed to the same utxo over time.

For this reason, the second option will be chosen, for now.

About the null entries used to complete the leaf set

This isn't technically an "Issue" if I understood it correctly, but it needs to be written down somewhere:

Obviously it is normal that the number of elements of the set you want to use (the number of bitcoin pubkeys let's say), will not be a power of 2.

As can be seen here:

... the strategy taken to deal with this case is that a value of '0' is inserted for the x-coordinate of the commitment point (for the leaf), if there is no actual leaf available.

This would tend to suggest an issue might arise when using '0' for x-coordinates. But it isn't that clear. Suppose there are four children, points on secp: P1, P2,P3, P4. And suppose "P4" is actually empty, represented in code by None (see x_coordinate() function). Then the scalar it is mapped to in secq for the next layer up the tree (here "x4") (suppose "C" is the parent of P1..P4) is zero. But C is constructed as (x1G1 + x2G2 + x3G3 + x4G4 + rH) with, here, x4=0. So clearly that will work fine (indeed, it must, for this fill out strategy to work!).

(An additional complication is permissible points: when you construct a curve point in this algorithm, it is incremented with +H (i.e. r+=1) as many times as needed, deterministically, then passed through the Universal Hash to see if we get a permissible point or not. But this is a deterministic process carried out by prover and verifier.)

However all this is moot if we consider the that the Ped-DLEQ component of our proof in aut-ct is in fact checking knowledge of the private key, more specifically, for a rerandomised point D, we check:

  • the prover must prove knowledge of the secret witness (d, r) for D = dG + rH
  • the prover must prove that E has same DL wrt J as D has wrt G (i.e. d)

The second, trivially, cannot work with d=0, because it would output the point at infinity, irrespective of the choice of any J on the curve. Most implementations of ECC just outright reject this, but it wouldn't matter if they allowed it.

To summarize: because of the output of the key image there's no way a forged proof using a zero private key can work. To analyze, instead, what could happen in a different Curve Trees-using protocol would need one to specify the exact way the Curve Tree was being used there, distinct from how it is being used here.

(I'm relegating this additional info to an "Addendum" because I think it happens not to be relevant; but it's still interesting and tangentially related:

secp256k1 does not have a point(s) with x-coordinate 0. Put another way, over the base finite field of secp256k1, there is no square root of 7 (reminder: y^2 = x^3 +7 and we want x==0).
It just so happens that this is also true of secq256k1:

sage: F = FiniteField(2**256-2**32-2**9 -2**8 - 2**7 - 2**6 - 2**4 - 1)
sage: a = 0
sage: b = 7
sage: n = 115792089237316195423570985008687907852837564279074904382605163141518161494337
sage: Fn = FiniteField(n)
sage: F(7).sqrt()
sqrt7 //<-- this is the return format for values without roots in the field
sage: F(9).sqrt()
3 //<-- if there is a root it is shown
sqrt115792089237316195423570985008687907852837564279074904382605163141518161494327
sage: Fn(7).sqrt() //<-- Fn is the scalar field of secp, so it is the base field of secq
sqrt7

Why is this irrelevant? Because the operation (x coord of a point on secp -> scalar in secq) doesn't have to pay attention to whether the input x-value is actually valid. To illustrate, consider what the code is doing if you provide an encoding of an x value which is not actually a valid x-coordinate on secp256k1 (this is is true for approx half of the integers in 0..N where N is the order of the secp256k1 group).
)

Update test cases for default key image labelling

Currently the J generator used for the DLEQ (and therefore key image) is a hash of a random string. Need to standardise it with a customisable label from the label field in the config file, and then update the test cases to include the default value of this label.

Support multiple concurrent key image stores

Specified in config/command line for serve method with a comma separated list.

At the same time, change the specification from a file path to just a name, with a default location for the keyset file (because the server and client need to communicate it).

Add TLS to rpc connection

IIRC there is an example in toy-rpc, so this should be fairly easy, at least with self-signed certs.

versioning

Currently we have a --version option in the arguments, but in this I was thinking specifically of the ability of the user to choose a communication protocol version (see docs/protocol-utxo.md for the current 1st draft).

The application version should be separate and --version should be reserved for this. So the probable action to take is: remove --version from the explicit command line, ensure that clap is taking version from the Cargo.toml setting, then add another option for communication protocol version.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.