Giter Club home page Giter Club logo

bbs-signature's Introduction

The BBS Signature Scheme

This repository is home to multiple internet drafts around the BBS Signature scheme, detailed below:

BBS Signature Scheme (Core Draft)

This draft defines the core operations, cryptographic structures and overall protocol for the BBS Signature scheme

Blind Sign BBS Signature Scheme Extension

This draft defines an extension to core draft enabling the ability for blind bbs signatures, including describing the required operations, cryptographic structures and sub-protocol.

Meetings

Regular meetings are held bi-weekly on Mondays, on the same weeks as the Applied Crypto Working Group call is held.

Meeting agendas and minutes can be found in /meetings, the next up and coming meetings agenda can be found in here.

Tooling

To assist the development of the specification a set of tooling is co-located in this repository and can be found here.

Generating the output documents

The text and HTML versions of the specifications can be generated by running make; this requires having the xml2rfc and mmark packages installed.

bbs-signature's People

Contributors

alessandroguggino avatar alessandropino avatar andrewwhitehead avatar basileioskal avatar christianpaquin avatar dev0x1 avatar hackmd-deploy avatar jovfer avatar kdenhartog avatar lucagiorgino avatar mikelodder7 avatar nembal avatar schanzen avatar tmarkovski avatar tplooker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bbs-signature's Issues

[Proposal] An interop profile document

Given the discussion in #8 around how the core bbs signature draft should relate to underlying cryptographic primitive selection, I wanted to propose an approach that may be inline with this emerging consensus.

Instead of mentioning in the core bbs signature draft anything around concrete curves or digest algorithms we could define a seperate document which is an opinionated "bbs signature profile" aimed at promoting interoperable implementations. The profile would also host test vectors to facilitate this process.

If this approach is accepted by the community then I would suggest we immediately start work on one profile that uses the following.

  • BLS12-381 as the pairing friendly curve
  • SHAKE256 as the digest algorithm

Alternatively if having multiple document is too complex we could elect to merge these, however we would need to be careful in how we structure the document.

[USECASE] Privacy Preserving Authorization/Access Tokens

Currently the most prolific model for expressing authorization/access tokens on the web are those like JWT based ones as used by the OAuth2.0 protocol. In order to provide these tokens with the necessary authenticity and integrity guarantees the tokens are secured using conventional cryptographic primitives like traditional digital signatures and or structures like HMACs. To make use of one of these access tokens issued by a protocol like OAuth2.0, a relying party most commonly includes the access token as issued, in the Authorization header of an HTTP request they make to a resource server to prove authority. Some limitations of this model are:

  • That repeated requests to the same resource server correlates the relying party.
  • These access tokens are most often bearer in nature, which means if the token is intercepted by a malicious third party, they then obtain the same level of access as the relying party at the resource server.
  • Finally when authorization has been granted to a relying party from an authorization server that gives it access across multiple resource servers, having a single access token can leak certain information un-intentionally to the various resource servers as the access token is used.

An access token secured with a BBS signature could help to solve some of these challenges in the following ways:

  1. An access token secured with a BBS signature would never be directly revealed via a request to a resource server, thus mitigating the correlation risk. Instead each time a relying party were to make a request to a resource server they instead would compute a derived proof of the access token and include that result in the request to the resource server. The resource server still obtains the same level of confidence (if not better) around proof of authority associated to the request, without strongly correlating the client via the signature on the access token.
  2. Because an access token secured with a BBS signature is never directly revealed via a request to a resource server, instead only a derived proof which cannot be replayed, the likely hood of the access token being intercepted by a malicious third party is strongly mitigated.
  3. Finally because BBS Signatures enable multi-message signing and selective disclosure during the deriving of a proof, a relying party could leverage this to derive proofs from a single access token that proves authority to perform actions at independent resource servers without the risk of un-intentionally information leakage.

Generalize the presentation proof nonce

The spec currently specifies a presentation nonce to prevent replay attack, and recommends it to be a random value. I suggest we had a few details and requirements around nonce generation, to accommodate various scenarios:

  • For replay protections, the nonce only need to be unique, so a verifier identifier + timestamp would work (to create proofs non-interactively).
  • If the holder keys are held in hardware that the user only has access for a limited period of time, then making the nonce random prevents the user pre-generating proofs to be used later after the authorized period.
  • A BBS+ attribute credential can be used to create a digital signature on some data, disclosing a subset of the attributes. In this case, the nonce can be the digest of the signed data.

I can create a PR with suggested updates.

Improve documentation around BBS Signature message generator procedure

The specification currently has limited detail on how an important aspect of existing implementations of BBS associated to this draft works, that is the deterministic generation of what are known as message generators (a name that perhaps should be revisited).

For context a message generator is effectively a "public parameter" that is used to commit a digested message that will be protected by the resulting BBS signature. All of the message generators for all of the messages are required during sign, verify, deriveProof and verifyProof (even for those messages that remain un-revealed). For implementations of BBS signatures outside / not aligned to this draft, these generators are often communicated out of band or in conjunction with the signature or proof.

Implementations following this draft however derive these generators via a deterministic process from the public key during the sign, verify, deriveProof and verifyProof operations thus removing the need to manage these generators as additional public parameter information associated to the signature.

This technique/procedure should be better documented and discussed by the working group.

This issue is also partially related to #10 in the sense that it explains why the revealed messages have to be ordered in reference to how they were signed as this provides the means to associate the correct message to its generator.

Domain separation of generated signatures

At the moment within the scope of this specification there are several points of extension, such as

  • How messages are mapped to their appropriate scalar values
  • How the generators used to commit each message are created (e.g from a global, issuer specific or signature specific seed)

Many of these above decisions will be limited or at least formally defined in a concrete ciphersuite to promote interoperability. However with multiple possible cipher suites overtime that make different choices, how do we ensure that a signature generated under one cipher suite is not confused with one generated under another cipher suite.

One proposal to limit the possibility of this occurring is to define that the first message in a BBS signature MUST always be revealed further more this message could contain a range of information OR something more limited like a cipher suite identifier. As raised by @BasileiosKal on the call, putting too much information in this revealed message is likely to create a correlation risk that would erode some of the benefit that BBS signatures provide, so perhaps a better middle ground would be to instead encode the cipher suite identifier (e.g for the current BLS12381 cipher suite in the spec this could be the hash scalar produced from the string BBS_BLS12381G1_XOF:SHAKE-256)

Supporting use cases for blind signing

Currently the spec documents a procedure for constructing BBS signatures where by some / all of the messages signed by the signer are unknown to them. This issue is to discuss the usecases for this feature and whether it should remain in scope for the core spec of whether it could be added in the form of a seperate extension specification. The tradeoff being that electing to keep blind signing in scope significantly increases deliverables for the spec in the form of multiple cryptographic operations and data structures.

Note - related to this is a discussion on how we should handle "bound" bbs signatures captured in issue #28

Parameterise the bytes drawn from the XOF so they are defined by the cipher suite

Throughout the operations defined by the spec there are numerous calls to the XOF where a certain number of bytes are read, via the .read() function. How many bytes are read is dependent on several decisions at the cipher suite level (e.g the curve used) hence the current "magic numbers" like 64 being used in these operations needs to be parameterised in the cipher suite definition.

Proof of knowledge revealed message structure

Related to #2

Currently the implementations of a BBS proof of knowledge signature includes the encoding of a binary structure at the start of the proof which captures which of the messages being verified and how they correspond to the original ordering of the message set signed. Currently this structure takes the form of a bit array.

For example assume the following set of messages were originally signed

[
  "message1"
  "message2",
  "message3",
  "message4",
  "message5"
]

And a proof revealing only the 1st 4th and 5th message was generated, meaning the revealed messages supplied during the verify proof operation would appear in the following order

[
  "message1",
  "message4",
  "message5"
]

The resulting bit array in the proof would take the following form

+-----------------------------+-----------------------------+---------------------------------+
| Total messages length (16bit)| Revealed messages bit array |..... remaining proof structure |
+-----------------------------+-----------------------------+---------------------------------+

+-----------+--------------+-----------------------------------+
| 5 (integer) | 1 | 0 | 0 | 1 | 1  | ..... remaining proof structure |
+-----------+--------------+-----------------------------------+

However, this mechanism of expression does limit options around the data canonicalisation algorithm that is used on top of the cryptographic scheme. Namely it does not support a re-ordering of the revealed messages away from the order in which they were originally signed. To accomodate this we could instead elect to change the expression of this aspect of the proof from a bit array to a byte/multi-byte array, where instead the array contents is a set of un-signer integers (16bit each) where each integer corresponds to the index the message occupied in the original ordering of the signed message set

For example assume the following set of messages were originally signed

[
  "message1",
  "message2",
  "message3",
  "message4",
  "message5"
]

And a proof revealing only the 1st 4th and 5th message was generated, BUT the messages in the resulting proof are re-ordered so that they present as 4, 5, 1

[
  "message4",
  "message5",
  "message1"
]

The resulting bit array in the proof would take the following form

+-----------------------------+----------------------------------------+-------------------------------+----------------+
| Total messages length (16bit)| Revealed messages array length (16bit) | Revealed messages byte array | remaining proof |
+-----------------------------+----------------------------------------+-------------------------------+----------------+

+--+--+--------+------------------------------+
| 5  | 3 | 4 | 5 | 1 | ..... remaining proof structure |
+--+--+--------+------------------------------+

The tradeoff here is that the overhead of this structure is more than in the simpler case with the bit array.

NOTE It is probably also worth considering a more standard structured binary format like CBOR rather than a custom structure

Blinded signatures revision

As it is right now, the commitment mechanic (blinded signatures) could lead to some inconsistencies and possible forged signatures. For example, if the Issuer signs the messages m1, m2, a "standard" (non-blinded) signature would be, (A, e, s) with

A = (g0 + h0 * s + h1 * m1 + h2 * m2)*(1/(SK+e))

If the Holder commits a message m3 using h1, i.e.,

commit = h0 * s’ + h1 * m3

The blinded signature, created by the Issuer, would be

A_blind = (commit + g0 + h0 * s + h1 * m1 + h2 * m2)*(1/(SK+e)) = 
  = (g0 + h0 * (s+s’) + h1 * (m1+m3) + h2 * m2) * (1/(SK+e)). 

Notice that (A_blind, s+s’, e) is a valid BBS+ signature in the messages (m1+m3) and m2, which is not the desired outcome and a potential forgery (A_blind, s, s’ and e are known to the Holder).

The main problem is that currently there is no mechanism for the Issuer to validate that the Holder does not create commitments using the same G2 elements as themself (h1 in the example above). My proposal is (at least for now) to define a blinded signature that would only contain committed from the Holder messages.

Element encoding when passed to HASH or XOF.

See discussion in PR #95

In the spec, we need a way to pass elements of different types (EC points, integers, scalars etc) to a hash function. In that case, the encoding of those elements becomes important. IMO there are two options:

  1. Explicitly making the transformation to octets for each hashed element. That would look something like
HASH(point_to_octets(PK), I2OSP(L), point_to_octets(H_1),..., point_to_octets(H_L),...)
  1. Implicitly make the conversion (in the definition of the HASH for example or in a separate section etc.). This would look something like,
HASH(PK, L, H_1,..., H_L,... )

Personally I prefer 2. This would mean that we can also specify rules like the one Tobias pointed out here i.e., that each element should also be appended with its length, which i agree with. A consideration here is if we need to specify this rule for every element (even the point_to_octets outputs) or it is sufficient to specify it for general octet strings and non-negative integers (lengths of lists)??. IMO the latter will suffice, since there is not really any ubiquities that come to mind regarding the output of point_to_octets (which is also an octet string but with well defined length), but i could be mistaken.

Indexes and messages notation

The spec makes heavy use of lists of messages and indexes. We need a consistent way to represent those.

- Option 1:
Represent indexes with a consisted format like I_t, I_d corresponding to the total and disclosed messages. Represent the messages with something like,

(msg[i] for i in I_d)   // or I_t

As such spkGen for example will be,

spkGen(spk, PK, I_t, I_d, (msg[i] for i in I_d), pm)

- Option 2:
Define that all signatures on L messages will use the L first generators. With that, instead of the indexes of all the signed messages we can pass just the totalSignedMessages. That will also allow for a solution like here with

spkGen(spk, PK, revealedMessages, totalSignedMessages, pm)

where revealedMessages is a map between index and message. The drawback here is that when creating commitments for blind signatures, the user will need to choose the n first generators, with n = Total_Committed_Messages (i.e., always follow the example flow from the blind signatures spec), which could be restrictive for some applications.

- Option 3:
A compination of both, i.e.,

spkGen(spk, PK, Indexes, revealedMessages, pm)

where again revealedMessages is a map between index and message and Indexes the indexes of all the messages

Elaborate on background around group signatures

Because the concept of a "group signature" is a lesser known term in the wider community that uses digital signature technology it would provide a good opportunity for education around some of the novel features of these types of cryptographic schemes provide if the spec included some of this background

Named message generators

I'm finding that it would be beneficial to define generators more generally, instead of just having H_0 for blind signatures and H_1 ... H_n for indexed messages. Extensions to the BBS signature scheme may want to reserve one or more message indexes for particular purposes (wallet binding, revocation, etc), but this leads to conflict when multiple extensions are used at once. You would likely need to define a new signature scheme for every supported combination in order to specify what message indexes are used for what purpose.

Instead, I believe we can define special-purpose message generators which have unique identifiers rather than indexes. You can also think of these as 'unordered' generators, with standard usage that is consistently defined across various credential schemas. A few potential uses that I've identified:

  • a 'linked secret'/'wallet binding' generator for preventing use of credentials without the secret (Indy's use case)
  • a 'device binding' generator (for a secret value that can be authenticated as generated by a trusted module)
  • revocation entry and revocation group generators for linking to an accumulator or other revocation mechanism
  • a schema identifier generator
  • a credential ID generator
  • a credential set ID generator (for a collection of credentials that can be presented in combination)
  • a credential creation time generator
  • a credential expiration time generator

We could also reformulate a couple things in these terms:

  • H_0 becomes the 'blinded signature' or 'commitment' generator
  • P1 could be replaced by a 'base point' generator - freeing it up to be used as a holder binding generator as in #37 (NB. in this case the associated message would always be 1)

These generators might be defined either specific to an issuer key or globally by adding a unique identifier to the hash-to-curve input. Some thought would be needed here to define the process exactly. For debugging purposes, messages tied to these generators might also be represented as a dictionary while numbered messages are represented as a list, for example:

{
  "revocation index": 1,
  "revocation group": "mydomain.com/users",
  "schema": "http://schema.org/Person"
}

Naming of the draft BBS vs BBS+

I propose we rename the draft to "The BBS Signature Scheme" and elect to drop the + from all mentions of BBS. Through appropriate citations of the academic work I think we can still achieve the right level of association to the schemes origins. The rationale for dropping the + is that in many visible places that character is not allowed (e.g naming repositories, naming packages and programming language syntax)

Security Review of Differences

The work on the specification is progressing well, but there seem to be a number of changes being considered that will cause this specification to deviate from the published and peer-reviewed academic papers.

While there may be more, I am concerned in particular about:
#19
#37
#70
#78

If these changes, or any others, are made which result in a deviation from the security proofs of the source material, then new security proofs need to be written before this spec is submitted as a draft to IETF.

Document rationale for why BBS Signatures are only efficient when the signatures generated are in G1 and public keys are in G2

Unlikes BLS signatures which as per the spec define two possible variations in the scheme based on which sub-group is used for the signatures and public keys. Because of the unique mathamatical computations involved for proof generation (SPK gen) being only efficient when performed in G1, the effective variant of BBS signatures that would instead use G2 is impractical. We should document this in the spec.

More flexible default generators

At the moment, the default generators are defined using roughly the formula hash-to-curve(PK || index || count), meaning that the generators are dependent on both the public key of the issuer and the number of signed messages.

There is also discussion of wanting to use a set of predetermined random generators instead of the default generators, in which case the generators would be independent of both the public key and the number of messages in the signature.

In the interest of aligning these approaches somewhat, I think it's worth considering removing the message count from the hash input, so that the default generators depend only on the public key of the issuer. This could have positive effects on performance given that all the generators could be cached up to a maximum number of signed messages.

This change might also be relevant in the case of JSON-LD credentials. In order to commit to one or more hidden messages (to implement holder binding for instance) a holder would need to have the relevant message generators. But the total number of messages is often not known at this point in the credential exchange protocol, because the JSON-LD credential itself (including the proof block?) would need to be generated and normalized first. If the generators were independent of the number of messages and the hidden messages were among the first messages (not added to the end) then this would be made possible.

Correspondence between proof and public key

In general, I was under the impression that a signature can be used to derive proofs that would be validated with only 1 specific key (the issuers PK).

That does not seem to be the case. The holder can present proofs for credentials signed from an issuer with public key PK, that will be validated with PK * y = P2 * (x*y), where y a scalar chosen by the holder. My question is: Is this dangerous?? It is not really a forgery. It may compromise the issuer's and/or holder's accountability though, i.e., an issuer may falsely claim that they never signed a specific credential or a holder that they never possessed (and used to create an spk) a credential signed from that issuer.

You can find the detailed mathematics here: https://hackmd.io/@Vasileios/HyuXt_kf9
and a reproduction here: https://github.com/BasileiosKal/dif-bbs-issue

In general the method is the following; Let the holder having the signature (A, s, e) on some messages. Let also the holder wanting to create a proof that would be validated with PK * y instead of PK, for some scalar y. During spkGen they will calculate Abar_f = A’ * y * e + B * r1 * y and D_f = B * r1 * y + H0 * r2. Note that e(Abar_f, P2) = e(A', PK * y). Changing some additional values in the proof, the holder can present the spk; (A’, Abar_f, D_f, …) that will be validated with PK * y.

A similar method works with a signature. The holder can present (A*(y^-1), s, e*y) which is a valid signature for the key PK * y (which is why the above method doesn't go against the security proofs given in the literature, i.e., a bbs proof, proves knowledge of a signature not only from the secret key x but also for a signature from the "secret key" y * x).

If a mitigation is needed, maybe including the issuer's PK as one of the signed messages will be an efficient solution, that solves the issuer’s and/or holder’s accountability issue.

Seperate the message generators from the public key

Currently the specification uses the term "deterministic public key" to refer to the underlying public key + message generators required to commit a series of messages to in order to construct a BBS signature signed by an issuer. This association has been found to be confusing and the suggested clarification is to remove the concept of a "deterministic public key" and instead just have a public key and a set of message generators with a defined process under which they are generated.

Updates to blind signature extension draft

We made significant progress on the core draft over the past couple of months, which is likely to have created some in-consistencies in how the blind sign draft maps onto it as an extension, this issue is just open to track this work and re-alignment.

Holder Binding

Currently the spec is missing a way to bind a signature to a user. The proposal from here is to use BLS keys. Essentially the process will be the following,

  1. User creates BLS keys i.e., BLS_SK and BLS_PK in G1 (BLS_PK = P * BLS_SK ) where P a point in G1
  2. The Issuer will use the BLS_PK of the user as a commitment i.e.,
A = (P1 + BLS_PK + h0 * s + h[1] * msg[1] + h[2] * msg[2] + … + h[L] * msg[L]) * (1/(e + x))
  1. The user will have to know BLS_SK to derive a BBS+ proof creating the binding.

An issue is what base points to use i.e., P (base point for use with BLS) and P1 (base point for use with BBS+). Those points need to be different. A choice is to use P as defined by the BLS signatures spec, which is the base point of G1 as defined from the pairing-friendly-curves-draft spec.

The advantage is that BLS will most likely use the base point as defined by the pairing-friendly-curves spec, so we will be in spec when it comes to BLS keys. The negative is that we will need to define a different base point for BBS+ (i.e, P1). My proposal is to just use P1 = P*2. Another option is to use a random point in G1 as the base point for BBS+ if there is some security reason to do so.

NIZK's must use fiat-shamir hash comparison

When verifying any NIZK, it is necessary to check if the fiat-shamir heuristics are the same i.e. the challenges.
Comparing whether the commitments are the same is not enough.

Change digest algorithm used to SHAKE256

There are three different ways a digest algorithm is used in a BBS Signature

  1. Message generators
  2. Doing deterministic signature generation
  3. Proof - for the fiat shamir heuristic

Current implementations use BLAKE2B however this is a non-NIST approved digest algorithm making alignment more difficult the suggestion is to shift to using SHAKE256 for the above applications

BBS spk security against an untrusted/compromised issuer

Let’s say that the issuer gets compromised, i.e., both their secret key x and the records with the signatures they signed will become known. Let’s assume a holder has a bbs+ signature (A, e, s) from that issuer with A = B * (1/(e+x)) where B = P1 + H0*s + H_1*msg_1 + … + H_L*msg_L. Any spk will be of the form, spk = (A', Abar, D, ...) where

A' = A * r1 =  B * r1 * (1/(e+x))

and

Abar = A' * x =  B * r1 * (x/(e+x))

The values (B, B*x, A', Abar) essentially form a DDH tuple, which is easy to solve using pairings. Specifically, it will hold,

e(B*x, A’) = e(B*x, B*r1*(1/(e+x))) = e(B, B) ^(r1*(x/(e+x))) = e(B, B*r1*(x/(e+x)))

meaning that,

e(B*x, A') = e(B, Abar)

So, if an adversary who knows x and s, sees (or has already seen) a proof with A’ and Abar, they can just try different messages, construct B and check the above equality between the pairings. Since the messages will not be many this is not hard to do and will result in the secret messages of the holder been revealed.

If the Issuer is untrusted, they can use the above equation to track the holder by correlating the spk's with specific signatures.

This is mitigated a little by using blind or bound signatures. If the commitment is not revealed an adversary will not be able to construct B by trying possible hidden messages. However, it is reasonable to assume here that if the issuers secret key and the signatures they signed are revealed so will the commitments (they will be known -and possibly recorded- by the issuer). Furthermore, if the issuer is untrusted. they can still correlate the holder.

As a result, the bbs+ proofs are not secure against an untrusted or compromised issuer. This is not that much of an issue and is like what happens in most PKIs but it might be worth documenting it in the security considerations in the spec so we will have an accurate security model.

Crypto agility

The current draft is partially agnostic to the cryptographic details and in other parts very specific.
For example, the choice of curve is left open: https://github.com/schanzen/bbs-signature/blob/main/spec.md#choice-of-signature-primitive
But, the curve is implicitly enforced in https://github.com/schanzen/bbs-signature/blob/main/spec.md#terminology along with other primitives such as the hash functions.
This will make #7 a bit awkward to fix.

I guess the specification could be very specific to only define a concrete instantiation or abstract to allow crypto agility.
The latter may require additional metadata in the objects.

Related to #5 as well. Maybe hashfunctions should be harmonized for the initial recommendation/instantiation?

Cite relevant prior academic works

The BBS Signature scheme is based on a wealth of prior academic work that should be appropriately referenced. Any deviations from these works should also be documented in an informative appendix while the scheme continues to develop. Lets use this issue to collate these sources

Variable and function naming

The current draft uses the caret and tilde characters in variable naming, as they are in some of the source research. This causes issues with the markdown format of the specification in that it is burdensome and/or impossible to properly escape these characters.

I don't see any clear standards on variable or function naming in other IETF drafts. Variables may be upper-, lower-, camel-, or snake-cased. Functions can be camel-cased, snake-cased, or kebab-case (ie. HKDF-Extract).

Some suggestions for discussion:

  • name spec-internal variables and functions in snake case unless they represent an initialism (like DST) or are commonly capitalized (such as a field F or the set of integers Z). I think there is a tendency to use capitalized variables for elliptic curve points like A in the signature
  • name external functions according to their respective standards when possible, for example hash_to_curve
  • a trailing apostrophe is allowed in variable names, such as s' - precedent in hash-to-curve draft for example
  • for variables currently using ~ as a suffix use _b short for blinding factor
  • for variables currently using ^ as a suffix use _c short for commitment

Test vectors

In order to facilitate interoperability this specification should contain a set of test vectors for the cryptographic structures defined

Conceptually seperate message generators from public key definition

At the moment in the draft a BBS signature public key is defined as an ordinary public key (using the based generator for the curve group) + a set of message generators used for committing messages to a BBS signature. Given the conversation occurring in #19 even if the outcome is to stick with a set of message generators that are derived through a deterministic process from the ordinary public key, I think we should remove the concept of a BBS signature public key and regard message generators as "public parameters" instead.

Deterministic SPK

Although deterministic proof generation should not be used in practice it may be useful to support it, so we can more easily create and update test vectors. This MUST NEVER be used in practice. The main idea is to incorporate a secret key managed by the holder of the bbs signature into spkGen that is going to be used for the generation of the random elements. This secret MUST be unique for every proof generation procedure. One possible method then, is to use the same algorithm as in deterministic signing, i.e.,

H = XOF(Holders_SK || msg_i1 ||…||msg_iR || pm)
r1~ = OS2IP(H.read(64)) mod q
r2~ = OS2IP(H.read(64)) mod q
              .
              .
              .

In spkGen, maybe we can also use RFC 8937, which gives good security properties and is easy to use, since you don't need direct access to the secret key itself. Each random element (in the non-deterministic case) could be

random_element_i = OS2IP(HKDF-Expand(HKDF-Extract(HASH(Sig(Holders_SK, DST)), PRF(L)), i, 64)) mod q

To get a deterministic spk, we could just replace the PRF with an XOF or hash. That said, this does "bound" the holder to use a secret, even in the non-deterministic case.

Message Generators Creation Method

Since the consensus seems to be to move to global fixed message generators shared between issuers (see #19 ) the next step would be to decide how to create those generators. Opening this issue to create some discussion around possible proposals. My understanding is that they will be deterministically generated by a common seed (created from some nothing-up-my-sleeve value).

I would also propose for the process to be such that it will create a strong binding between index and generator (i.e., something like hash-to-curve(seed, dst, index)) to help with indexing in applications and avoiding ambiguities.

Update core operations to clarify the form of input messages

Currently the core sign, verify, spkGen and spkVerify operations are silent on the form the messages take into the algorithm, based on conversations with @mikelodder7 these are most often inputed in the form of scalars as outputed by the MapToScalar function being proposed in #61. We should clarify this further in the spec.

Message Encoding

Data can be encoded multiple ways depending on how it will be used with other ZKPs. For example, it will not be useful to hash an integer to a 32 byte value using SHA256 and expect it to work properly with Range Proofs. The following is a list of encodings and the use cases they can be used:

  1. Hashed: for aribitrary length data fields that will not fit in the base field e.g. images, strings, biometrics, blockchain transaction info. Use hash2field or some other method to hash the data into a 32 byte base field value.
  2. Bytes: a value that is already in the base field
  3. Numbers: for range proofs. The value is zero-centered by take computing the base field modulus z=q / 3 integer division as the zero center and adding the positive number or substracting the negative number. To support complex numbers like decimal, the number is converted to fixed point arithmetic. The Q format is used for these numbers as Q64.160. This leaves two bytes to avoid numbers greater than q. While it does not represent the full breadth of IEEE754 numbers, it does give a considerable resolution -263 to 263-2-160 signed and 0 to 264-2-160 unsigned or about 48 decimal places of precision. If decimals are rounded to the nearest integer, then Q format is not necessary.
  4. Null: Hard coded as 5. Means the value is not included or not used.
  5. Empty: Hard coded as 11. Means the value is included but is an empty string or zero bytes. For example, a person does not have a middle name but a value is required to be entered.
  6. Ignored: Hard coded as 23. For data fields that were ignored or not answered by the person. This is different than empty where the person's answer is literally nothing vs they chose not to answer it at all.

Small safe primes were used for the last three but can really be any value that is not 0 or 1.

Comparing numbers to Null, Empty, Ignored

No DLEQ proof is needed to check for this since the verifier gets to pick the bounds and in theory this could be done but isn't meaningful

Convert challenge hash to scalar value

Currently in the in spkgen and spkverify operations, the challenge is not appropriately being converted to its scalar representation, this needs to be rectified, issue originally raised during review of #84 see here

Terminology clarification

Clarify terminology for

Domain Parameters
Generators (and their relationship to domain parameters)
Generators with specific purposes (e.g message generators and blinding factor generators)

Consider renaming nonce in the proof to presentation message

As discussed in the WG call on the 24th of Jan, the usage of the term nonce was found to confuse reviewers of #21, upon discussing this the suggestion was to potentially rename this to presentation message as was the case with uProve. This is to convey the structure we are talking about may not be limited to being used as just a cryptographic nonce, instead it may be used to convey other information accompanying the proof generated

Adding the revealed messages to the chalenge to avoid forgery

The following is a method that could possibly allow the prover to reveal messages for which they don't have a signature, taking advantage of the Fiat–Shamir challenge not containing the revealed messages. Let the prover with a signature (A, e, s) on messages msg_1, msg_2 and msg_3, creating an spk revealing only msg_1. Going through the spkGen process the prover will calculate

C2_prover = D*(-r3~) +H0 * (s~) + H_2 * m~_2 + H_3 * m~_3

The proof value will be,

spk = (Abar, A’, D, c, e^, r2^, r3^, s^, m^_2, m^_3)

From the verification process we know that if T = P1 + H_1 * msg_1 then

#EQ1: T*c + D * (-r3^) + H0*s^ + H_2 * m^_2 + H_3 * m^_3 = C2_prover

Let the prover now calculate the following,

  1. c’ = c ^ -1 mod q
  2. msg_adv = m^_2 * c'

and instead of revealing spk, they send spk2,

spk2 = (Abar, A’, D, c, e^, r2^, r3^, s^, m^_3)

with revealed messages msg_1 and msg_adv.

The verifier receiving spk2 and msg_1, msg_adv will calculate C1_verifier, C2_verifier and check the pairing equality. Note that since Abar, A’, D, c, e^, r2^ remain the same between spk and spk2, C1_prover = C1_verifier and e(A', PK) = e(Abar, P2). To calculate C2_verifier the verifier will first calculate T2 as,

T2 = P1 + H_1 * msg_1 + H_2 * msg_adv = P1 + H_1 * msg_1 + H_2 * (m^_2 * c')

From that the verifier will calculate,

C2_verifier = T2*c + D * (-r3^) + H0*s^ + H_3 * m^_3 =
   = D * (-r3^) + H0*s^ + (P1 + H_1 * msg_1) * c +  H_2 * (m^_2 * c' * c)  H_3 * m^_3 =  
   = D * (-r3^) + H0*s^ + T * c +  H_2 * m^_2 +  H_3 * m^_3  = C2_prover

The last equality follows from #EQ1. As a result, the verification will succeed for msg_1 and msg_adv.

Obviously, this should not be possible, since this proves knowledge of a signature that the prover doesn't have (which contains msg_adv). If I'm correct on the above, most likely the only issue is with the Fiat–Shamir heuristic. Note that if the protocol was interactive, that method will not work, since the prover would have to "commit" to the messages prior of getting the challenge c. Hence we should also pass the revealed messages to the hash of the challenge during spkGen and spkVerify, to "commit" them before the challenge is created.

Supporting bound BBS signature usecases

As has been discussed before in related communities, the loose intended definition of bound BBS signatures refers to a signature that features a binding to a secret key (private key) possessed by the holder, which is required to be known by the holder in order to derive a proof using the signature. Effectively it is the introduction of a proof of key possession factor to a BBS signature which increases the assurances a relying party or verifier has when presented with a derived proof.

Mechanically this "feature" could be realised in multiple ways which affect the scope of this draft:

  1. Would be to leverage the blind sign functionality and treat the bound secret key as a blinded message.
  2. Would be to doing something more opinionated at the level of this draft introducing the concept of bound signatures and potentially removing the more flexible blind sign functionality.

Remove KeyGen definition

This may be defined as a property of the profile along with details of the pairing-friendly curve in use. For BLS curves it probably makes sense to reference the BLS Signatures IETF draft (hopefully that is getting an update?).

How to map data to be cryptographically signable

Describe the process for how to map abitrary data to be in a sign-able form.

The current way in which this is done is the data is hashed and the first 48 bytes are used, this is then reduced to the first sub-group order, mod Q.

In future there may be new options for this process:

  1. Hash the data (approach depended on #5) take the 64 byte output and then reduce the data to the first sub-group order, mod Q.
  2. Mapping integers - provides the basis for range proofs
  3. Mapping floating point numbers
  4. Sortable data, e.g anything that has lexiographical ordering

@mikelodder7 can provide further details

Supporting deterministic signatures

BBS Signatures as defined in academic literature are non-deterministic in nature. In general non-deterministic digital signatures can be difficult from an implementation perspective due to their dependency on a sound RNG. It causes complexities around confirming implementation correctness because the RNG often has to be mocked or seeded in a particular way. There is the potential to modify the signature construction in this draft to make signatures deterministic, @mikelodder7 could you share your proposal for this?

The main questions I think we as a WG should be asking are:

  • Do we want deterministic signatures?
  • Do we want to take on the scope of deterministic signatures in the current draft?
  • Given we are deviating from academic literature, what security proofs do we need?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.