Giter Club home page Giter Club logo

distaff's People

Contributors

bobbinth avatar nkeywal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

distaff's Issues

Ideas for assembly instruction improvements

Some of the instructions below would require support for 2-parameter instructions.

Crypto instructions

  • digest.n to support hashing long sequences of values (as in #38)

Bit manipulations

  • bext.n.m to extract mth bit from an n-bit value. This could replace the current isodd. For example, bext.128.0 would extract the least significant bit of the field element.
  • btrunc.n.m to truncate an n-bit value to m bits. For example, btrunc.128.64 would truncate a field element to a 64-bit value.
  • bsplit.n.m to split an n-bit value into m-bit chunks. For example, bsplit.64.32 would split a 64-bit value into two 32-bit values.

Stack manipulation

  • Reverse roll instructions - e.g. uroll.4 to be the opposite of roll.4
  • More roll instructions - e.g. roll.3

Define elliptic curve for the VM

It would be great if the VM supported EC operations (point addition, doubling, multiplication). This would open the door for verifying EC signatures (e.g. Schnorr) within the VM as well as many other useful things.

The biggest challenge is that the VM operates in a 128-bit base field, but for acceptable security we need elliptic curves in ~256-bit field. One way to address this is by defining an elliptic curve in a quadratic extension of the base field. This would be similar to Microsoft's FourQ curve.

Currently, the base field is defined by prime modulus m = 2128 - 45 * 240 + 1. This can be changed if needed, but the modulus should comply with the following constraints:

  1. m should be around 128 bits in size (could be a bit less if needed - e.g. 126 bits).
  2. m = k * n + 1, where n is a power of 2 greater than 232. This ensures that the field has enough roots of unity for STARK operations.
  3. (m - 1) % 3 != 0, and also ideally, (m - 1) % 5 != 0. This ensures that the field can be used with algebraic hash functions (e.g. Rescue, Poseidon).

Add support for multi-path programs

Currently, a program is just a linear sequence of instructions. This makes writing programs with complex conditional logic challenging.

One way to address this is to change the definition of a program to be a set of alternative execution paths. In such a case, a commitment to a program would be a root of a Merkle tree where each leaf is a linear execution path.

A proof of execution for such a program would need to include a Merkle branch to the execution path which was taken during program execution.

Add mpath instruction with public index input

Currently, mpath assembly instruction accepts node index as secret input. There are scenarios when supplying node index as a public input is required. So, there should be an additional instructions (maybe pmpath) which would take the node index as public input.

Implement VERIFY operation

VERIFY instruction should be added to the instruction set.

  • Opcode: 00001000
  • Description: Execution fails if the top of the stack is not zero.
  • Effect on stack: stack state does not change.

Add ALT stack

Currently, all programs execute against a single stack. Adding an alt stack would enable:

  1. More expressive programs where a program could cache values on the alt stack
  2. A relatively simple way of implementing full random access memory (this would also require operation mirroring).

Adding of ALT stack (without mirroring) would require 2 new instructions:

  • TOALT - to move a value from the main stack to the alt stack
  • FROMALT - to move a value from the alt stack to the main stack

Impact on performance should be relatively small - maybe up to 5% increase in proof generation time. Proof size should remain un-affected for programs which don't make use of the alt stack.

Implement INV operation

INV instruction should be added to the instruction set.

  • Opcode: 00001001
  • Description: Pops the top item from the stack, computes a multiplicative inverse, and pushes the result back onto the stack.
  • Effect on stack: the top item of the stack is replaced with its multiplicative inverse. The rest of the stack remains unchanged.

This instruction will also enable division. But rather than having a dedicated division instruction, division can be computed as a sequence of 2 instructions: INV MUL

Add more boolean assembly instructions

Currently, the only boolean operation is not. It would be useful have or and and operations as well. It would probably make sense to add these in the core instruction set (not just assembly).

Implement hashing operations

Distaff VM should support hashing. Specifically, the following forms of hashing need to be supported:

  • b = hash(a) - basically just hashing of a single 256-bit value.
  • c = hash(a, b) - hashing of two 256-bit values into one 256-bit value. This will enable such use cases as Merkle proof verification.

Ideally, this would be achieved with a single instruction - e.g. HASH. The instruction could behave as follows:

Assuming stack registers are: s0, s1, s2, s3 . . ., where each register holds a 128-bit field element and s0 is the top of the stack, we can view a = (s3, s2) and b = (s1, s0), then:

  1. We pop top 4 values from the stack,
  2. Hash them,
  3. Push the result as 2 elements onto the stack.

Hashing of b = hash(a) can be emulated by setting the top two values of the stack to 0. This, in effect will mean that hash(a) = hash(a, 0).

However, we can't describe the entire hash operation as a single instruction of degree less than 6. So, the solution is to have the operation be a single round of hashing. For example, if HASH described a round of a hash function which requires 10 rounds, hashing of 2 values - e.g. hash(1, 2) may look like so:

PUSH 0, PUSH 1, PUSH 0, PUSH 2,
HASH, HASH, HASH, HASH, HASH, HASH, HASH, HASH, HASH, HASH, HASH,
DROP, DROP

Basically, we push our values onto the stack, then we apply hash round operation 10 times, then we drop 2 elements from the top of the stack (hash result is in the next 2 elements). Excluding PUSH operations, execution of the above sequence of instructions has the same effect as the idealized HASH instruction.

Based on this, the criteria for selecting a hash function is as follows:

  1. AIR for a single round should be an algebraic function of degree less than 6,
  2. Number of rounds needed to achieve 120-bits of security should be small.
  3. All rounds should be uniform - otherwise, we'd need more than 1 hashing instruction,

It seems like Rescue hash function is the best candidate. It meets 2 criteria out of the 3:

  1. AIR for describing a single round of Rescue can have degree 3 or 5,
  2. Number of rounds needed for 120-bits of security can be either 10 or 12

However, it does require special handling of the first and the last rounds. It may be possible to get rid of this special handling - though, security implications of this modification need to be better understood.

Other functions considered were Poseidon and GMiMC - but they both have a fairly large number of rounds, and Poseidon also has 2 different types of rounds (full round and partial round).

Introduce high-degree operations

Currently, all instructions are encoded using 5 bit representation. This has several implications:

  1. There can be at most 32 operations
  2. All operations must have degree 3 or less

However, most operations have degree 2 or less with HASHR and CMP being two exceptions.

We can expand the encoding of instructions to 7 bits, such that the upper 2 bits encode high-degree operations, while lower 5 bits encode low-degree operations. This would enable the following:

  1. The number of operations would increase to 34. Out of these:
    1. 2 operations would be of degree 6 or less
    2. 30 operations would be of degree 3 or less
    3. 2 operations: BEGIN and NOOP would span all 7 bits (would be shared between two groups of instructions)
  2. We could move HASHR operation into the high-degree group and increase S-Box degree to 5.
  3. We could move CMP operation to the high-degree group and simplify the constraints so that they don't need to rely on aux stack register.
  4. We would be able to use the 8th bit in all opcodes to enable instruction mirroring in the alt stack.

Refactor conditional execution

Preventing exponential explosion of execution paths due to conditional branching will require refactoring of program representation as described here.

Add support for value comparison

New instructions should be added to the VM to enable comparing two values (less than, greater than, etc.). This cannot be done as a single operation, but doing this in a cycle of 128 operations seems doable. This will probably require adding 2 - 3 new instructions to the instruction set.

Add assembly instruction for hashing long sequences of values

Current hash instructions allow hashing between 1 and 4 field elements. Hashing of more than 4 elements is possible but requires the user to manually calculate the number of hash instructions needed. Ideally, we should have a single digest.n instruction which can be used to hash arbitrarily many elements.

Add support for secret inputs

To support secret inputs a READ operation should be added to the instruction set.

The READ operation will read an secret input from the secret input tape and push it onto the stack.

The length of the secret input tape should not be limited - so, this will enable providing an unlimited number of secret inputs to a program.

Add support for random access memory

Currently, stack is the only place to store values. This is rather limiting, and can be improved by introducing random access memory. Adding RAM would require:

  1. Implementing ALT stack #19
  2. Limiting total constraint degree to 7. This can be achieved by splitting instructions into high-degree and low-degree instructions #18
  3. Implementing instruction mirroring
  4. Adding an extra register to store root of the RAM state
  5. Adding LOAD and STORE instructions to access RAM state register

Overall performance impact of the above changes would probably result in over 50% execution slowdown with instruction mirroring being the biggest driver.

Improve code organization

To improve code organization the repo should be split up into multiple crates. Potential crates are:

  • Math with all the functionality that currently lives under /src/math
  • Crypto for /src/crypto
  • Core to include core components e.g.: opcodes, constraint definitions, most of components from /src/programs etc.
  • Processor with most of the code currently under /src/processor but also including prover components from /src/stark
  • Verifier with verifier code and related components

Implement CHOOSE operations

The purpose of a CHOOSE operation is to select between two values based on a third value. Specifically:

Assuming stack registers are: s0, s1, s2, . . ., where each register holds a 128-bit field element and s0 is the top of the stack, CHOOSE operation will do the following:

  1. Pop s0, s1, s2 from the stack
  2. If s2 = 1, push s0 onto the stack,
  3. If s2 = 0, push s1 onto the stack.

Semantically, this is the same as popping 3 elements from the top of the stack and then pushing (s0 * s2) + (s1 * (1 - s2)) back onto the stack.

It also might make sense to add CHOOSE2 variation of this instruction. It will do the following:

  1. Pop top 5 elements from the stack,
  2. Compute (s0 * s4) + (s2 * (1 - s4)) and (s1 * s4) + (s3 * (1 - s4)) and push these values back onto the stack.

The effect of CHOOSE2 operation would be to select 2 out of the top 4 stack values based on the value of the 5th stack value. This well simplify Merkle branch verification use case.

Add support for macro operations

Currently, there are several operations which are meaningful only if executed as a part of a mini-program. These operations include CMP, BINACC, and HASHR. Knowing how to arrange these operations into mini-programs to achieve common tasks places an undue burden on users of the VM.

A better approach is to have a simple assembly-like set of instructions which may reduce either to a single operation or to a pre-defined sequence of operations.

For example, new operations could be HASH, GT, 'LT` etc. which would be reduce to a mini-programs upon execution.

Remove AUX stack register

Currently, several operations rely on existence of an AUX stack register. These operations are:

  1. EQ
  2. CMP
  3. BINACC

Removing AUX register would simply the design of the VM. The only downside is that EQ operation would require 2 cycles to execute.

Add support for bounded loops

Bounded loops could be a purely compile-time construct for Distaff assembly. The loops would get unfolded into repeated sequences of instruction. The syntax could look like so:

repeat.10
  a_0 ... a_n
end

where:

  • a0 . . . an are the instructions that define loop body.
  • 10 is a parameter that specifies how many times the body of the loop should be executed.

Implement multi-threaded proof generation

Currently, proof generation is done in a single thread. Prooving time can be sped-up significantly by taking advantage of multi-threading. The following areas could be parallelized (ranked in order from most to least impact):

  1. FFT evaluation / interpolation
  2. Constraint evaluation
  3. Merkle tree construction
  4. Everything else

Clean up machine instruction set

Candidates for addition:

  • CSWAP to maintain consistency with CSWAP2 instruction.

Candidates for removal:

  • CHOOSE2 operation can be emulated by using CSWAP2 operation.
  • CHOOSE operation can be emulated by using CSWAP operation.

Add isodd assembly instruction

The semantic of isodd instruction could be as follows:

Pop the top value from the stack, if it is odd, push 1 onto the stack. Otherwise, push 0 onto the stack.

The format for the instruction would be isodd.n where n is the max number of bits needed to represent the top value of the stack. If top value of the stack is > 2n, the operation will fail.

Add grinding to proof generation

Currently, query positions are derived pseudo-randomly from roots of Merkle trees. Instead, we should derive them for root || nonce where nonce meets a configurable PoW parameter. This will improve proof security without increasing proof size.

Might make sense to do this after #1 is addressed.

Change default security level to ~100 bits

Current default security level of the proofs is around 110 bits (not 120 as mentioned in the docs). Getting more security than this would require running FRI in a quadratic extension of the base field. This should be a long-term goal, but for now, capping security at 100 bits should be sufficient.

This will allow us to reduce proof size by at least 30% because:

  1. We can reduce number of queries in each proof from 50 to 40,
  2. We can reduce size of nodes in a Merkle tree from 32 bytes to 24 bytes.

Add support for unbounded loops

Adding unbounded loops will require refactoring of program representation as described here. Syntax for unbounded loops could look like so:

while.true
   a_0 . . . a_n
end

where:

  • a0 . . . an are the instructions that define loop body.

Loop body will be executed until the top of the stack is not 0.

Implement EQ operation

EQ instruction should be added to the instruction set.

  • Opcode: 00011100
  • Description: Pops two items from the top of the stack, compares them, and pushes 0 to the stack if the values are equal, or 1 otherwise.
  • Effect on stack: top of the stack is replaced with the result of the comparison. The rest of the stack is shifted up by one position.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.