anoma / alucard Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 1.0 2.38 MB

A common lisp DSL for writing zero knowledge circuits

Home Page: https://anoma.github.io/alucard

Common Lisp 99.91% Makefile 0.09%

blockchain zkp zkproof

alucard's People

Contributors

Stargazers

Watchers

Forkers

sskav7

alucard's Issues

Add String and String Literals

We should support string and string literals

Strategies

Using Arrays

One way to compile strings is to have a byte array of ascii values, where each character is represented by their numerical value (E.G. #\D = 100).

packing by byte concatenation

Another interesting technique, if the array is inefficinet. Is to pack the string bitwise. E.G:

"df"
;; this is really
'(100 102)

;; We can now pack this into one elemnt by bit shifting the 100
;; and adding the 102 to the bitshifted value, concatenating them!
(cl:+ (ash 100 7) 102)
#b110010 01100110

However it needs to be noted that we should know the field size, and make sure we don't overflow this when shifting. Thus we should have a small array of these values if we do go over size

Array Compilation

preamble

This is a followup issue to #5. Issue #5 demonstrates an array compilation technique for strings. Namely we can encode a string into a field element, by simply concatenating the binary representation padded to the nearest byte. The code for this is quite simple and can be seen below

(defconstant +byte-size+ 8)

(-> string-to-number (string) integer)
(defun string-to-number (string)
  "converts a string to a numerical encoding"
  (let ((cont 0))
    (sum (map 'list
              (lambda (c)
                (prog1
                    (ash (char-code c) (* cont +byte-size+))
                  (incf cont (char-byte-size c))))
              string))))

(-> char-byte-size (character) fixnum)
(defun char-byte-size (char)
  "Calculates how many bytes is needed to model the current char"
  (ceiling (integer-length (char-code char))
           +byte-size+))

We can thus view strings as a subset of a proper array compilation technique

Encoding Technique

The string code shows a good strategy for encoding utf8 style strings, for arrays we can actually simplify the technique quite a bit as instead of working on values with variable size dimensons (我 has a size of 15 bits, thus 2 bytes to encode. 🤕 takes 3 bytes to encode proper.), we can work on a fixed sized type to properly encode.

Thus if we have an array of size arr we can enocde it as the following

(defun sequence-to-number (size arr)
  "converts an array to a numerical encoding"
  (sum (map 'list
            (lambda (ele count) (ash ele (* count size)))
            arr
            (alexandria:iota (length arr)))))

Since the field element may overflow, we require arrays to be of a fixed size, as if we overflow, we have to split it into multiple field elements.

An amusing fact is that we can grow the array in O(1) given no overflow by simply noting the size and adding by the appropriate value.

(For I8's we can only fit 30 field elements before overflow as (ash 2 (* 8 31)) = 250 bits. Thus we can't even use the full size of the array due to byte alignment).

Indexing technique

So far we've only dealt with constructing an array, but how do we index into it? We can do this with a carry bit along with a condition that returns 1 when we hit the condition we want

carry = (condition * value_at_index_i) + (1 - condition) * carry

[TODO write more about the condition value and how we can compute it]

Simplify The Language through removing custom binders

I thought of a really good trick that should simplify the alucard compiler quite a bit.

Currently I do some hacks to get around the fact I can't use CL's let. I have my own version def.

however what if, when I create a term of the language, instead of just creating it.

What if I gave each an unique object number, and thus when it gets duplicated

(let ((foo (+ 2 3)))
  (+ foo foo))

We can know the user really let it.

Thus I can have a pass that redoes the let with some basic analysis.

We can remove def from the language with this change in!

Note that we still have to emit to a real body as if a user writes

(progn
   (+ 2 3)
   (* 3 5))

the (+ 2 3) would get thrown away, if the (+ 2 3) does not emit, This means that we will likely lose information that our current strategy works with.

Tracking Types down the compiler

with #22 we have operations to introduce type variables to the typing context. We should use this for the rest of compilation to better compile addition mod some number.

Better Array Inference

With the advent of #22 I've left some holes, namely:

;; src;typechecker;typecheck.lisp
(-> annotate-term-no-binder
    (ir:term-type-manipulation typing-context)
    (values typing-result typing-context))
(defun annotate-term-no-binder (term context)
  "Annotating a term can either end up with the following results:

1. Error
  - If the system is in an invalid state with the binder, an error is
    thrown.

2. `hole-conditions'
  - If the system needs more information to fully determine the type a
    `hole-conditions' is returned.

3. `type-info'
  - If unification is completely successful, then we get back a
    `type-info'"
  (with-accessors ((holes holes) (info hole-info)
                   (dep dependency) (closure typing-closure))
      context
    (match-of ir:term-type-manipulation term
      ....
      ((ir:array-set :arr arr :value value)
       (let ((lookup-val (normal-form-to-type-info value context))
             (lookup-arr (normal-form-to-type-info-not-number-err arr context)))
         (dispatch-case ((lookup-arr lookup-type)
                         (lookup-val lookup-type))
           ;; We will leave some holes, as currently we force the
           ;; submission of the type with an array. This should
           ;; change, and we should be able to understand the length
           ;; of a hole
           ((hole type-info)
            (error "not implemented yet"))
           ((hole hole)
            (error "not implemented yet"))
          ...))))

These holes make it so the user has to give the type of an array when allocating it. If we can omit that information, and get it here then we can have the type of the array be inferred (though length can not be).

Example circuits to implement in Alucard

Simple Merkle tree proof verification
- Can be implemented with a list initially
Tendermint light client
- ed25519 verification (which will suck performance-wise, we should figure out how much it sucks)
- bls12-381 verification
- implement light client verification algorithm, also see this function
- we should start with a simple model, such as verifying a threshold multisignature over headers
  - circuit takes as public input a start set of keys and weights and an end set of keys and weights and some hashes (state DB Merkle roots) associated with heights
  - circuit takes as private input a list of headers and signatures
  - headers must be verified sequentially
  - each header can update the keys and weights and has a hash (state DB root) associated with the particular height
  - circuit should check
    - that all of the header transitions are valid, i.e. for each i from start to end,
    - the signatures for h_i on the signer set for h_i+1 are valid, if it changed
    - the signatures for h_i on the hash are valid
    - and that if the height is in the list of heights we want to verify the state DB root for, the root matches

rough initial sense:

type Multisig = [(PublicKey, Int)]
type Threshold = Int
type Height = Int
type Hash = Bytes32
type Header {
  multisig: Multisig
  height : Int
  dbStateRootHash: bytes32
}

valid : 
  // public
  Threshold -> 
  Multisig ->
  Header ->
  [(Height, Hash)] ->
  // private
  [(Header, [Signature])] ->
  boolean
valid threshold startingMultisig startingHeader rootsToValidate headers = 
  foldl (validateHeaderTransition rootsToValidate) startingHeader headers
  where validateHeaderTransition rootsToValidate header1 (header2, sigs) = 
                 validThresholdSig (multisig header1) (hash header2) sigs &&
                 (if (height header2) in rootsToValidate then (lookup rootsToValidate height == dbStateRootHash header2) else True)
             validThresholdSig multisig sigs = sum (filterMap ecverify (zip multisig sigs)) > threshold

if x then y else z`

(x && y || z)

if x then y else z : Int

(= 2 (if x then y else z : Int)) --> ((x && y = 2) || z == 2)

Validity predicates for Taiga
- check here for an idea of the inputs
- For assets
  - Allow mint or burn under specific conditions
- For users
  - Spending conditions (different keys for different amounts), intent conditions (private bartering)
    - e.g. key X for up to amount Y of asset Z, key X1 for up to amount Y1, etc.
    - e.g. allow spend of this note with amount X of asset Y iff. a note is minted to me with amount X1 of asset Y1

Properly handle loading code (unfreezing)

Currently I don't eval any code until It's time to extract. One runs the pipeline, and the function is transformed on the spot!

However it would be nice if code the user writes is ran right away all the way down to vampir and cached. However for testing purposes I'd prefer evals to freeze by default so I can run the pipeline myself manually to certain stages.

In order to accompany this i propose the following

(defparameter alu::*freeze* nil
   "determines if circuit definitions freeze by default or not")

this variable being set to true means that the circuit freezes as it does now, however if we set it to nil (which for the user it is by default!). Then the circuit will try to evaluate on the spot, warning the user if a function is missing. We should assume a function call is a function call and try to catch the warning with replacing it with an application call at the Alucard level.

If the type can be inferred then great, don't error, otherwise ask for more information.

This means we should also increase our caching system for new definitions. For circuits we define this is trivial, as we can setup a system that will notify the circuit on next dump if the cache needs to get renewed. For user defined functions this is also trivial, as we can make it also update this system.

However, if the user uses the base defun the situation is different as we can't determined if it's been changed, maybe with engineering even this too can be updated properly.

Thus we should define the following functions

(redefine-circuit-type new-type)
(redefine-function function-name)
(valid-cache? circuit-in-storage)
(recaluate-cache circuit-in-storage)

where redefine-circuit-type gets called if a defcircuit call changes type, as we must recommit type checking.

We can retype-check all locations now, and warn the user on type errors elsewhere on the system. Or simply delay it, while also giving the user a way to check what functions are affected..

redefine-function denotes that an inlined function has been redefined, and we need to re-run our generator. We can use this same feature for user redefined macros.

valid-cache? simply checks if the cache is still valid. A call to either of these two functions should trigger a nil from this call.

reavluate-cache just forces the re-evaluation of the body, whether we like it or not.

we may want to also add a force-recompile-all-calls to force the recaching of all functions which call the current one. We likely shouldn't do this by default as it can get quite expensive, and thus have users waiting at their listener on some very commonly used redefns

Preserving Line information and Adding comments with Mixins

Preamble

Currently I don't store any source information when I do my compiler transformations. What this means is that error messages are going to be quite ugly. Further it may be nice to generate out vamp-ir code with comments to describe were certain parts come from

Solution

I can get around this issue by having the saved syntax of a defcircuit wrap a (save-expression-as-current-emited-expression ...) over each top expression in a block. Further we will note the current circuit being defined to give even better information for debugging. save-expression-as-current-emitted-expression will note where the current expression for metadata saving purposes. This will be automatically applied to nodes as they emit to the real body. This can be nested with more save-expression-as-current-emited calls, allowing us to sneak call traces in there for more context when CL functions are used.

The storage of this information is not too bad, as we can use mixin classes to represent the extra source-information and comment functionality.

It's as simple as

(defclass source-mixin ()
   ((%source :initarg :source :reader :source)))

(defclass standalone-ret (source-mixin) ...)

(defmethod add-source ....)

Update Documentation for changes due to the Vamp-IR revamp

As Seen in The latest VAMP-IR commit, the syntax and capabilities of VAMP-IR has changed quite a bit.

In response I've made #38 comment 1 to note about how records should be handled in the future.

Further I should note the following in the documentation

with-constraint produces invalid vampir
- vampir will later add new primitives that should make this compile again
arrays do not work
- This is due to with-constraint being broken.

Line Number and Custom Backtrace lead

https://www.snellman.net/blog/archive/2007-12-19-pretty-sbcl-backtraces.html

This link seems to show me how to get line numbers from CL

Mark public parameters show up on the vampir dump

Currently I ignore public private parameters. However with little work we can get public parameters to functions to emit when we dump the entry point to the circuit.

Care has to be taken care not to have any other functions public parameters be marked as public when we compile down the circuit however.

We can either chose to inject this in the primitive-circuit phase of the compiler

;; src/pass/pass.lisp
(-> primtitve-circuit (spc:fully-expanded-list spc:circuit) spc:prim-circuit)
(defun primtitve-circuit (terms circuit)
  (~>> (spc:make-prim-circuit :name (spc:name circuit) :body terms)
       (fill-in-arguments circuit)
       (fill-in-output    circuit)))

or rather as special behavior for the main function to compile.

Make documentation for intended ways to test circuits

Leo has a good section on how to test given circuits. We too can give out good documentation on how to test circuits. With #6 we can just run circuits as if they were normal CL functions.

Thus we can rely on any CL testing framework to test our circuits.

Thus we should document a page showing off how to use 5am for our use case, and link to their documentation.

Thankfully we don't have to bake in any syntax for this, as we can rely on CL to do this sort of functionality for us.

Add Documentation Strings to defcircuit and deftype

Currently I don't add documentation strings to my defcircuit form, would be good if we had those.

Interpreter for circuits

It would be nice to have an interpeter for circuits, so we can submit code run the interpeter and give back data.

This way we can write tests over our custom gates and other such niceities.

Automatic behavior

If the current gate is invoked on a top level setting, we should run the interpreter and give back a result.

However if the current gate is being executed in a defcircuit or a defgate then we should not execute it, rather having the normal behavior of emitting a call.

I think this can be easy to do with a dynamic variable that marks that we are compiling within a circuit, so have normal calling behavior

Remove binders to void

With the last commit, explciit voids can be generated from standalone with-constraint.

This means that the code currently generates out a void with the following code

(defcircuit range-check ((private input int)
                         (output void))
  (with-constraint (b1 b0)
    (with-constraint (b2 b3)
      (= input (+ b1 b2)))))

ALU-TEST> (pipeline:pipeline (storage:lookup-function :range-check))
def range_check input {
  g71716 = b1 + b2
  input = g71716
  g71717 = input
  vg71718 = void
}

It would be good if we removed the void as it would be an improper binding.

This task should be simple, since we generate out a typing map, in the past we had

;; Update logic so that we can get inference on this.
(-> remove-void-bindings (ir:fully-expanded-list) ir:fully-expanded-list)
(defun remove-void-bindings (terms)

which would scan the syntax manually for applications to remove the bindings.

Thus this should be updated to use the typing map to check if a value is void and do the transformations then.

Light weight syntax for circuits, types, and publicity

We should have light weight syntax for defining circuits along with privacy and type information.

Currently we do everything in defcircuit but most functions are not the entry point and thus the publicity of the arguments go ignored.

Further it would be nice to declare types in a separate call if need be. Thus I propose a few abstractions

sig

I believe we should have a sig macro that updates the type signature of the circuit. This should not be redefined by the circuit unless it has explicit type signatures on the form.

Sig should look like

(sig constrain (-> nested bool))

where -> is the function syntax.

defgate

Defgate allows us to not include the privacy information when declaring a circuit. Like defcircuit, types are needed but they can be deferred to a sig call.

(sig constrain (-> nested bool))
(defgate constrain (nest)
  (def ((plane (plane nest))
        (time  (time nest)))
    (= (* (x plane)
          (y plane))
       (* (x time)
          (y time)))))

(defgate constrain ((nest nested) &out bool)
  (def ((plane (plane nest))
        (time  (time nest)))
    (= (* (x plane)
          (y plane))
       (* (x time)
          (y time)))))

public

We should be able to declare parameters public, making them private by default for defgate, if we wish for a defgate to be the real entry point to the circuit.

(public constrain nest)

Proper Array setting Interface

Currently the setf interface for array setting is quite imperative. which makes it clash with the compilation model, however @paulcadman suggested a more functional form like

(defcircuit array-creation-check ((output (array int 10)))
  (def ((bar  (array 10 (int 32)))
        (init (initalize (array 10 (int 32))
                         0 (check 23 (int 32))
                         1 25
                         2 30)))
    (+ 25 (get init 0) (get init 1))))

which will replace

(defcircuit array-creation-check ((output (array int 10)))
  (def ((foo (array 10 (int 32)))
        (bar (array 10 (int 32))))
    (setf (prld:get foo 0) (prld:check 23 (int 32)))
    (setf (prld:get foo 1) 25)
    (prld:+ 25 (prld:get foo 0) (prld:get foo 1))))

which gives arrays a much better interface that is true to the model

Refactor New Object Creation to Copy Instance

With the addition of mixins for meta information per node code like:

(defun return-last-binding (constraint-list)
  "This transforms the last let into a straight binding, since we aren't
going to be using a let"
  (and constraint-list
       (append (butlast constraint-list)
               (let ((term (car (last constraint-list))))
                 (cons term
                       (etypecase-of ir:expanded-term term
                         (ir:standalone-ret  nil)
                         (ir:bind            (list (ir:make-standalone-ret
                                                     :var (list (ir:var term)))))
                         (ir:bind-constraint (list (ir:make-standalone-ret
                                                     :var (ir:var term))))))))))

should have a way of not making a new construtctor with no information, but rather have a way to copy over meta information between the nodes, so that way when we error we can give proper messages

Redesign Record Processing

Currently I have two big pieces of logic in pass

relocation
expansion

Relocations job is to find the records which should be expanded away, and give a map and a closure related to do this task.

Expansion has the job of searching where things should be expanded and placing the proper information.

I think this model worked in the past before we had to consider packing data.

Now that we've had to deal with wanting packed and unpacked data, a much better model can be formed.

Use the closure as a context mapping of names.
- Thus it contains the following information
  1. the type
  2. the unpacked names
  3. the packed names
  4. What state the value is naturally in
  5. Requesting to generate the other state?
Many operations may care about the packed or unpacked status of a value

look into pretty printing for our data structures

Pretty pritning can be read about:

https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node253.html

https://gist.github.com/wobh/2c1a109e1e6dc2cdbc28

It would serve us much better than my current pretty printing techniques

Type format for CL.

It would be nice if we could request type information from alucard terms in CL functions

(defun reshape (array new-element-type) 
  (with-requested-type array array-type
    (match array-type
      ((array length type)
       (coerce array
              (array
               (* length
                  (cl:- (element-size element-type) (element-size type)))
               new-element-type)))))

with that this code can be made. Meaning that we can offer a more safe way to reshape an array form CL. rather than just doing the math in our head.

Gensym on All let bindings

If a user writes a function like

(defun square-root-func (p)
  (def ((with-constraint (x)
          (= p (* x x))))
    x))

Then procedes to use it in code that has x in it

(def ((x 5))
   (square-root-func x))

This will end up with unhygenic binding conflicts.

There is a very easy solution to this problem, gensym even on defs and with-constraints.

References still get used properly, as we generate out a let above the function, and thus lexical scoping will be obeyed if we do this.

Remove Excess let bindings

Prologue

Currently when one compiles code like

(defcircuit constrain ((public nest nested)
                       (output void))
  (def ((plane (plane nest))
        (time  (time nest)))
    (= (* (x plane)
          (y plane))
       (+ (x time)
          (y time)))))

One gets an output like

def constrain nest_plane_x nest_plane_y nest_time_x nest_time_y {
   plane_plane_x = nest_plane_x
   plane_plane_y = nest_plane_y
   time_time_x = nest_time_x
   time_time_y = nest_time_y
   vg56299 = plane_plane_x
   vg56300 = plane_plane_y
   vg56301 = vg56299 * vg56300
   vg56302 = time_time_x
   vg56303 = time_time_y
   vg56304 = vg56302 + vg56303
   vg56301 = vg56304
   vg56305 = vg56301
 }

Which we can see that vg56300 is really just plane_plane_y which is really nest_plane_y. 8 of the 12 instructions are just excess references that have no use.

Since we are just interested in references refering to other references, I think it would be best we ran this any time after record-expansion

Implementation Spot

Thus it should go at the end of

(defun to-expand-away-records (circuit)
  (~> circuit
      pass:linearize
      (pass:expand-away-records circuit)
      pass:remove-void-bindings))

which means we are operating on

(deftype fully-expanded-term ()
  "A fully expanded term is a `expanded-term' with the records part
removed."
  `(or spc:term-normal-form
       spc:application
       bind
       multiple-bind
       multi-ret
       ret))

Which means that we just need to deal with the binders which is bind and below (multiple-bind multi-ret ret).

Pass description

Now a good way to go about this is doing 3 concrete paths

The first pass should just fill a closure from the redundant bind (a bind with a name pointing to a reference) to the name they really reflect. Some care is needed to make vg56300 to point to nest-plane-y
Have a pass that replaces the old refernece (E.G. vg56300) with the new one (E.G. nest-plane-y) every place it occurs
Remove the old bind from the list, removing any dead let.

Typing primitives

Alucard as a language is quite limited, further the type system is also quite limited. Due to this I'd like to introduce the folloiwng primitives

(type-of foo int)
- this states the value foo is of type int, and that we should check if this is consistent
(assume foo int)
- This states the value foo is of type int and we are to just assume this as fact
(coerce foo int)
- This state that the value foo will now be considered in the format of int

These three have distinct roles. 1. will serve as :: in haskell, 2. will serve as a non checked version of this. And finally 3. will allow us to convert types between each other.

1. and 2. are quite easy to implement as they just tell the type checker of types that can be propagated and checked against.

3. on the other hand can be more troublesome to compile. This is because if we consider the situation where we coerce an integer into a record or vice versa.

For records in particular this is difficult, as records get desugared into multiple values with no packing. Thus to say an integer is a record or a record is an integer would incur a packing cost where we have to shuffle the data to fit the layout of the target data types.

This kind of situation will happen often as to get quick array iteration we will have to resort to this kind of trick [1].

Thus I propse a pass after type checking that looks at these coercesions and determines if any packing or unpacking needs to occur. If no shifting of data is required, we can silently just turn the coerce into a let binding to get removed later. If not we can expand it to the packing or unpacking logic and have the system handle it from there.

[1] Since Alucard's type system distinguishes between sizes of arrays and we lack any sort of depdendent typing, we can't have our recursive primitives work on an array that is decreasing in size. That would be invalid type wise. Thus for efficient array lookup, we'll likely convert the array to an int, and mod off values. Meaning that we'll then coerce the modded off value to the original type the array was composed of.

Add a global scope.

If a user writes some code like

(deflex custom-transfer
    (transfer :from-address #xFFFFFFFFF
              :to-address   #x111111111
              :amount       250))

at the top level, then they will just recieve a reference to a let binding that does not exist.

It would be good if when these functions are run without a proper circuit body to catch them that they are added to a global scope that can be compiled to VAMP-IR.

We can run dead code elimination if this costs performance.

Radically Change compilation strategy to support native feeling lists/arrays

Preamble

The current model of Alucard works well enough given the current data structures and constraints we face (user defined structures, and field elements.), however there are serious issue with integration with native Types. For example we want to define a list type that allows the user of alucard to return a list of n values. these lists should be reflected as a real list, as what this will allow us to do is

(defcircuit reveal-bit ((private preimage (list 16 int32))
                        (public  bit-num  (int 9))
                        (output (tuple (list 8 int32)
                                       bool)))
  (def ((new-image
         ;; mapcan will flatten the list given back, thus it's `flatmap'
         ;; this will be 512 boolean bits, since
         ;; 16 * 32 = 512
         (mapcan #'int32-to-bit-list preimage)))
      (tuple (sha256 (subseq preimage 0 8)
                     (subseq preimage 8))
             (nth bit-num new-image))))

However with my current method this is not possible, due to the mapcan getting a literal application node, rather than the result type of a list of.

Implementation Plan

Thus I propose that when we define a circuit we freeze the circuit. We then eval it at circuit compilation time. During this eval. the circuit functions will push the fact that they were called. This forms the real body.

the circuit functions will also return a value, so that normal CL functions can operate over them. Thus I propose the following return schema:

field ---> field

struct --> struct

;; functions with a list return type will emit a let of the names that will go here
list     ---> (list return-1 ... return-n)

Basically all types will act the same as they do now, but lists will expand into a CL list, so we can call function literals easily enough.

For native functions this is no issue, and will be compiled out in the ANF step.

We will store the results of the entire body in the true-body slot of a circuit. This has the benefit of that we can retry compilation from the same users code if a differnet component updates (as we save the original code!)

Downside

Non Oblivious Arrays or How Indexing ruins this strategy (Big issue)

Runtime operations of arrays seem quite difficult. Mainly if we look at (nth bit-num new-image), the value of bit-num is a runtime value of the circuit. This creates massive issues where the array doesn't exist at runtime, yet we have to index a random element out of it. Circ calls this Oblivious Array Elimination [1].

This goes to imply that operations like subseq are also out when values are field elements at run time.

If these limitations are fine, then the strategy above should work, however if not then, I think more thought should be put into how we should compile lists/arrays. And once that is done, we can see how much of an impedance mismatch there is between the meta language (CL) and alucard. Hopefully whatever we come up with allows us to reuse the host language's lambda and treat it either as a native type, or a type that we can implement the same abstractions over

Representaiton (potentinal issue)

I'm not sure how this structure would turn the 512 bitwise list into what I want, however I think what I posted is optimal

def bits[4] a -> a3 a2 a1 a0 {
    bool a3
    bool a2
    bool a1
    bool a0
    a = 8 * a3 + 4 * a2 + 2 * a1 + a0
}

is code from vampir that shows getting 4 bits from a. Thus this should emit optimal code, as vamp-ir will worry about turning the 512 values into an effective relation

References

[1] https://eprint.iacr.org/2020/1586.pdf

Proper Constraint Compilation

Currently Alucard does not express constraints very well in vampir. Namely in our def macro we only let the user bind one value, and the value can't even show up in the body it is defined over.

This is rather unforutnate as many basic techniques rely on this (see ###Example).

Since this is a fundimental operation in vampir, we should be able to properly support it. Further I'd argue that constraints of this nature, work much like pattern matching.

In pattern matching we often have forms like

match foo with
| Cons x1 x2 -> code-with-x1-and-x2
| [        ] -> empty-case

With circuits it's much the same, just instead of data structures, we can do this over numerical operations. Making it much like prolog (see ###Example for the Prolog example).

match x3 with
| 100 * x1 + x2 -> code-with-x1-and-x2

Syntax Additions

We should be able to support this form

(defcircuit foo ((public x₃ int8)
                 (output tuple))
  ;; def can be used as before
  ;; we just augment it with multiple bindings
  (def ((foo   (* x₃ 5))
        (x₂ x₁ (= x₃ (+ (* x₂ (expt 2 8)) x₁))))
    (tuple :fst x₁ :snd x₂)))

where our def form binds x₁ and x₂ and uses them in the matched equation .

Example

For example array lookup compilation may look like

array = div * 2 ^(8 * 2) + rem
div = unused * 2^8 + ans
ans

where we have some array of i8's and try to extract out the third element, by divmodding away the values until we are left with the ans

Example:

For simplicity we will work with a base 10 number, and try to get the 3rd value in this array.

x₃ = 131459

x₃ = 100 * x₁ + x₂
x₁ 1314 x₂ = 59

x₁ = 10 * x₅ + x₆
x₅ = 131 x₆ = 4

x₆

To furhter my previous point about pattern matching, consider prolog

| ?- X = 131459, X #= (100 * Y) + Z, Y #= Foo * 10 + Ans, Ans = 4.

Ans = 4
Foo = _#80(0..131)
X = 131459
Y = _#21(4..1314)
Z = _#38(59..131059)

yes

Prolog does not get the constraints like a circuit (mainly due to me not having used Prolog before), however it does show this kind of pattern matching is used in the wider world.

Fix returning a constant

When writing the imperative example, I noticed a bug

(defcircuit imperative-example ((output int))
  (let ((accumulator 0))
    (loop for i from 0 to 9
          do (setf accumulator (+ i accumulator)))
    0))

If we return an explicit number

we get an error

The value
  0
is not of type
  SYMBOL
when binding PASS::SYMB
   [Condition of type TYPE-ERROR]

Restarts:
 0: [RETRY] Retry SLY mREPL evaluation request.
 1: [*ABORT] Return to SLY's top level.
 2: [ABORT] abort thread (#<THREAD "sly-channel-1-mrepl-remote-1" RUNNING {1001BB8073}>)

Backtrace:
 0: (PASS::RENAMING-SCHEME 0) [external]
 1: (PASS:RENAME-PRIMITIVE-CIRCUIT #<ALU.IR.PRIMITIVE-GLOBAL:PRIM-CIRCUIT IMPERATIVE-EXAMPLE -> 0 = ..)
 2: (ALU.PIPELINE.PIPELINE:TO-VAMPIR Circuit IMPERATIVE-EXAMPLE = ..)
 3: (ALU.PIPELINE.PIPELINE:PIPELINE Circuit IMPERATIVE-EXAMPLE = ..)
 4: (SB-INT:SIMPLE-EVAL-IN-LEXENV (VAMPIR IMPERATIVE-EXAMPLE) #<NULL-LEXENV>)
 5: (EVAL (VAMPIR IMPERATIVE-EXAMPLE))

; Returning var 0 of frame 1
#<ALU.IR.PRIMITIVE-GLOBAL:PRIM-CIRCUIT IMPERATIVE-EXAMPLE -> 0 =
(#<LET G1441 = #<#<REFERENCE +> 0 0>>
 #<LET G1442 = #<#<REFERENCE +> 1 #<REFERENCE G1441>>>
 #<LET G1443 = #<#<REFERENCE +> 2 #<REFERENCE G1442>>>
 #<LET G1444 = #<#<REFERENCE +> 3 #<REFERENCE G1443>>>
 #<LET G1445 = #<#<REFERENCE +> 4 #<REFERENCE G1444>>>
 #<LET G1446 = #<#<REFERENCE +> 5 #<REFERENCE G1445>>>
 #<LET G1447 = #<#<REFERENCE +> 6 #<REFERENCE G1446>>>
 #<LET G1448 = #<#<REFERENCE +> 7 #<REFERENCE G1447>>>
 #<LET G1449 = #<#<REFERENCE +> 8 #<REFERENCE G1448>>>
 #<LET G1450 = #<#<REFERENCE +> 9 #<REFERENCE G1449>>> #<STANDALONE-RET 0>) : (0)>

It seems the 0 doesn't get renamed for some reason, should be an easy fix

Pipeline Object

Would be nice to use a standard pipeline object that I can pass around information along with a body

`constant` and `compile-time` argument types

Idea

It would be great to support vamp-ir's [] abstraction

// k-bit xor and
def xor[k] x y -> z { ... }
def and[k] x y -> z { ... }

where [k] means the xor function takes a constant integer to compile with. For Alucard, this would ideally be inferred, however we can explicitly declare it with the constant field in the circuit signature.

Further I think as we wish to write more complex circuits a compile-time will allow us to serve a circuit that is paramatarized over a common lisp value, that we must supply when extracting.

This can be very useful for testing certain behaviors when extracting or for grabbing round data from disk and supplying them before generating out the circuit.

For example

(deftype point ()
  (x int)
  (y int))

(deftype nested ()
  (plane point)
  (time  point))

(defcircuit root-test ((public root int)
                       (constant field))
  (= (bit-or field (* 3 root) 214)
     0))

(defcircuit record-test ((public  root int)
                         (private planer nested)
                         (constant bit-size)
                         (compile-time hash-round-data)
                         (output nested))
  ;; apply is a normal CL function
  (apply #'and
         (bit-xor bit-size root (x (plane planer)))
         ;; mapcar is also a CL function, the rest are all alucard
         ;; functions/values!
         (mapcar (lambda (constant-from-disk)
                   (root-test constant-from-disk root))
                 hash-round-data)))

Note :: at the bottom of this issue we discuss alternative syntax to compile-time.

could be an example of using constant to setup a root-test. And a compile-time hash-round-data to read hash round constants off disk to plant them in the circuit, which we then require to hold true by the and that is called upon the given equations.

Compilation Technique

For constant not much has to change, simply the vamp-ir layer needs to be notified of this, and we need to make sure that we can't place an intermediate value in the place where a constant value is taken.

Some shuffling will have to be done if constant are not the first argument due to how vamp-ir holds it's arguments, but that is less of an issue.

For compile-time a bit more will have to change. As our storage mechanism will have to be either a constraint as it is now, or a CL lambda which evaluates to a constraint. This means that we have to be careful during extraction, and extract with only the compile-time arguments given.

This makes the given syntax misleading as we have to supply only the compile-time values before extracting.

However once that issue is solved, all that has to be done is to make sure each circuit with compile-time arguments are handed them before running the pipeline

Syntax Challanges

This makes the proposed syntax misleading, as what we are doing is akin to

(defun generate (hash-round-data)
  (defcircuit record-test ((public  root int)
                           (private planer nested)
                           (constant bit-size)
                           (output nested))
    (apply #'and
           (bit-xor bit-size root (x (plane planer)))
           (mapcar (lambda (constant-from-disk)
                     (root-test constant-from-disk root))
                   hash-round-data))))

(generate (read-from-disk ....))

which would take no infrastructure change on my side to support (In fact this pattern works currently, but sadly redefines what the circuit is on every generate call...).

I'm unsure of how to effectively communicate this with the current diesgn, maybe something along the lines of a let over the cirucit would be better?

(def-compile-time (hash-round-data)
    (defcircuit record-test ((public  root int)
                             (private planer nested)
                             (constant bit-size)
                             (output nested))
      (apply #'and
             (bit-xor bit-size root (x (plane planer)))
             (mapcar (lambda (constant-from-disk)
                       (root-test constant-from-disk root))
                     hash-round-data))))

Propagating static type information for meta language compilation

Currently no propagation of static type information for Common lisp is needed for proper integration. However, with the addition of arrays, common lisp will need to know the length of the arrays before generating any constraints.

Due to this we need a solid plan to compile type information on the inputs the HOF take.

We can do this strategy in a few ways.

Run a type checker early for bindings

We can run a type checker over the base terms, after doing so, we can add this information to the

anoma / alucard Goto Github PK

alucard's People

Contributors

Stargazers

Watchers

Forkers

alucard's Issues

Strategies

Using Arrays

packing by byte concatenation

preamble

Encoding Technique

Indexing technique

Preamble

Solution

Automatic behavior

sig

defgate

public

Prologue

Implementation Spot

Pass description

Preamble

Implementation Plan

Downside

Non Oblivious Arrays or How Indexing ruins this strategy (Big issue)

Representaiton (potentinal issue)

References

Syntax Additions

Example

Idea

Compilation Technique

Syntax Challanges

Run a type checker early for bindings

Recommend Projects

Recommend Topics

Recommend Org