redprl / redtt Goto Github PK

View Code? Open in Web Editor NEW

204.0 26.0 12.0 3.1 MB

"Between the darkness and the dawn, a red cube rises!": a proof assistant for cartesian cubical type theory

License: Apache License 2.0

Makefile 0.09% OCaml 77.41% Red 20.48% Standard ML 0.02% Shell 0.02% Emacs Lisp 1.07% Vim Script 0.90%

cubical-type-theory proof-assistant type-theory ocaml

redtt's People

Contributors

Stargazers

Watchers

Forkers

jozefg solson clayrat emacsmirror soraismus almostnever xqqq steeleprice ecavallo ivoysey gitter-badger tankofzion

redtt's Issues

Rationalize and refactor closures, machine states, etc.

I have resolved the contradiction between closures and computing with systems by embedding in each closure a control stack which I can edit. In this sense, I am not really using closures anymore so much as suspended machine configurations. This idea should scale up to the really hard cases, like computing coe in V types, etc.; the key insight is that each kind of "edit" that needs to be done in the subterms can be added to the semantic domain as a new kind of stack frame. This is a kind of defunctionalization of the naive algorithm.

But currently it is a little messy. I need to rationalize and unify the structure...

A final test of this idea will be to switch from the "dimension families" (boehm-tree style representation of semantic dimension binders) to a syntactic representation using these suspended machine states. If I can code everything in this way that I have done in the boehm-tree style, then that will validate the idea. The suspended machine state formulation is preferable to the boehm trees because it is purely syntactic, and thence easier to justify (and, potentially, easier to optimize).

Add kinds (pre, kan) to start

It will be very useful to add kinds like pre and kan already, so that I can start using things like the real restriction types, which are generally only pretypes.

support metavariables

some notion of metacontext (for now, perhaps just use global state, since the typechecker does not need to locally change the metacontext)
typecheck metavariables and their suspended substitutions (expensive, but unavoidable)
evaluation and quotation for the suspended substitutions (gnarly)

Once this is in place, it will be possible to unleash the elaborator (#16) and develop a usable source language. I have so many ideas I want to try out!

Fix extension types

@cangiuli and @mortberg and I discovered today that the extension types are not Kan. We have an alternative version that is Kan, and which more faithfully generalizes the path types:

We need an n-ary extension type (# [i ...] A [xi N...])
To be Kan, we need that xi... is valid wrt just i... (i.e. it cannot use dimensions from Γ).

Parse errors?

@freebroccolo When the parser fails, I'd like it to emit some minimal message that tells me what went wrong; it doesn't need to be a good message (for instance, I don't yet have the bandwidth to develop fancy custom error messages like you seem to have done in yggdrasil). Ideally, it would just be a kind of crappy message like you get by default with these parser generators.

Do you know how to enable that behavior? There was some commented code Parse.ml, but I didn't really understand it.

support twin variables in syntax and typechecker

Notation for composition and filler

A frequent occurrence in yacctt and RedPRL is that we need to construct a composite and its filler in close proximity, sometimes with many complex tube faces, and often with reference to many ambient (dimension and term) variables. Here's some code from yacctt:

let ctr : fiber VB B projB b =
      (Vin i (w.2.2 b).1.1
             (hcom 1->0 B [(i=1) -> <_> b
                          ,(i=0) -> (w.2.2 b).1.2] b)
      ,<j> hcom 1->j B [(i=1) -> <_> b
                       ,(i=0) -> (w.2.2 b).1.2] b)

One might imagine let-binding the filler and using the filler @ 0 for the composite. The problem is that we need to give the filler an iterated path type -- its faces are relevant, so line types do not suffice -- and we need to explicitly specify its 0 side as the composite. Ultimately we would write each tube face and cap three times, so it's actually more ergonomic to copy-paste the hcom itself.

I suggest adding the following notation:

let (hc,hf) : B ~= hcom 1~>0 B [i=0 -> foo , i=1 -> bar] baz

which would elaborate into

let hf = <z> hcom 1~>z B [i=0 -> foo , i=1 -> bar] baz
       : (z : dim) -> B [i=0 -> foo @ z , i=1 -> bar @ z , z=1 -> baz]
    hc = hf @ 0
       : B [i=0 -> foo @ 0 , i=1 -> bar @ 0]

where B [ i=0 -> foo ] is a restriction type. (Note that this can't be an extension type because i is ambient.) (Please suggest better concrete notation.)

coe in V

Fix neutral quotation

Neutral quotation isn't quite right: it seems to need to be able to return a checkable term, but then the inductive cases don't exactly work out.

I want to somehow move the complexity into reflection, and I'm not yet sure how to do it.

Replace restriction types with extension types :)

Had a meeting with @cangiuli in which we thought for a while about how to make the singleton-style restriction types Kan, and as far as we can tell, there is no useful definition of the restriction types which would be Kan.

Our new idea is to replace these with extension types; extension types are types of the form "(i : dim) -> A [ sys ]". We would require that the system be "valid" or "tautological" in @favonia's words.

This idea would replace the decomposition of the path type into pi + extension with a single connective that combines the abstraction over the dimension and the restriction. Combined with line types (in the style of RedPRL), we'd have a language to express the specification of higher cubes by their boundaries.

The part of evaluation where the boundary is projected now would get moved from eta expansion into semantic application. In practice, this would not look that different from what I had proposed before, except that by "restricting" the places where you are allowed to place restrictions (essentially, into the codomains of certain pi types), we make it possible/plausible to build the appropriate Kan structure.

Consider hereditary substitutions

I wonder if all this would be easier using hereditary substitutions and a purely syntactic approach, than the NBE machinery.

I'm not exactly sure, but I want to consider it as an option (because it came up while I am implementing the unifier).

Add proof irrelevance modality

Unicode support

I guess I need to use a different lexer generator :(

Sketch of a source language

I want to immediately unleash a source language, which will at first not have many features; in fact, almost everything will be done at the beginning by just writing down a full proof term (with no holes!). As we improve the elaborator and the unifier, we will be able to incrementally support more and more forms of interactive programming.

DECL ::=
 | theorem foo (x : E_TERM) ... : E_TERM    (* declare a problem / goal *)
 | refine foo x ... GADGET                  (* refine the goal using a "gadget" *)
 | done                                     (* check that all holes are filled in *)

GADGET ::=
 | => E_TERM  (* return a proof *)

E_TERM ::=
 | ?foo                   (* hole *)
 | fun x -> E_TERM        (* lambda abstraction*)
 | < E_TERM, E_TERM >     (* pair *)
 | `M                     (* embed a real proof term M *)

M ::=
 | (lam [x] M)
 | (cons M M)
 (* ... *)

Later on, we will extend the grammar of GADGET and E_TERM (especially the latter) to include more programming facilities. GADGET will ultimately be extended in the Epigram style with a way to interactive program with eliminators, whereas E_TERM will be extended with things that are analogous to all the constructs of cubical type theory.

I think that the sketch above is implementable almost right away, and will let us start to get off the ground without requiring us to solve every problem at once.

Conveniences for shifting under binders in dev calculus

During elaboration, we have a pattern of the following kind:

claim "foo" ty0 begin
  claim "bar" ty1 begin
    fill_hole @@
     some term mentioning 'var 0' and 'var 1', 
     which are the things introduced by the claim binders)
  end
end

What would be sweet is if we could write instead:

claim "foo" ty0 @@ fun ref0 ->
claim "bar" ty1 @@ fun ref1 ->
shift ref0 >>= fun tm0 ->
shift ref1 >>= fun tm1 ->
(* use tm0, tm1 *)

The idea is that in the former, we have to do a lot of de bruijn arithmetic, which is not so nice. But we clearly have enough information to have this done automatically for us, since the monad controls every move under a binder. So, we could instead keep a map of "references" to terms in the proof state along with the binding depth at which they were created; and then when we want to use them (with shift), we just unleash a shift which corresponds to the difference between the current binding level and the original binding level.

This would make programming elaborators a lot simpler.

Restriction pretype formation must check adjacency

Semantically / in the models, there is no condition on the restriction except well-typedness. However, with open terms, you can easily destroy definitional equivalence by making a false assumption. Consider the "doubleton" pretype (bool [0=0 tt] [0=0 ff]): if you have one of these fuckers in the context, then it is definitionally equal to both tt and ff (by the eta laws), so by transitivity, we have trivialized definitional equality.

This is very similar to the deviations that Conor McBride reported in his famous Ripley Cupboard.

By requiring the restriction to line up with itself / checking the adjacency conditions, we can avoid this.

I noticed this when I realized I could define an exact equality type using restriction types which appeared to make equality reflection derivable (LOL).

Revival of fhcom?

The main argument supporting fhcom over fcom is gone---now it really needs to be homogeneous in redtt. @ecavallo @cangiuli

add face formulas, systems

add support for comments

I prefer C++-style notation for comments, as in RedPRL. That is, // for single line, and /* ... */ generally.

But I am open to any other ideas.

Elaborator, holes, etc.!

I want to have an elaborator that supports interactive development by refinement, using holes. Eventually we can do more fancy things like unification and top-level implicit arguments (in Epigram style) too (but later!).

One thing I have been wondering is whether it will be possible to keep the core term language completely ignorant of the user-supplied names, but somehow keep track of these in the elaboration environment.

[domain] Consider adding neutral hcom

It is strange that in our semantic domain we have "non-neutral" values that inhabit neutral semantic types. Nothing seems broken, but it doesn't seem right either. A solution is to have ncom and ncoe (for "neutral" composition and coercion), analogous to fcom.

hcom in fcom

coe in fcom

Higher order pattern unification for terms

In order to implement refinement proof in which you can see the required boundary at each step (such as when going under a path abstraction), it is necessary to be able to determine the new boundary of the subgoals.

For instance, in an easy case like lambda, the subgoal's boundary is just the application of the main goal's boundary to the bound variable. It works this way for anything negative, like sigma and extension.

But for positive things (including refining a Univ subgoal with Pi), it is harder. In the case of refining Univ by Pi, we have to project out the domain and codomain from each face of the boundary, each of which needs to be a pi type. In the general case, this amounts to pattern matching on the term in the boundary.

Metalanguage

It will likely be easier to fire up a "RedML" style metalanguage than it was in RedPRL. A very basic one of these could be a priority, as it would make it easier to fiddle with stuff (like, "print this definition", "evaluate this code", "quote this value", "normalize", etc.).

Name?

https://en.wiktionary.org/wiki/red#Translations

Pretty printer for terms

Print nested lambdas, applications, etc.

Would be nice in the printer to coalesce nested lambdas/pi/sigma, applications, etc.

support for importing files

Add build instructions to README

Define global context/signature

I need a global context/signature for top-level declarations and skolemized holes. Things in here should be given string names or something like that, so that we can freely add new declarations without needing to shift de bruijn indices, etc.

Parameterize everything by information about which dimensions are identified

I need a substitution-free way to typecheck and evaluate and quote under a restriction / assumption of r=r'. I think one way to do it is to pass around some object that will provide comparisons of two dimensions. Then, in the evaluator when I compare two dimensions (which I currently do in a context-free way), I would just ask this object.

This object would be augmented when checking systems, and when checking elements of restriction types.

The first step is to update the evaluator with this information; I think that it will be desirable to include this information in the closures.

Then, the quotation apparatus can likewise be updated, as will the typechecker. For the quotation and typechecking modules, this information should be held within the context object.

add extension types

depends on #4

The interesting part here will be the eta expansion.

coe in fcom

Why is this one missing in the issue list? Or did I miss something?

Pretty printer for values

Crucial for not only debugging, but we will actually need this to compose certain error messages.

Labeled types a la Epigram

In Epigram, there is a notion of labeled type < lbl t ... : T >; it is just a wrapper or 'phantom' type around T, but it is extremely useful for programming, especially when programming with eliminators as we will need to do.

The idea is that you have a couple rules like the following:

    T type    t_i => S_i...
-------------------------------
   < lbl t ... : T > <= type

     M <= T
----------------------------------
   return M <= <lbl t ... : T>

  F => <lbl t ... : T>
-----------------------------------
  call(F) => T

Then, when you are writing a program using an elimination rule, such as addition, instead of defining:

plus : (-> Nat Nat Nat)
plus x y = (elim [_] Nat y [z ih] (succ ih) x)

you define a version using the labeled type <plus x y : Nat>; the point of doing this is that if you are using holes, in your subgoals you can see (e.g.) that you are trying to implement <plus 0 y : Nat>, the induction hypothesis will actually say the name of what recursive call it is! You would write:

plus : (-> [x : Nat] [y : Nat] <plus x y : Nat>)
plus x y = (elim [z] (<plus z y : Nat>) (return y) [z ih] (return (succ (call ih))) x)

This seems a little more busy-looking, but it is great for interactive programming. And it is not so hard to elaborate to this form from a more visually appealing version, as Conor and his crew did in Epigram:

plus : (-> [x : Nat] [y : Nat] <plus x y : Nat>)
plus x y <= elim x {
  plus zero y => y;
  plus (succ x) y => succ (plus x y)
}

Note that this is not real pattern matching; there is no need for matching, termination checking, etc. What really is happening is the following:

first, an induction motive is chosen; in this case it is automatic, because you are eliminating a variable.
the first branch is the first case of the elimination rule; the left hand side is checking/asserting that the current goal is the labeled type <plus zero y : Nat>, and the right hand side is elaborated to return y.
The lhs of the second branch is just checking that the current goal is <plus (succ x) y : Nat>, but the rhs is a bit trickier. If you look in the local context, there is actually some anonymous variable _ih : <plus x y : Nat>, which is the induction hypothesis; this means that you are allowed to make this "recursive call". In the source language, such a recursive call plus x y is translated into a goal to find in the context some hypothesis ih : <plus x y : Nat>. If we find such a hypothesis (which we do), then it elaborates to return (succ (call ih)).

More sophisticated derived elimination rules can be used to enable more flexible forms of "pattern matching"; for instance, something called a 'memo structure' can be used to ensure that you are allowed to make not only the immediate recursive call, but any structurally smaller recursive call. All this without needing special support in the type theory for termination checking, etc.

Locally nameless term representation

I determined that it will be too difficult to implement #56 and #16 without a locally nameless representation of syntax. The De Bruijn arithmetic that is necessary there owns you really bad.

Now, this is at first a shame, because to write a good typechecker and evaluator, you should do the syntax in a nameless style with de bruijn indices everywhere---it's stupid to unbind and bind things in the typechecker and evaluator.

So, my thinking is that we would have a non-abstract locally nameless representation; that is, rather than hiding the unbinding under an abstract type, we would just allow you to choose to unbind a variable or not. So in the core (typechecker, nbe, etc.) we never unbind or bind, but in the elaborator, unification machinery etc., we would use that facility liberally.

This would generalize the Global x things that we have for accessing the global signature.

Rework values

Trying to design a correct and non-silly eval/quote semantics for coercions has led me to the following design ideas:

the semantic values of type expressions should probably be whnfs: in particular, not just binders should be closures, but really any subterms
we need explicit substitutions in the term language

I believe that this is the easiest way to avoid interleaving evaluation and quotation in a very bad way...

Executable frontend

Right now I just run this from the repl.

Subtyping rules for restriction types probably not sufficient

Right now, in the bidirectional style, subtyping fires only when you try to check an inferrable term against a type. But consider the face of applying the "function" f : (A -> B) [ xi :-> N ] to an element of type A. This does not invoke subtyping, since it is not the mode switch, so the application would be ill-typed.

One way to make it well-typed is to wrap f in an annotation, like (f : A -> B). Then, we are really doing a double mode switch: we switch from inference to checking and back to inference again. In the middle, we invoked subtyping, so we are OK.

But this is not really what we want for a few reasons. First, it's dumb to have to put the annotation. Second, it causes the information from the restriction to be lost; we would like it to be the case that f(M) can be seen to have the restriction xi :-> N(M).

This is related to how in singleton types, you allow the singleton specification to float through all the connectives.

@cangiuli Any thoughts?

More immortal decomposition of syntax (heads, spines, etc.)

I'm finding when writing the elaborator for interactive development that a more immortal decomposition of the syntactic terms would be preferable. Right now we have two sorts; chk and inf (for checkable and inferrable terms respectively).

[Sidenote: Often these are referred to as "normal" and "neutral", but I prefer to regard those names as having meaning in the semantic domain, and not the syntax--indeed, in a language which allows redexes, the check-infer distinction does not really match the normal-neutral distinction.]

What I want is to have at least the following syntactic categories:

chk for checkable terms. These are subject to a judgment like Γ+ |- {M} <= A+.
head for inferrable terms that are not in left focus, but can be the target of left focus. This would include variables x, meta applications α[σ], coe, hcom, com, if, (M : A), etc. These are subject to a judgment like Γ+ |- {H} ~> A-.
stk for stacks (or spines): these correspond to the moves taken during left focus. This includes things like [] (empty/identity stack), app N :: S, @r :: S, fst :: S, snd :: S, etc. These are subject to a left-focus style judgment, Γ+ [A+] |- {S} > C+, which says that the stack S takes the type A to C. For instance, we have Γ [A*B] |- {fst :: []} > B.

Observe how the stack is always in checking mode, and that the mode switch from bidirectional typechecking is deferred until you get to the end of the stack, where you must see that the type in focus is a subtype of the output type.

A head can be cut with a stack to create a checkable term.

The benefit of this scheme is that it is more suitable for interactive/refinement-based development than the usual formulation.

Dump/prettyprint unifier/elaborator state

A prerequisite for me learning how to write an elaborator using Conor's monad that I implemented is that I should be able to print out the full state. Then, I can better understand what effect the various transitions have, and then hopefully I can build up some combinators for elaborating interactive proofs.

lexer, parser

split utility code into separate library

I'd like to have a library like red-basis or something like that, where I put things like the definition of a monad, some general purpose data structures (like persistent hash table, etc.). I want to keep it in this repository, but just have a different folder and then depend on it from redtt.

My dune/jbuild/opam foo is not yet strong enough to do this properly.

actual spine form ;-)

Because of my perfectly symmetric mind, I failed to realize that spines and stacks are opposites 😆 A spine is like a queue. I implemented stacks, because I was most comfortable with these.

It turns out that in the unification algorithm, it is more convenient to have things as genuine spines. I should probably switch it over. LOL.

😎

Write new notes

Need to replace the current notes, as soon as I have cleaned up the experimental implementation. This is a priority, since until I write the notes, nobody has any idea what I'm doing 😆

define parsetrees, resolver

"Corestriction"/partial pretypes

I was thinking that for the proof theory & intermediate states of the elaborator, it would be very helpful to have a type like (r = r' ...) -> A, in which you give an element of A under the assumption that r=r'.... The elimination rule would let you get an element of A if you are in a context where r=r'... holds.