We used this issue to discuss and track what remained to be included in the Primer 1.0 definition.
Terminology
Status: tracked in #333
I would like to be precise about terminology for concepts in Primer. This will be necessary to align our code documentation, help system, curricula, and marketing material.
Examples of terminology we need to nail down:
- Functions: do they take "inputs"? "Parameters"? "Arguments"? Should we use different terms depending on whether we're talking about a function definition or the application of a function in an expression?
- Variables: this term is overloaded in programming, and variables are also notoriously confusing to beginners. We are potentially even more vulnerable to this confusion as most students coming to Primer with previous programming experience will likely have learned that variables are things that you can modify. The mini-Haskell course we developed in 2020 did not use the term "variable" at all, in favor of "bindings," "names," and "values." Should we adapt this terminology for Primer?
Should shadowing be permitted?
Status: tracked in #332
Our implementation guarantees that node IDs are unique, and therefore names need not be. However, this is probably bad UX: in the UI, how does the student tell the difference between foo
with ID 5 and a "shadowing" foo
with ID 100?
This could potentially be solved via the UI itself: for example, when selecting a node containing a name, we could literally draw an arrow back to the definition. (But what about names imported from other modules? Presumably these would be shown with their namespaced name and therefore not strictly shadowing?)
At any rate, we need to decide whether this is permitted in the language definition, or whether we should not allow it for UX reasons.
Audit evaluation rules
Status: tracked in #325
More flexible pattern matching
Status: definitional issues are tracked in #334, #335, #336. UI issues will be tracked in primer-app
when the time comes.
At the moment, we only support the most basic pattern matches. This may have to suffice for a 1.0, but if we stick with that plan, it's something we should address soon after shipping 1.0.
Time-depending, we may, at a later stage but pre-1.0, try to implement the following, which are not currently supported:
- Wildcard patterns.
- Pattern-matching on primitive constructors.
However, these would require writing an overlap checkers (and, for the special case of primitive constructors, an exhaustivity checker; however, this could probably be done as a special case where we simply ensure that the wildcard pattern appears as one of the patterns). This would also require some UX/UI features that we don't currently support, since at the moment, patterns are provided automatically by the backend based on the scrutinee's type.
Are constructors synthesizable or checkable?
Status: done
Should constructors be treated as synthesizable, or checkable? See @brprice's comment below for more details: #132 (comment)
We decided that it's helpful to the student if these are synthesizable.
let type
Status: tracked in #337
We have support for this in the core language because we use it for eval mode. Should we make it available to the student for use in their programs?
This should be reasonably easy to support in core, but it has implications on error messages in the type-checker and it's probably not worth spending on time pre-1.0. This feature is not particularly useful to beginners anyway.
Editable typedefs
Status: in progress
Note: at this point, we believe that no Primer language changes will be needed to support this, and it can all be done via a series of edits with the action system & the type-checker's smart holes implementation. Therefore, I may remove this section from the document at a later date.
We would be remiss if we didn't provide some way for students to edit (or at least delete) existing user-defined typedefs. This has lots of knock-on UX effects that we need to consider, like what happens to existing code when a type changes: there is probably no way to ensure the student's program still type-checks after a change is made, so what do we do with code that used to work but now doesn't?
A "cheap" way to do this would be to create a new type and keep the old one around until the student has time to make the changes & the old definition is no longer in use, at which time Primer would allow the old definition to be deleted (no more uses). Perhaps, given our goal of wrapping up Primer definition by the end of Q1 2022, this is the safe option for an initial implementation. However, students will probably be changing their types frequently, so it should really have better UX than "sorry, it's your problem."
Holes in type definitions
Status: tracked in #270
Should we support holes in type definitions? It may be useful pedagogically and it may be necessary to support editable typedefs. See #270 for more details.
I think we have arrived at a consensus that we should support these.
Naming rules
Status: tracked in #339
Related to the semantic typography bit I posted about above, do we want to adopt any of Haskell's naming rules? Here I mean things like, "(most) constructor names must be capitalized," "variables must begin with a lower-case latter," etc.
Also, per @brprice: can type constructors or value constructors be named the same thing as term definitions?
There is quite a bit to consider here and it mostly comes down to how easy we want to make it to write a plaintext version of Primer. Personally, my thinking on this is shifting more towards not having any particular naming rules in the language definition. Then conformant implementations would be free to implement stricter naming rules based on their own needs; e.g., a plaintext implementation might want to disallow spaces in names. Such implementations would still need to accept programs from implementations with no naming restrictions to be conformant! However, so long as they can mangle these more liberal names in such a way as the program semantics are preserved, there should be no issue here.
Internationalization support
Status: tracked in #339
[Related: hackworthltd/hackworthltd#25, but here, I'm not speaking about UI support for internationalization (i18n), but the core langauge itself.]
We have decided that for resource constraint reasons, we're not going to focus on i18n support in 1.0. However, I do want to make sure the core language is ready for i18n when the time comes.
What exactly does this mean? ... I'm not exactly sure yet! But we should think through some of the implications.
One example: keywords in our language (e.g., match ... with
) should be generalized. When we say match with
in the language spec, we shouldn't mean the literal words "match" and "with". Maybe in a Hanzi-based writing system, the best way to express the same concept isn't even two words (maybe it's one, maybe it's more). So while the core language spec might use match with
, this should refer to the semantics of pattern match, and this should be divorced from its lexical syntax.
Another example, again involving keywords: suppose match with
is best expressed in French as assortir avec
. Now imagine the following (silly) program, "projected" in French:
-- Assume we don't enforce capitalization here, for the sake of argument
data preposition = to | from | with
f : preposition -> Bool
f x = assortir x avec
to -> True
from -> False
with -> True
Now the student shares this program with another student who speaks English, who sees it as:
data preposition = to | from | with
f : preposition -> Bool
f x = match x with
to -> True
from -> False
with -> True
Capitalization bikeshedding aside, we should definitely accept this program. But this raises some potential pedagogical issues about mixing up which with
is meant. So there are some consequences to being i18n-friendly about language keywords.
Text syntax
Status: tracked in #339
Do we need to define a canonical text syntax for everything, including typedefs? Previously, I might have said, "No," to this question. However, recent feedback we've received during demos makes me think that we need to treat text view in Primer as a first class citizen, and therefore we should probably also specify a text syntax in the language definition that matches what the student sees in text view.
If Primer catches on, people may take our open source bits and want to write a traditional parser/interpreter/compiler for it, in which case, it might be helpful to have an agreed text representation, with no "cheating."
Personally, I am increasingly leaning towards just specifying a canonical AST representation (e.g., as JSON) and not worrying about a canonical text syntax. This would open the door to Primer implementations in languages other than English, for example. (See Internationalization support above.)
Bounded recursion
Status: tracked in #338
Should we create a language subset ("Proto-Primer"?) with bounded recursion?
At the very least, I like the idea from Etherium of giving a computation some "gas." I'm also curious about Conor McBride's "potato-powered proofs," but don't know much about how it works, or how accessible it would be to beginners.
On a related note, I'd like to think about how to implement a decent termination checker in the standard language.
Model top-level definitions as letrec
s
Status: dropped. We went a different way.
Not only would this mean that top-level definitions aren't special, but also it should also make it easy to represent a block of letrec
s as a canvas-within-a-canvas.
But what about types?
Module system
Status: in progress, though the module system for 1.0 will probably be fairly basic
Could we implement modules via letrec
? As with the top-level definition question above, we might not be able to do this for exported types.
Do we need fancy ML-style modules? Something Backpack-ish? Probably not, but before reaching for the most obvious and/or simplest implementation, I'd like to review what the PL community has learned about module systems over the past 30-ish years.
Should the Prelude be part of the language definition?
Status: yes
This is pretty much just a yes/no question. Personally, I think the Prelude should be part of the language definition, as the language landscape is littered with languages that suffered for lack of a standard library (e.g., the Lisp+Scheme family of languages).
Edit: Having given this some more thought since our meeting this morning, I think it makes sense just to add the Prelude to the language. So long as we keep it relatively minimal and learn from Haskell's mistakes (e.g., no partial functions!), this seems uncontroversial to me.
Haskell-style (or -ish) type classes
Tracked in #343
Do we need them, or will our module system be powerful enough that we won't? I believe our current thinking is that we don't really know how to solve the coherence problem with a parameterized module system, except maybe if we go the fully dependently-typed route, which we're apprehensive about for a pedagogical language (or at least one designed for beginners, anyway).
We are aware of some attempts to improve upon ML's module system in order to obviate the need for type classes, but for various reasons, none of them have caught on, and meanwhile, Haskell developers seem more or less pleased with Haskell type classes.
Therefore, our current tentative plan is to do a relatively straightforward non-dependently-typed parameterized module system, and then consider whether we need to also add a type class system.
Built-ins/primitive types
Tracked in #272 and #344
We now support Int
and Char
primitives, which is probably good enough for 1.0. String
will be implemented as List @Char
.
We still need to decide the following for 1.0:
- What primitive functions will be provided?
- Where will they live?
- How do we handle partial application of primitive functions? See #272.
Presumably the answer to the 2nd question is "In the Prelude."
Scoped type definitions
Status: dropped, no current use case.
Is there any reason why we shouldn't support scoped type definitions? We already have let type
for type variables in eval. Shouldn't we expose this to the user language? Can we do something similar for type definitions?
Saving the student's work when a type changes
Status: irrelevant, as we don't think there are any definitional issues here.
There are a few scenarios in which, in response to the student changing the type of a node in the tree, the student may lose work. I'll try to enumerate them here. (Note: at the moment, this list is not exhaustive.)
Scrutinee type changes in match _ with
expressions
In match x with ...
: If the type of x
changes, or if the student deletes x
from the scrutinee location, causing the scrutinee to have type ?
, then any expressions the student has written in her pattern match branches will disappear. This may be jarring. Also, if the student has written one or more non-trivial expressions in these branches, it is probably bad UX to simply delete them.
Could we/should we allow non-empty holes in pattern matches? Offhand, this seems like it would be pretty tricky. If the scruntinee is just a hole ?
, then fine — the idea of keeping around some old patterns for type A
under the match with
expression seems simple enough. But then when the student fills the scrutinee hole again with a value of type B
, what would we do with the old patterns for type A
?
(Side note: I believe that Hazel (the editor, anyway, if not the Hazelnut calculus) allows something like this?)
Semantic typography
Status: never seriously considered for inclusion. This sort of thing has now been incorporated into the discussions above on text syntax, names, and internationalization support.
It seems obvious we should use typography (different typefaces, weights, styles, colors, etc.) to distinguish types, terms, etc. This is pretty standard fare in syntax highlighters for conventional languages.
However, during our 2021-05-11 developer meeting, we discussed whether it might be a good idea to make typography semantic. For example, if we want to distinguish between a local variable a
and a global variable a
, perhaps we should call the global variable a instead, where the boldface is part of global variable a
's name.
One concern I have about this is that it might be too subtle, especially for students with vision impairments. Some of these students might use screen readers, which could solve the problem, but not all will.
Another concern I have about this idea is that it won't transfer. No other language I'm aware of makes such a distinction. From the perspective of transfer, we'd be better off sticking with using namespaces to distinguish global variables from local ones, assuming we want to do that at all (as obviously most languages don't support this distinction, either).