Comments (8)
@evincarofautumn is experimenting with a build system style of query-driven architecture:
You can model it as a build system, where the artifacts are data structures and the tasks are compiler passes—although I don’t know if that’s new or just new to me. This naturally extends to things like caching and incremental builds as long as your data structures are granular enough; that is, you don’t necessarily need to use explicitly incremental data structures, you just need to be able to divide the work into small chunks which are not transformed incrementally. This model seems to make things like concurrency and logging/tracing easy to add, especially in Haskell
By the first point I mean that, instead of “this function changed, therefore recompile the whole file” you get “this function changed, therefore recompile that function” but you still have “relink the whole file”—if you had fully incremental data structures, you’d get “modify just the part of the linker output that changed”. Seems to be a good compromise for me in terms of understanding the implementation but getting pretty good performance and interactivity, but we’ll see
Incrementurtles all the way down sounds appealing but I don’t know if I want to go down that rabbit hole…turtle hole…you know what I mean. Especially because my intuition is that the overhead of using pervasively incremental data types and figuring out how to apply changes might be more costly than just doing full rebuilds (of smallish things)
https://gitter.im/pikelet-lang/Lobby?at=5b58fe9632d98c2ed2b58eac
from pikelet.
The (very slowly emerging) consensus I’m seeing is to structure this mid section roughly like some variant of an efficient datalog engine. To whatever degree of sophistication in caching, parallelism and incrementalism your implementation can manage.
I think it's tempting to try to get away with a multi-pass, straight-line compilation model and tbqh I'm still sympathetic enough to it to try to design languages that are amenable. But if the language design doesn't fit, it's a costly mistake to build the compiler that way.
(It's not at all clear to me that users are that hostile to the constraints imposed on a language by designing for it. Suspect the benefits in speed and simplicity of compilation might persuade them! But we often write cheques for ease-of-use features that we must then cash.)
Do you count Dotty’s “compilers are databases” under this consensus? Dotty also fits this thread because basically the whole program is a giant mutually recursive object, so your typechecker needs to be (I suspect) a productive corecursive program:
The type of each decl. is a lazy thunk that processes the given type or infers one. Forcing it might force other thunks (either to infer the type from the body, or to force declarations used in the type). It’s an error if the thunk forces itself before creating a result.
To some extent that’s already captured by DOT’s typing rule for objects, tho this rule should be coinductive instead.
Γ, self : T ⊢ decls : T ———————————————————————————————————————————————— Γ ⊢ new { self => decls } : mu (self => T}
Having reverse-engineered this from Dotty, I wonder how it compares with other compilers. But duplicating validateDecl doesn’t seem a problem here. OTOH, blog posts on Kentucky Mule (rsc’s father) suggest that such laziness is a significant performance problem...
from pikelet.
More twitter chats:
Graydon:
I would say that most compilers I've seen are very embarrassingly structured and don't adequately isolate/track dependencies at the crude level they "ought" to be able to from the language semantics. So wind up redoing vast amounts of work every run.
Much of this, in turn, derives from the extremely convoluted rules for name lookup in virtually every real language. If you can't easily and precisely figure out what a source entity depends on name-wise, you're cooked, gotta back off to the whole TU. Which is really common!
Like if I had one piece of advice for language design it'd be to keep your name lookup rules as simple as possible, not intertwined with type checking, not involving any forms of global overload disambiguation, argument-dependence or whatever. It's a hairball almost everywhere.
Me:
Hum, interesting. Might using dependent records for my module system make this horrific?
Graydon:
Might. Not sure. Worth considering in some detail: you want def/use dependence relation to snap into focus cheaply, not involve either "searching an unbounded number of places" or "doing a lot of work -- importing or deduction -- at each place you have to look" for each name.
Like do a thought experiment where you change only one def with only one use in a library with a million defs, and figure out how small you can make the constant (ideally zero) on the O(million) work you're going to do to figure out there's only one use, and exclude the rest.
Blaisorblade:
So
@gkossakowski
’s Kentucky Mule relates exactly to this problem, as it’s about building very quickly the “symbol table”, which is sequential, and then compile bodies independently. No chance for return type inference tho! (And he complains about Scala lookup rules, too!)
from pikelet.
Looking at Salsa - it's still early days for it, but it looks like it will be close to what we'll need! Here's an example.
from pikelet.
Speaking to Niko, it seems that Salsa doesn't handle 'sub-term' incrementalism. This could be an issue given we are going with a 1ML-style of language, where modules are just part of the term language.
from pikelet.
Just added a link to @ollef's description of incremental compilation In Sixten.
from pikelet.
Lucent by @Techno-coder is moving to a query-driven approach: https://github.com/Techno-coder/lucent/blob/next-prototype/src/query/context.rs
It's a hand-rolled version of the rustc approach, rather than using something like Salsa.
from pikelet.
Interesting comment from @matklad: https://lobste.rs/s/afqbdk/individual_element_thinking_vs_grouped#c_gns5uo
Finally, to speak about something I actually do, rust-analyzer is rather game-like. There’s a bunch of “data”, there’s a main loop accepting input from the client, doing “game update” and “rendering” the end result (completions, highlighting and such) to the client. It pains me that today this works by having millions individually allocated things pointing at each other. I envy sorbet’s (type checker for Ruby) architecture, where they have a
GlobalState
, which holds a bunch of flat vectors, and that’s basically it. Sadly, doing just that in rust-analyzer is not trivial, as we have a pretty intricate incremental computation setup, which, in the current incantation, pretty much forces a tree of individually owned objects.
Links to Sorbet's GlobalState
type:
- https://github.com/sorbet/sorbet/blob/master/core/GlobalState.h
https://github.com/sorbet/sorbet/blob/master/core/GlobalState.cc
And here is a blog post that talks about some of the design decisions: Why the Sorbet typechecker is fast.
Doing something like this might be nicer than using Salsa if we want the Pikelet compiler to be embeddable in other programs.
from pikelet.
Related Issues (20)
- Turn theory appendix in book into the Pikelet specification HOT 6
- Fix ambiguous dependent function syntax HOT 1
- Pattern match compilation
- Don't panic on mismatched number of record fields
- Get Pikelet to build on wasm HOT 14
- Terminology bikeshed: Universe shifting/lifting/embiggening HOT 5
- Rename signed integer types
- Allow fields to be shifted
- Add a list of keywords to the docs
- Remove `of` keyword from case expressions
- Pikelet driver/loader API HOT 12
- Package manager HOT 4
- Add a top-level integration test suite
- Move away from using Moniker for variable binding HOT 7
- Try using Logos for the lexer HOT 1
- Merge next branch into master
- Fix unfolding avoidance for local variables
- Experiment with Andras Korvacs' version of universe lifting HOT 5
- Workspace doesn't build on a headless linux system HOT 1
- Cannot build with modern versions of Rust
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pikelet.