zxul767 / lox Goto Github PK
View Code? Open in Web Editor NEWAn interpreter for the Lox language
An interpreter for the Lox language
clox
currently supports relative paths when running files as follows:
make run NAME=../samples/basic_class.lox
we should make jlox
support the same. at present it throws this error:
Exception in thread "main" java.nio.file.NoSuchFileException: ../samples/basic_class.lox
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.base/java.nio.file.Files.newByteChannel(Files.java:375)
at java.base/java.nio.file.Files.newByteChannel(Files.java:426)
at java.base/java.nio.file.Files.readAllBytes(Files.java:3272)
at dev.zxul767.lox.Lox.runFile(Lox.java:49)
at dev.zxul767.lox.Lox.main(Lox.java:42)
this is a simple function that is most useful for discovering the signatures of standard library functions.
there's an initial (albeit incomplete) version in jlox
that can serve as a guide for the implementation in clox
at the moment we use fprintf(stderr, "...", ...)
everywhere, but some patterns begin to emerge and it's clear that we can save some tedious work.
they started as statements initially because it was important to print things as soon as possible (i.e., before we implemented functions)
it would be very useful to have an assert
function in the standard library, which can serve as a poor man's testing facility (e.g., see pytest
):
assert("hello".starts_with("hell"), "starts_with failed!")
assert("hello".ends_with("llo"), "ends_with failed!")
There are various places in clox
where the push .. pop
pattern shows up. This pattern is used so that the garbage collector will not free memory it shouldn't by temporarily placing a newly created object on the value stack, and then popping it once it's stored in another place where it belongs (e.g., the interned strings table, or the global variables table).
This is typically necessary when the newly created object is being pushed to a container which can trigger a memory allocation right before the object is stored in the container (a potential time for the garbage collector to kick in).
the goal is to be able to unify the following methods (in LoxNativeClass
):
static LoxString assertString(LoxInstance instance) {
assert (instance instanceof LoxString)
: "instance was expected to be a LoxString";
return (LoxString)instance;
}
static LoxList assertList(LoxInstance instance) {
assert (instance instanceof LoxList)
: "instance was expected to be a LoxList";
return (LoxList)instance;
}
but using generics directly doesn't seem to work due to generic type erasure.
at the moment we use fprintf(stderr, "...", ...)
everywhere, but some patterns begin to emerge and it's clear that we can save some tedious work.
Column numbers can be computed either for each token eagerly (i.e., as attribute on the token right as it is created) or on demand (computed based on the line and the source code offset only when it's time to display an error)
the :globals
command should print a list of all the globals in the current session, organized by type:
CoffeeMaker <class>
maker <CoffeeMaker instance>
start <function>
counter1 <closure>
counter2 <closure>
count <primitive:number>
program <primitive:string>
flag <primitive:bool>
for clox
, even though we always wrap functions in closures, by inspecting the upvalues
array we can tell if a function is an actual closure.
the general current pattern of parsing functions in clox
is as follows:
...
if (match(TOKEN_FOR)) { // consumes current token if it matches
for_statement(...)
}
...
void for_statement(...) {
consume(TOKEN_LEFT_PAREN);
...
}
i believe this can a be bit confusing since the parsing of the whole construct (in this example the for
statement) is spread over two places (or more in some cases).
i think it might be better to structure parsing functions so that they are self-contained, as follows:
...
if (current_is(TOKEN_FOR)) {
for_statement(...)
}
...
void for_statement(...) {
consume(TOKEN_FOR);
consume(TOKEN_LEFT_PAREN);
...
}
this restructuring adds no additional runtime overhead but can make code more straightforward to grok since all logical steps for parsing a construct are in its corresponding function.
currently, access is achieved via:
var l = list()
l.append(1)
print l.at(0)
but given how common such an operation is, and that list
is a primitive, we should support the following syntax as well:
print l[0]
where the index can, of course, be any valid expression that evaluates to an integer.
Python is quite lenient when it comes to the slices, e.g., list[0:10]
for a list with less than 10 elements simply returns an empty list.
I suppose that this has a good rationale but I don't know it off the top of my head, so I think we should research it and see if it makes sense to adopt it for Lox. The current behavior simply emulates the Java API (String::substring
)
In the book, incorrect usage of the return
statement (i.e., outside of function definitions) is detected using the Resolver
class (i.e., after parsing). However, this strikes me as inappropriate given that return
statements are meant to be used strictly inside function definition blocks. In my opinion, it'd be better to modify the grammar to detect the error as a parsing error (e.g., Python 3.8+ does this).
The modifications necessary for this are likely to be minimal (e.g., removing returnStmt
from the top-level production for declaration
, and adding a rule for functionBody
that includes the union of declaration
and returnStmt
)
it would be interesting to add generators to Lox, with syntax similar to other languages (i.e., adding the yield
keyword and using the resulting objects with an iterator protocol)
fun fib(limit) {
var a = 0
var b = 1
while (a <= limit) {
yield a
var previous_a = a
a = b
b = previous_a + b
}
}
var iter = fib(/* limit: */10)
while (!iter.at_end()) {
println(iter.value)
it.more()
}
for clox
, the feature is not likely to require large runtime restructuring, as it is a matter of adding a new opcode and being able to save/restore a function's local environment (along with the program counter for properly resuming a function after the last yield
)
for jlox
, the feature unfortunately requires more extensive changes since the way code is currently executed (a recursive traversal of the AST) doesn't allow for easy resuming of a function's execution. one way to implement it would be by creating a family of "resumable" statement classes which store the current state of a statement's execution (e.g., this gist has a proof of concept of such an implementation).
Currently there is a difference between primitive types (i.e., numbers, booleans and strings) and objects, as far as their representation goes. this prevents primitive types from having methods (e.g., we cannot say "string".length()
)
while it is desirable to keep primitives as lightweight as possible (e.g., to avoid costly overhead when holding collections of them), it is also quite convenient to be able to write expressions such as "string".ends_with("ing")
instead of string__ends_with("string", "ing")
.
we should explore the possibility of having an expression such as:
"string".starts_with("str")
be equivalent to:
str("string").starts_with("str")
where str(.)
is the cast operator (or alternatively, the str
class constructor overloaded for various types) that returns an object.
while it might also be desirable to do something similar for numbers and booleans, the need is much less clear (e.g., writing 12.sin()
or (-12).abs()
adds no more clarity than their usual counterparts sin(12)
and abs(-12)
; on the contrary, they are arguably less clear.)
i believe there's some misconfiguration in the REPL's "readline" library that causes the following odd behavior:
>>> "hi" < "hello"
Runtime Error: Operands must be numbers
[line 1]
>>> !"hi"
>>> "hi" < "hello"
Runtime Error: Operands must be numbers
[line 1]
notice how upon pressing ENTER
the !"hi"
expression gets automatically expanded to the last command/expression instead of being evaluated normally.
the same issue is present when pressing TAB
, although in this case the behavior is just fine, as TAB
is often used for autocompletion commands in many apps.
there is now a handy function help
which can be passed any value and it will print whatever help is available. for example:
>>> help(sin)
{ sin(n:number) -> number } : <built-in function>
>>> help("hello".starts_with)
{ starts_with(prefix:str) -> bool } : <built-in method>
we should advertise it in the initial help banner of the REPL as well as in the READMEs
clock(1)
should raise a runtime error saying that "expected 0 arguments but got 1"
The term "closure" is bandied about very casually but I feel like we should be more precise to avoid confusions in comments and documentation for clox
.
Technically, a closure should refer only to functions that close over non-local, non-global variables (i.e., local variables in parent scopes). This happens when a function extends its lifetime beyond that of its defining scope by being returned. This means that any such captured local variables need to be migrated from the stack to a more permanent place (the heap or the stack of a top-level function guaranteed to exist until interpretation is done).
Functions that refer to non-local, non-global variables, but whose lifetime doesn't go beyond its parent's, are simply nested functions, but not closures.
jlox
has both print and println statements, but clox
only has print
today we have a bunch of places that look like this:
switch (OBJECT_TYPE(callee)) {
case OBJECT_CLOSURE: {
#ifdef DEBUG_TRACE_EXECUTION
if (vm->trace_execution) {
debug__print_callframe_divider(vm);
}
#endif
bool result = call(AS_CLOSURE(callee), args_count, vm);
#ifdef DEBUG_TRACE_EXECUTION
if (vm->trace_execution) {
debug__show_callframe_names(vm);
#endif
return result;
}
one possibility might to follow the way of assert
and make debug__
functions macros which hide all the ugliness behind them and which are compiled away when DEBUG_TRACE_EXECUTION
is not defined.
debugging clox
when there's a segmentation or other fatal error is hard because there is no stacktrace to examine. if you look up online, you'll find a few ways, but we need to research a little bit more to find one that's as effective as possible.
jlox
already implements the list
primitive type (although it doesn't have literal syntax yet). clox
should implement it too to have feature parity.
For its expected usage and API, look at samples/list.lox
. The only additional piece of syntax needed is for indexing, which takes the form list[index]
. In the jlox
implementation we experimented with desugaring to a method call to __setitem__
and it worked quite well. We should document this in the (not yet written) spec for Lox.
currently, the settings file .loxrc
must be in the current working directory for it to load automatically upon starting the REPL. we should relax this restriction so the file can be in $HOME/.loxrc
or in any ancestor directory starting from the current working directory.
the idea is to have the same set of tests for both interpreters. each test should be a pair of input/output files. for example, the following program tests shadowing of global variables by local variables:
var i = 0;
{ var i = 1; println i; }
println i;
the expected output is:
0
1
The REPLs for both clox
and jlox
noe have new visible features that'd be good to showcase.
In jlox
we already have all the backend support required (introduced when list
was implemented), so it should be relatively easy to add (minus support for literals).
In clox
the implementation shouldn't be too hard either, but it's a bit more involved since we haven't added list
yet. The backing infrastructure is in place, though (namely, table.c
)
consider the following code:
>>> var l = list()
>>> l.append(1)
>>> l[1]
Runtime Error: tried to access index 1, but valid range is [0..0] or [-1..-1]
[line 1, token: 'at']
the last error is expected to say [line 1, token: '[']
; the current message happens because internally, index access is desugared to the list.at
function, but this can be quite confusing for users.
the :load
command would take a single parameter: the path to a lox source file.
as a bonus, we might configure the line readers in each REPL to autocomplete source files
the feature has already been implemented in clox
. we can do a similar implementation in jlox
and remove the kludge that's currently implemented there.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.