Giter Club home page Giter Club logo

basil's People

Contributors

elucent avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

basil's Issues

Add composite types to user-level code.

Basil internally supports values of product and sum types, and uses them to represent certain internal data - for example, the tuple of arguments passed into a function when called.

We should support the construction of products and sums in userspace code via the inclusion of some new built-in operators and functions, or possibly even new grammar. These values should be implemented both as compile-time values and as runtime nodes.

In particular, the following operations should be defined on the aforementioned types:

  • Random-access by index of an element of a tuple.

  • Find the actual type of a sum type value.

  • Access the current value of a sum type by type (for example, we might write int of union: int|string to access the int value of a union of int and string).

We should also consider adding arrays, along with a built-in function or syntax extension to write array literals.

Problem with $help command

I ran this on the http://repl.it site

I tried the $help command at the ? prompt of Basil, and only made it work if I typed a non-intuitive EOF

$ make basil
$ ./basil

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚
โ”‚ ๐ต๐‘Ž๐‘ ๐‘–๐‘™ โ€” version 0.1 โ”‚
โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Enter any Basil expression at the prompt, or '$help' for a list of commands!

? $help (I Typed CONTROL-D)
. (I typed CARRIAGE RETURN)

Commands:

  • $help => prints this command list.
  • $quit => exits REPL.
  • $print-tokens => toggles token printing.
  • $print-parse => toggles parse tree printing.
  • $print-ast => toggles AST printing.
  • $print-ssa => toggles SSA printing.
  • $print-asm => toggles assembly printing.

? ? $quit ( typed CARRIAGE RETURN)
. (I typed CARRIAGE RETURN)
$ (FINALLY GOT LINUX PROMPT)

Note: I found that I didn't need to type CONTROL-D I could simply type CARRIAGE RETURN and then another CARRIAGE RETURN at the prompt.
I could also use CONTROL-M or CONTROL-J instead of CARRIAGE RETURN

Thoughts about a flexible yet compile-time turing complete language

Crazy braindump:

Today I opened again the repo https://github.com/zeroflag/punyforth and it triggered unexpected brainstorming.

What if Basil became a language with something like punyforth's goodies:

  1. the "dictionary" semantics (hyper static global environment)
  2. the unique exception handling
  3. the seamless yet clear distinction between "immediate words" (executed in compile time; suitable for metaprogramming - especially control structures and similar), "parsing words" (also compile time; suitable for metaprogramming by "looking ahead"), "deferred words" (executed in run time; copy-on-write for keys & values in the "dictionary"), "overriding" (executed in run time; suitable for aliasing and thus redefinitions or value changes etc.), and "quotations" ("recorded code in run time to execute in run time")
  4. terse syntax due to missing need for variables (but with dead-easy way to have them for clarity - e.g. by deferred words or overriding)
  5. unit testing etc.
  6. seamless cooperative multitasking

but would significantly improve on:

  1. syntax to become (much) better and ordinary
    1. the need of low-level manual shuffling with dup swap etc. is a no-go
    2. it shall support "stateful" code writing - most languages have some form of lexical scoping (delimited e.g. by a file beginning and end) in which one doesn't have to write some "prefix" etc. significantly lowering the length of combinators/methods/function_names/variable_names/etc.
    3. if possible all 3 notation variants shall be supported seamlessly (prefix, infix, postfix)
  2. performance to become (much) better on modern CPUs (stack is the bottleneck, low-level orientation is also a bottleneck, etc.)
  3. compile-time safety checking (including safer defaults and higher-level APIs & concepts)

No. (1) is already on its way in Basil, but compile-time stuff is not yet smooth enough and it doesn't seem as universal as ponyforth. With others, it's still an open question ๐Ÿ˜‰.

Thoughts?

String formatting

Basil should have some way to easily format strings. Here are a couple syntax proposals.

The percent symbol could be swapped out for something like # or $.

  • ("hello %0 %1" format "world" true) -> yields "hello world true"
  • `hello %{"world"} %{true}` -> yields "hello world true"

XL lang with some similarities to Basil

I've just come across XL and am quite surprised by its capabilities and simplicity. It's default compiler is written in C++ and it can compile to native binary as well as to bytecode (for which there is a VM).

XL seems a bit similar to Basil when it comes to the goals, so it might be a good source of inspiration. XL is being actively developed, so there is also plenty of room to influence XL itself ๐Ÿ˜‰.

Rework compiler backend with register allocation and SSA-based mid-level IR.

The current compiler backend goes directly from an abstract syntax tree to a list of instructions. It'd be easier to perform analyses and implement new compiler optimizations on an SSA form, that gets lowered to the instruction list we currently have. In particular, this addition should help us implement register allocation, which should significantly improve language performance.

Attributes/annotations

Basil would benefit from some sort of generalized user-level annotation feature. Some uses include:

  • Toggles for compiler optimizations (automatic struct field alignment)
  • Static checks (function overriding, pure function, etc)
  • Automated tooling, such as documentation generation or code analysis (coverage, timing, etc)

Bug in allowing elided ':' in infix operator/macro calls

infix (x added-to y :and z)
    x + y + z

display (1 added-to 2 and 3)

Expected output:

6

Actual output:

Expected term in binary expression.
Expected term in binary expression.
Could not evaluate term 'error'.

However, the above code will run correctly if the call to added-to includes a quoted and:

display (1 added-to 2 :and 3)

Support multiple return values

Support for functions returning multiple values has many implications in terms of potential language features:

  • Being able to index intro lists using lists (list[1 .. 5], where .. returns the range from 1 to 5)
  • Comprehensions

Add sub-64-bit integer types as primitives.

Basil should support integer types with widths less than 64 bits.

Similar to Go's primitive int types, I propose redefining the current Int type as a generic "best size" integer type equal to the word size of the target architecture. We'll augment this with types with explicit sizes: I propose I8, I16, I32, and I64 for brevity's sake.

Some open questions associated with this feature:

  • Should we include unsigned integers in the language? What should their semantics be (specifically w.r.t. type coercion)?
  • Is there a compelling reason to define a distinct equivalent of C's intptr_t? Word-sized Int should be equivalent to this on most modern systems - perhaps we could even specifically define Int as pointer-sized?
  • Should we consider generalizing ints for arbitrary bit-widths? I128 could have some uses. What about atypicalwidths like I7 or I50?

Forms versus macros

Recent commits added "macros". Naturally the question arises why Basil needs macros if it already has forms (which is a more generic concept capable of everything macros can and much more).

@elucent could you elaborate?

Note I didn't do any deep investigation on macros nor forms in Basil as it's all still very much in flux.

Vaporization instead of reference counting to jump to the close-to-metal language battleground

Languages with advanced compile-time evaluation & compile-time syntax extensibility make it much easier to overcome restrictions imposed by some highly efficient memory management techniques.

Basil is such a language and thus I'd like to point out the Vaporization technique which might actually allow Basil to get rid of reference counting altogether if certain constraints will be implemented.

Vaporization also allows for real-time guarantees which makes the language more attractive in general.

Thoughts?

Better pretty printing

Currently functions and macros are printed very opaquely.

Printing a function yields <#procedure> and printing a macro yields <#macro>.

Ideally these prints would include more info, e.g. printing the plus function yields <#procedure +: (int * int) -> int> or something similar.

AsmJit C++ library for easy and super-fast compilation to ASM

I've heard that Basil might want to generate its own assembly to allow for some tricks not applicable to other languages.

For that purpose I'd propose a library which I myself consider as the best from state-of-the-art libraries dealing with ASM on the highest possible level (including semi-automated register allocation/deallocation etc.).

https://asmjit.com/

Create standard library

One of the most important elements of making Basil more accessible to new users is providing those users with access to useful library functions. In its first form, this will look like:

  • data structure operations (lists, strings, etc.)
  • math
  • control structures (for-in loops, etc.)

Each separate category of operation will be defined in it's own file, allowing users to import library functions as they're needed.

The overall goal for the Basil standard library is to be easy to use and to have enough breadth that it covers the majority of programmers' average use cases.

Add more primitive types of different sizes.

Basil currently only supports word-size types, and has a small list of primitives: int, bool, symbol, and string. Since Basil is working towards being a viable systems-programming language, it would probably be best to support a variety of other common primitives, including:

  • 8, 16, and 32-bit signed integer types.

  • 32 and 64-bit floating point numbers.

  • raw pointers.

Type coercion between some of these types would be useful, but not entirely necessary for the 0.2 update. We can revisit it later.

External functions, foreign function interface.

Support the definition of functions with undefined bodies. These functions must have full type annotations, and represent external symbols that will be resolved at link time. This will allow Basil to interact with native libraries and substantially broaden the kinds of applications Basil can be used in.

In the current Basil runtime, code is JIT-compiled, so we'd need to create our own linker of sorts to find the implementations of these functions. For a start, we should consider just supporting shared objects, to avoid having to parse ELF files. We can consider full-on AOT compilation for a later issue.

Add floating-point primitives.

Basil should support floating-point primitives, specifically a 32-bit float and 64-bit double, as primitives. Floating-point types should support implicit conversion from smaller integer types, and to larger floating-point types.

Continuation Passing Style (CPS) and compile-time evaluation (and parallelism/concurrency)

After reading nim-lang/RFCs#295 (comment) I immediately thought about Basil as this might come handy for code generation (see C implementation of CPS on few tens of lines a bit similar to Scheme).

The interesting thing I didn't realize until today (despite my craziness into parallelism & concurrency) is that CPS might make parallelism & concurrency easier to implement with guarantees (e.g. memory bounds) which are otherwise hard to achieve when using other techniques and looking for performance. This seems to be true though only if used extensively in compile time.

So I'm writing this as some of the links in https://github.com/weavers-guild/weave-io/tree/master/research might be really useful for Basil.

Add garbage-collected references and reference types.

Basil is planned to be a garbage-collected language, whereas most types in the language are value types (with the exceptions of strings, lists, and closures). In order to support more desirable semantics for certain types of objects, such as arbitrarily-long lifetimes and minimal copying, Basil should have an easy-to-use reference type that allows the allocation and management of arbitrary values.

I propose introducing a new type kind to Basil, "Reference". We'll refer to a reference to some type T as a T ref. Overall, the semantics of references are rather similar to those of raw pointers (#30), but since references are safer and fall quite naturally out of any GC, we'll be prioritizing their development. At minimum, references should support a few unique operations:

  • ref : Type -> Type: to construct a reference type value, i.e. Int ref is the type of a reference to an integer.
  • new : T? -> ref T?: allocates a value of type T on the heap and returns a managed reference to it.
  • deref : ref T? -> T?: dereferences a managed reference and returns the pointed-to value. This pattern also functions as an lvalue!

Some open questions:

  • While pointer arithmetic is distinctly not a goal of reference types, we might want similar syntax for managed and non-managed pointers: perhaps deref foo could be represented as *foo or foo.* for brevity's sake?
  • Should references be nullable? While it would be relatively easy to represent this on a value level, nullability has famous downsides. Perhaps we can require the use of optional types or unions, and optimize out tags when possible.
  • Since reference types don't change the semantics of a value too greatly, we might want their use to be as seamless as possible. Some features that would fall into this category would be syntactic sugar to define implicit reference types (such as ref Foo = Int for an automatically-boxed integer), or perhaps automatic dereference via type coercion (T <: T ref?).

Dreaming about end-user/programmer use-cases

I was (again ๐Ÿ˜‰) dreaming lately about:

  1. making teishi-like safety/validation/assert lib work in (mostly) compile-time instead of specifying types for each argument of a function separately (yeah, it's pretty radical, but it's vastly more powerful and readable than relying on just argument types in which case one would need to create duplicates of existing functions and existing types just to represent an instance of the tuple function, types, values like in Haskell to get about the same expressiveness but at the huge cost of verbosity and "intent obfuscated and scattered all over the app source")

  2. having the SixtyFPS syntax ported to native Basil syntax (probably also drawing some inspiration from https://github.com/fpereiro/gotoB and maybe even ustack and Jasonelle)

  3. having async stuff as in Julia (i.e. syntactically explicit and clear "what it does" and that it's all just "compile-time" macros and thus how it copes with e.g. multithreading etc.) to account for hybrid systems like "ECS & async"

  4. seamless "high-level" FFI like PythonCall (not to be confused with PyCall) to Python (yes, this is no joke - this'll allow Basil to develop & maintain only a super tiny standard library)

  5. seamless "low-level" FFI for C/C++/Rust libs (to cover the rest of the use cases) - note this FFI shall also support seamless interfacing with C/C++ structs/data_types

  6. language support for "full" hierarchy of caches (from registers through L1... caches and RAM and SSD and HDD and network attached storage up to extremely long-term archiving storage like Amazon Glacier)

    E.g. I wrote once a "persistentdict" for Python whose aim was to be a "true" Python dict with the twist that everything added/removed/changed is being automatically persisted (with the highest guarantee the given OS supports) - the only difference to Python dict was initialization which needed two arguments (path to a file used as backing storage and a "namespace string" to support multiple persistent dicts with the same backing storage) - and if one used Python context managers with multiple such persistent dicts in the code block, then the changes made were transactional (fully ACID) unlike if used separately when only operations upon each separate persistentdict instance were guaranteed transactional (fully ACID).

    Working with that was a breeze (incl. easy optimizations if needed) compared to any other persistent API I've ever seen (I've even seen some benchmarks of big databases like PostgreSQL on server HW and was surprised that my persistentdict performed about the same in multithreaded write benchmarks on my old laptop with slow HDD ๐Ÿ˜ฎ - I've checked it several times as I couldn't believe it, but yes, it was that bad with bigger DBs - probably due to their extremely complicated locking schemes etc.).

    For Basil I imagine something even more "first class" (pun intended with the recent jam ๐Ÿ˜‰).

    Maybe just make all variables/arguments/return_values persistent by default (of course not naively otherwise we'd be feeling like we're running on 0.1MHz CPU, but maybe using some escape analysis, sampling of the "important state" or just smartly/semi-automatically determining save points, etc.)? If DBs can (more or less) do that, why not Basil?

  7. Please add you own below in the comments!

(yeah, I totally admit some of these sound like true scifi, but hey, let's offer something unusual but extremely useful unlike other languages!)

Tracing garbage collection.

The langjam build of Basil ruthlessly leaked memory whenever it needed a string or list. This really shouldn't be the case for a proper release...

This issue proposes a simple tracing garbage collector for Basil. The priorities of this design are fairly straightforward:

  • The GC should be fast - we'd like to be able to minimize the amount of scanning we do, and maximize fast allocations.
  • We should support both boxed and unboxed values on the stack. Value types substantially larger than word-size should be free to coexist with garbage-collected pointers. In fact, it'd really be ideal if we could support both tracked and untracked pointers in our GC scheme.

To this end, I propose we adopt a two-space copying collector, using stack and heap value maps to locate references. Using stack maps allows us to safely ignore non-GC'd values, and using a copying collector allows us to minimize the amount of heap-traversal needed per collection, as well as allowing us to keep allocations fast.

Concurrent and generational GC are probably reasonable to consider for the future, but considering the relatively small number of roots in a typical Basil program currently (should be essentially totally constrained to the stack, and thus proportional to call depth), I'm not expecting a tremendous cost of GC right off the bat. Something simple and workable is all we need for the 1.0 release.

Add raw pointers and pointer types.

Basil compiles directly to native code, so it should be relatively straightforward to support unsafe pointer types and pointer arithmetic, at least from a codegen perspective. While the primary means of passing around reference types should be through garbage-collected pointers, supporting raw pointers would allow Basil to more easily express low-level function prototypes - useful for interacting with foreign functions.

I propose we introduce a new primitive type kind, "Pointer", that has a single parameter - the type it points to. We'll tentatively denote the type of a pointer to some type T as T ptr. Pointer types can be coerced generically to pointer types that point to a generic type like Any or a type variable. Besides this, pointer types support no other implicit coercions.

Pointers should, at minimum, support a few primary operations:

  • deref : T? ptr -> T?: dereferences a pointer and returns the value it points to.
  • addr : T? -> T? ptr: computes the address of a value and returns it as a pointer value. The parameter to addr must be an lvalue!
  • T? ptr as U? ptr: converts one pointer type to another. This is unchecked - unsafe pointer coercion is not a runtime or compile error!
  • T? ptr as Int: converts a pointer to an integer value.
  • Int as T? ptr: converts an integer to a pointer value. Between this and the previous conversion, rudimentary pointer arithmetic is achievable.

A few open questions:

  • Should we introduce new syntax for dereference and address-of operations? We could replicate C-style &val and *ptr syntax with a few new tokens. One less-invasive alternative would be Zig-style dot syntax: val.& and ptr.* would be easily expressed in the current Basil semantics.
  • Perhaps we could add some easier pointer arithmetic instructions than converting to and from Int? Maybe it could be type-based: ptr + Int could add the size of an Int to the address contained in ptr.

Killer app/lib/built-in-feature to make Basil stand out

...and maybe lead to some monetization (as language itself is not monetizable, but an end-user product is).

Motivation

There are many hardly explored areas where compile-time evaluation brings major advantages.

Recently I thought a rules engine - if made part of the standard Basil runtime - could accelerate public interest in Basil. More importantly, it'd allow for terse, provably correct (assuming the runtime itself is correct, ehm), and readable/understandable programming.

This would substitute all reactive programming, ECS libraries, etc. Game development, UI development, etc. shall then become a breeze in Basil.

Summary

Build something like https://github.com/paranim/pararules but without the few downsides pararules have.

Details

  1. Basil should focus on the staticRuleset functionality (incl. the wrapper macro to avoid forced centralization of rules in the source code) instead of the alternative naive implementation (uses mutable dynamic arrays/tables under the hood) pararules offer.
  2. The rule engine shall be fully parallel to take advantage of tens, hundreds, etc. processing cores.
  3. The syntax shall be "safe by default" - i.e. probably making then = false the default to guarantee no cycles can happen.
  4. session.retract( ... ) shall be discouraged and instead a "cascading" session.retractCascade( ... ) shall be introduced to guarantee the fact DB will always stay consistent.
  5. Because Basil uses value-passing, the API shall automatically always use references (in whatever form - not necessarily a pointer) to all bigger values (usually most non-primitive types or anything having more than two or four bytes) passed to the rules engine.
  6. Leverage the low-level capabilities of Basil and optimize the rules engine leveraging all the information available in compile time (especially to achieve cache locality & coherency as that seems to be what the rules engine algorithms suffer from).
  7. Improved syntax (this I'm not much afraid of as Basil is nicely readably by default, so it's only a matter of "not screwing it up much" ๐Ÿ˜‰).
  8. ... (this list is not exhaustive)

Add bitwise operators

Add bitwise operators including:

  • & (bitwise and)
  • | (bitwise or)
  • >> (bitwise left shift)
  • << (bitwise right shift)

Support optional type annotations.

Currently Basil code is either dynamically-typed in compile-time evaluation, or has its types inferred before code generation. It would be nice to support explicit type annotations so that it can be clearer to the reader what type a value is supposed to be.

This is currently built into the grammar with the infix : operator, and usage of this operator is transformed by the parser into a call to the built-in annotate function. This function needs to be implemented, though, both as a compile-time operation and as an AST node. The behavior of this function should be to assert the type of a value, emitting a compile error if a type inconsistency is found.

The right-hand side of a type annotation should be a type value. Type values are currently supported, but no primitives exist for them at the moment. We should add built-in variables for primitive types, such as int, symbol, and string.

Definitions also need special support for type annotations:

  • A definition of the form def x: int 1 should transform into def x 1: int, ensuring the type of the initial value.

  • A definition of the form def (f x: int): int x is a bit more complicated. The first annotation should produce an annotation node in the function body, asserting the type of whatever value was provided as an argument. The second annotation specifies the return type of the function, and should assert that the function body returns a value of said type.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.