Giter Club home page Giter Club logo

pretty_laughable_lang's Introduction

The Pretty Laughable Programming Language

An educational C-like toy programming language that compiles to x64 binary.

The compiler is a self-contained Python program that weighs about 1000 LoC. It's part of an online tutorial on compilers and interpreters.

Introduction

The hello world looks like this:

; the write() syscall:
; ssize_t write(int fd, const void *buf, size_t count);
(syscall 1 1 "Hello world!\n" 13)
0

Compile and run the program:

$ ./pl_comp.py ./samples/hello.txt -o ./hello
$ ./hello
Hello world!

The output is a tiny freestanding x64 Linux ELF binary.

$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header
$ wc -c hello
288 hello

The Language

The syntax is just S-expression, parsing strings is too boring for me.

The semantics are C-like. The only data types are integers and pointers. This should be enough to write any program in.

01. Pointers

The peek command reads data from a pointer and the poke command writes to a pointer.

; copy data byte by byte
(def (memcpy void) ((dst ptr byte) (src ptr byte) (n int)) (do
    (loop n (do
        (poke dst (peek src))
        (set dst (+ 1 dst))
        (set src (+ 1 src))
        (set n (- n 1))
    ))
))

02. Control Flows

List of control flow structures:

(? cond yes no)
(if cond (then yes blah blah) (else no no no))
(do a b c...)
(loop cond body)
(break)
(continue)
(call f a b c...)
(return val)

Some examples:

(def (fib int) ((n int))
    (if (le n 0) (then 0) (else (+ n (call fib (- n 1))))))
(def (fib int) ((n int)) (do
    (var r 0)
    (loop (gt n 0) (do
        (set r (+ r n))
        (set n (- n 1))
    ))
    (return r)
))

03. Data Types

The only data types are:

  • byte: unsigned 8-bit integer.
  • int: signed 64-bit integer.
  • ptr elem_type: pointer to elem_type.

Variable types are automatically inferred:

(var a 123)         ; int
(var b 45u8)        ; byte
(var p (ptr int))   ; a null pointer to int
(var s "asdf")      ; ptr byte

The type of the function return value and the argument must be specified explicitly:

(def (memcpy void) ((dst ptr byte) (src ptr byte) (n int)) (do
    ; ...
))

int can be cast to any pointer types and vice versa.

(var i 0x1234)                  ; int
(var p (cast (ptr int) i))      ; ptr int
(var a (cast (int) (+ 1 p)))    ; int

04. Memory Management

Memory management is very simple at this point, because it doesn't exist at all.

However, the language doesn't prevent you from building your own memory management routines. This usually starts with the mmap syscall.

(var heap (ptr byte))

; a fake malloc
(def (malloc ptr byte) ((n int)) (do
    (if (not heap) (do
        ; create the heap via mmap()
        (var heapsz 1048576)    ; 1M
        (var prot 3)            ; PROT_READ|PROT_WRITE
        (var flags 0x22)        ; MAP_PRIVATE|MAP_ANONYMOUS
        (var fd -1)
        (var offset 0)
        (var r (syscall 9 0 heapsz prot flags fd offset))
        (set heap (cast (ptr byte) r))
    ))
    ; just move the heap pointer forward
    (var r heap)
    (set heap (+ n heap))
    (return r)
))

; TODO: figure out how to recycle the memory
(def (free void) ((p ptr byte)) (do))

05. The stdlib

The Pretty Laughable Language comes with the world's smallest standard library — no standard library — not even a builtin print function.

But with the ability to make arbitrary syscalls and peek-poke the memory, you can build your own stdlibs. Let's add the print function:

(def (strlen int) ((s ptr byte)) (do
    (var start s)
    (loop (peek s) (set s (+ 1 s)))
    (return (- s start))
))

(def (print void) ((s ptr byte)) (do
    (syscall 1 1 s (call strlen s))
))

(call print "Yes!\n")
0

Here is a more sophisticated program you can play with.

Roadmaps

Language features:

  • int, byte
  • pointer
  • syscall
  • if-then-else, loop
  • function
  • nested function, nonlocal variable
  • array
  • struct, class
  • function pointer

Explorations:

  • module or include directive
  • macro?
  • alternative syntax?
  • Windows
  • ARM64
  • WASM

Optimizations:

  • register allocation
  • constants
  • tail call

The Design

01. The Goal

TBA

02. The IR (Intermediate Representation)

TBA

03. Machine Code Generation

TBA

The Implementation

To be added.

But you can learn how to do it by reading the source code.

Or you might like the book From Source Code To Machine Code, which this repo is based on.

pretty_laughable_lang's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pretty_laughable_lang's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.