Giter Club home page Giter Club logo

mini-typescript's Introduction

mini-typescript

A miniature model of the Typescript compiler, intended to teach the structure of the real Typescript compiler

This project contains two models of the compiler: micro-typescript and centi-typescript.

micro-typescript started when I started reading Modern Compiler Implementation in ML because I wanted to learn more about compiler backends. When I started building the example compiler I found I disagreed with the implementation of nearly everything in the frontend. So I wrote my own, and found that I had just written a small Typescript.

I realised a small Typescript would be useful to others who want to learn how the Typescript compiler works. So I rewrote it in Typescript and added some exercises to let you practise with it. micro-typescript is the smallest compiler I can imagine, implementing just a tiny slice of Typescript: var declarations, assignments and numeric literals. The only two types are string and number.

So that's micro-typescript: a textbook compiler that implements a tiny bit of Typescript in a way that's a tiny bit like the Typescript compiler. centi-typescript, on the other hand, is a 1/100 scale model of the Typescript compiler. It's intended as a reference in code for peopple who want to see how the Typescript compiler actually works, without the clutter caused by real-life compatibility and requirements. Currently centi-typescript is most complete in the checker, because most of Typescript's complexity is there.

To get set up

git clone https://github.com/sandersn/mini-typescript
cd mini-typescript
code .

# Get set up
npm i
npm run build

# Or have your changes instantly happen
npm run build --watch

# Run the compiler:
npm run mtsc ./tests/singleVar.ts

To switch to centi-typescript

git checkout centi-typescript
npm run build

Limitations

  1. This is an example of the way that Typescript's compiler does things. A compiler textbook will help you learn compilers. This project will help you learn Typescript's code.
  2. This is only a tiny slice of the language, also unlike a textbook. Often I only put it one instance of a thing, like nodes that introduce a scope, to keep the code size small.
  3. There is no laziness, caching or node reuse, so the checker and transformer code do not teach you those aspects of the design.
  4. There's no surrounding infrastructure, like a language service or a program builder. This is just a model of tsc.

Exercises

  • Add EmptyStatement.
  • Make semicolon a statement ender, not statement separator.
    • Hint: You'll need a predicate to peek at the next token and decide if it's the start of an element.
    • Bonus: Switch from semicolon to newline as statement ender.
  • Add string literals.
  • Add let.
    • Make sure the binder resolves variables declared with var and let the same way. The simplest way is to add a kind property to Symbol.
    • Add use-before-declaration errors in the checker.
    • Finally, add an ES2015 -> ES5 transform that transforms let to var.
  • Allow var to have multiple declarations.
    • Check that all declarations have the same type.
  • Add objects and object types.
    • Type will need to become more complicated.
  • Add interface.
    • Make sure the binder resolves types declared with type and interface the same way.
    • After the basics are working, allow interface to have multiple declarations.
    • Interfaces should have an object type, but that object type should combine the properties from every declaration.
  • Add an ES5 transformer that converts let -> var.
  • Add function declarations and function calls.
  • Add arrow functions with an appropriate transform in ES5.

mini-typescript's People

Contributors

morzel85 avatar orta avatar sandersn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mini-typescript's Issues

Test failures due to EOL conversion on Windows

It is likely that Git on Windows applies EOL conversion from LF to CRLF while repository is cloned (controlled by core.autocrlf config).
When that happens, all test fail due to EOL differences in reference (CRLF) and local (LF) files.

File difference example:

$ xxd baselines/reference/basicLex.lex.baseline

00000000: 5b0d 0a20 205b 0d0a 2020 2020 2249 6465  [..  [..    "Ide
00000010: 6e74 6966 6965 7222 2c0d 0a20 2020 2022  ntifier",..    "
00000020: 7822 0d0a 2020 5d0d 0a5d                 x"..  ]..]

$ xxd baselines/local/basicLex.lex.baseline

00000000: 5b0a 2020 5b0a 2020 2020 2249 6465 6e74  [.  [.    "Ident
00000010: 6966 6965 7222 2c0a 2020 2020 2278 220a  ifier",.    "x".
00000020: 2020 5d0a 5d                               ].]

I've faced this issue on Win 10 Home with Git v2.31.1.windows.1 (Git Bash). On Ubuntu (WSL) all worked fine.

Clarification for semicolon as a statement ender

Hello again @sandersn ๐Ÿ‘‹, I'm working on the second exercise "Make semicolon a statement ender, not statement separator." I found an approach but am unsure if it's the right way to go. I just wanted to check with you about your thoughts on this.

Here's the implementation I've done https://github.com/imteekay/mini-typescript/pull/7

My idea is:

  • Rather than parsing statements with the semicolon as the separator, we can parse all statements until it gets to the EOF node.
  • Return the EOF node similar to how it works in TS*
  • When parsing, if it gets to the semicolon (and it's not an empty statement**), the parser skips the semicolon and starts parsing the next statement

*see example

source code ast
Screen Shot 2023-05-02 at 16 35 56 Screen Shot 2023-05-02 at 16 36 00

**empty statements should have at least two semicolons together (is it a fair statement?). In the example above, we can see the empty statement is there because it has two semicolons between variable declarations and at the end of the ast, it has the eof node.

Thanks ๐Ÿ™‡

Clarification for the "Add type aliases" exercise

Hey! I'm working on the next exercise for the mini-typescript and it's the "Add type aliases" one.
But I'm not sure if it's already implemented.

  • The lexer already creates Type token
  • The parser already parses the Type token and creates the TypeAlias ast node
  • The binder assigns the type alias into the symbols table and the compiler can reference to it later in the program
  • And the checker checks the type based on the typename

What should I do next in this exercise, @sandersn? Thank you!

11 test failures on Windows (fresh clone at e8cbe96)

More joys of Windows (works fine on Ubuntu WSL)... Sorry ;)
Many tests fail on freshly cloned repo:

redeclare failed: Expected baselines to match
 - result   - baselines/local/redeclare.tree.baseline
 - expected - baselines/reference/redeclare.tree.baseline
 - run: diff baselines/local/redeclare.tree.baseline baselines/reference/redeclare.tree.baseline

redeclare failed: Expected baselines to match
 - result   - baselines/local/redeclare.errors.baseline
 - expected - baselines/reference/redeclare.errors.baseline
 - run: diff baselines/local/redeclare.errors.baseline baselines/reference/redeclare.errors.baseline

redeclare failed: Expected baselines to match
 - result   - baselines/local/redeclare.js.baseline
 - expected - baselines/reference/redeclare.js.baseline
 - run: diff baselines/local/redeclare.js.baseline baselines/reference/redeclare.js.baseline

singleIdentifier failed: Expected baselines to match
 - result   - baselines/local/singleIdentifier.errors.baseline
 - expected - baselines/reference/singleIdentifier.errors.baseline
 - run: diff baselines/local/singleIdentifier.errors.baseline baselines/reference/singleIdentifier.errors.baseline

singleTypedVar failed: Expected baselines to match
 - result   - baselines/local/singleTypedVar.errors.baseline
 - expected - baselines/reference/singleTypedVar.errors.baseline
 - run: diff baselines/local/singleTypedVar.errors.baseline baselines/reference/singleTypedVar.errors.baseline

twoStatements failed: Expected baselines to match
 - result   - baselines/local/twoStatements.tree.baseline
 - expected - baselines/reference/twoStatements.tree.baseline
 - run: diff baselines/local/twoStatements.tree.baseline baselines/reference/twoStatements.tree.baseline

twoStatements failed: Expected baselines to match
 - result   - baselines/local/twoStatements.errors.baseline
 - expected - baselines/reference/twoStatements.errors.baseline
 - run: diff baselines/local/twoStatements.errors.baseline baselines/reference/twoStatements.errors.baseline

twoStatements failed: Expected baselines to match
 - result   - baselines/local/twoStatements.js.baseline
 - expected - baselines/reference/twoStatements.js.baseline
 - run: diff baselines/local/twoStatements.js.baseline baselines/reference/twoStatements.js.baseline

twoTypedStatements failed: Expected baselines to match
 - result   - baselines/local/twoTypedStatements.tree.baseline
 - expected - baselines/reference/twoTypedStatements.tree.baseline
 - run: diff baselines/local/twoTypedStatements.tree.baseline baselines/reference/twoTypedStatements.tree.baseline

twoTypedStatements failed: Expected baselines to match
 - result   - baselines/local/twoTypedStatements.errors.baseline
 - expected - baselines/reference/twoTypedStatements.errors.baseline
 - run: diff baselines/local/twoTypedStatements.errors.baseline baselines/reference/twoTypedStatements.errors.baseline

twoTypedStatements failed: Expected baselines to match
 - result   - baselines/local/twoTypedStatements.js.baseline
 - expected - baselines/reference/twoTypedStatements.js.baseline
 - run: diff baselines/local/twoTypedStatements.js.baseline baselines/reference/twoTypedStatements.js.baseline

Sample file diff:
baselines\reference\redeclare.errors.baseline

[
  {
    "pos": 14,
    "message": "Cannot redeclare x"
  }
]

baselines\local\redeclare.errors.baseline

[
  {
    "pos": 11,
    "message": "Expected identifier or literal but got Unknown"
  },
  {
    "pos": 15,
    "message": "parseToken: Expected EOF but got Var"
  }
]

OS: Microsoft Windows 10 Home 10.0.19041

Test failure: missing baselines/local directory (ENOENT: no such file or directory)

If there is a test failure on newly cloned repo, then an error happens due to missing baselines/local directory.

If empty baselines/local is added before npm test is run, then tests are handled properly (local files are created)...

Example (run on Ubuntu WSL):

xxx@yyy:~/projects/gh/sandersn/mini-typescript$ npm test

> [email protected] test
> rm baselines/local/*; tsc && node test.js

rm: cannot remove 'baselines/local/*': No such file or directory
node:internal/fs/utils:343
    throw err;
    ^

Error: ENOENT: no such file or directory, open 'baselines/local/basicLex.lex.baseline'
    at Object.openSync (node:fs:582:3)
    at Object.writeFileSync (node:fs:2143:35)
    at test (/home/xxx/projects/gh/sandersn/mini-typescript/test.js:13:12)
    at /home/xxx/projects/gh/sandersn/mini-typescript/test.js:78:68
    at Array.map (<anonymous>)
    at Object.<anonymous> (/home/xxx/projects/gh/sandersn/mini-typescript/test.js:78:46)
    at Module._compile (node:internal/modules/cjs/loader:1109:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1138:10)
    at Module.load (node:internal/modules/cjs/loader:989:32)
    at Function.Module._load (node:internal/modules/cjs/loader:829:14) {
  errno: -2,
  syscall: 'open',
  code: 'ENOENT',
  path: 'baselines/local/basicLex.lex.baseline'
}

Maybe the dir could be added as a postinstall step with mkdir -p baselines/local ?

Clarification on `String` literals

Hi @sandersn, I'm working on the third exercise "Add string literals".

Does the exercise require adding type-checking to the string literals too?
Or adding a token and parsing it is enough?

This is the wip PR: imteekay/mini-typescript#4

Edit: I'm studying how the type checker works and was able to type check string literals using typeof for the expression values (imteekay/mini-typescript@a1df2c7). I also complemented the singleTypedVar test to cover this implementation. But still not sure if using the typeof is the right approach here.

Kudos: Learnings from working on the mini-ts exercises

Hey! ๐Ÿ‘‹

Just wanted to say thank you for the repo. I've been learning a lot about compilers and the TypeScript compiler.
I've studying it and writing a series of posts about my learnings:

I added a thank you at the end of the forth post.

Screen Shot 2023-06-26 at 13 42 54

(feel free to close this issue! just wanted share this with you)

Clarification on `EmptyStatement`

Hi @sandersn,
I just wanted to clarify the first exercise "Add EmptyStatement".

Is an EmptyStatement a semicolon in JavaScript/TypeScript (link)? So the idea is to stop the execution of a statement?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.