ratel-rust / ratel-core Goto Github PK

View Code? Open in Web Editor NEW

431.0 14.0 17.0 2.2 MB

High performance JavaScript to JavaScript compiler with a Rust core

License: Apache License 2.0

Rust 97.39% JavaScript 1.49% CSS 0.44% HTML 0.61% Shell 0.07%

javascript rust transpiler compiler ast parser performance

ratel-core's People

Contributors

Stargazers

Watchers

Forkers

schultyy cmtt tpraxl s-coimbra21 luozijun fdionisi ishitatsuyuki pitaj forsakenharmony amilajack yoric kdy1 tzvipm kamijin-fanta fmpomar lukasbombach ajunlonglive

ratel-core's Issues

Sparse arrays

This is valid syntax:

let foo = [,,];

Bad link in README.md

The following link
https://camo.githubusercontent.com/1ea558cf5608a653c9e3b85bc9953181ee80d2d2/687474703a2f2f7465726869782e636f6d2f726174656c2d706572662d312e706e67

returns "Cannot proxy the given URL"

Support spread expression in object

Currently we support:

[...elems]
f(...elems)

But we do not support {...props}.

Document what happens with superfluous parens

While superfluous parens have no official semantics, they are actually used by browsers to perform laziness tricks:

var foo = function() { ... }; // Parsed lazily by the browser.
var bar = (function() { ... }); // Parsed eagerly by the browser.

It would be very useful to keep superfluous parens in such a case (or perhaps always). I don't know if that's what ratel does currently, so documenting the choice would be useful, too!

Perhaps it should be a parsing option?

More doc comments

A lot of the code is meant to be self explanatory, but some parts could use extra documentation, such as the different enum variants contained in grammar.rs.

Numbers with scientific notation

Currently the tokenizer only parsers integers and floats. It needs to be expanded to allow for scientific notation, regex form: [eE][+-]?[0-9]+.

I've been thinking about coming with a name that isn't already taken on npm but also doesn't deviate from the badger-ness. One option I got, which sounds kinda cute and should be easy to remember is badgeroo.

Edit: we now have access to ratel on npm :).

The github organization can be then changed to:

ratel-rust / ratel-cli -> ratel on npm
ratel-rust / ratel-core -> ratel on crates.io

Sidenote: project logo could be a yellow JS-esque square with a head profile (plain black + white) of a honey badger in the bottom right corner.

Website and compiler server in Rust

I think one of the issues we might have right now that isn't super obvious is that it's actually hard to see anything working. Having to download, install and compile things just to play around with Ratel can be a lot to ask for, especially when our target is a JS crowd that's used to having online demos of everything.

That being said, because Ratel is insanely performant, putting it on a cheap EC2 with some very basic HTTP server to spit out compiled JS should be trivial. The server itself can be pure Rust and just pull ratel-core as a dependency. A simple S3 website with a try-it-out page that talks to the server should also be pretty easy to do.

Need to finally decide on a domain.

Release 0.8?

I'm currently working as a background task on a bridge between BinAST and Ratel. However, Ratel 0.7.0 and Ratel 0.8.0 seem to be very different beasts, with very different ASTs.

Which version should I target? Is Ratel 0.8.0 meant to be released soon?

Sourcemap support

Source maps are undoubtedly a necessity for development tools. We currently have some span helpers but don't have any code related to emitting sourcemaps yet.

Get rid of OwnedSlice, split project into crates

Attach 'src lifetime to all AST structs and enums.
Replaces all instances of OwnedSlice throughout the code with &'src str.
Remove source code from Program struct. Instead of trying to self-contain borrowing, which is difficult to impossible with the way borrowck works, we should rather embrace it and have the AST be an immutable borrow on the source. More in rationale below.
Some unsafe code might be necessary for transmuting lifetimes when borrowck gets in the way, particularly when making slices from a source ref stored inside parser (borrowck can confuse parser lifetime with source lifetime).

Rationale

OwnedSlice is a footgun. It's efficient and rustc seems to optimize transforming it into &str at will without issues, however the method for creating OwnedSlice from non 'static slices introduces issues when it comes to explaining to end users of the AST when and why they should or shouldn't use it. Having some self-contained unsafe code in the parser is fine, having that unsafe code spread over virtually all parts of the code (transform and codegen) is not so great. Forcing users to use &'static str in transformer is, for the most part, a good limitation.

This should also allow us to separate the project into separate crates. While work is done on transformer, the parser and AST by themselves can be useful for other projects, e.g. should anyone want to build a JS linter with ratel-core.

For in with initial value

Reproduce:

for (var i = 0 in {}) {}

Escaped unicode identifiers support

let \u0050 = 0;
console.log(P) // 0

Add Clippy and maybe rustfmt

To make sure that we write clean, idiomatic Rust, adding Clippy to the pipeline could be beneficial.

Rustfmt, unless it breaks something (aligning => in long match statements?) could be a nice addition as well.

Failure parsing a ordinary minified JS file

https://cdnjs.cloudflare.com/ajax/libs/vue/2.5.17/vue.runtime.min.js

SyntaxError: Unexpected token at 6:83

> 6 | !function(t,e){"object"==typeof exports&&"undefined"!=typeof module?module.exports=e() // minified long line omitted...

Spread operator support

https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Operators/Spread_operator

new.target support

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/new.target

Hosted documentation

As follow-up task of #25, ratel-core should use rustdoc for documentation.

Generated content could be hosted using GitHub Pages or in scope of #26.

Keywords allowed for object keys and member accessors.

This is valid JavaScript:

let foo = { function() { } };
foo.function();

We need to allow keywords to extend ObjectKey as described here: #21 (comment)

Additionally Expression could use a KeywordMember variant where a keyword can be a property.

Lexicon needs to be changed in order to make keywords into it's own enum separate from tokens:

#[derive(Debug, PartialEq, Clone, Copy)]
pub enum KeywordKind {
    Break,
    Do,
    Case,
    Else,
    Catch,
    Export,
    Class,
    Extends,
    Return,
    While,
    Finally,
    Super,
    With,
    Continue,
    For,
    Switch,
    Yield,
    Debugger,
    Function,
    This,
    Default,
    If,
    Throw,
    Import,
    Try,
    Static,
}

#[derive(Debug, PartialEq, Clone, Copy)]
pub enum Token {
    EndOfProgram,
    Semicolon,
    Colon,
    Comma,
    ParenOpen,
    ParenClose,
    BracketOpen,
    BracketClose,
    BraceOpen,
    BraceClose,
    Keyword(KeywordKind), // replaces all above
    Operator(OperatorType),
    Declaration(VariableDeclarationKind),
    Reserved(ReservedKind),
    Identifier(OwnedSlice),
    Literal(Value),
    Template(TemplateKind),
}

Bonus 1: FatArrow should be a first class token, not a variant of OperatorType.
Bonus 2: OperatorType should be renamed to OperatorKind to keep the naming scheme consistent.

Destructuring for assignments, parameters and declarations

We need support for destructuring assignments, e.g.

const [ x, y ] = [ 1, 2 ];

Universal node type

If I read correctly the code, there is no universal node type.

This is the kind of thing that would be useful to write a converter between two AST:

fn ratel_to_binast(source: &ratel::Universal) -> Result<binast::Universal, ?> {
  ...
}

There may be alternatives, I'll think about that.

Default parameters

Parser needs to be able to handle parameters with default values.

what's the status of this project?

ratel is a cool project. It seems that the rewrite branch is still quite active.
Do you have any roadmap or plan to integrate back to master branch?

Support for comments

As follow-up for #99, this issue is about adding optional support for comment nodes.

As @maciejhirsz suggested, comments should be gathered by the lexer as internal list.
This way, f.e. all comments of a function can be pulled before the declaration.

Please note that there are three types of valid comments in JavaScript:

Line comments // foo
Block comments /* bar */
HTML comments  (:angry:)

Support debugger statement

ECMAScript 2015: https://www.ecma-international.org/ecma-262/6.0/#sec-debugger-statement
ECMAScript 2017: https://www.ecma-international.org/ecma-262/8.0/#prod-DebuggerStatement

Document ratel::grammar::Statement::Transparent

Currently looking at converting ratel -> binast. What's ratel::grammar::Statement::Transparent?

Support `super()` and `super.*` in constructors/class methods.

super() and super.* support, self explanatory.

Try & catch

Missing support for try & catch.

Generator functions

Allow for Multiplication token after Function keyword.

Deploy `ratel-wasm` using CI, MIME type

In order to have the REPL up-to-date with each master deployment, it should be deployed to a static web host during CI.

Additionally, wasm files should be served with application/wasm as MIME type. This ensures that browsers can compile the application while streaming the resource.

Currently, the following error is logged:

wasm streaming compile failed: TypeError: Failed to execute 'compile' on 'WebAssembly':
Incorrect response MIME type. Expected 'application/wasm'.
falling back to ArrayBuffer instantiation

Putting the JS AST specification on your radar

I'm currently working on the JavaScript Binary AST TC39 proposal. Part of this proposal is an official AST for JavaScript. While this AST is not final yet, it is stabilizing, so I figured it might be of interest to you: WIP specs.

Extracting a Rust ADT from the WIP specs is pretty easy. If you need, I have code that does it already, and I figure I'll publish it as a separate crate soon.

ASI for `++` and `--` tokens

x 
++ 
y

and

x 
-- 
y

Should be parsed as x; ++y; and x; --y; respectively.
They are currently parsed as x++; y; and x--; y;

`in operator` code minify

JavaScript Syntax:

in operator

let obj = {"a": 1, "b": 2};
if ( "a" in obj === false ) throw new Error('Ooops ...');

Codgen:

var obj={"a":1,"b":2};if("a"inobj===!1)throw new Error('Ooops ...');

"a"inobj need space.

Missing tokens

Apparently we are missing following tokens:

~~&&= should be OperatorLogicalAndAssign~~*
~~||= should be OperatorLogicalOrAssign~~*
# ? proposed use for private fields
@ ? proposed use for decorators
:: should be OperatorBind
.. ? E4X specific, not sure if we even need it

Editing the enum will require altering all lookup tables, which is a bit of a chore.

* - those are apparently invalid in ECMAScript, some parsers recognize them internally as tokens but don't parse them.

Parse `NaN` and `Infinity / -Infinity`

Keywords:

NaN / +NaN / -NaN --> std::f64::NAN
Infinity / +Infinity --> std::f64::INFINITY
-Infinity --> std::f64::NEG_INFINITY

These keywords need to be parsed into Literal, not an Identifier .

Get ideas from other projects

There are several other Rust projects (some launched recently) that deals with JavaScript sources. Let's see if we can get some ideas from their design.

https://github.com/nathan/pax by @nathan
https://github.com/swc-project/swc by @kdy1
https://github.com/FreeMasen/RESS by @FreeMasen

On this thread, let's focus on differences or ideas on:

Parser structure
Crate separation/modularization
Project goals: beyond parsers, what should we focus on? Transpiling, minification/optimization, and bundling is what I'm aware of currently.

Library authors: are you interested in joining force with Ratel? Are there anything on library design that we should change? (Pardon this friendly ping, sorry if you get annoyed by a notification. Feel free to unsubscribe in any case.)

TypeScript syntax support

Is there a plan for supporting TypeScript syntax? I mean it just supports the syntax without doing actual type checking.

Support for JSX syntax

Hi, Have plan to support JSX syntax ?

Unit tests for tokenizer

Tokenizer needs to have unit tests to make sure that at least all operators are correctly turned into tokens.

[AST] Unicode Support

Hi, the parser is not ready for parse unicode, right ?

The master branch:

The rewrite branch:

JavaScript Unicode Name Example:

这是一个名称 = "世界 ( World )！";

console.log(这是一个名称.length);
console.log(`Hello, ${这是一个名称}`);

Regular expressions

Need to add support for RegExp literals.

In parser if an expression begins with a Division operator, call a tokenizer method that reads the body and flags of a regular expression.

Do ... while support

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/do...while
http://www.ecma-international.org/ecma-262/5.1/#sec-12.6.1

One-line `for` loops, `if`statements

For loops should allow body to be a single non-block statement.

Function call expression parse error

Line: https://github.com/ratel-rust/ratel-core/blob/master/ratel/src/parser/expression.rs#L276-L277

parse f(1,2) is ok.
but parse f(1,) is error.

Ternary expression error

Reproduce:

const a = true ? console.log('foo') : null;

Template literals

Template strings are actually quite complicated, since they can create 4 different grammar elements:

* `foobar`
* `foo${
* }bar{
* }baz`

Class Expressions

Currently we only allow for class statements, but class expressions are also possible:

const foo = class { }

Just like with functions, class expression may have a name while a class statement must have a name. Other than that, parsing and transformation for class expressions and statements need to be DRY.

Getters and setters support

https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Functions/get
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Functions/set

Good way to know parent of Node in a Visitor?

Hey guys, I'm using this crate to build a control flow graph for JS code and I'm wondering how you would implement node parents. I'm reasonably new to Rust and I'm not sure where to start. I'm basing my work on the rewrite branch.