Giter Club home page Giter Club logo

Comments (3)

kamadorueda avatar kamadorueda commented on May 28, 2024 2

rnix-parser is faster (at least for now), but much less correct, so that's the trade-off

rnix-parser is based on a less generic algorithm: recursive descent, while Nixel is based on a modified Earley. This is an important difference with big implications becase Nix requires GLR, which is in the same category as Earley (they can parse ambiguous languages like Nix), which are 1 level above the category of deterministic parsers like recursive descent, LALR, LR, PEG, etc. If you use an algorithm unable to handle ambiguous grammars you'll always encounter code that is incorrectly parsed, as you can find in rnix-parser issues. On the other hand, NixEL performs the same lexical analysis as the original Nix, and uses similar parse tables, and that allows to handle correctly (as in exactly equal to Nix) logic like unescaping strings, dedenting multiline strings, the funny stuff around the or variable name, path interpolations and disambiguation and precedence, which rnix-parser does not do as of today, not to mention the complex logic behind the lexer states, which is very hard (if not very tedious) to get correctly on a hand-written parser like rnix-parser. On NixEl everything is declarative, much easier to maintain

So in short, correctness, that's the comparison, and probably types, since rnix-parser only has nodes and tokens, and here we have a richly typed data-structure, much easier to work with

from nixel.

oberblastmeister avatar oberblastmeister commented on May 28, 2024

One important difference is that rnix-parser is error resistant, you can give the parser gibberish and it will always return a tree. This is very important for things like IDEs and formatters. rnix-parser will soon have a typed api nix-community/rnix-parser#105.

If you use an algorithm unable to handle ambiguous grammars you'll always encounter code that is incorrectly parsed, as you can find in rnix-parser issues.

Those issues are just normal issues with the parser, not because the algorithm is unable to handle an ambiguous grammar. Of course using a parser generator makes the parser easier to maintain, but in return we get more control. I would be interested to see an example that cannot be parsed due to being ambiguous. As for speed, recursive descent is $O(n)$ while GLR and Earley are both $O(n^3)$. GLR has the advantage of being $O(n)$ for most inputs.

from nixel.

kamadorueda avatar kamadorueda commented on May 28, 2024

Since version 5.0 NixEL is the closest parser to the original Nix implementation, essentially because now NixEL is a copy-paste of the original lexer and parser definitions, and uses the same lexer and parser generator (Flex and GNU Bison).

Since version 5.0 NixEL also uses GLR instead of Earley, it parses the entirety of Nixpkgs in under 25 seconds (single-threaded) while keeping memory safety, a typed API, offering single-line comments, multi-line comments, whitespace, string unescaping, multi-line string indentation handling, starting and end positions of everything, and excellent documentation.

All of that is without the bugs inherent to using a hand-crafted backend as the one rnix-parser uses

Those issues are just normal issues with the parser, not because the algorithm is unable to handle an ambiguous grammar. Of course using a parser generator makes the parser easier to maintain, but in return we get more control.

At NixEL I don't consider parsing issues "normal" because the whole point of a parser for Nix is parsing Nix correctly, ideally exactly equal to the reference implementation. I don't see the point in having "more control" if it cannot be used for that purpose.

That being said, the feature of parsing partially valid Nix files is perfectly possible with the current architecture of Nixel, and first-class citizen of GNU Bison: #3. So I foresee a future where NixEL supports that as well.

I think some interesting future work would be to upstream back the knowledge I got from working on NixEL to Nix, since in the original parser implementation of Nix they mention some memory leaks, which I was able to find, confirm and fix while working on NixEL. This has the potential of being very beneficial, especially when evaluating big projects like Nixpkgs, which requires a considerable amount of memory right now, so let me know if that's something the Nix Core Developers would be interested in merging.

Lastly, I'm closing this issue. I'm not interested in creating a discussion between library authors here, but I understand that putting excellent software out there can create some tension between the alternatives, as happened when I published Alejandra. Let's not take it personally. We all are members of the same community, and the community benefits from the software we publish. That's how FOSS works.

from nixel.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.