Giter Club home page Giter Club logo

predictive's People

Contributors

dtpreda avatar

Stargazers

 avatar  avatar

Watchers

 avatar

predictive's Issues

Parser annotation support

Currently, the parser does not support annotations. This should be updated to allow for better parsing trees. It suffices to make changes to the Parser and NonTerminal classes.

Skip Expression Support

Skip expressions currently lack support from the program. The file passed to the program should be opened by a special function which should take into account the expressions passed down to the new grammar.

Symbol Existence Verification

All terminals should have been properly declared (with the exception of EOF) and all non-terminals on the right-hand side should have been declared, at some point, on the left-hand side.

Token Ambiguity

The MATCH_ALL and REGEX_EXPR tokens can be easily mistaken by the ID token. This ambiguity should be cleared in order to ease the parser development.

End of file for new grammars

The rule and symbol should be created to detect and process end of file: Start' -> Start <EOF>.

This should be done before computing the grammar sets.

Nullable, First and Follow

To generate the parser, we need a parsing table. For that, we need to determine the Nullable, First and Follow sets for each of the symbols.

The Nullable set requires empty rules support. While the closure * creates empty rules, it only allows so in a specific context. This can be solved by making the Expansion symbol on the prediCtive grammar nullable.

Regex Validation

To prevent errors on the parser generation phase, each terminal's regex should be validated at a semantic check level.

Node and node-related instances refactor

The node class should always hold pointers as references to other nodes, be it children or parents. Therefore, it should only work with Nodes in that appropriate format. A refactor is required to force this requirement upon the class.

Parsing table

After determining the Nullable, First and Follow sets, the parsing table should be built to generate the parser. For this, it must be verified that the language is, indeed, LL(1).

Wrap-up

All that suffices now is to create a main function responsible for handling all phases of the program.

It should take two arguments - the grammar file and the file to parsed.

It should output the parsing tree obtained.

Symbol Table Creation

New visitors should be created for the purpose of establishing a symbol table, required for semantic verification of the program. The actual symbol table class should also be created.

Project Vision

A project vision should be defined in order to clearly define the end goals of this project. It should be added to the root README, with a a second section for the main features.

Parser Grammar

The first step is to properly document the grammar of the parser. A good documentation should provide:

  • The CFG for the parser grammar;
  • The First and Follow sets for each terminal and non-terminal;
  • The Nullable value for each terminal and non-terminal;
  • The LL(1) parsing table.

Nested Closures

The `ClosureSimplifierVisitor isn't currently supporting nested closures. Fixing should be straightforward.

AST printing

For debug purposes (and other unforeseen ones), the AST could be printed out as text to standard output, in order for easier visualization.

Every node should take up a line, with each depth level increase getting an extra tab at the beginning of its line. Annotations should be included, next to the node's name.

Bundle visiting for AST conversion

The visitors are now capable of converting the parsing tree into an AST. Their behaviour should be bundled together in a class which provides the conversion, while hiding away specific details.

Token recognition

With the grammar now properly defined, the first step should be to allow the program to properly recognize the terminal symbols. This recognition requires the establishment of priority levels for the terminals list for disambiguation (e.g., "TOKENS" could be recognized as the TOKENS terminal symbol or as an ID terminal symbol).

Recursive Descent Parsing

Now with the ability to properly parse Tokens, the next step is to parse the grammar through recursive descent. The basic algorithm for LL(1) must be implemented. This will allow for input correctness verification.

Furthermore, the algorithm should already be responsible for generating a basic parsing tree with the overall structure of the file. Node annotating and flattening of certain branches should be left for later.

Parsing Tree Simplification

In order to return to the user the parsing tree that he wished for, there should be a visitor that correctly removes all the Intermediate_NonTerminal_X nodes, while keeping the annotations taken by those nodes.

Keywords

Keywords must be defined. The list should include:

  • A mandatory Start Non-Terminal, responsible for indicating the grammar's start point. Therefore, there should be no START token allowed;
  • Symbol name representation on the AST should have a keyword associated;
  • Consumed tokens should also have a keyword associated with them;
  • All prediCtive keywords, such as TOKENS, SKIP or RULES, should also not be allowed.

More entries may be added to this list.

Visitor(s)

With parsing now finished, the parsing tree needs to be converted into a syntax tree. For this, a generic visitor should be created to allow for contextual approaches to each node type.

Context-specific visitors should also be created, in order to convert the parsing tree into a syntax tree.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.