dtpreda / predictive Goto Github PK
View Code? Open in Web Editor NEWA generic purpose C++ parser generator
License: MIT License
A generic purpose C++ parser generator
License: MIT License
Currently, the parser does not support annotations. This should be updated to allow for better parsing trees. It suffices to make changes to the Parser
and NonTerminal
classes.
Skip expressions currently lack support from the program. The file passed to the program should be opened by a special function which should take into account the expressions passed down to the new grammar.
All terminals should have been properly declared (with the exception of EOF
) and all non-terminals on the right-hand side should have been declared, at some point, on the left-hand side.
The MATCH_ALL and REGEX_EXPR tokens can be easily mistaken by the ID token. This ambiguity should be cleared in order to ease the parser development.
The rule and symbol should be created to detect and process end of file: Start' -> Start <EOF>
.
This should be done before computing the grammar sets.
To generate the parser, we need a parsing table. For that, we need to determine the Nullable, First and Follow sets for each of the symbols.
The Nullable set requires empty rules support. While the closure *
creates empty rules, it only allows so in a specific context. This can be solved by making the Expansion
symbol on the prediCtive
grammar nullable.
To prevent errors on the parser generation phase, each terminal's regex should be validated at a semantic check level.
The node class should always hold pointers as references to other nodes, be it children or parents. Therefore, it should only work with Nodes in that appropriate format. A refactor is required to force this requirement upon the class.
After determining the Nullable, First and Follow sets, the parsing table should be built to generate the parser. For this, it must be verified that the language is, indeed, LL(1).
All that suffices now is to create a main function responsible for handling all phases of the program.
It should take two arguments - the grammar file and the file to parsed.
It should output the parsing tree obtained.
New visitors should be created for the purpose of establishing a symbol table, required for semantic verification of the program. The actual symbol table class should also be created.
A project vision should be defined in order to clearly define the end goals of this project. It should be added to the root README, with a a second section for the main features.
The first step is to properly document the grammar of the parser. A good documentation should provide:
The `ClosureSimplifierVisitor isn't currently supporting nested closures. Fixing should be straightforward.
For debug purposes (and other unforeseen ones), the AST could be printed out as text to standard output, in order for easier visualization.
Every node should take up a line, with each depth level increase getting an extra tab at the beginning of its line. Annotations should be included, next to the node's name.
The visitors are now capable of converting the parsing tree into an AST. Their behaviour should be bundled together in a class which provides the conversion, while hiding away specific details.
With the grammar now properly defined, the first step should be to allow the program to properly recognize the terminal symbols. This recognition requires the establishment of priority levels for the terminals list for disambiguation (e.g., "TOKENS" could be recognized as the TOKENS
terminal symbol or as an ID
terminal symbol).
Now with the ability to properly parse Tokens, the next step is to parse the grammar through recursive descent. The basic algorithm for LL(1) must be implemented. This will allow for input correctness verification.
Furthermore, the algorithm should already be responsible for generating a basic parsing tree with the overall structure of the file. Node annotating and flattening of certain branches should be left for later.
In order to return to the user the parsing tree that he wished for, there should be a visitor that correctly removes all the Intermediate_NonTerminal_X
nodes, while keeping the annotations taken by those nodes.
Keywords must be defined. The list should include:
Start
Non-Terminal, responsible for indicating the grammar's start point. Therefore, there should be no START
token allowedTOKENS
, SKIP
or RULES
, should also not be allowedMore entries may be added to this list.
With parsing now finished, the parsing tree needs to be converted into a syntax tree. For this, a generic visitor should be created to allow for contextual approaches to each node type.
Context-specific visitors should also be created, in order to convert the parsing tree into a syntax tree.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.