Comments (7)
Hello :-),
Is PP language mainly intended for greating new DSL rather then port existing ones?
PP aims at describing all LL(*) grammars. The language is mainly inspired from JavaCC or YACC, but it has additional unique constructions, such as the unification (please, see the README.md
or the documentation). Consequently, PP is more expressive than the (E)BNF formalism.
Why was PP language created and not a compiler for (E)BNF?
Please, see the previous answer. We needed new constructions to go futher. Moreover, the (E)BNF formalism has several limitations.
The PP language, along with the Hoa\Compiler\Llk
library, is the result of my recent researches for my PhD thesis. These works have been published in A-MOST 2012 (please, see the details, the full research paper or the presentation). One of the goals was to generate data based on a grammar. We have proposed three different algorithms, see the article for more informations.
Would it be worth porting EBNF to PP or would I be better of using another compiler?
Not it's very easy and natural. Taking your example:
QueryLanguage ::= SelectStatement | UpdateStatement | DeleteStatement
becomes
QueryLanguage:
SelectStatement() | UpdateStatement() | DeleteStatement()
And
SelectStatement ::= SelectClause FromClause [WhereClause] [GroupByClause] [HavingClause] [OrderByClause]
UpdateStatement ::= UpdateClause [WhereClause]
DeleteStatement ::= DeleteClause [WhereClause]
becomes
SelectStatement:
SelectClause() FromClause() WhereClause()? GroupByClause()? HavingClause()? OrderByClause()?
UpdateStatement:
UpdateClause() WhereClause()?
DeleteStatement:
DeleteClause() WhereClause()?
Also:
IdentificationVariable ::= identifier
becomes
IdentificationVariable:
<identifier>
The documentation is still in french but we are working hard on translating it. The README.md
and the full paper should help you.
Is it also possible to build a compiler to transform (E)BNF grammar to PP. It would be really easy. However, the Hoa\Compiler\Llk
library provides the construction of an AST at the end of the analyzes if they have succeeded. The AST balance can be controlled by using sharps (again, see the #node
construction in the README.md
& co.). This information does not appear in a (E)BNF grammar.
Finally, you're welcome on #hoaproject
to get help!
from compiler.
Thanks for the very extensive reply!
That's great news that PP is more extensive than (E)BNF, as you show porting it won't be a problem then. It's a very interesting suggestion to use the compiler to parse (E)BNF itself and then convert it to PP.
Closing issue: Answer covered original question ...
Off Topic:
By the way i already looked at the full research paper another time, but i was just scratching my head about it o_O The presentation looks a bit more easy to follow.
Generating unit tests is also pretty cool, the phpunit-skeleton-generator is pretty limited with what it does, so it can use some help.
When looking at the presentation i don't understand most of it, but what i think is super valuable is having an @invariant annotation (as i know this from a presentation on Domain Driven Design). So i guess the other ones are also useful :) I'm not sure why you implemented a @throwable annotation as phpdoc already describes a @throws annotation, so might be double?? http://www.phpdoc.org/docs/latest/for-users/phpdoc/tags/throws.html
I've been wondering for a long time how to test all possible inputs for a unit test (without writing all yourself) .. or at least a set that gives you enough probability to cover things. I guess this can now be done with the software in the presentation?
from compiler.
It's totally off-topic but Praspel is a contract language. It's not just writing an API documentation, it's writing a specification. Read the presentation and maybe take a look at https://github.com/hoaproject/Praspel and https://github.com/hoaproject/Hoathis-Atoum. I'm working hard on it since I reach the end of my PhD thesis and I have to finish this work before June ;-).
from compiler.
So Praspel integrated with Atoum by the Hoathis-Atoum library? So it can not be used directly with phpunit?
from compiler.
PHPUnit is a unit testing tool, just like atoum. But atoum is more powerful, faster and more modular. Please, watch https://github.com/atoum/atoum, https://github.com/Hywan/atoum-instrumentation/, https://github.com/jubianchi/atoum, https://github.com/Hywan/atoum and https://github.com/hoaproject/Hoathis-Atoum to get more informations about the activity. We will publish a big article, slides & co. to explain our current work. Stay tuned :-).
from compiler.
from compiler.
I tried to port an EBNF file to PP, but this is not easy. Manually the process is too error-prone and extremely tedious. Automatic conversion is not easy either. I managed to do it with a bunch of regular expressions and other tricks but encountered many cases were special things needed to be done.
- remove comments from ebnf
- replace equal sign to semi-colon for starting rule
- Replace things like
'['
with a placeholder to prevent difficulties with bracket replacement later on. - Find groups in brackets, single items should not get parenthesis, several items need parenthesis. Group types: normal, one-or-more, zero-or-more
- remove command and semi-colon (they have no purpose in PP)
- replace quoted literals by pp notation
::literal::
- replace placeholders to pp literal
- find all literals, make a unique list so that a token alias can be assigned
- escape characters in literals because pp uses regex to denote them (and also i will use regex to find and replace them)
- replace literals with token aliases (
::literal::
->::token1::
) - further replacement of literals and escape all characters that fall under
\h
horizontal whitespace, because the pp parser will eat them otherwise. See also #19 - make rules a node in grammar (by adding
#
) - add list of tokens to the output file (
%token alias regex
) - add parenthesis to all rule use
- manually (too much effort to automate this) replace
ANY - (literals)
with regex[^token_literals]
- manually (too much effort to automate this) replace quantifier
4 * literal
with regexliteral{4,4}
Now the output is valid pp syntax however it's not best practice on pp. For example in EBNF to be case insensitive a rule has to be made with a literal for lower case and upper case. To to parse hello
the following rules are now present:
#H:
::token43:: | ::token73::
#E:
::token40:: | ::token70::
#L:
::token47:: | ::token77::
#O:
::token50:: | ::token80::
#myHello:
H() E() L() L() O()
Now a further optimization step is needed that will replace #myHello
with ::hello_token::
where ::hello_token
is [hH][eE][lL][lL][oO]
for performance and clarity and more idiomatic pp code style.
Also now i have valid pp syntax, but the test input doesn't parse so more has to be done to the grammar.
A full fledged EBNF parser to PP translator with optimizations would be nice to have.
from compiler.
Related Issues (20)
- Multiple start-symbols support
- Parsing tree is just the first token HOT 1
- Dependabot can't resolve your PHP dependency files
- Backtrack issue when rules overlap HOT 1
- PHP 7.4 deprecation warning in Bin/Pp.php
- Future of Compiler package HOT 4
- Enhance context output of UnrecognizedToken exception
- Enhance context output of UnexpectedToken exception
- Unexpected namespace assignment for PCRE containing colon
- Dependabot can't resolve your PHP dependency files
- Dependabot can't resolve your PHP dependency files
- mbstring problem
- Debug grammar tooling problems
- Bug when saving parser class
- Question: how to access/traverse nodes of grammar HOT 5
- Inlining code of the Parser and license HOT 2
- Madness with exceptions HOT 5
- Remove dependency to `ext/ctype` HOT 4
- Broken visualization of invalid input token in multiline input HOT 7
- Unrecognized Token in Lexer always reports Line 1? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compiler.