Giter Club home page Giter Club logo

Comments (7)

Hywan avatar Hywan commented on May 17, 2024

Hello :-),

Is PP language mainly intended for greating new DSL rather then port existing ones?

PP aims at describing all LL(*) grammars. The language is mainly inspired from JavaCC or YACC, but it has additional unique constructions, such as the unification (please, see the README.md or the documentation). Consequently, PP is more expressive than the (E)BNF formalism.

Why was PP language created and not a compiler for (E)BNF?

Please, see the previous answer. We needed new constructions to go futher. Moreover, the (E)BNF formalism has several limitations.

The PP language, along with the Hoa\Compiler\Llk library, is the result of my recent researches for my PhD thesis. These works have been published in A-MOST 2012 (please, see the details, the full research paper or the presentation). One of the goals was to generate data based on a grammar. We have proposed three different algorithms, see the article for more informations.

Would it be worth porting EBNF to PP or would I be better of using another compiler?

Not it's very easy and natural. Taking your example:

QueryLanguage ::= SelectStatement | UpdateStatement | DeleteStatement

becomes

QueryLanguage:
    SelectStatement() | UpdateStatement() | DeleteStatement()

And

SelectStatement ::= SelectClause FromClause [WhereClause] [GroupByClause] [HavingClause] [OrderByClause]
UpdateStatement ::= UpdateClause [WhereClause]
DeleteStatement ::= DeleteClause [WhereClause]

becomes

SelectStatement:
    SelectClause() FromClause() WhereClause()? GroupByClause()? HavingClause()? OrderByClause()?

UpdateStatement:
    UpdateClause() WhereClause()?

DeleteStatement:
    DeleteClause() WhereClause()?

Also:

IdentificationVariable ::= identifier

becomes

IdentificationVariable:
    <identifier>

The documentation is still in french but we are working hard on translating it. The README.md and the full paper should help you.

Is it also possible to build a compiler to transform (E)BNF grammar to PP. It would be really easy. However, the Hoa\Compiler\Llk library provides the construction of an AST at the end of the analyzes if they have succeeded. The AST balance can be controlled by using sharps (again, see the #node construction in the README.md & co.). This information does not appear in a (E)BNF grammar.

Finally, you're welcome on #hoaproject to get help!

from compiler.

flip111 avatar flip111 commented on May 17, 2024

Thanks for the very extensive reply!

That's great news that PP is more extensive than (E)BNF, as you show porting it won't be a problem then. It's a very interesting suggestion to use the compiler to parse (E)BNF itself and then convert it to PP.

Closing issue: Answer covered original question ...

Off Topic:
By the way i already looked at the full research paper another time, but i was just scratching my head about it o_O The presentation looks a bit more easy to follow.

Generating unit tests is also pretty cool, the phpunit-skeleton-generator is pretty limited with what it does, so it can use some help.

When looking at the presentation i don't understand most of it, but what i think is super valuable is having an @invariant annotation (as i know this from a presentation on Domain Driven Design). So i guess the other ones are also useful :) I'm not sure why you implemented a @throwable annotation as phpdoc already describes a @throws annotation, so might be double?? http://www.phpdoc.org/docs/latest/for-users/phpdoc/tags/throws.html

I've been wondering for a long time how to test all possible inputs for a unit test (without writing all yourself) .. or at least a set that gives you enough probability to cover things. I guess this can now be done with the software in the presentation?

from compiler.

Hywan avatar Hywan commented on May 17, 2024

It's totally off-topic but Praspel is a contract language. It's not just writing an API documentation, it's writing a specification. Read the presentation and maybe take a look at https://github.com/hoaproject/Praspel and https://github.com/hoaproject/Hoathis-Atoum. I'm working hard on it since I reach the end of my PhD thesis and I have to finish this work before June ;-).

from compiler.

flip111 avatar flip111 commented on May 17, 2024

So Praspel integrated with Atoum by the Hoathis-Atoum library? So it can not be used directly with phpunit?

from compiler.

Hywan avatar Hywan commented on May 17, 2024

PHPUnit is a unit testing tool, just like atoum. But atoum is more powerful, faster and more modular. Please, watch https://github.com/atoum/atoum, https://github.com/Hywan/atoum-instrumentation/, https://github.com/jubianchi/atoum, https://github.com/Hywan/atoum and https://github.com/hoaproject/Hoathis-Atoum to get more informations about the activity. We will publish a big article, slides & co. to explain our current work. Stay tuned :-).

from compiler.

flip111 avatar flip111 commented on May 17, 2024

👍

from compiler.

flip111 avatar flip111 commented on May 17, 2024

I tried to port an EBNF file to PP, but this is not easy. Manually the process is too error-prone and extremely tedious. Automatic conversion is not easy either. I managed to do it with a bunch of regular expressions and other tricks but encountered many cases were special things needed to be done.

  • remove comments from ebnf
  • replace equal sign to semi-colon for starting rule
  • Replace things like '[' with a placeholder to prevent difficulties with bracket replacement later on.
  • Find groups in brackets, single items should not get parenthesis, several items need parenthesis. Group types: normal, one-or-more, zero-or-more
  • remove command and semi-colon (they have no purpose in PP)
  • replace quoted literals by pp notation ::literal::
  • replace placeholders to pp literal
  • find all literals, make a unique list so that a token alias can be assigned
  • escape characters in literals because pp uses regex to denote them (and also i will use regex to find and replace them)
  • replace literals with token aliases (::literal:: -> ::token1::)
  • further replacement of literals and escape all characters that fall under \h horizontal whitespace, because the pp parser will eat them otherwise. See also #19
  • make rules a node in grammar (by adding #)
  • add list of tokens to the output file (%token alias regex)
  • add parenthesis to all rule use
  • manually (too much effort to automate this) replace ANY - (literals) with regex [^token_literals]
  • manually (too much effort to automate this) replace quantifier 4 * literal with regex literal{4,4}

Now the output is valid pp syntax however it's not best practice on pp. For example in EBNF to be case insensitive a rule has to be made with a literal for lower case and upper case. To to parse hello the following rules are now present:

#H:
    ::token43:: | ::token73:: 

#E:
    ::token40:: | ::token70:: 

#L:
    ::token47:: | ::token77:: 

#O:
    ::token50:: | ::token80:: 

#myHello:
    H() E() L() L() O()

Now a further optimization step is needed that will replace #myHello with ::hello_token:: where ::hello_token is [hH][eE][lL][lL][oO] for performance and clarity and more idiomatic pp code style.

Also now i have valid pp syntax, but the test input doesn't parse so more has to be done to the grammar.

A full fledged EBNF parser to PP translator with optimizations would be nice to have.

from compiler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.