Giter Club home page Giter Club logo

sage's Introduction

Sage

[Version 1.0.0 - 12/15/2015]

Sage is a packrat PEG parser I created for the intents of building my own language. It provides support for parsing generic PEG grammars (some examples included in the /grammars folder).

Features

  • Full fledged Regex support
    • I am aware C++11 provides support for regexes but wanted to custom-roll my own library
    • This allows me to have complete control over the scanning and parsing process
    • This was also for my own education of NFA -> DFA -> Regex conversions
  • Arbitrary Scanning
    • Scanning is provided in a manner similar to the Java Scanner
    • Allows for reading in delimited tokens (and not if not word bounded)
  • PEG Parsing
    • By using the PEGParser class, one can construct a PEG parser from a .peg file
    • Can then begin parsing an arbitrary file according to this grammar, returning an AST

Limitations

This is primarily an experimental project, and I believe I've run the course of the experiment, especially after studying up on unicode functionality. Provided are some of the limitations of Sage if you decide to play with it:

  • Unicode Support
    • While I could perhaps look into getting this running, it would be an insane amount of work to try and compete against something like ICU or Boost.Locale. And if I were to settle with either of these libraries, I might as well use their corresponding regex libraries as well (which also supports unicode).
  • Contextual Analysis
    • I would like to incorporate some means of declaring types within the PEG file so that Sage can perform the contextual analysis but its not something I'm too excited about jumping on quite yet.
  • Scanning
    • If a language is whitespace delimited (something like Python), the provided Scanner provides very limited support for this.

Example

The following grammar is also written out in arithmetic.peg under the /grammars folder:

Expression' -> Sum
Sum         -> Product ("[+\-]" Product)*
Product     -> Value ("[*/]" Value)*
Value       -> "[0-9\"]+" | "\(" Expression "\)"

Here a PEG statement is broken up into a term (the nonterminal) and the definition (everything to the right of the arrow operator). To refer to another nonterminal, simply call this nonterminal by name. To refer to a terminal, surround a string via quotes and place in a regex (according to the rules of the custom regex module). Note that regexes and nonterminals can also be surrounded by parenthesis and repeated with the kleene star ('*') and plus operator ('+'), or made optional ('?').

The above would parse the following statement:

195 + (186 * 32) - 14 / 9

and construct the following AST tree:

|- Sum
|---------- Product
|-------------------- Value
|------------------------- 195
|-------------------- +
|-------------------- Product
|------------------------------ Value
|---------------------------------------- (
|---------------------------------------- Expression
|--------------------------------------------- Sum
|------------------------------------------------------- Product
|----------------------------------------------------------------- Value
|---------------------------------------------------------------------- 186
|---------------------------------------------------------------------- *
|---------------------------------------------------------------------- Value
|--------------------------------------------------------------------------- 32
|---------------------------------------- )
|-------------------- -
|-------------------- Product
|------------------------------ Value
|----------------------------------- 14
|----------------------------------- /
|----------------------------------- Value
|---------------------------------------- 9

sage's People

Contributors

jrpotter avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.