Giter Club home page Giter Club logo

ironmeta's Introduction

Build CI
NuGet NuGet

The IronMeta parser generator provides a programming language and application for generating pattern matchers on arbitrary streams of objects. It is an implementation of Alessandro Warth's OMeta system in C#.

IronMeta is available under the terms of the BSD License.

Changelog

The changelog is available in the repo: CHANGELOG.

Using IronMeta

IronMeta is available on NuGet. To install it, open the NuGet shell and type Install-Package IronMeta, or use the NuGet tools for Visual Studio. This will install the IronMeta library in your package.

Once you have installed the NuGet package, add a grammar file with the extension .ironmeta to your project. Then generate a C# class from it. You can do this in two ways:

  • You can install a Visual Studio extension that provides a custom tool for generating C# code from IronMeta files. You must set the "Custom Tool" property of your IronMeta file to be IronMetaGenerator. Then the C# code will be generated whenever your grammar file changes. Syntax errors will appear in your Error List.
  • IronMeta.Library.dll contains an MsBuild task called "IronMetaGenerate". A simple example of how to use this:
      <UsingTask TaskName="IronMetaGenerate" AssemblyFile="path_to\IronMeta.Library.dll" />
      <Target Name="BeforeBuild">
        <IronMetaGenerate Input="MyParser.ironmeta" Output="MyParser.g.cs" Namespace="MyNamespace" Force="true" />
      </Target>
  • A command-line program IronMetaApp.exe is included in the NuGet package, in the tools directory. The program takes the following arguments:
    • -o {output} (optional): Specify the output file name (defaults to {input}_.g.cs).
    • -n {namespace} (optional): Specify the namespace to use for the generated parser (defaults to the name of the directory the input file is in).
    • -f (optional): Force generation even if the input file is older than the output file.
    • {input}: Specify the input file name (must end in .ironmeta.)

To use an IronMeta-generated parser in your C# program, create a new instance of the generated parser class. Then call the function GetMatch() with the input you wish to parse, and the method of the generated parser object that corresponds to the top-level rule you wish to use. This returns an object of type IronMeta.Matcher.MatchResult, which contains information about the result of the match, as well as errors that might have ocurred.

The following is a small sample program that uses the Calc demo parser that is included in the source code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MyCalcProject
{
    class Program
    {
        static void Main(string[] args)
        {
            var parser = new Calc();
            var match = parser.GetMatch("2 * 7", parser.Expression);

            if (match.Success)
                Console.WriteLine("result: {0}", match.Result); // should print "14"
            else
                Console.WriteLine("error: {0}", match.Error); // shouldn't happen
        }
    }
}

Building from Source

When you have checked out the GitHub repository, you will need to generate your own signing key in Source\IronMeta.snk; open a Visual Studio Developer Command Prompt and type:

sn -k Source\IronMeta.snk

Features

Although the most common use for IronMeta is to build parsers on streams of text for use in compiling or other text processing, IronMeta can generate pattern matchers (more accurately, transducers) for any input and output type. You can use C# syntax directly in grammar rules to specify objects to match.

  • IronMeta-generated parsers use strict Parsing Expression Grammar semantics; they are greedy and use committed choice.
  • Generated parsers are implemented as C# partial classes, allowing you to keep ancillary code in a separate file from your grammar.
  • You can use anonymously-typed object literals in rules; they are matched by comparing their properties with the input objects'.
  • Unrestricted use of C# in semantic conditions and match actions.
  • Higher-order rules: you can pass rules (or arbitrary patterns) as parameters, and then use them in a pattern.
  • Pattern matching on rule arguments: you can apply different rule bodies depending on the number and types of parameters.
  • Flexible variables: variables in an IronMeta rule may be used to:
    • get the input of an expression they are bound to.
    • get the result or result list of an expression they are bound to.
    • match a rule passed as a parameter.
    • pass a rule on to another rule.
  • As an enhancement over the base OMeta, IronMeta allows direct and indirect left recursion, using Sérgio Medeiros et al's algorithm for all rules, even within parameter matching.

Current limitations

Error reporting is currently quite rudimentary, only reporting the last error that ocurred at the rightmost position in the input.

Performance is quite slow, as not much optimization has been done to date.

The IronMeta Language

This section is an informal introduction to the features of the IronMeta language.

It uses the following IronMeta file named Calc.ironmeta, which is included in the IronMeta distribution. It can also be found in the Samples/Calc directory in the source.

The Calc grammar is much more complex than it needs to be in order to demonstrate some of the advanced functionality of IronMeta.

// IronMeta Calculator Example

using System;
using System.Linq;

ironmeta Calc<char, int> : Matcher<char, int>
{
    Expression = Additive;

    Additive = Add | Sub | Multiplicative;

    Add = BinaryOp(Additive, '+', Multiplicative) -> { return _IM_Result.Results.Aggregate((total, n) => total + n); };
    Sub = BinaryOp(Additive, '-', Multiplicative) -> { return _IM_Result.Results.Aggregate((total, n) => total - n); };

    Multiplicative = Multiply | Divide;
    Multiplicative = Number(DecimalDigit);

    Multiply = BinaryOp(Multiplicative, "*", Number, DecimalDigit) -> { return _IM_Result.Results.Aggregate((p, n) => p * n); };
    Divide = BinaryOp(Multiplicative, "/", Number, DecimalDigit) -> { return _IM_Result.Results.Aggregate((q, n) => q / n); };

    BinaryOp :first :op :second .?:type = first:a KW(op) second(type):b -> { return new List<int> { a, b }; };

    Number :type = Digits(type):n WS* -> { return n; };

    Digits :type = Digits(type):a type:b -> { return a*10 + b; };
    Digits :type = type;

    DecimalDigit = .:c ?( (char)c >= '0' && (char)c <= '9' ) -> { return (char)c - '0'; };
    KW :str = str WS*;
    WS = ' ' | '\n' | '\r' | '\t';
}

We will go through this example line by line to introduce the IronMeta language:

Comments

// IronMeta Calculator Example

You may include comments anywhere in the IronMeta file. They may also be in the C-style form:

/* C-Style Comment */

Preamble

using System;
using System.Linq;

You can include C# using statements at the beginning of an IronMeta file. IronMeta will automatically add using statements to its output to include the namespaces it needs.

## Parser Declaration

ironmeta Calc<char, int> : Matcher<int>

An IronMeta parser always starts with the keyword ironmeta. Then comes the name of the parser (Calc, in this case), and the input and output types. The generated parser will take as input an IEnumerable of the input type, and return as output an object of the output type.

In this case, the Calc parser will operate on a stream of char values, and output an int value.

Note: you must always include the input and output types.

You may also optionally include a base class:

: Matcher<int>

If you do not include a base class, your parser will inherit directly from IronMeta.Matcher.Matcher.

Rules

Expression = Additive;

An IronMeta rule consists of a name, an pattern for matching parameters, a =, a pattern for matching against the main input, and a terminating semicolon ; (for folks used to C#) or comma , (for folks used to OMeta):

In this case, the rule Expression has no parameters, and matches by calling another rule, Additive.

Matching Input

You can use the period . to match any item of input, or you can use arbitrary C# expressions. The C# expressions may be a string literal, a character literal, a regular expression, an object created using the new keyword, or any other expression that is surrounded by curly braces:

MyPattern = 'a' "b" {3.14159} {new MyClass()};

IronMeta will use the standard C# object.Equals() method to match the inputs.

Regular Expressions

If your input type is char, you can use simple regular expressions:

MyPattern = /a?bc(def+|ghi)(kl)*/;

You can use the following constructs in regular expressions:

  • One or more single characters, e.g. /abc/. The following syntax characters must be escaped if you want to match them: |, (, ), [, ], \, +, *.
  • Categories: \s matches whitespace, \d matches any Unicode digit, \w matches any Unicode letter, \p{Cc} matches any character in given Unicode general category, e.g. Lu, Nd.
  • Disjunctions: /abc|def/.
  • Classes: /[abcd-g]/ (syntax characters must be escaped here as well). You can use negative categories: /[^xyz]/.
  • + matches one or more elements, star * matches zero or more, and ? matches zero or one. As usual, these bind tighter than disjunction but looser than sequence.
  • () will group sequences.

Matching Anonymous Objects

You can also use anonymous object syntax (you don't need to surround the whole new expression with braces in this case):

MyPattern = new { Name="MyName", Value="MyValue" }  new { Name="MyName", Value="MySecondValue" };

Literals that you define with anonymous types will be matched according to their public properties; if an input object has the same properties with the same values, it will be considered equal to the anonymous object.

Matching Sequences

The pattern literal can also be an IEnumerable of the input type, including C# strings for sequences of characters.

This eliminates the need for the OMeta token function; just use a string literal, or if you are matching on something other than characters, use a list:

MyPattern = {new List<MyInputType>{ a, b, c }};

Sequence and Disjunction

Additive = Add | Sub | Multiplicative;

As is probably obvious from the other rules, you write a sequence of patterns by simply writing them one after the other, separated by whitespace.

To specify a choice between alternatives, separate them with |.

Note: unlike in other parser generator formalisms, separating expressions with a carriage return does NOT mean they are alternatives! You must always use the |.

Other Operators

You can modify the meaning of patterns with the following operators that appear after an expression:

  • ? will match zero or one time.
  • * will match zero or more times.
  • + will match one or more times.
  • {N} will match N times.
  • {Min,Max} will match at least Min times, and at most Max times.

These operators are all greedy –- they will match as many times a possible and then return that result.

You can stop them from matching by using the prefix operators:

  • & as a prefix will match an expression but NOT advance the match position. This allows for unlimited lookahead.
  • ~ as a prefix will match if the expression does NOT match. It will not advance the match position.

Conditions and Actions

DecimalDigit = .:c ?( (char)c >= '0' && (char)c <= '9' ) -> { return (char)c - '0'; };

Here things get more interesting. This rule has only one expression, the period .. This will match a single item of input. It is then bound to the variable c by means of the colon :.

Note: you can leave out the period if you are binding to a variable; that is, :c is equivalent to .:c. However, this rule will not necessarily match any character, because it contains a condition. A condition is written with ? followed immediately by a C# expression in parentheses. The C# expression must evaluate to a bool value. Once the expression matches (in this case it will match anything), it is bound to the variable c, which is then available for use in your C# code.

The rule also contains an action. Actions are written with -> followed by a C# block surrounded by curly braces. This block must contain a return statement that returns a value of the output type, or a List<> of the output type.

Note: if you do not provide an action for the expression, it will simply return the results of its patterns, as a list. Matching a single item will return default(TResult) by default, or you can pass a delegate or lambda function to the matcher when you create it that will convert values of the input type to the output type. Be aware that an action only applies to the last expression in an OR expression. So the action in the following:

MyRule = One | Two | Three -> { my action };

will only run if the expression Three matches! If you want an action to apply on an OR, use parentheses:

MyRule = (One | Two | Three) -> { my action };

Variables

Digits :type = Digits(type):a type:b -> { return a*10 + b; };

Upon a successful match, variables will contain information about the results of the match of the expression they are bound to. In this example, because a & b are used in an expression containing an integer, they will automatically evaluate to the results of their expressions, because the result type of the Calc grammar is int.

IronMeta variables are very flexible. They contain implicit cast operators to:

  • A single value of the input type: this will return the last item in the list of results of the expression that the variable is bound to.
  • A single value of the output type.
  • A List<> of the input type.
  • A List<> of the output type.

If your input and output types are the same, the implicit cast operators will only return the inputs, and you will need to use the explicit variable properties:

  • c.Inputs returns the list of inputs that the parse pattern matched.
  • c.Results returns the list of results that resulted from the match.
  • c.StartIndex returns the index in the input stream at which the pattern started matching.
  • c.NextIndex returns the first index in the input stream after the pattern match ended.

You can also use variables in a pattern, in which case they will match whatever input they matched when they were bound. Or, if they were bound to a rule in a parameter pattern (see below), they will call that rule. You can even pass parameters to them.

Built-In Variables

IronMeta automatically defines a variable for use in your C# code: _IM_Result is bound to the entire expression that your condition or action applies to.

Multiple Rule Bodies

Multiplicative = Multiply | Divide;
Multiplicative = Number(DecimalDigit);

You can have multiple rule bodies; their patterns will be combined in one overall OR when that rule is called.

Parameters

Add = BinaryOp(Additive, '+', Multiplicative) -> { return _IM_Result.Results.Aggregate((total, n) => total + n); };

This rule shows that you can pass parameters to a rule. You can pass literal match patterns, rule names, or variables.

BinaryOp :first :op :second .?:type = first:a KW(op) second(type):b -> { return new List<int> { a, b }; };

This rule demonstrates how to match parameters. The parameter part of a rule is actually a matching pattern no different than that on the right-hand side of the =. Using this fact, plus the ability to specify multiple rules with the same name, you can write rules that match differently depending on the number and kind of parameters they are passed.

Rules as Arguments

Add = BinaryOp(Additive, '+', Multiplicative) -> { return _IM_Result.Results.Aggregate((total, n) => total + n); };
BinaryOp :first :op :second .?:type = first:a KW(op) second(type):b -> { return new List<int> { a, b }; };

These rules show that you can pass rules as parameters to other rules. To match against them, just capture them in a variable in your parameter pattern, and then use the variable as an expression in your pattern. You can pass parameters as usual.

Patterns as Arguments

You can also pass arbitrary patterns as arguments. Variables from the outer rule that you use in the argument pattern will be passed to the inner pattern when matching.

List Folding

KW :str = str Whitespace*;

If you look at the rules that call this rule (indirectly through the BinaryOp rule), you'll see that they pass both a single character and a string:

Sub = BinaryOp(Additive, '-', Multiplicative) -> { return _IM_Result.Results.Aggregate((total, n) => total - n); };
Divide = BinaryOp(Multiplicative, "/", Number, DecimalDigit) -> { return _IM_Result.Results.Aggregate((q, n) => q / n); };

When matching against variables captured in parameters, variables containing single items or variables containing lists will match correctly.

Rule Inheritance

IronMeta parsers are regular C# classes, so they can inherit from other parsers and call their rules. You must preface rules you wish to override by the C# keyword virtual. You can override rules in a base class by prefixing the rule definition by the keyword override, or hide non-virtual rules by means of the new keyword.

ironmeta DerivedParser<char, Node> : BaseParser<Node>
{
    virtual Expression1 = ...;
    override Expression2 = ...;
    new Expression3 = ...;
}

Rule Encapsulation

You can also refer to rules in a completely unrelated grammar (as long as the input and output types are the same) by declaring initialized members of the other grammar's class and referring to those members' rule functions.

public partial class MyParser
{
    private OtherParser other_parser = new OtherParser();
}

ironmeta MyParser<char, int>: IronMeta.Matcher.Matcher<char, int>
{
    Rule = "foo" other_parser.OtherRule "bar";
}

ironmeta's People

Contributors

andreyzakatov avatar chalcolith avatar dependabot[bot] avatar steve7411 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ironmeta's Issues

item template for vs

please add a item template with build in custom tool and assembly to vs extension

Regexes are undocumented and non-standard

So there is support for regex in the form /regex/ but not as per .NET. In particular, the characters +-|* have to be escaped when used inside []
So what flavours of regex are used exactly?

Problems with README documentation

As best I can tell the class CharMatcher doesn't exist in the library?

// IronMeta Calculator Example

using System;
using System.Linq;

ironmeta Calc<char, int> : IronMeta.Matcher.CharMatcher<int>
{
    Expression = Additive;

General Question: Parse anything

hi, i would parse a block with body that can be anything, like this:
xml { }

this works fine but when i want to try some body content with curly braces it fails:
json { "obj": { "key": true } }

my rule for the body: [^}]*
ive treid using .* instead but it matches the end curly brace of the root block.

how could i solve this?

Double Quotes in Regex

Hi, I have trouble getting a Regex to work which should match double quotes.

Example: CustomString = /[^"]*/; -> it should match any char from one " to the other ". (Note that the surrounding double quotes are in the parent expression. This actually just matches everything that is not a double quote)

As far as I understand you are usinng https://github.com/verophyle/regexp for your regex and marking a regex with /..../ instead of new Regex("...."). This works just fine for everything BUT escaped or not-escaped double quotes. CustomString = /[^\"]*/; this also does not work and will cause a syntax error (i.e. EOL message).

This also does not work:
CustomString = /[^\u0022]/; // this encodes unicode double quotes
or
CustomString = /[^\u{0022}]
/; // this encodes unicode double quotes

I think letting the user escape the double quotes would be the more consistent approach, or if that is not an option, especially list the double quotes and how to work around them in a regex in the documentation.

more docs/examples?

I might be missing something obvious, but the documentation would greatly benefit from the set of "building blocks", e.g. "how to match quoted string with escapes", "comma-separated list", "how to match right-associative infix operators" etc.

Unit testing example?

I am porting my project to .NET Standard and it is important to me to be able to unit test each grammar rule individually.

Can you please provide an example of calling a specific rule within a full grammar and verifying its output using either XUnit or similar?

Thanks in advance!

Can't pass constructed rules as arguments to rules

Given a rule that takes another as an argument:

Statement :indentation = indentation SomeOtherThing;

I would like to call this in some way that allows me to compute on the value :indentation. For example:

BlockBody :indentation = Statement('\t' indentation)*:statements;
// or
AlternateBlockBodyRule :indentation = Statement(Indent(indentation))*:statements;
Indent :indentation = '\t' indentation;

Neither of these work. The generator fails to parse my intent, and just passes null as the argument to Statement, which obviously fails.

Nested matchers

Hi, we've been using IronMeta in production for a while now, and are pretty happy with it (even more after #24). Thanks for the project!

Currently we are looking towards using it for named entity recognition (NER) task together with general pattern matching.

Consider for example phrase I want ten apples. We'd like to have a matcher that would answer if the phrase matches I want <N> apples pattern, AND would return value of N as int.

It's clear that we could write a specific grammar for this task, but consider a more complex example: I want <N> <Fruit> or I want <N> <Fruit> delivered at <Address> on <DayOrDate>.

It's still possible to write a specific grammar for every example, but it would cause huge duplication of code for specific entity matchers.

For example we already have a matcher that could transform words into integers ("ten thousands" -> 10000), so it would be great to have a way of re-using it without copy-pasting it as a part of specific grammar.

I imagine it to be something like this:

ironmeta IntMatcher<string, int>: Matcher<string, int>
{
...
}

enum Fruits
{
 Orange,
 Apple
}

ironmeta FruitsMatcher<string, Fruits>: Matcher<string, Fruits>
{
...
}

class Result
{
   int number;
   Fruits fruit;
}

ironmeta MainMatcher<string, Result>: Matcher<string, Result>
{
  Expression = "I" "want" IntMatcher:n FruitsMatcher:fr -> { return new Result() { number = n; fruit = fr }; };
}

What do you think? Is it possible to implement something like this? Or maybe there are simpler ways?

Integrating IronMetaGenerate task in Visual Studio

Can you be more specific about how to integrate this in Visual Studio 2017? Perhaps a step-by-step?

IronMeta.Library.dllcontains an MsBuild task called "IronMetaGenerate". A simple example of how to use this:

      <UsingTask TaskName="IronMetaGenerate" AssemblyFile="path_to\IronMeta.Library.dll" />
      <Target Name="BeforeBuild">
        <IronMetaGenerate Input="MyParser.ironmeta" Output="MyParser.g.cs" Namespace="MyNamespace" Force="true" />
      </Target>

Regex matches broken?

I am getting results from regex matches that I don't understand. In all cases, I'd expect the program below to output the same thing for Foo that it does for Bar, and to match exactly 6 characters every time.

Ironmeta file:

ironmeta Test<char, char>
{
    Foo = /[^\r\n]+/ -> { return new string(_IM_Result.Inputs.ToArray()); };
    Bar = (~('\r' | '\n').)+ -> { return new string(_IM_Result.Inputs.ToArray()); };
}

Test harness:

    class Program
    {
        static void PrintMatch(IronMeta.Matcher.MatchResult<char, char> m)
        {
            Console.WriteLine($"{m.Success} {m.NextIndex}");
            if (m.Success)
            {
                Console.WriteLine(string.Join(',', m.Results.Select(c => (int)c)));
            }
        }
        static void Compare(string s)
        {
            var p1 = new Test();
            var m1 = p1.GetMatch(s, p1.Foo);
            Console.WriteLine("foo:");
            PrintMatch(m1);
            var p2 = new Test();
            var m2 = p2.GetMatch(s, p2.Bar);
            Console.WriteLine("bar:");
            PrintMatch(m2);
        }
        static void Main(string[] args)
        {
            Compare("Hello!");
            Compare("Hello!\n");
            Compare("Hello!\nWorld!\n");
        }
    }

Results:

foo:
True 6
72,101,108,108,111,33
bar:
True 6
72,101,108,108,111,33
foo:
True 7
72,101,108,108,111,33,10
bar:
True 6
72,101,108,108,111,33
foo:
False -1
bar:
True 6
72,101,108,108,111,33

Issues with special characters in regexp.

I have a lot of trouble getting IronMeta to work as expected with the regular expressions:
Issue cases:

  • \s does not match \n and some other whitespace characters -> this is on verophyle.regexp, not iron meta
  • CustomString = /[^\"]/; This works, but gets a weird syntax highlighting in VS Code if you set it to c# because the " starts a new string.
  • /REGEXPCODE/* is legal in the grammar but makes everything behind it be displayed as a comment in VS Code with c# syntax highlighting
  • it is not possible to have newline \n or tab \t inside of a regexp.

The reason for the latter is the following:
ws = /[\t \n\r]*/; This is my rule to match zero or more whitespace characters in the .ironmeta file. This gets parsed into new Verophyle.Regexp.StringRegexp(@"[\t \n\r]*") which looks fine at first. Actually it does not work due to the @ in front of the string which disables the tab and newline escapes.

Proposal:
move away from the /REGEXPCODE/ syntax and use a c# string with a special prefix to label it as a regexp. e.g. §"REGEXPCODE" which would be ws = §"[\t \n\r]*"; for my ws. This has the benefit of being interpreted as a string so \t, \r and \n should work as intended while also being nicely displayed in VS Code with c# syntax highlighting.
If this is not to your liking, another way to input special characters is needed. Another option would be to provide the special characters (especially not visible once like the tab and newline) to be input via unicode encoding which could look something like this: \u12345

Calculator Example

a) The calculator example is broken...

Will successfully parse '5+3' giving the answer '8'...

But will also successfully parse '5+' giving the answer '5'.

The parser returns a 'MatchResult' with an error condition 'not enough arguments;expected DecimalDigit or WS' which is correct but the 'Success' value is set to 'true' and not 'false'.

The match is not a success so why is it being reported as a success???

b) Also for '5+3' there is also the same error condition... just a different errorindex...

c) As an aside: Why is there no parse tree being constructed and returned as part of the 'MatchResult'? Does this have to be constructed manually using actions?

The custom tool "IronMetaGenerator" failed.

I am on Visual Studio 2017 v15.9.20, right click on resource file x.ironmenta > Run Custom Tool results in the following error message:

The custom tool 'IronMetaGenerator' failed. Could not load file or assembly 'Microsoft.VisualStudio.Shell.15.0, Version=16.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.

I've installed the NuGet package and the Visual Studio Extension in the project. Is this something that can be fixed?

note: IronMeta.App.Exe appears to be working.

C# library documentation from XML tags

The C# library contains lots of useful documentation, but this is not visible when debugging a target program. Is it possible to distribute documentation built from the tags, or a perhaps a debug version of the software in which the source code can be browsed?

extension does not work

i want to use your generator but nothing happens. i have created a file Grammar.ironmeta with Custom Tool IronMetaGenerator but nothing happens

Solution does not build from repo

Clone the repo, open the solution, build.

Loads regex from NuGet. Then errors.

Severity Code Description Project File Line
Error Metadata file 'D:\MyDocs\Repos\ironmeta\Source\Library\bin\Debug\IronMeta.Library.dll' could not be found IronMeta.VSExtension D:\MyDocs\Repos\ironmeta\Tools\VSExtension\CSC
Error Error signing output with public key from file '..\IronMeta.snk' -- File not found. Library D:\MyDocs\Repos\ironmeta\Source\Library\CSC
Error Metadata file 'D:\MyDocs\Repos\ironmeta\Source\Library\bin\Debug\IronMeta.Library.dll' could not be found Calc D:\MyDocs\Repos\ironmeta\Samples\Calc\CSC
Error Metadata file 'D:\MyDocs\Repos\ironmeta\Source\Library\bin\Debug\IronMeta.Library.dll' could not be found IronMeta D:\MyDocs\Repos\ironmeta\Source\IronMeta\CSC
Error Metadata file 'D:\MyDocs\Repos\ironmeta\Samples\Calc\bin\Debug\Calc.exe' could not be found UnitTests D:\MyDocs\Repos\ironmeta\Tests\UnitTests\CSC
Error Metadata file 'D:\MyDocs\Repos\ironmeta\Source\Library\bin\Debug\IronMeta.Library.dll' could not be found UnitTests D:\MyDocs\Repos\ironmeta\Tests\UnitTests\CSC

Add an MSBuild task

An MSBuild task would make it easy to add IronMeta to the build process, including understanding dependencies between .ironmeta and generated .cs.

This would avoid the problem where someone changes (for example) the name of an AST node class, but forgets to rerun ironmeta.

Here's a first stab at an MSBuild Task:

using IronMeta.Generator;
using Microsoft.Build.Framework;
using Microsoft.Build.Utilities;

public class IronMetaTask : Task
{
    public override bool Execute()
    {
        var result = CSharpShell.Process(Input, Output, Namespace, Force);

        if (result.Success)
        {
            return true;
        }
        else
        {
            Log.LogError(result.Error);

            return false;
        }
    }

    public bool Force { get; set; }

    public string Namespace { get; set; }

    public string Output { get; set; }

    [Required]
    public string Input { get; set; }
}

Concurrency issues when using regex

Without regular expressions multiple instances of the generated parser can be used concurrently. When regexps are used they are declared as 'static' but are in fact mutable objects, and _ParseRegexp method changes state of the objects. This prevents using parser with regular expressions concurrently:
image
image

Please change parser generation so that regexps are not static

Handling word-level regexes with `.*`

Hi, we are looking towards moving our custom word-level regex engine to IronMeta. Considering that at least for characters IronMeta supports regex-based rules, it seems like it should work out of the box.

Here's th issue I'm currently looking at. Let's consider the following regex: .* "test" .*. It should match all the phrases, containing word test, eg. ["test"], ["some", "test"].
So a naive translation into IronMeta matcher would look like this:

ironmeta MatcherSpecific<string, bool>: Matcher<string, bool>
{
    Pattern_0 =  .* "test" .*;
    Expression = Pattern_0;
}

and it does not match any of the test phrases (giving error: expected end of file). This seems somewhat expectable considering the greedy nature of the quantification operators in IronMeta.

So I kinda found two ways to make matcher for this sample regex:

ironmeta MatcherSpecific<string, bool>: Matcher<string, bool>
{
    Pattern_0 =  "test" .*;
    Pattern_0 =  . Pattern_0;
    Expression = Pattern_0;
}
ironmeta MatcherSpecific<string, bool>: Matcher<string, bool>
{
    Pattern_0 =  (~"test" .)* "test" .*;
    Expression = Pattern_0;
}

But it's unclear to me how to generalize these solutions to more complex real-world regex cases. Could you please suggest something?

I wonder if the regular regexes for characters are also working this way in IronMeta?

expected eof or identbody

hi,

i have the followng problem: expected EOL or IdentBody on line 14 the str part.
when i comment out the values part compiles. so what is wrong with my regex, its valid regex
my grammar:

`
using System;
using System.Linq;
using IronMeta.Matcher;

ironmeta Function<char, string>
{
Expression = name "(" args ")";

name = /[_a-zA-Z][_a-zA-Z0-9]*/;
args = value | value ",";
value = str | integer | floating | hex;

//values
str = /"((?:\\.|[^"\\])*)"/;
integer = /[-+]?[0-9]+/;
hex = /0x[0-9A-Fa-f]+/;
floating = /[-+]?[0-9]*\.?[0-9]*/;

}
`

what do i wrong?

Non-linear performance for mixed right and left recursion

Hello, I think there is a performance problem with the current matching algorithm, when right and left recursions are being mixed.

When a right recursion grows the call stack and involves a left recursion and continues to grow (for example when it processes a nested expression), the performance characteristic of the current implementation of the algorithm is exponential.

Specifically, I think that making all the productions non-memoizable, that are placed in the call stack above the earliest appearance of the current left recursive production, is wrong (https://github.com/kulibali/ironmeta/blob/master/Source/Matcher/Matcher.cs#L216).

I tested an alternative in which only the productions, that are placed on the stack above the most recent production that matches the current left recursive one, are considered "involved" and are disabled for memoization. As far is I can see it now, this makes the performance characteristics of nested right recursions linear again, and passes all of my tests.

I must admit, that I do not completely understand the text of the original algorithm, and made this change just based on a hunch. And also there is a small probability that my F# port of the matching algorithm is just wrong and caused this problem in the first place.

For reference, here is the change set of the F# version, which also includes a test grammar:

pragmatrix/ScanRat@b59643c

The production addition contains a direct left-recursion, but is actually never used in the string that is parsed, but it does cause - unexpectedly - the drop in performance for the nested / right recursive property production. This might not be such a huge a problem for simple grammars, but it adds up considerably for more complex ones.

build ast

hi,

is it possible to build an ast from the parsing result?

Missing /tools folder in releases

It seems that nuget releases after 4.4.1 are missing the /tools folder, and therefore no IronMeta.App.exe is available. Is this an intended change?

It seems that it should still be possible to build .ironmeta files with MsBuild task. Is this a recommended way to use IM now?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.