Giter Club home page Giter Club logo

superpower's Introduction

Superpower Build status NuGet Version Stack Overflow

A parser combinator library based on Sprache. Superpower generates friendlier error messages through its support for token-driven parsers.

Logo

What is Superpower?

The job of a parser is to take a sequence of characters as input, and produce a data structure that's easier for a program to analyze, manipulate, or transform. From this point of view, a parser is just a function from string to T - where T might be anything from a simple number, a list of fields in a data format, or the abstract syntax tree of some kind of programming language.

Just like other kinds of functions, parsers can be built by hand, from scratch. This is-or-isn't a lot of fun, depending on the complexity of the parser you need to build (and how you plan to spend your next few dozen nights and weekends).

Superpower is a library for writing parsers in a declarative style that mirrors the structure of the target grammar. Parsers built with Superpower are fast, robust, and report precise and informative errors when invalid input is encountered.

Usage

Superpower is embedded directly into your C# program, without the need for any additional tools or build-time code generation tasks.

dotnet add package Superpower

The simplest text parsers consume characters directly from the source text:

// Parse any number of capital 'A's in a row
var parseA = Character.EqualTo('A').AtLeastOnce();

The Character.EqualTo() method is a built-in parser. The AtLeastOnce() method is a combinator, that builds a more complex parser for a sequence of 'A' characters out of the simple parser for a single 'A'.

Superpower includes a library of simple parsers and combinators from which more sophisticated parsers can be built:

TextParser<string> identifier =
    from first in Character.Letter
    from rest in Character.LetterOrDigit.Or(Character.EqualTo('_')).Many()
    select first + new string(rest);

var id = identifier.Parse("abc123");

Assert.Equal("abc123", id);

Parsers are highly modular, so smaller parsers can be built and tested independently of the larger parsers that use them.

Tokenization

Along with text parsers that consume input character-by-character, Superpower supports token parsers.

A token parser consumes elements from a list of tokens. A token is a fragment of the input text, tagged with the kind of item that fragment represents - usually specified using an enum:

public enum ArithmeticExpressionToken
{
    None,
    Number,
    Plus,

A major benefit of driving parsing from tokens, instead of individual characters, is that errors can be reported in terms of tokens - unexpected identifier `frm`, expected keyword `from` - instead of the cryptic unexpected m.

Token-driven parsing takes place in two distinct steps:

  1. Tokenization, using a class derived from Tokenizer<TKind>, then
  2. Parsing, using a function of type TokenListParser<TKind>.
var expression = "1 * (2 + 3)";

// 1.
var tokenizer = new ArithmeticExpressionTokenizer();
var tokenList = tokenizer.Tokenize(expression);

// 2.
var parser = ArithmeticExpressionParser.Lambda; // parser built with combinators
var expressionTree = parser.Parse(tokenList);

// Use the result
var eval = expressionTree.Compile();
Console.WriteLine(eval()); // -> 5

Assembling tokenizers with TokenizerBuilder<TKind>

The job of a tokenizer is to split the input into a list of tokens - numbers, keywords, identifiers, operators - while discarding irrelevant trivia such as whitespace or comments.

Superpower provides the TokenizerBuilder<TKind> class to quickly assemble tokenizers from recognizers, text parsers that match the various kinds of tokens required by the grammar.

A simple arithmetic expression tokenizer is shown below:

var tokenizer = new TokenizerBuilder<ArithmeticExpressionToken>()
    .Ignore(Span.WhiteSpace)
    .Match(Character.EqualTo('+'), ArithmeticExpressionToken.Plus)
    .Match(Character.EqualTo('-'), ArithmeticExpressionToken.Minus)
    .Match(Character.EqualTo('*'), ArithmeticExpressionToken.Times)
    .Match(Character.EqualTo('/'), ArithmeticExpressionToken.Divide)
    .Match(Character.EqualTo('('), ArithmeticExpressionToken.LParen)
    .Match(Character.EqualTo(')'), ArithmeticExpressionToken.RParen)
    .Match(Numerics.Natural, ArithmeticExpressionToken.Number)
    .Build();

Tokenizers constructed this way produce a list of tokens by repeatedly attempting to match recognizers against the input in top-to-bottom order.

Writing tokenizers by hand

Tokenizers can alternatively be written by hand; this can provide the most flexibility, performance, and control, at the expense of more complicated code. A handwritten arithmetic expression tokenizer is included in the test suite, and a more complete example can be found here.

Writing token list parsers

Token parsers are defined in the same manner as text parsers, using combinators to build up more sophisticated parsers out of simpler ones.

class ArithmeticExpressionParser
{
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Add =
        Token.EqualTo(ArithmeticExpressionToken.Plus).Value(ExpressionType.AddChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Subtract =
        Token.EqualTo(ArithmeticExpressionToken.Minus).Value(ExpressionType.SubtractChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Multiply =
        Token.EqualTo(ArithmeticExpressionToken.Times).Value(ExpressionType.MultiplyChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Divide = 
        Token.EqualTo(ArithmeticExpressionToken.Divide).Value(ExpressionType.Divide);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Constant =
            Token.EqualTo(ArithmeticExpressionToken.Number)
            .Apply(Numerics.IntegerInt32)
            .Select(n => (Expression)Expression.Constant(n));

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Factor =
        (from lparen in Token.EqualTo(ArithmeticExpressionToken.LParen)
            from expr in Parse.Ref(() => Expr)
            from rparen in Token.EqualTo(ArithmeticExpressionToken.RParen)
            select expr)
        .Or(Constant);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Operand =
        (from sign in Token.EqualTo(ArithmeticExpressionToken.Minus)
            from factor in Factor
            select (Expression)Expression.Negate(factor))
        .Or(Factor).Named("expression");

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Term =
        Parse.Chain(Multiply.Or(Divide), Operand, Expression.MakeBinary);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Expr =
        Parse.Chain(Add.Or(Subtract), Term, Expression.MakeBinary);

    public static readonly TokenListParser<ArithmeticExpressionToken, Expression<Func<int>>>
        Lambda = Expr.AtEnd().Select(body => Expression.Lambda<Func<int>>(body));
}

Error messages

The error scenario tests demonstrate some of the error message formatting capabilities of Superpower. Check out the parsers referenced in the tests for some examples.

ArithmeticExpressionParser.Lambda.Parse(new ArithmeticExpressionTokenizer().Tokenize("1 + * 3"));
     // -> Syntax error (line 1, column 5): unexpected operator `*`, expected expression.

To improve the error reporting for a particular token type, apply the [Token] attribute:

public enum ArithmeticExpressionToken
{
    None,

    Number,

    [Token(Category = "operator", Example = "+")]
    Plus,

Performance

Superpower is built with performance as a priority. Less frequent backtracking, combined with the avoidance of allocations and indirect dispatch, mean that Superpower can be quite a bit faster than Sprache.

Recent benchmark for parsing a long arithmetic expression:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Windows
Processor=?, ProcessorCount=8
Frequency=2533306 ticks, Resolution=394.7411 ns, Timer=TSC
CLR=CORE, Arch=64-bit ? [RyuJIT]
GC=Concurrent Workstation
dotnet cli version: 1.0.0-preview2-003121

Type=ArithmeticExpressionBenchmark  Mode=Throughput  
Method Median StdDev Scaled Scaled-SD
Sprache 283.8618 µs 10.0276 µs 1.00 0.00
Superpower (Token) 81.1563 µs 2.8775 µs 0.29 0.01

Benchmarks and results are included in the repository.

Tips: if you find you need more throughput: 1) consider a hand-written tokenizer, and 2) avoid the use of LINQ comprehensions and instead use chained combinators like Then() and especially IgnoreThen() - these allocate fewer delegates (closures) during parsing.

Examples

Superpower is introduced, with a worked example, in this blog post.

Example parsers to learn from:

  • JsonParser is a complete, annotated example implementing the JSON spec with good error reporting
  • DateTimeTextParser shows how Superpower's text parsers work, parsing ISO-8601 date-times
  • IntCalc is a simple arithmetic expresion parser (1 + 2 * 3) included in the repository, demonstrating how Superpower token parsing works
  • Plotty implements an instruction set for a RISC virtual machine
  • tcalc is an example expression language that computes durations (1d / 12m)

Real-world projects built with Superpower:

  • Serilog.Expressions uses Superpower to implement an expression and templating language for structured log events
  • The query language of Seq is implemented using Superpower
  • seqcli extraction patterns use Superpower for plain-text log parsing
  • PromQL.Parser is a parser for the Prometheus Query Language

Have an example we can add to this list? Let us know.

Getting help

Please post issues to the issue tracker, or tag your question on StackOverflow with superpower.

The repository's title arose out of a talk "Parsing Text: the Programming Superpower You Need at Your Fingertips" given at DDD Brisbane 2015.

superpower's People

Contributors

anders-rasmussen avatar andrewsav avatar andymac4182 avatar ar1k avatar atifaziz avatar benjaminholland avatar dancuriosity avatar ejsmith avatar ellested avatar fvbommel avatar gertjvr avatar kodraus avatar kthompson avatar liammclennan avatar nblumhardt avatar powerdude avatar randomc0der avatar spaceman1861 avatar sq735 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

superpower's Issues

Long-form error messages

Superpower produces errors that carry a location, error message fragment, and expectations.

By default, the result types' ToString() formats these into one-line messages as in this test case:

AssertParser.FailsWithMessage(alternating, "123 abc 123 123", new SExpressionTokenizer(),
    "Syntax error (line 1, column 13): unexpected number `123`, expected atom.");

This is a good default, but in some applications (especially those that produce output for terminals) a multi-line error can provide more context.

This is an example from FParsec:

Failure: Error in Ln: 1 Col: 6
1.25E 3
     ^
Expecting: decimal digit

Using the information available to Superpower,

Unexpected number `123`, expected atom
 ---> line 1, column 13
  | 
1 | 123 abc 123 123
  |             ^^^ expected atom

It would be nice to show the preceding and following lines if available. The new rustc error formatting sets the benchmark here.

[Question] How do I combine these parsers?

Let's assume I have a Group 1 of parser a,b,c and d and Group 2 of parser w,x,y and z.
I would like to combine these parses so that the resulting parser would parse any combination of parsers from group 1 and exactly 1 parser from group 2 in any order.

Is this possible?

It is quite possible that I'm attacking the problem from a wrong angle (XY-problem), here is what I'm really trying to solve.

I'd like to parse a command line switch that starts with - or / and then a number of options follows for example -abwdc among these options there could be as many as desired binary ones (that is a,b,c or d, that do not require a parameter to follow) but only one that requires a parameter (w,x,u or z). The options can be specified in any order.

Char parsing and double quote

Hello,

First of all, thank you for this great library!

There's a behavior that I don't understand. My char parser returns A as expected. But when used with the tokenizer i get "A" (double quotes are added). See example below.

Would you mind explaining this ?
Thanks.

var parser =
    from first in Character.EqualTo('"')
    from content in Character.AnyChar
    from last in Character.EqualTo('"')
    select content;

var p1 = parser.TryParse(@"""A""").Value.ToString();
var p2 = parser.TryParse("\"A\"").Value.ToString();

var tokenizer = new TokenizerBuilder<char>()
    .Ignore(Span.WhiteSpace)
    .Match(parser, 'X', requireDelimiters: false)
    .Build();

var t1 = tokenizer.Tokenize(@"""A""").First().ToStringValue();
var t2 = tokenizer.Tokenize("\"A\"").First().ToStringValue();

Debug.WriteLine(p1);
Debug.WriteLine(p2);
Debug.WriteLine(t1);
Debug.WriteLine(t2);

/*
Console Display :

A
A
"A"
"A"

*/

Hard time parsing increment and decrement expressions

I'm trying to parse a text containing pre/post increment/decrement expression within the main expression.

Let's use the given text: a = b + c--;

The rest of the expression parser works very well using the Chain capabilities of the Parser. But when I come to imcrement and decrement statements I got stuck... because and increment/decrement statement can be applied to an identifier (a,b c) or an increment/decrement statement it self...

--a--;

And this gets my code stuck in an infinite loop... could you please point me to right direction when parsing these kind of expressions?!

Variance

The parser delegates need to return interfaces in order to support variance on the value type T. Not sure if the JIT will be smart enough to optimize it away; guessing probably not, need to run some experiments.

Can we get OneOf please?

Pidgin has OneOf, can we get similar one in Superpower please? It is really awkward combining an array of parsers instead.

The goal: given an array of parser combine them with Or or TryOr (should be able to specify) to get the resulting parser.

Many with partial match consumes to much of the input

I'm trying build a parser to match text that would be recognized by the regular expression "(ab)*ac"

var ab = Character.EqualTo('a').Then(_ => Character.EqualTo('b'));
var ac = Character.EqualTo('a').Then(_ => Character.EqualTo('c'));
var list = ab.Try().Many().Then(ac);

This should match the input "ababac". However Many(...) will consume the "a" in "ac" and thus the ac parser will fail. If I leave out the Try() then Many(...) will fail and the ac parser isn't even executed.

Am I missing something here? This seems like a bug to me, but I can't imagine that I'm the first to try this. Is there some other way to combine such parsers with partial matches? (I was originally trying to build an XML parser for educational purposes, where I had this issue with the '<' char)

Looking at the history it seems that this might have worked before the rewrite of backtracking.

I know how I can change Many() so this is working, but I'm not sure if that would be the right place.

Force .Chain() to apply at least once

I'm currently creating a DSL utilizing Superpower and as part of that I am using the .Chain method in my parser. However, I'm running into an issue where this isn't actually requiring the two values and operator passed in the chain method and is instead, if there is only 1 of the 2 items, simply passing that item through without requiring the operator and second item.

The parsers in question are defined as such:

public static readonly TokenListParser<Token, Operators.Operator> Boolean =
    And.Or(Or);

public static readonly TokenListParser<Token, Operators.Operator> Operator =
    Is.Or(Not).Or(LT).Or(LTE).Or(GT).Or(GTE);

public static readonly TokenListParser<Token, Expression> Unary =
    from item in Literal
    from op in Exists
    select MakeUnary(op, item);

public static readonly TokenListParser<Token, Expression> Binary =
    Parse.Chain(Operator, Literal, MakeBinary);

public static readonly TokenListParser<Token, Expression> Operand =
    Unary.Try().Or(Binary);

public static readonly TokenListParser<Token, Expression> Conditional =
    Parse.Chain(Boolean, Operand, MakeBinary);

And/Or/Not/Exists/etc. are all simple token matches defined similarly to this:

public static readonly TokenListParser<Token, Operators.Operator> And =
            Token.EqualTo(Token.And).Value(Operators.Operators.And);

and Literal is a bunch of literal values that store values in string form:

public static readonly TokenListParser<Token, Expression> Text =
            Token.EqualTo(Token.Text)
                .Select(t => (Expression)new ConstantExpression(t.ToStringValue().Trim('"')));

...

public static readonly TokenListParser<Token, Expression> Literal =
            Number.Or(Text)
                 .Or(HexNumber)
                 .Or(Reference)
                 .Or(BinaryState)
                 .Named("literal");

The issue right now is that a conditional that is just a lone Literal, without any operators, is allowed, when I need it to only allow combinations of a literal and an operator.

Illegal: foo
Legal: foo is 1 or bar exists

I assume that since .Chain handles left-recursive chains it notices there is nothing to chain and just returns the first item, but is it possible to force there to be an operator when using .Chain?

How to parse multiple TextParser from TokenListParser

I'm trying to use Superpower to parse a text file. I think my tokenizer works pretty well. However, I struggle to write a working parser and I'm looking for some help and advices.

My file looks like following example:

#(
#Dictionary 
#('Foo' ' ->' 
#RefFoo) 
#('Bar' ' ->' 
#RefBar)) 

I use this code to generate the token list:

internal enum SmllToken
{
  Reference,
  BlockStatementBegin,
  BlockStatementEnd,
  Identifier,
  String,
}

internal static class SmllTokenizer
{
  private static TextParser<Unit> IdentifierToken { get; } =
    Span.Regex("[A-Za-z]+").Value(Unit.Value);

  private static TextParser<Unit> StringToken { get; } =
    Span.Regex("'[A-Za-z]+'").Value(Unit.Value);

  public static Tokenizer<SmllToken> Instance { get; } =
    new TokenizerBuilder<SmllToken>()
    .Ignore(Span.WhiteSpace)
    .Ignore(Span.EqualTo('#'))
    .Match(Character.EqualTo('('), SmllToken.BlockStatementBegin)
    .Match(Character.EqualTo(')'), SmllToken.BlockStatementEnd)
    .Match(Span.EqualTo("' ->'"), SmllToken.Reference)
    .Match(IdentifierToken, SmllToken.Identifier)
    .Match(StringToken, SmllToken.String)
    .Build();
}

After that I would like to parse the different values from my text file to get something like { Foo: RefFoo }, { Bar: RefBar }. Unfortunately, I'm not able to parse ('Foo' ' ->' #RefFoo) multiple times correct. I always get different syntax errors with invalid identifiers. I tried a couple of different ways to write the parser - none of them worked. Most of the time they looked like this:

private static readonly TextParser<string> Foo =
  from chars in Character.AnyChar.Many()
  select new string(chars);

private static readonly TextParser<string> Bar =
  from chars in Character.AnyChar.Many()
  select new string(chars);

private static readonly TokenListParser<SmllToken, object> None =
  Token.EqualTo(SmllToken.Reference)
  .Or(Token.EqualTo(SmllToken.BlockStatementBegin))
  .Or(Token.EqualTo(SmllToken.BlockStatementEnd))
  .Value((object)Unit.Value);

private static readonly TokenListParser<SmllToken, object> Pair =
  Token.EqualToValue(SmllToken.Identifier, "Dictionary")
  .Then(x => Token.EqualTo(SmllToken.String).Apply(Foo))
  .Then(y => Token.EqualTo(SmllToken.Identifier).Apply(Bar))
  .Select(foo => (object)foo);

private static readonly TokenListParser<SmllToken, object> Values =
  None.Or(Pair);

public static readonly TokenListParser<SmllToken, IEnumerable<object>> Instance =
  Values.Many().AtEnd().Select(value => value.AsEnumerable());

I think I miss the part to parse the surrounding brackets ( and ), as well as ' ->' (twice). It would be great if some could help me to understand how to parse the example text file or string correct and how Superpower works.

[Question] Combinator to necessarily match delimiter between parser matches

I have a token for new lines, which works fine, but then I want to parse text delimited by new line, and I'm using ManyDelimitedBy(parser, delimiter), but it always expects a parser to match when delimiter matches, and throws an exception if parser doesn't match. If I use Try(), before ManyDelimitedBy, it parses successfully, but if there's a syntax error it will silently succeed, but the array is empty.

Sample project: SuperpowerNewLine.zip

Steps

  1. Run the project, it successfully parses the text.
  2. Remove the right bracket at line 53, it will silently succeed, but the result is empty.
  3. Add the right bracket back.
  4. Remove Try() at line 33.
  5. Run the project, now it fails, because it expects an identifier, as it matched the separator (new line).

What kind of combinator do I need for this? I want something like ManyDelimitedBy, where delimiter has to match between parser matches, but it shouldn't expect parser to match after delimiter matches.

[SOLVED / EDITED] I wanted to clarify something regarding the use of OR

I have something like this in pseudo code.

property = identifier + (equalsign + value).optional

method = identifier + openparenthesis + arguments + closeparenthesis;

member = method.or(property) + terminator

this doesn't seem to work with the error "unexpected equalsign expected openparenthesis"... when I have a property...would it be because of the identifier being common to both (property and method). or am I doing something wrong?!

Parser for nested balanced parentheses

Greetings and thanks for the great library. I'm struggling to come up with a parser for the set of partial inputs below (nested, balanced parentheses with 'I' OR separators).

This is a part of a larger language, thus the set of tokens below, but arbitrary text can go inside the parens, including whitespace, other tokens, and "()".

Only '|', '(', ')', should have special meaning here (a newline would also end the sequence). To be valid, each (balanced) parenthesised group must have a '|' and at least one character that is not '(' or ')'.

Valid:

(a|)
(a | b)
(a | b.c())
(aa | bb cc )
(a | b | c #dd)
((a | b) | $c)
((a | b) | (c | d))
(((a | b) | c) | d)
...

Invalid:

()
())
(()
(|)
(|())
(.)
(())
(()|())
(abc)
(a bc)
(a.bc())

My tokens are as follows:

    public enum Tokens
    {
        None,
        String,
        Number,

        [Token(Description = "#identifier")]
        Label,

        [Token(Description = "$variable")]
        Symbol,

        [Token(Example = "[")]
        LBracket,

        [Token(Example = "]")]
        RBracket,

        [Token(Example = "{")]
        LBrace,

        [Token(Example = "}")]
        RBrace,

        [Token(Example = "(")]
        LParen,

        [Token(Example = ")")]
        RParen,

        [Token(Example = "?")]
        QuestionMark,

        [Token(Example = "#")]
        Hash,

        [Token(Example = "$")]
        Dollar,

        [Token(Example = "|")]
        Pipe,

        [Token(Example = "=")]
        Equal,

        [Token(Example = ",")]
        Comma,

        [Token(Example = ":")]
        Colon,

        [Token(Example = ".")]
        Dot,

        [Token(Example = "()")]
        ParenPair
    } 

Simple tokenizer question

I'm trying to match c-style identifiers that start with a '#' character, but to discard the hash character and keep the rest. In my attempts below the '#' is always included in the token:

static TextParser<Unit> HashIdent { get; } = Character.EqualTo('#').IgnoreThen(Identifier.CStyle).Value(Unit.Value);

I know I'm missing something simple...

Match a string with tokenizer only if it is first token in line

When tokenizing, how to match a token only if it is the first thing in a line?

Take the following example where I am trying to match any ident followed by a colon, ONLY if it is the first thing in the line (thus the ^):

static TextParser<Unit> Actor { get; } =
  from start in Span.Regex(@"^[A-Za-z][A-Za-z0-9_]+:")
  select Unit.Value;

It seems that if the line is "1 abc:" and Ignore(Span.WhiteSpace) is set, then the tokenizer consumes the first token ('1'), then ignores the white space as directed, then sees "abc:" as starting from position 0, thus matching.

But what I want is to only match "abc:" if it is the first token ... Is this possible ?

Parse tokens into several arrays/lists

I am struggling to find a solution to parse tokens into several different arrays/lists.
The string to tokenize/parse looks like this

(x:x1, x:x2, x:x3, y:y1, z:z1, z:z2)

The result should be stored in a node like this:

class Node
{
  List<string> xItems; // contains all items for token x: (x1, x2, x3 in the example above)
  List<string> yItems; // contains all items for token y: (y1 in the example above)
  List<string> zItems; // contains all items for token z: (z1, z2 in the example above)
}

, or more generically, in a node like this

class MoreGenericNode
{
  Dictionary<string,List<string>> items; //the dictionary key is the token (x:, y:, z:, ...)
}

The defined tokens are x:, y:, z:, as well as the comma and the left/right brackets.
Is it possible to get the result information by parsing the input string only once, or is it necessary to parse the string several times, for each token?

TextSpan.Until Behaviour

TextSpan.Until is documented as

Return a new span from the start of this span to the beginning of another.

So I expect this to work:

string source = "123";
Position one = Position.Zero.Advance(source[0]);
TextSpan t1 = new TextSpan(source);
TextSpan t2 = new TextSpan(source, one, 1);        
Assert.AreEqual("1",t1.Until(t2).ToStringValue());

But it does not because the code assume that both text spans should end in the same position. This is fine, but this not documented and it is not checked in the code. Am I'm missing something?

How to only return what can be parsed in a TokenizerBuilder

Lets say I want to execute the following:

            var tokenizer = new TokenizerBuilder<SqlToken>()
                .Ignore(Span.WhiteSpace)
                .Match(Character.Letter.Many(), SqlToken.Keyword)
                .Build();

            var tokens = tokenizer.Tokenize("select 1 + 23");
            foreach (var token in tokens)
            {
                Console.WriteLine(token);
            }

Currently it is failing because it doesnt know how to parse the remainder("1 + 23").

Superpower.ParseException: 'Zero-width tokens are not supported; token keyword at position 7 (line 1, column 8).'

Is there a way to tell superpower to only parse what it can and ignore anything else? Or is the only way to supply parsers for what I want to ignore?

Obsolete/replace `Tokenizer.Position` to make stateless tokenizers possible

While not all tokenizers will end up being thread-safe, this property precludes any instance of Tokenizer<T> from being thread-safe.

Instead of storing the last-yielded token in a property on Tokenizer itself, an overload of Tokenize() could be provided that accepts an additional TokenizationState parameter, or, we could extract StatefulTokenizer out into a subclass of Tokenizer and push down the property.

Where is the length?

A documentation comment says:

Compare a string span with another using source identity semantics - same source, same position, same length.

But the code does not check the length. What's going on?

string source = "123";
TextSpan t1 = new TextSpan(source, Position.Zero, 1);
TextSpan t2 = new TextSpan(source, Position.Zero, 2);
Assert.AreNotEqual(t1,t2);

Project fails to build in VS2017

Project Superpower is not compatible with netcoreapp2.0 (.NETCoreApp,Version=v2.0). Project Superpower supports: net45 (.NETFramework,Version=v4.5)

Seems the nuget restore fails... what do I need to do other than opening the .sln file ?


Running non-parallel restore.
Reading project file /Users/user/git/superpower/sample/DateTimeTextParser/DateTimeParser.csproj.
The restore inputs for 'DateTimeParser' have changed. Continuing restore.
Restoring packages for /Users/user/git/superpower/sample/DateTimeTextParser/DateTimeParser.csproj...
Restoring packages for .NETCoreApp,Version=v2.0...
Resolving conflicts for .NETCoreApp,Version=v2.0...
Checking compatibility of packages on .NETCoreApp,Version=v2.0.
Checking compatibility for DateTimeParser 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.App 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower 2.1.0 with .NETCoreApp,Version=v2.0.
Project Superpower is not compatible with netcoreapp2.0 (.NETCoreApp,Version=v2.0). Project Superpower supports: net45 (.NETFramework,Version=v4.5)
Checking compatibility for Microsoft.NETCore.DotNetHostPolicy 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Platforms 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for NETStandard.Library 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostResolver 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetAppHost 2.0.0 with .NETCoreApp,Version=v2.0.
Incompatible projects: 1
Committing restore...
Writing lock file to disk. Path: /Users/user/git/superpower/sample/DateTimeTextParser/obj/project.assets.json
Writing cache file to disk. Path: /Users/user/git/superpower/sample/DateTimeTextParser/obj/DateTimeParser.csproj.nuget.cache
Restore failed in 127.9 ms for /Users/user/git/superpower/sample/DateTimeTextParser/DateTimeParser.csproj.
Reading project file /Users/user/git/superpower/sample/IntCalc/IntCalc.csproj.
The restore inputs for 'IntCalc' have not changed. No further actions are required to complete the restore.
Committing restore...
Assets file has not changed. Skipping assets file writing. Path: /Users/user/git/superpower/sample/IntCalc/obj/project.assets.json
No-Op restore. The cache will not be updated. Path: /Users/user/git/superpower/sample/IntCalc/obj/IntCalc.csproj.nuget.cache
Restore completed in 3.65 ms for /Users/user/git/superpower/sample/IntCalc/IntCalc.csproj.
Reading project file /Users/user/git/superpower/sample/JsonParser/JsonParser.csproj.
The restore inputs for 'JsonParser' have changed. Continuing restore.
Restoring packages for /Users/user/git/superpower/sample/JsonParser/JsonParser.csproj...
Restoring packages for .NETCoreApp,Version=v2.0...
Resolving conflicts for .NETCoreApp,Version=v2.0...
Checking compatibility of packages on .NETCoreApp,Version=v2.0.
Checking compatibility for JsonParser 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.App 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower 2.1.0 with .NETCoreApp,Version=v2.0.
Project Superpower is not compatible with netcoreapp2.0 (.NETCoreApp,Version=v2.0). Project Superpower supports: net45 (.NETFramework,Version=v4.5)
Checking compatibility for Microsoft.NETCore.DotNetHostPolicy 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Platforms 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for NETStandard.Library 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostResolver 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetAppHost 2.0.0 with .NETCoreApp,Version=v2.0.
Incompatible projects: 1
Committing restore...
Writing lock file to disk. Path: /Users/user/git/superpower/sample/JsonParser/obj/project.assets.json
Writing cache file to disk. Path: /Users/user/git/superpower/sample/JsonParser/obj/JsonParser.csproj.nuget.cache
Restore failed in 34.63 ms for /Users/user/git/superpower/sample/JsonParser/JsonParser.csproj.
Reading project file /Users/user/git/superpower/src/Superpower/Superpower.csproj.
The restore inputs for 'Superpower' have not changed. No further actions are required to complete the restore.
Committing restore...
Assets file has not changed. Skipping assets file writing. Path: /Users/user/git/superpower/src/Superpower/obj/project.assets.json
No-Op restore. The cache will not be updated. Path: /Users/user/git/superpower/src/Superpower/obj/Superpower.csproj.nuget.cache
Restore completed in 3.79 ms for /Users/user/git/superpower/src/Superpower/Superpower.csproj.
Reading project file /Users/user/git/superpower/test/Superpower.Benchmarks/Superpower.Benchmarks.csproj.
The restore inputs for 'Superpower.Benchmarks' have changed. Continuing restore.
Restoring packages for /Users/user/git/superpower/test/Superpower.Benchmarks/Superpower.Benchmarks.csproj...
Restoring packages for .NETCoreApp,Version=v2.0...
Resolving conflicts for .NETCoreApp,Version=v2.0...
Checking compatibility of packages on .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower.Benchmarks 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NET.Test.Sdk 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.runner.visualstudio 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for BenchmarkDotNet 0.10.10 with .NETCoreApp,Version=v2.0.
Checking compatibility for Sprache 2.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.App 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower 2.1.0 with .NETCoreApp,Version=v2.0.
Project Superpower is not compatible with netcoreapp2.0 (.NETCoreApp,Version=v2.0). Project Superpower supports: net45 (.NETFramework,Version=v4.5)
Checking compatibility for Microsoft.TestPlatform.TestHost 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.CodeCoverage 1.0.3 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.DotNet.InternalAbstractions 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.core 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.assert 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.analyzers 0.7.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for BenchmarkDotNet.Core 0.10.10 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostPolicy 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Platforms 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for NETStandard.Library 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.TestPlatform.ObjectModel 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Newtonsoft.Json 9.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Extensions.DependencyModel 1.0.3 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.AppContext 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.InteropServices.RuntimeInformation 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.extensibility.core 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.extensibility.execution 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Serialization.Primitives 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XPath.XmlDocument 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit.Lightweight 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.DotNet.PlatformAbstractions 1.1.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Win32.Registry 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ValueTuple 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XmlSerializer 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostResolver 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.EventBasedAsync 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.TypeConverter 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Metadata 1.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Loader 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Serialization.Json 4.0.2 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Process 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.TextWriterTraceListener 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.TraceSource 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Thread 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.CSharp 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Dynamic.Runtime 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Linq.Expressions 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ObjectModel 4.0.12 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XDocument 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for runtime.native.System 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.abstractions 2.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Resources.ResourceManager 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Globalization 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Extensions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.ReaderWriter 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XmlDocument 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XPath 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit.ILGeneration 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Primitives 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Handles 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.InteropServices 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Linq 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Extensions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.TypeExtensions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.RegularExpressions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetAppHost 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.NonGeneric 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Specialized 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.Primitives 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Immutable 1.2.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Private.DataContractSerialization 4.1.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Win32.Primitives 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.ThreadPool 4.0.10 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Tools 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Targets 1.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.Encoding 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Tasks 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Debug 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO.FileSystem 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO.FileSystem.Primitives 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.Encoding.Extensions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Tasks.Extensions 4.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Globalization.Extensions 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Concurrent 4.0.12 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Tracing 4.1.0 with .NETCoreApp,Version=v2.0.
Incompatible projects: 1
Committing restore...
Writing lock file to disk. Path: /Users/user/git/superpower/test/Superpower.Benchmarks/obj/project.assets.json
Writing cache file to disk. Path: /Users/user/git/superpower/test/Superpower.Benchmarks/obj/Superpower.Benchmarks.csproj.nuget.cache
Restore failed in 2.05 sec for /Users/user/git/superpower/test/Superpower.Benchmarks/Superpower.Benchmarks.csproj.
Reading project file /Users/user/git/superpower/test/Superpower.Tests/Superpower.Tests.csproj.
The restore inputs for 'Superpower.Tests' have changed. Continuing restore.
Restoring packages for /Users/user/git/superpower/test/Superpower.Tests/Superpower.Tests.csproj...
Restoring packages for .NETCoreApp,Version=v2.0...
Resolving conflicts for .NETCoreApp,Version=v2.0...
Checking compatibility of packages on .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower.Tests 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NET.Test.Sdk 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.runner.visualstudio 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.App 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Superpower 2.1.0 with .NETCoreApp,Version=v2.0.
Project Superpower is not compatible with netcoreapp2.0 (.NETCoreApp,Version=v2.0). Project Superpower supports: net45 (.NETFramework,Version=v4.5)
Checking compatibility for Microsoft.TestPlatform.TestHost 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.CodeCoverage 1.0.3 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.DotNet.InternalAbstractions 1.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.core 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.assert 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.analyzers 0.7.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostPolicy 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Platforms 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for NETStandard.Library 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.TestPlatform.ObjectModel 15.5.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Newtonsoft.Json 9.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Extensions.DependencyModel 1.0.3 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.AppContext 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO.FileSystem 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.TypeExtensions 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Extensions 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.InteropServices 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.InteropServices.RuntimeInformation 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.extensibility.core 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.extensibility.execution 2.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetHostResolver 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.EventBasedAsync 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.TypeConverter 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Metadata 1.3.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Loader 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Serialization.Primitives 4.1.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Serialization.Json 4.0.2 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XPath.XmlDocument 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Process 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.TextWriterTraceListener 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.TraceSource 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Thread 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.CSharp 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Debug 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Dynamic.Runtime 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Globalization 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Linq 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Linq.Expressions 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ObjectModel 4.0.12 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Extensions 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Resources.ResourceManager 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.Encoding 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.Encoding.Extensions 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Text.RegularExpressions 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Tasks 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.ReaderWriter 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XDocument 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.DotNet.PlatformAbstractions 1.0.3 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.Targets 1.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.IO.FileSystem.Primitives 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Runtime.Handles 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Primitives 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for runtime.native.System 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for xunit.abstractions 2.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.NETCore.DotNetAppHost 2.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.NonGeneric 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Specialized 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.ComponentModel.Primitives 4.1.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Immutable 1.2.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Private.DataContractSerialization 4.1.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XmlDocument 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XPath 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Win32.Primitives 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for Microsoft.Win32.Registry 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.ThreadPool 4.0.10 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit.ILGeneration 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Reflection.Emit.Lightweight 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Threading.Tasks.Extensions 4.0.0 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Tools 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Globalization.Extensions 4.0.1 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Collections.Concurrent 4.0.12 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Xml.XmlSerializer 4.0.11 with .NETCoreApp,Version=v2.0.
Checking compatibility for System.Diagnostics.Tracing 4.1.0 with .NETCoreApp,Version=v2.0.
Incompatible projects: 1
Committing restore...
Writing lock file to disk. Path: /Users/user/git/superpower/test/Superpower.Tests/obj/project.assets.json
Writing cache file to disk. Path: /Users/user/git/superpower/test/Superpower.Tests/obj/Superpower.Tests.csproj.nuget.cache
Restore failed in 333.09 ms for /Users/user/git/superpower/test/Superpower.Tests/Superpower.Tests.csproj.

NuGet Config files used:
    /Users/user/.config/NuGet/NuGet.Config

Feeds used:
    https://api.nuget.org/v3/index.json
Restore failed.


Simpler tokenization

Currently, the need to write a tokenizer makes it hard to get started with Superpower. Almost prohibitively so, except for the most motivated newcomers :-)

Generating high-performance, general, tokenizers is a bit more than we can bite off here in the short term, but that doesn't preclude us from making the experience better. At the expense of some raw performance, it's possible to generate fairly useful tokenizers using TextParser<T>s as recognizers.

In v2, this is the model I'd like to propose:

var tokenizer = new TokenizerBuilder<SExpressionToken>()
    .Ignore(Span.WhiteSpace)
    .Match(Character.EqualTo('('), SExpressionToken.LParen)
    .Match(Character.EqualTo(')'), SExpressionToken.RParen)
    .Match(Numerics.Integer, SExpressionToken.Number, requireDelimiters: true)
    .Match(Character.Letter.IgnoreThen(Character.LetterOrDigit.AtLeastOnce()),
        SExpressionToken.Atom, requireDelimiters: true)
    .Ignore(Comment.ShellStyle)
    .Build();

var tokens = tokenizer.TryTokenize("abc (123 def) # this is a comment");
Assert.True(tokens.HasValue);
Assert.Equal(5, tokens.Value.Count());

Compare this with the by-hand version in: https://github.com/datalust/superpower/blob/dev/test/Superpower.Tests/SExpressionScenario/SExpressionTokenizer.cs#L10 - at least 30 significant lines of fairly dense code, without even supporting comments.

The proposed TokenizerBuilder can accept any text parser as a recognizer, and using the requireDelimiters argument, can deal with the awkward "is null" vs "isnull" case using a one-token lookahead.

The downside is that tokenizer run-time increases linearly with respect to the number of matches attempted. This probably isn't noticeable for small grammars like the one above, but larger grammars can do much better with a hand-written replacement. We might be able to extend the builder with some optimizations to claw this perf back, down the line, or add a table-based alternative.

How to use Token.Sequence()

Hello,

I'm not sure how to use Token.Sequence() like Token.EqualTo(). Can you provide an example? I'm trying to do this:

public static TokenListParser<MyToken, int> MySpecialNumber = Token.Sequence(MyToken.Number, MyToken.Word).Apply(Numerics.IntegerInt32); 

but it fails because the text parser can't take a series of tokens. What is a valid use case for this method or what kind of parser do i have to create to use Apply?

Partial parsing results for completions

Hey there! Is there a way to get a list of potential parse results, when there is a parse error? This may be very useful for providing auto-completions in the editors by analyzing the partial AST.

Thanks!

Chain combinator question

Hi,

I am trying to find a more idiomatic way to write a token list parser for something like variable.propertyA.propertyB ".
I believe this can be written recursively as something like Reference = Variable | Reference '.' Identifier (with careful use of Try) where the '.'-token plays a role similar to an operator combined Reference and Identifier.

The chain operator can't be used since it expects a Func<TOperator, T, T, T> which means it combines two T's into a single T'. (this somehow sounds analogous to fold and foldLeft).

I was hoping to find a built in combinator like Chain. Any suggestions?

Recreate Sprache Csv Parser in Superpower

Hi @nblumhardt, I have been trying to recreate the Sprache Csv Parser in the test folder over in it's repository with the Superpower library.

  1. I have successfully made a simple version to parse my Csv data by row but I wanted to try and recreate the slightly more comprehensive Csv parser from Sprache's tests but in Superpower.
class Program
{
    static void Main(string[] args)
    {
        var input1 = "a,b,c";
        var input2 = "d,e,f" + Environment.NewLine + "a,b,c";

        var values = CsvParser.Record.Parse(input1).ToList(); // Works for parsing individual records

        Console.WriteLine($"1:{values[0]}, 2:{values[1]}, 3:{values[2]}");

        values = CsvParser.Record.Parse(input2).ToList(); // Works just grabbing the first row

        Console.WriteLine($"1:{values[0]}, 2:{values[1]}, 3:{values[2]}");

        var test = CsvParser.Csv.Parse(input2); // Exception
    }
}

public class CsvParser
{
    private static readonly TextParser<char> CellSeparator = Character.EqualTo(',');

    private static readonly TextParser<char> QuotedCellDelimiter = Character.EqualTo('"');

    private static readonly TextParser<char> QuoteEscape = Character.EqualTo('"');

    private static TextParser<T> Escaped<T>(TextParser<T> following)
    {
        return from escape in QuoteEscape
                from f in following
                select f;
    }

    private static readonly TextParser<char> QuotedCellContent =
        Character.AnyChar.Between(QuotedCellDelimiter, Escaped(QuotedCellDelimiter));

    static readonly TextParser<char> LiteralCellContent =
        Character.ExceptIn(',', '\r', '\n');

    static readonly TextParser<string> QuotedCell =
        from open in QuotedCellDelimiter
        from content in QuotedCellContent.Try().Many()
        from end in QuotedCellDelimiter
        select new string(content);

    static readonly TextParser<string> NewLine =
        Parse.Return(Environment.NewLine);

    static readonly TextParser<string> RecordTerminator =
        Parse.Return("").AtEnd().Or(
        NewLine.AtEnd()).Try().Or(
        NewLine);

    static readonly TextParser<string> Cell =
        QuotedCell.Or(
        LiteralCellContent.Many().Select(x => new string(x)));

    public static readonly TextParser<IEnumerable<string>> Record =
        from leading in Cell
        from rest in CellSeparator.Then(_ => Cell).Try().Many()
        from terminator in RecordTerminator
        select Cons(leading, rest);

    static IEnumerable<T> Cons<T>(T head, IEnumerable<T> rest)
    {
        yield return head;
        foreach (var item in rest)
            yield return item;
    }

    public static readonly TextParser<IEnumerable<IEnumerable<string>>> Csv =
        Record.Many().AtEnd().Select(x => x.AsEnumerable());
}

I've tried following the Sprache Conversion very carefully, but, I cannot get the TextParser<IEnumerable<IEnumerable<string>>> Csv to work.

Exception:

Exception has occurred: CLR/Superpower.ParseException
An unhandled exception of type 'Superpower.ParseException' occurred in Superpower.dll: 'Many() cannot be applied to zero-width parsers; value ParserConcept.CsvParser+<Cons>d__11`1[System.String] at position 5 (line 1, column 6).'

I have a feeling this is going to be a very simple problem and hoping you can provide some insight. Anything seem wrong to you? I may try and just implement one from scratch instead of converting.

P.S. Thanks for the awesome library, it's the first combinator I've used in C# and falling in love with it. It is making parsing crazy custom files I have much easier and with less code.

Simple parser question

I'm trying to parse lines which consist of some free text (not including brackets {}), followed by JSON-like key-value pairs (without quotes).

I have the the key-value pairs working (thanks to your JSONParser example), but I'm stuck on how to include chars like comma and colon in the free text part (e.g., the 3rd & 4th examples below).

This seems a very basic functionality of a parser, so I'm sure I'm missing something simple...

Some examples:

Hello {a:b}
Hello {a:b,c:d}
Hello, nice to meet you {a:b,c:d}
The List: A,B,C  {a:b,c:d}

thanks!

Add `Matching` to `TokenListParser`

As discussed on Gitter, sometimes a parser ends up being used twice for parsing the same piece of text: first time in tokenizer, and second one via Apply in the parser. This is because if we use a enum for TKind there is nowhere to store the initial parsing result.

Instead of using a enum for result of tokenization, thus it could be advisable using a normal class, that has the token type as a field. This way the parser will have access to the parsed result and will not need to call Apply to execute parsing again.

To facilitate this approach we would use Matching instead of Token.EqualTo to be able to select on type, which now became object field.

Matching method could look like this:

public static TokenListParser<TKind, Token<TKind>> Matching<TKind>(Func<TKind, bool> predicate, string name)
{
    if (predicate == null) throw new ArgumentNullException(nameof(predicate));
    if (name == null) throw new ArgumentNullException(nameof(name));

    return Matching(predicate, new[] { name });
}

private static TokenListParser<TKind, Token<TKind>> Matching<TKind>(Func<TKind, bool> predicate, string[] expectations)
{
    if (predicate == null) throw new ArgumentNullException(nameof(predicate));
    if (expectations == null) throw new ArgumentNullException(nameof(expectations));

    return input =>
    {
        var next = input.ConsumeToken();
        if (!next.HasValue || !predicate(next.Value.Kind))
            return TokenListParserResult.Empty<TKind, Token<TKind>>(input , expectations);

        return next;
    };
}

Tokenize<TKind>.Tokenize should accept TextSpan

I am pre-parsing some text to extract the parts I need to parse, and don't want to allocate more than necessary. For this reason it would be nice to be able to pass a TextSpan that I would extract from the original buffer instead of having to create a string out of it and pass it to the tokenizer.

Visibility of Result<T>.Backtrack

I have tried to build an alternative tokenizer, based on the TokenizerBuilder in this repo. I hit a problem where the code for TokenizerBuilder uses Result.Backtrack - this is an internal property. Which makes me wonder: if the current SimpleLinearTokenizer needs access to this property, wouldn't other custom tokenizers need it too?

Make SelectCatch() from Serilog available in Superpower

The Serilog parser has two extension methods SelectCatch(,,string error message) that does a select, but if that select throws, instead results in an error with the given message.

I think these are useful enough to introduce more generally in Superpower.

Help needed parsing time

Hi,

sorry to ask a noob question here, but struggling with some parsers. I want to create some parsers smart enough to read "2 minutes" or "2 seconds". I have this so far:

        public static TextParser<TimeSpan> Minutes { get; } = from number in Numerics.IntegerInt32
                                                              from _ in Character.WhiteSpace.IgnoreMany()
                                                              from units in Span.EqualToIgnoreCase("minutes")
                                                              select System.TimeSpan.FromMinutes(number);

        
        public static TextParser<TimeSpan> Seconds { get; } = from number in Numerics.IntegerInt32
                                                              from _ in Character.WhiteSpace.IgnoreMany()
                                                              from units in Span.EqualToIgnoreCase("seconds")
                                                              select System.TimeSpan.FromSeconds(number);

        public static TextParser<TimeSpan> TimeSpan { get; } = Minutes.Or(Seconds);

and these tests:

 [TestMethod]
        public void TimeSpanFromSeconds()
        {
            var result = WorkoutTextParser.TimeSpan.Parse("10 seconds");

            result.Should().Be(TimeSpan.FromSeconds(10));
        }
        [TestMethod]
        public void TimeSpanFromMinutes()
        {
            var result = WorkoutTextParser.TimeSpan.Parse("10 minutes");

            result.Should().Be(TimeSpan.FromMinutes(10));
        }

the seconds test fails with the message Superpower.ParseException: Syntax error (line 1, column 4): unexpected 's', expected 'minutes'.

Any suggestions would be greatly appreciated.

thanks

Question: an analog of Positioned from Sprache?

In Sprache, you provided IPositionAware and Positioned to make a parse result, well, aware of the position it’s parsed. I see this feature useful for giving a precise position of some syntax construct in post-parse checks (like, “this variable right here wasn’t declared” vs. the same without being able to report a position to the user, so they would have to search that place for themselves).

There is Result<T>.Location, but I don’t see how I could apply that to the resulting value via combinators. Could I achive it here, and which way you’d advice to do it best? (Or if maybe I’m looking for the wrong thing, and the thing mentioned should be done another way.)

Printing tokens from a custom tokenizer

How to iterate (and print) the tokens in my custom tokenizer? Seems there is no access to the underlying array of tokens? I'd like to print this data out to check the tokenizing as I develop. Is there a way to do this (currently I can only see the tokens in the debugger)?

For example, in the following code, I'd like to be to print the values after tokenizing and before parsing:

        var tokens = MyTokenizer.Instance.TryTokenize(json);
        if (!tokens.HasValue)
        {
            value = null;
            error = tokens.ToString();
            errorPosition = tokens.ErrorPosition;
            return false;
        }

       // print results here, then do the parsing phase

        var parsed =MyDocument.TryParse(tokens.Value);
        if (!parsed.HasValue)
        {
            value = null;
            error = parsed.ToString();
            errorPosition = parsed.ErrorPosition;
            return false;
        }

"Structured exception data" for superpower

When parsing fails an exception is thrown which includes a error message. The error message normally gives the error context, like line, and column.

In certain scenarios we are passing to the parser partial data (think templating, inlcude files, etc) to parse, and they may not correspond to what the end user will see.

The application using superpower then will need to translate the position given by the exception into an actual position relevant to the user.

Currently this is not possible without parsing the error message itself.

Can we employ ideas found in Serilog, and provide some structured data inside the exception that can be easily processed before passing onto the client?

Parsing left-recursive/ambiguous grammars

Hello everybody!

This is my favorite parser combinators library. In fact, the only I use. But after some time and having much more experience, I found that only the most simple grammars can be expressed with it while keeping complexity low (and being readable / maintainable).

For example, I tried to define a grammar for the ANSI-C language and failed multiple times. The problem is that most of grammars I've found are left-recursive, thus, not handled by Superpower (with the exception of some combinators like Chain)

Well, my dream is to create a C compiler. And I would like to use Superpower. The reasons are obvious: it's .NET and it's cross-platform. Also, I love C#.

I really hope that Superpower could become more mature and maybe copy some concepts from other parsers like GLL Combinators to be able to deal with such grammars.

Also, I hope to finish my compiler some time. The current parser is here (based on Superpower, of course) https://github.com/SuperJMN/SuppaParser. Feel free to collaborate. I really want to learn!

Expressions aren't handled in any way, for now.

I leave this issue as a reminder of my wishes and hopes, since I know this request is very unlikely to be seen in the near future. Also, I know @nblumhardt is quite busy and there aren't a lot of collaborators that could implement such an improvement.

Anyways, thank you :)

Function Calls - Recursive / Circular Grammars

Hello there everyone!

I have been struggling to parse recursive function calls, as implementing both recursive and iterative grammars seems impossible.

Is there some kind of solution?
Thanks!

[Question] Compare with ANTLR

Hi
Thanks for you great library.

Is it possible to make a comparison with the ANTLR ? (pros, cons, features, ...)
Can you get a road map?

for example, We don't have any Visitor like ANTLR so I think we should define event for each steps.
or EBNF syntax can be useful in some scenarios too however Fluent Syntax is brilliant .

It's great that we have a library like ANTLR based on C#.NET entirely.

Binary Model?

It seems like constraining parsing to only strings and character arrays seems unnecessarily restrictive. Creating parsers that operate on bytes and byte arrays could be super useful for parsing binary data into arbitrary objects. Creating a byte-array based version of the classes in Model would be the first step to doing this.

I'm wondering if this is a feature that would be appropriate to include in this project, or possibly fork into a different one. I haven't seen a lot of discussion about doing this in ANY nomadic parser library, and so I'm also wondering if I'm just way out to sea on this one.

How to check if input has been fully consumed?

When I call ParserExtensions.Parse (TokenList one) it seems to happily consume a prefix of the input and return success.

Is there any way to check the input has not been consumed fully as that might indicate a problem?

Read token attributes?

When iterating the tokens generated by the tokenizer, is it possible to check if any TokenAttribute (category, example or description) is set, and, if yes, to get their values ? It would be also interesting to be able to check directly if the token list contains any token from a certain category.

Custom parser usage

I find several examples of custom tokenizers (here for example), but none of custom (procedural) parsers. Is there an example somewhere, or is this a bad idea?

parsing comments

hi, I am trying to parse multiline comments with delimiters as defined for C line languages using Superpower but contrary to Sprache cannot find a lot out there to guide me. tried a few possibilities with no luck.
e.g.

/* this is a comment /
or
/
this is a
comment */

public static readonly TextParser MultiLineComment =
from first in OpenMultilineComment
from rest in Character.AnyChar.ManyDelimitedBy(CloseMultilineComment)
from last in CloseMultilineComment
select new string(rest);

the code above returns only one char for "rest" and I cannot work out why to be honest!?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.