graphql-dotnet / parser Goto Github PK

A lexer and parser for GraphQL in .NET

License: MIT License

C# 100.00%

parser's Introduction

GraphQL.NET Parser

This library contains a lexer and parser as well as the complete GraphQL AST model that allows you to work with GraphQL documents compatible with the October 2021 spec.

The parser from this library is used by the GraphQL.NET project and was verified by many test data sets.

Preview versions of this package are available on GitHub Packages.

1. Lexer

Generates token based on input text. Lexer takes advantage of ReadOnlyMemory<char> and in most cases does not allocate memory on the managed heap at all.

Usage:

var token = Lexer.Lex("\"str\"");

Lex method always returns the first token it finds. In this case case the result would look like following.

2. Parser

Parses provided GraphQL expression into AST (abstract syntax tree). Parser also takes advantage of ReadOnlyMemory<char> but still allocates memory for AST.

Usage:

var ast1 = Parser.Parse(@"
{
  field
}");

var ast2 = Parser.Parse(@"
{
  field
}", new ParserOptions { Ignore = IgnoreOptions.Comments });

By default ParserOptions.Ignore is IgnoreOptions.None. If you want to ignore all comments use IgnoreOptions.Comments. If you don't need information about tokens locations in the source document, then use flag IgnoreOptions.Locations. Or just use IgnoreOptions.All and this will maximize the saving of memory allocated in the managed heap for AST.

3. ASTVisitor

ASTVisitor provides API to traverse AST of the parsed GraphQL document. Default implementation traverses all AST nodes of the provided one. You can inherit from it and override desired methods to implement your own AST processing algorithm.

SDLPrinter

For printing SDL from AST, you can use SDLPrinter. This is a highly optimized visitor for asynchronous non-blocking SDL output into provided TextWriter. In the majority of cases it does not allocate memory in the managed heap at all. Extension methods are also provided for printing directly to a string, which utilize the StringBuilder and StringWriter classes.

var document = Parser.Parse("query { hero { name age } }");

// print to a string with default options
var sdl = new SDLPrinter().Print(document);

// print to a string builder
var sb = new StringBuilder();
new SDLPrinter().Print(document, sb);

// print to a string with some options
var sdlPrinter = new SDLPrinter(
    new SDLPrinterOptions
    {
        PrintComments = true,
        EachDirectiveLocationOnNewLine = true,
        EachUnionMemberOnNewLine = true,
    });
var sdl = sdlPrinter.Print(document);

// print to a stream asynchronously
using var writer = new StreamWriter(stream);
await sdlPrinter.PrintAsync(document, writer, default);
await writer.FlushAsync();

Output:

query {
  hero {
    name
    age
  }
}

SDLSorter

An AST document can be sorted with the SDLSorter using a predefined sort order. You can specify the string comparison; by default it uses a culture-invariant case-insensitive comparison. Any futher customization is possible by deriving from SDLSorterOptions and overriding the Compare methods.

var document = Parser.Parse("query { hero { name age } }");
SDLSorter.Sort(document);
var sdl = new SDLPrinter().Print(document);

Output:

query {
  hero {
    age
    name
  }
}

StructurePrinter

You can also find a StructurePrinter visitor that prints AST into the provided TextWriter as a hierarchy of node types. It can be useful when debugging for better understanding the AST structure. Consider the following GraphQL document:

query a { name age }

After StructurePrinter processing the output text will be

Document
  OperationDefinition
    Name [a]
    SelectionSet
      Field
        Name [name]
      Field
        Name [age]

Usage:

public static async Task PrintStructure(string sdl)
{
    var document = Parser.Parse(sdl);
    using var writer = new StringWriter(); 
    var printer = new StructurePrinter()
    await printer.PrintAsync(document, writer);
    var rendered = writer.ToString();
    Console.WriteLine(rendered);
}

Contributors

This project exists thanks to all the people who contribute.

PRs are welcome! Looking for something to work on? The list of open issues is a great place to start. You can help the project by simply responding to some of the asked questions.

parser's People

Contributors

Stargazers

Watchers

parser's Issues

Comments should have no significance to the semantic meaning

Current parser tries to associate comments with the entities by assuming that the comment preceding the definition belongs to that definition.

According to the specification draft:

Comments are Ignored like white space and may appear after any token, or before a LineTerminator, and have no significance to the semantic meaning of a GraphQL Document.

So we should interpret comments just like the whitespace, skipping them during lex parsing.

This issue relates to #33 as documentation should replace the current comments annotation logic

Use string.GetHashCode in ROM

https://docs.microsoft.com/en-us/dotnet/api/system.memoryextensions.equals?view=net-6.0#system-memoryextensions-equals(system-readonlyspan((system-char))-system-readonlyspan((system-char))-system-stringcomparison)

https://docs.microsoft.com/en-us/dotnet/api/system.string.gethashcode?view=net-6.0#system-string-gethashcode(system-readonlyspan((system-char)))

Should both be faster than existing implementations. Need IFDEF check for both.

9.0.1 formatting - lost indentation of arguments (fine in 8.x)

9.0.1 fixes some formatting issues -- #287
And introduces a new issue, lost indentation of field arguments.

Please find attached the project for reproducing the issue:
TryGraphQLParser.zip

Steps

x1-type.graphql was produced by 8.x formatting.
See expected indented arguments:

type Query {
  "Fetches an object given its ID."
  node(
    "ID of the object."
    id: ID!): Node
  "Lookup nodes by a list of IDs."
  nodes(
    "The list of node IDs."
    ids: [ID!]!): [Node]!

Run

dotnet run -- x1-type.graphql

This produces the new x1-type.graphql.output.graphql.
See unexpected not indented arguments:

type Query {
  "Fetches an object given its ID."
  node(
  "ID of the object."
  id: ID!): Node
  "Lookup nodes by a list of IDs."
  nodes(
  "The list of node IDs."
  ids: [ID!]!): [Node]!

Avoid parse ambiguity on types & extensions

See graphql/graphql-spec#598

Make sure the parser handles it.

Lexer can still allocate memory in the case of parsing strings with escape characters

netstandard2.0 only migration?

Recently there was a similar migration in core GraphQL repository: graphql-dotnet/graphql-dotnet#969

Just wondering if we can do here as well. Does it make sense to continue to support net45 and netstandard1.1 TFMs ?

Сomments are shuffled in the schema

Unfortunately, we forgot to read the comment in one of the ParserContext methods - ParseEnumValueDefinition. Because of this, the comments stack fills up and the following items take previous comments from it. As a result, the scheme turns into a mess. It was fun to watch it in our production apps. At first we thought it was our mistake ).

Error on __schema query with comments

I am using the latest 3.0.0-preview-1631 so parser version is 5.1.2 . The following query gives an error :

{
  __schema {
    
    types {
      kind
      
    }
  }
  
# __schema{
#   types {
#     name
#   }
# }
}

The error is :

{
  "errors": [
    {
      "message": "GraphQL.ExecutionError: ParserContext has 1 not applied comments.\r\n ---> System.ApplicationException: ParserContext has 1 not applied comments.\r\n   at GraphQLParser.ParserContext.Dispose()\r\n   at GraphQLParser.Parser.Parse(ISource source)\r\n   at GraphQL.Execution.GraphQLDocumentBuilder.Build(String body) in C:\\projects\\graphql-dotnet\\src\\GraphQL\\Execution\\GraphQLDocumentBuilder.cs:line 20\r\n   at GraphQL.DocumentExecuter.ExecuteAsync(ExecutionOptions options) in C:\\projects\\graphql-dotnet\\src\\GraphQL\\Execution\\DocumentExecuter.cs:line 86\r\n   --- End of inner exception stack trace ---",
      "extensions": {
        "code": "APPLICATION",
        "codes": [
          "APPLICATION"
        ]
      }
    }
  ],
  "extensions": {
    "tracing": {
      "version": 1,
      "startTime": "2020-07-27T09:02:21.0125943Z",
      "endTime": "2020-07-27T09:02:21.0125943Z",
      "duration": 354200,
      "parsing": {
        "startOffset": 5200,
        "duration": 303400
      },
      "validation": {
        "startOffset": 0,
        "duration": 0
      },
      "execution": {
        "resolvers": []
      }
    }
  }
}

Support for repeatable directives

Relates to graphql-dotnet/graphql-dotnet#1496

Package compatibility issues

We've been having issues with occasional bad builds of our graphql-dotnet based being generated, where a bad build will always output the following exception for all GraphQL calls:

ExecutionErrors [GraphQL.ExecutionError: Unmapped selection Field ---> GraphQL.ExecutionError: Unmapped selection Field
   at GraphQL.Language.CoreToVanillaConverter.Selection(ASTNode source)
   at GraphQL.Language.CoreToVanillaConverter.<>c__DisplayClass10_0.<SelectionSet>b__0(ASTNode s)
   at GraphQL.EnumerableExtensions.Apply[T](IEnumerable`1 items, Action`1 action)
   at GraphQL.Language.CoreToVanillaConverter.SelectionSet(GraphQLSelectionSet source)
   at GraphQL.Language.CoreToVanillaConverter.Operation(GraphQLOperationDefinition source)
   at GraphQL.Language.CoreToVanillaConverter.<>c__DisplayClass3_0.<AddDefinitions>b__0(ASTNode def)
   at GraphQL.EnumerableExtensions.Apply[T](IEnumerable`1 items, Action`1 action)
   at GraphQL.Language.CoreToVanillaConverter.Convert(String body, GraphQLDocument source)
   at GraphQL.Execution.GraphQLDocumentBuilder.Build(String body)
   at GraphQL.DocumentExecuter.ExecuteAsync(ExecutionOptions options)

Looking at the source of that exception it should never be throwing that exception when source.Kind == ASTNodeKind.Field.

Having done some digging, I am pretty sure I understand the problem now:

Our project depends (via graphql-dotnet) on GraphQL-Parser 3.0.0, and each time our CI build runs, it pulls a new copy of GraphQL-Parser 3.0.0 from nuget.

What is happening is that sometimes (maybe 5-10% of the time) the package being served by NuGet has ASTNodeKind.Field == 7, and the rest of the time ASTNodeKind.Field == 6.

It would appear that 13 days ago de9ffa2 added a new ASTNodeKind.Comment as the first field in the enum, which has shifted the int value of everything below it up by one, breaking binary compatibility.

At least one other person on StackOverflow appears to be hitting this issue, but I'm assuming most people won't notice because they have a cached 3.0.0 nuget package.

So I think there are two issues here:

Was the current pre-release version mistakenly pushed to nuget as 3.0.0 at some point? Because NuGet appears to be intermittently serving a different version as 3.0.0. Perhaps the incorrect package got onto NuGet's CDNs? We wrote a script to repeatedly pull the package from NuGet and confirmed that it was occasionally serving an incorrect version of the package.
Renumbering ASTNodeKind is a breaking change - the major version number should probably be incremented so a package doesn't get incorrectly upgraded from 3.0 to 3.1 and break in a very confusing way. Maybe it would be better to move the new Comment enum field to the bottom of the list instead rather than renumbering all existing enum fields?

Thanks!

Reading a schema file that has triple quotes?

So I have a schema file. With comments encased in triple quotes. Trying to read in the file like this:
var schemaTxt = File.ReadAllText(@".\shopify.graphql"); var schema = Schema.For(schemaTxt);
When I do so the first line with triple quotes is throwing an exception. Is there an option I can configure to treat these as comments and ignore them?

For example, """Marks an element of a GraphQL schema as having restricted access."""

Parser.Parse throws ArgumentOutOfRangeException

Parser.Parse throws an ArgumentOutOfRangeException when parsing the file from the attached archive. You can run the following code to reproduce it (the path variable should contain the path to the extracted file):

var text = File.ReadAllText(path);
var parser = new Parser(new Lexer());
parser.Parse(new Source(text));

Found via SharpFuzz.

@specifiedBy support

see graphql/graphql-spec#649

Fix readme on Nuget

Sad news. Does this mean that other projects also have this problem? I see readme in graphql-dotner but not parser 🤔 . Looks a bit weird. How about just move property out of Condition?

  <PropertyGroup>
    <PackageReadmeFile>README.md</PackageReadmeFile>
  </PropertyGroup>

The same for AnalysisMode. I understand it requires a bit more work but it should not be difficult. I propose to add CA1707 into NoWarn in Tests.props . After that I see only ~60 issues to fix 😉 . I think this is better than jumping over different order of conditions in MSBuild.

Originally posted by @sungam3r in graphql-dotnet/server#897 (comment)

Add tests to increase code coverage up to 100%

Add SDLPrinterOptions.Sorted property

Should sort all sortable AST nodes while printing.

@sungam3r Are you interested in adding this feature? I would find it very useful for tests on dynamically generated schema.

Release v7

~~Test string -> ROM in GraphQL.NET~~ DONE
~~Update README.md~~ DONE
~~Describe new option for ignoring comments.~~ DONE in #101
~~XML comments~~ PARTIALLY DONE
~~Take care of #89~~ DONE

Add support for all types of schema/type extensions

Ensure October spec compliance before v8 release and publish release

@Shane32 I won't to deal with #103 and #217. These are not so important tasks for optimizing non-critical paths. So the single issue left here before v8 release but it requires to look through the spec https://spec.graphql.org/October2021/ one more time.

Null Literal Value

graphql/graphql-spec#83

Add advanced SDLWriter tests for descriptions and comments

See https://app.codecov.io/gh/graphql-dotnet/parser/blob/master/src/GraphQLParser/Visitors/SDLWriter.cs

Parser.Parse takes around 18s to parse the 58K file

Parser.Parse takes around 18s to parse the 58K file from the attached archive. You can run the following code to reproduce it (the path variable should contain the path to the extracted file):

var text = File.ReadAllText(path);
var parser = new Parser(new Lexer());
parser.Parse(new Source(text));

Found via SharpFuzz.

4.0.0 stable release on nuget.org

Maybe it's time for stable 3.1.0 release?

Performance improvement

Hi,
I migrated your parser to VS2017 and noticed a very low hanging performance fruit you might want, I can't provide a pull request, as my branch is already too different, because all those VS2017 changes require quite a bit of change.
Anyway:
In LexerContext.GetToken() you currently have the following line 31:

var unicode = this.IfUnicodeGetString();

This line is only used in the error case, but processed each and every time.
If you move it down to the exception, your benchmark runs ~6 times faster and uses ~5 less memory.

throw new GraphQLSyntaxErrorException($"Unexpected character {ResolveCharName(code, IfUnicodeGetString())}", source, currentIndex);

All your unit tests still work obviously, because IfUnicodeGetString() does not change currentIndex.
The change itself is here:
https://github.com/Tornhoof/graphql-dotnet-parser/commit/50f5b6dca69dc7f849a54ff4a5bba254a51b6d1f

Benchmark before:

BenchmarkDotNet=v0.10.8, OS=Windows 10 Redstone 2 (10.0.15063)
Processor=Intel Core i7-4790K CPU 4.00GHz (Haswell), ProcessorCount=8
Frequency=3906243 Hz, Resolution=256.0005 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2098.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2098.0

Method	Mean	Error	StdDev	Gen 0	Allocated
LexKitchenSink	84.97 us	1.506 us	1.335 us	18.9209	77.77 KB

Benchmark after:

BenchmarkDotNet=v0.10.8, OS=Windows 10 Redstone 2 (10.0.15063)
Processor=Intel Core i7-4790K CPU 4.00GHz (Haswell), ProcessorCount=8
Frequency=3906243 Hz, Resolution=256.0005 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2098.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2098.0

Method	Mean	Error	StdDev	Gen 0	Allocated
LexKitchenSink	13.89 us	0.0297 us	0.0232 us	3.5553	14.6 KB

Unfortunately that's basically the only really low hanging performance fruit, everything else is mostly related to substring etc. and for that quite a few more changes are necessary.

Do something with newlines in SDLWriter

Unify newlines or add setting.

Namespace

Just thinking about the namespace change. I did like the GraphQL(Core).Language as the root namespace. Thoughts on changing that back to GraphQL.Language or GraphQLCore.Language?

Strange formatting of input types with descriptions

If an input type has field descriptions then formatting produces unexpected extra new lines and extra indentation of fields.

Please find attached the project for reproducing the issue.
TryGraphQLParser.zip

Test1

No issues for type types:

dotnet run -- x1-type.graphql

input x1-type.graphql contains a type with fields with descriptions.
output x1-type.graphql.output.graphql is formatted as expected.

Test2

Formatting issues with input with descriptions:

dotnet run -- x2-input.graphql

input x2-input.graphql contains an input with fields with descriptions.
output x2-input.graphql.output.graphql shows unexpected new lines and extra indentations.

Test3

No issues with input without descriptions:

dotnet run -- x3-input.graphql

input x3-input.graphql contains an input with no descriptions.
output x3-input.graphql.output.graphql is formatted as expected.

Add codecov badges after merging in master

Publish v8.1.0 release

Allow interfaces to implement other interfaces

Relates to graphql-dotnet/graphql-dotnet#1497

Specification constraints

How to deal with spec constraints? For example this one

Transitively implemented interfaces (interfaces implemented by the interface that is being implemented) must also be defined on an implementing type or interface.

I know that we implement many checks in GraphQL.NET level. There is often more context for this and more convenient API. Nevertheless, the question arises to support the validation (at least some basic constraints) at the parser level. This can be a separate class with many validation rules such as one mentioned above that can be used if necessary. @Shane32 what do you think?

Something wrong with CodeCov

Deal with <Include>[GraphQL-Parser]*</Include>.
Why some PRs have no code coverage comment (#166) but other (#168) has.

Schema language descriptions

I'm not sure that the parser handles these yet.

graphql/graphql-js#464

# Marks an element of a GraphQL schema as no longer supported.
directive @deprecated(
  # Explains why this element was deprecated, usually also including a suggestion
  # for how to access supported similar data. Formatted in
  # [Markdown](https://daringfireball.net/projects/markdown/).
  reason: String = "No longer supported"
) on FIELD_DEFINITION | ENUM_VALUE

develop branch: ParserExtensions methods should not be extensions

I don't think we should have extension methods called Parse and Lex on string.

Why should "test".Parse() return a GraphQLDocument? Why shouldn't it parse to an integer, for example? And why should "test".Lex() create a lexer for GraphQL documents? Why not a lexer for json?

When you parse an integer, you call int.Parse("234"). It's not just naming -- Microsoft didn't add a "ParseInt" extension method for strings.

I believe we should remove ParseExtensions. The static methods on Parser and Lexer are well designed. No extension methods are needed. If desired for tests, someone can add an extension method within their a test project to facilitate testing. It should not be part of the library.

Is there a way to turn the AST back into a graphql query string?

I'm interested into urning Linq Expressions into graphql queries (e.g. so a .NET client can treat a graphQL endpoint as an IQueryable), and was thinking I could use your AST to do this, but I can't find a way to turn it back into a graphql query string. Is there a way to do this?

SDLPrinter.Print in addition to SDLPrinter.PrintAsync

Is your feature request related to a problem? Please describe.

Having only SDLPrinter.PrintAsync is inconvenient in synchronous scenarios.

Not usually recommended Result or GetAwaiter().GetResult() seemingly works
(so far? by chance?) but produces the code analysis issue / warning CA2012:

ValueTask instances should not have their result directly accessed unless the
instance has already completed. Unlike Tasks, calling Result or
GetAwaiter().GetResult() on a ValueTask is not guaranteed to block until the
operation completes. If you can't simply await the instance, consider first
checking its IsCompleted property (or asserting it's true if you know that to
be the case).

The recommended above safe way of getting ValueTask result is somewhat convoluted.

Describe the solution you'd like

The synchronous SDLPrinter.Print free of potential issues would be handy.

Parsing variable values

Parsed variable values come in as literals without any associated GraphQLtypes

Example:
If I've defined a variables as {username: "peterhouse"}
and defined a query as

query($username: String) {
  Person(uname: $username, firstName: "Pete") {
    id, email
  }
}

What is the best way to parse everything i.e. query + variable definitions. Variable definition in this case being {username: "peterhouse"}. I'm not seeing any examples on how to do this - not even in the tests.
What's the prescribed way for retrieving the value of a variable as a GraphQLType i.e. same way the parser returns firstName above as a GraphQLType with ASTNodeKind as StringValue?

Setup

Glad to see this here!

I'm thinking we should rename the repo to just parser
I have a graphql-dotnet-ci account that can be used with AppVeyor
I signed up for an open-source myget account a while ago though haven't used it yet
Is there any special setup needed for AppVeyor?

Consider to split GraphQLBooleanValue into two classes for true and false

Configure LGTM

Optimize comments and block strings

Rel #216

Not supporting variables with multiple values

{
    first: Project(project_SKey:[6494, 6495])
}

When parsing the above gives:

Syntax Error GraphQL (1:43) Unexpected EOF
1: {    first: Project(project_SKey:[6494

Schemas containing type or field description fail to be parsed correctly

Description

The attempt to parse schema with description tokens results in exception

Steps to reproduce

var schema = Schema.For(@"""Root type for all queries""
type Query {
  ""Documentation for the field""
  name: String
}");

Expected result

Documentation should either be ignored or parsed into the corresponding fields

Actual result

Exception

GraphQLParser.Exceptions.GraphQLSyntaxErrorException: Syntax Error GraphQL (1:1) Unexpected String "Root type for all queries"
1: "Root type for all queries"
   ^
2: type Query {

  at at GraphQLParser.ParserContext.ParseDefinition()
  at at GraphQLParser.ParserContext.ParseDefinitionsIfNotEOF()
  at at GraphQLParser.ParserContext.ParseDocument()
  at at GraphQLParser.ParserContext.Parse()
  at at GraphQLParser.Parser.Parse(ISource source)
  at at GraphQL.Utilities.SchemaBuilder.Parse(String document)
  at at GraphQL.Utilities.SchemaBuilder.Build(String typeDefinitions)
  at at GraphQL.Types.Schema.For(String typeDefinitions, Action`1 configure)

Environment

.NET Core 3.1, graphql-dotnet 3.0.0-preview-1490

Is there a list of allowable AST node types for each container?

Like what ASTNodeKind are permitted in a GraphQlDocument.Definitions collection? What can be in a SelectionSet collection? And so on.

v8 changes

Rename IHasDescription to IHasDescriptionNode
IgnoreOptions -> flags enum

More Parse methods

          Just a note - it seems that we could make more methods public to allow parse concrete parts of document.

Originally posted by @sungam3r in #295 (comment)

Don't track branch coverage for Debug.Assert

https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-test:

@Shane32 So may be add -c Release to increase coverage. Debug.Assert never happens when code works well so those calls always decrease coverare :)

Revise AST constructors

          Internally we can still do it the other way.  But the `public` constructor would have proper NRT attributes.  We can also allow for users to pass in `null!` if they like to the constructor, but they are then knowingly breaking NRT semantics.

Originally posted by @Shane32 in #300 (comment)

Support for descriptions

So 4.0 added support for comments but it seems that comments should be ignored by the parser. Descriptions are what the spec uses for documentation. More here graphql/graphql-js#927 and here https://graphql.github.io/graphql-spec/June2018/#sec-Descriptions

Eliminate RegExp usage and ToString call in Location.cs

#101 (comment)

graphql-dotnet / parser Goto Github PK

parser's Introduction

GraphQL.NET Parser

1. Lexer

2. Parser

3. ASTVisitor

SDLPrinter

SDLSorter

StructurePrinter

Contributors

parser's People

Contributors

Stargazers

Watchers

Forkers

parser's Issues

Description

Steps to reproduce

Expected result

Actual result

Environment

Recommend Projects

Recommend Topics

Recommend Org