Giter Club home page Giter Club logo

Comments (8)

Hejsil avatar Hejsil commented on August 25, 2024

If we can come up with something that is decently ergonomic, then I wouldn't mind the change. All my libs will not have a 1.0 release before zig 1.0 so let's just break things while we can.

from mecha.

zenMaya avatar zenMaya commented on August 25, 2024

I think the change would require moving out of Error!Result(T) into something like:

pub fn Result(comptime T: type) type {
    return struct {
        pub const Value = union(enum) {
            Ok: T,
            Err: Error, // using the pub const Error
        };

        value: Value,
        rest: []const u8 = "",
    };
}

But that would mean rewriting all combinators and their respective functions

from mecha.

zenMaya avatar zenMaya commented on August 25, 2024

maybe it would be even better to support in what state the parser left, as there are some recovery strategies that can be implemented, to report as many errors as possible

for example:
c strings are always one line:

char* str = "aaaaa
";

is invalid but most parsers would assume that the string was just improperly terminated, although the input is invalid, the parser would continue assuming that the string is valid:

char* str = "aaaaa"
";";

erroring with unterminated string x2 and missing semicolon x2

or even some smarter error recovery strategy. Like having a special rule for multiline strings that fails as it is not valid in the C grammar, but continuing with parsing as the parser can still find more errors. Failing at the first error and not knowing what the error is: "ParserError", is not sufficient for recovery.

If there could be either a "panic mode" (discard characters until a "safe character" is found, like }, end of a block)

or fail with continuation, so the implementor could modify the state to get it into a safe state. For example if string literal fails, continue by inserting a ghost ", then continue with parsing.

from mecha.

zenMaya avatar zenMaya commented on August 25, 2024

I have starting to test the "continuation" tactic.

The Result structure would look like this:

pub fn Result(comptime T: type) type {
    return struct {
        pub const Value = union(enum) {
            Ok: T,
            InternalError: Error, // Parser failed irrecoverably
            ParserError: Parser(T), // Parser failed at this state
        };
        value: Value,
        rest: []const u8 = "",
    };
}

and for example the string combinator would look like this (I had to remove the comptime since the parsing must be able to dynamically recreate the parser):

pub fn string(str: []const u8) Parser(void) {
    return struct {
        fn func(_: mem.Allocator, s: []const u8) Void {
            const diff = mem.indexOfDiff(u8, s, str); // find the first characters where input differs
            if (diff) |diff_index| { // if there is one
                if (diff_index == str.len) // if the first difference is the ending of str, the parse was successful
                    return Void{ .value = .{ .Ok = {} }, .rest = s[str.len..] };
                // else create a new string, that expects the rest of the string to be parsed
                // for example: if s = "strang" and str = "string"
                // => string("ing") and rest = "ang"
                const resume_string = string(str[diff_index..]);
                return Void{ .value = .{ .ParserError = resume_string }, .rest = s[diff_index..] };
            } else return Void{ .value = .{ .Ok = {} }, .rest = s[str.len..] };
        }
    }.func;
}

@Hejsil please let me know what you think! Does it seem interesting to you? Should I continue with the implementation?

PS. I noticed that this may not be possible, I am not that versed in every zig implementation detail, and string not beeing a comptime function might make it not function.

from mecha.

Hejsil avatar Hejsil commented on August 25, 2024

PS. I noticed that this may not be possible, I am not that versed in every zig implementation detail, and string not beeing a comptime function might make it not function.

const resume_string = string(str[diff_index..]);

Yes correct, this will not work. I'm also not sure I understand why we would want ParserError to have Parser(T) as the result. How would one use this? I think you need to show me how you imagine error recovery looking.

from mecha.

zenMaya avatar zenMaya commented on August 25, 2024

Yes correct, this will not work. I'm also not sure I understand why we would want ParserError to have Parser(T) as the result. How would one use this? I think you need to show me how you imagine error recovery looking.

Okay maybe I haven't thought this through properly, sorry. I have been thinking about this for a few hours, and parser combinators aren't really suitable for what I had in mind. It would work for LL(1) parsers, but you cannot really inspect combinators like you can inspect Finite State Machines with explicit transition tables. The opt function essentially does almost everything that you need for error recovery I think, maybe some more ergonomic wrapper for collecting errors, but that doesn't require drastic API changes.

I'm really sorry that I have bothered you with this, I should've thought about it much longer before posting this.

Maybe the original Ok/Err proposal would be still useful? At least if you don't want to modify your combinators to recover from errors using opt, you can at least report the syntax error position.

from mecha.

Hejsil avatar Hejsil commented on August 25, 2024

Maybe the original Ok/Err proposal would be still useful? At least if you don't want to modify your combinators to recover from errors using opt, you can at least report the syntax error position.

Yea, I think there is a usecase for the Ok/Err thing. Being able to see how far the parser got furthest can be quite useful for simple error reporting, but for things more complicated, I think that is outside this libraries scope.

from mecha.

zenMaya avatar zenMaya commented on August 25, 2024

Yea, I think there is a usecase for the Ok/Err thing. Being able to see how far the parser got furthest can be quite useful for simple error reporting,

Okay, I'll try to get it done soon'ish and make a PR!

but for things more complicated, I think that is outside this libraries scope.

You are right, I think there could be some combinators for opt and error message, or combine opt with map, but otherwise the library is already featurefull enough. Once again I'm sorry for all the other stuff I suggested, it was dumb.

from mecha.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.