Giter Club home page Giter Club logo

Comments (4)

DanielOliver avatar DanielOliver commented on September 25, 2024 1

I've just made a change so that the errors with the longest parse length are returned. It's not amazing error reporting, but it's a step in the right direction.

Every Failure to parse returns the structure below.

type ClassifierError<'t> =
{ LastStatus: ClassifierStatus<'t>
TokenizerError: Tokenizer.TokenizerError option
}

The LastStatus field contains the information below. The TokenizerError field isn't used very well right now and is inconsistent, so don't rely on that.

type ClassifierStatus<'t> =
{ Consumed: Tokenizer.Token<'t> list
ConsumedWords: string list
CurrentChar: int
Remainder: string
}

With the longest error being returned, by printing out the ConsumedWords list as shown in the FLexer.Example project, I can see that the Parser got this far:

  ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***
-------------------------------------------------------------------------------------
Rejected "SELECT Column1, Column2,,,NoColumn FROM Contacts"

LookaheadFailure

--  Consumed Text  ------------------------------------------------------------------
                          Text     Length
-------------------------------------------------------------------------------------
                        SELECT  |           6
                  (Whitespace)  |           1
                       Column1  |           7
                             ,  |           1
                  (Whitespace)  |           1
                       Column2  |           7
                             ,  |           1

Ideally, FLexer should be able to support custom error messages being returned from any point, but I haven't quite been able to make that happen yet.

from flexer.

DanielOliver avatar DanielOliver commented on September 25, 2024

Performance comparison is an excellent question that I don't have an answer for immediately.

Error reporting is a little sparse right now beyond seeing how far the tokens could be read. The current example from the README here illustrates the lack of information.

// Rejected "SELECT Column1, Column2,,,NoColumn FROM Contacts"
// 
// LookaheadFailure
// 
// --  Consumed Tokens  ----------------------------------------------------------------
//  StartChar  |     EndChar  |                  Text  |                  Classification
// -------------------------------------------------------------------------------------
//          0  |           5  |                SELECT  |  Select

I have some features regarding better error reporting in development, as well as making it easier to write more performant code. I can't say that FLexer will ever be always super fast, but it should be easy to get right.

FLexer is ultimately a recursive descent parser with infinite backtrack (a depth first search). With this approach, I'm putting the responsibility of being optimal on the developer. So, the happy path will be "right first try"; however, the potential to backtrack to the worst case still exists.

from flexer.

DanielOliver avatar DanielOliver commented on September 25, 2024

I can't promise anything on performance comparisons, but I'll ping you back here when I get better error reporting added in a few weeks.

from flexer.

7sharp9 avatar 7sharp9 commented on September 25, 2024

from flexer.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.