Hi, Im just curious about the relative performance vs fsyacc/lexx and maybe the error

Curious about performance vs fsyacc/lexx about flexer HOT 4 CLOSED

7sharp9 commented on September 25, 2024

Curious about performance vs fsyacc/lexx

from flexer.

Comments (4)

DanielOliver commented on September 25, 2024 1

I've just made a change so that the errors with the longest parse length are returned. It's not amazing error reporting, but it's a step in the right direction.

Every Failure to parse returns the structure below.

FLexer/src/FLexer.Core/Classifier.fs

Lines 28 to 31 in 9d321d6

 type ClassifierError<'t> = 

 { LastStatus: ClassifierStatus<'t> 

 TokenizerError: Tokenizer.TokenizerError option 

 }

The LastStatus field contains the information below. The TokenizerError field isn't used very well right now and is inconsistent, so don't rely on that.

FLexer/src/FLexer.Core/Classifier.fs

Lines 4 to 9 in 9d321d6

 type ClassifierStatus<'t> = 

 { Consumed: Tokenizer.Token<'t> list 

 ConsumedWords: string list 

 CurrentChar: int 

 Remainder: string 

 }

With the longest error being returned, by printing out the ConsumedWords list as shown in the FLexer.Example project, I can see that the Parser got this far:

  ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***   ***
-------------------------------------------------------------------------------------
Rejected "SELECT Column1, Column2,,,NoColumn FROM Contacts"

LookaheadFailure

--  Consumed Text  ------------------------------------------------------------------
                          Text     Length
-------------------------------------------------------------------------------------
                        SELECT  |           6
                  (Whitespace)  |           1
                       Column1  |           7
                             ,  |           1
                  (Whitespace)  |           1
                       Column2  |           7
                             ,  |           1

Ideally, FLexer should be able to support custom error messages being returned from any point, but I haven't quite been able to make that happen yet.

from flexer.

DanielOliver commented on September 25, 2024

Performance comparison is an excellent question that I don't have an answer for immediately.

Error reporting is a little sparse right now beyond seeing how far the tokens could be read. The current example from the README here illustrates the lack of information.

// Rejected "SELECT Column1, Column2,,,NoColumn FROM Contacts"
// 
// LookaheadFailure
// 
// --  Consumed Tokens  ----------------------------------------------------------------
//  StartChar  |     EndChar  |                  Text  |                  Classification
// -------------------------------------------------------------------------------------
//          0  |           5  |                SELECT  |  Select

I have some features regarding better error reporting in development, as well as making it easier to write more performant code. I can't say that FLexer will ever be always super fast, but it should be easy to get right.

FLexer is ultimately a recursive descent parser with infinite backtrack (a depth first search). With this approach, I'm putting the responsibility of being optimal on the developer. So, the happy path will be "right first try"; however, the potential to backtrack to the worst case still exists.

from flexer.

DanielOliver commented on September 25, 2024

I can't promise anything on performance comparisons, but I'll ping you back here when I get better error reporting added in a few weeks.

from flexer.

7sharp9 commented on September 25, 2024

Cool thanks!

…

On Mon, 5 Feb 2018 at 16:55, Daniel ***@***.***> wrote: I can't promise anything on performance comparisons, but I'll ping you back here when I get better error reporting added in a few weeks. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAj7yj7KXx-m431ktmpual7Eqm3vXDkyks5tRzKIgaJpZM4R0e-Y> .

from flexer.

Curious about performance vs fsyacc/lexx about flexer HOT 4 CLOSED

Comments (4)

Related Issues (1)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	type ClassifierError<'t> =
	{ LastStatus: ClassifierStatus<'t>
	TokenizerError: Tokenizer.TokenizerError option
	}

	type ClassifierStatus<'t> =
	{ Consumed: Tokenizer.Token<'t> list
	ConsumedWords: string list
	CurrentChar: int
	Remainder: string
	}