Giter Club home page Giter Club logo

zio-parser's Introduction

ZIO Parser

Library for constructing parsers and pretty printers based on invertible syntax descriptions

Development CI Badge Sonatype Releases Sonatype Snapshots javadoc ZIO Parser

Introduction

Zymposium - ZIO Parser

Installation

Start by adding zio-parser as a dependency to your project:

libraryDependencies += "dev.zio" %% "zio-parser" % "0.1.9"

Getting Started

import zio.parser.*

Declare your parsing syntax:

val digitSyntax: Syntax[String, Char, Char, Char] = Syntax.digit

Parse your string:

val result: Either[StringParserError[String], Char] = digitSyntax.parseString("1")
// result: Either[StringParserError[String], Char] = Right(value = '1')

Pretty print the parsing errors:

println(digitSyntax.parseString("Hello").left.map(_.pretty).merge)
// Hello
// ^
// error: Failure at position 0: not a digit
//

Documentation

Learn more on the ZIO Parser homepage!

Contributing

For the general guidelines, see ZIO contributor's guide.

Code of Conduct

See the Code of Conduct

Support

Come chat with us on Badge-Discord.

License

License

zio-parser's People

Contributors

vigoo avatar scala-steward avatar khajavi avatar kitlangton avatar petoalbert avatar jdegoes avatar johnspade avatar navidjalali avatar afsalthaj avatar github-actions[bot] avatar tusharmath avatar

Stargazers

Alexander Wiklund avatar Andreas Thoelke avatar Gerard Dróżdż avatar Oleksandr B. avatar Andréa Duque avatar Michaël Dohr avatar Andrea Passaglia avatar Landlocked Surfer avatar Raphael MANSUY avatar  avatar Alpha Ho avatar ben.dio avatar sudotty avatar Doug Roach avatar  avatar Russ White avatar Zachary Albia avatar He-Pin(kerr) avatar Maxwell Brown avatar  avatar  avatar ʟᴊᴜɴɢᴍ•ʀᴋ avatar zhangzhonglai avatar Dmitry Kozinets avatar Damian Reeves avatar Jorge Vásquez avatar  avatar zhihanz avatar

Watchers

 avatar  avatar Itamar Ravid avatar Andreas Gies avatar Damian Reeves avatar  avatar AshPrakasan avatar  avatar

zio-parser's Issues

Handling recursive syntax using `def` throws `StackOverflowException`

Parsing something like this —

FOO
[FOO
[[FOO]]
[[[FOO]]]

In the above syntax, FOO can be wrapped any number of times inside a square bracket.
The parser is implemented using the following code —

val foo = Syntax.string("FOO", "foo")

def wrappedN(foo: Syntax[String, Char, Char, String]): Syntax[String, Char, Char, String] = {
  val wrapped = foo.between(Syntax.char('['), Syntax.char(']'))
  wrapped | wrappedN(wrapped)
}

val syntax = wrappedN(foo)

val input = "[[FOO]]"

println(syntax.parseString(input)) // Stack overflow exception

Failing Test: https://github.com/zio/zio-parser/pull/115/files

Update 1:
Using a lazy val instead of a def works properly.

lazy val syntax: Syntax[String, Char, Char, Unit] =
  (
    Syntax.string("FOO", {}) | syntax
  ).between(Syntax.char('['), Syntax.char(']'))

Extremely slow parsing

As we discovered on Zymposium on 22th of April, this example:

import zio.Chunk
import zio.parser._
import zio.parser.internal.Debug

object Example extends App {


  case class ChessCoord(row: Char, col: Byte)
  case class ChessMove(from: ChessCoord, to: ChessCoord)

  val chessCoord =
    (Syntax.charIn("ABCDEFGH") ~ Syntax.digit.transform(
      _.toString.toByte,
      (byte: Byte) => byte.toString.head
    )).transform(
      { case (r, c) => ChessCoord(r, c) },
      (cc: ChessCoord) => (cc.row, cc.col)
    )

  val whitespaces = Syntax.whitespace.repeat.unit(Chunk(' '))

  val chessMove =
    (chessCoord ~ whitespaces ~ Syntax.string("->", ()) ~ whitespaces ~ chessCoord).transform(
      { case (from ,to) => ChessMove(from, to) },
      (cm: ChessMove) => (cm.from, cm.to)
    )

  Debug.printParserTree(chessMove.asParser.optimized)

  val parsed = chessMove.parseString("""E1  -> A4""")
  val printed = chessMove.printString(parsed.toOption.get)

  println(parsed)
  println(printed)
}

runs very slowly.

Further optimize Regex

Write more benchmarks and see if Regex implementation can be further optimized, as it will speed up all string parsers.

Syntax.end should report where it failed

It would be nice to know the position where Syntax.end failed, and exactly what part of the input was left unconsumed.

I've encountered that repeat/repeatWithSep silently swallow errors, and the only way to know about the error is to use Syntax.end, but it doesn't provide any information about the parsing error either.

Reenable generic input/output types

The Syntax/Parser/Printer types are all designed to be generic in their input/output types but currently the parser input is fixed to be String, fixing the parser input to be Char. This is for performance reasons, but we could reintroduce this feature for non performance critical applications.

An operator (let's say >>>) can be introduced to plug Syntaxes with compatible input/outputs together.

Streaming input

Earlier versions of the parser supported streaming input which was dropped in favor of using strings directly to maximize performance. We should keep the direct string parsing for use cases where high performance is needed but could also reintroduce support for streaming input:

  • Parsing input to be a ZStream of characters
  • Syntax converted to sink/pipeline to inject parsing and printing into streams
  • Reintroduce the cut operator and the maxDistance modifier to control memory buffering required for backtracking on streams

Lucene query example fails in printer

Understand why the lucene query example fails on pretty printing and either fix the issue or try to improve Syntax to avoid creating such syntaxes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.