gcanti / parser-ts Goto Github PK
View Code? Open in Web Editor NEWString parser combinators for TypeScript
Home Page: https://gcanti.github.io/parser-ts/
License: MIT License
String parser combinators for TypeScript
Home Page: https://gcanti.github.io/parser-ts/
License: MIT License
Apologies for opening yet another issue, but I continue to come across use cases uncovered by parser-ts
that might be beneficial to include in the main library as I work on a parser for the GraphQL query language.
Example
From the GraphQL specification:
EnumValue:
Name but nottrue
orfalse
ornull
where a valid Name
(in theory) could be the identifiers true
, false
, and null
.
Proposal
I propose that we add the two additional parsers defined below to the string
module. For examples, see the tests in the feature branch of my fork.
Let me know what you think!
/**
* Fails if the specified string is matched, otherwise succeeds with an empty result and
* consumes no input.
*
* @example
* import { run } from 'parser-ts/lib/code-frame'
* import * as S from 'parser-ts/lib/string'
*
* const parser = S.notString('foo')
*
* run(parser, 'bar')
* // { _tag: 'Right', right: '' }
*
* run(parser, 'foo')
* // { _tag: 'Left', left: '> 1 | foo\n | ^ Expected: not "foo"' }
* @category constructors
* @since 0.6.8
*/
export const notString: (s: string) => P.Parser<C.Char, string> = s => i => {
const _string: (s2: string) => P.Parser<C.Char, string> = s2 =>
pipe(
charAt(0, s2),
O.fold(
() => P.succeed(''),
c =>
pipe(
C.notChar(c),
P.chain(() => _string(s2.slice(1))),
P.chain(() => P.succeed(''))
)
)
)
return pipe(
P.expected(_string(s), `not ${JSON.stringify(s)}`)(i),
E.chain(next => success(next.value, i, i))
)
}
/**
* Fails if any of the specified strings are matched, otherwise succeeds with an empty result and
* consumes no input.
*
* @category constructors
* @since 0.6.8
*/
export function notOneOf<F extends URIS>(
F: Functor1<F> & Foldable1<F>
): (ss: Kind<F, string>) => P.Parser<C.Char, string>
export function notOneOf<F>(F: Functor<F> & Foldable<F>): (ss: HKT<F, string>) => P.Parser<C.Char, string>
export function notOneOf<F>(F: Functor<F> & Foldable<F>): (ss: HKT<F, string>) => P.Parser<C.Char, string> {
return ss => F.reduce(ss, P.succeed(''), (p, s) => fold([p, notString(s)]))
}
Hello.
I noticed that most of the examples uses the run function, that in turn requires babel code-frame. Can babel code frame be avoided? My parser is meant to run in the browser and this dependency is quite beefy (and I'm not even sure it can run properly)
I'd like to be able to raise a fatal error from a filter
combinator. Perhaps we could add some optional parameters for filter
which pass through options to error
.
Happy to contribute code if you accept the feature.
Running the string.doubleQuotedString
parser on long-ish strings causes it to hit the recursion limit. I tried getting around this by writing a regex to match such a string, but since streams of strings are arrays of characters, there is no way to match that regex to a string longer than a single character.
Is there some official workaround for this? Thanks!
@gcanti - If you are willing, I would like to have a design discussion about how the problem described below can potentially be solved using parser-ts
in the most efficient manner.
How can I create a Stream
in which I have access to the position of a given token (i.e. the line and column number) in the most efficient manner possible.
I would like to create a group of parsers that are aware of and can respond to the current indentation level, similar to how a Python parser needs to detect when the indentation level increases or decreases. Because the parser I am trying to create is context-sensitive, I am not sure the problem can be easily solved with parser-ts
. However, I can think of two approaches that would at least allow me access to the position of tokens within a Stream
.
The problem proposed above inherently lends itself to a Parser
with a string
source that parses a Stream
of Char
s. Therefore, when referring to parsers below I am actually referring to a parser of string characters:
import { Char } from 'parser-ts/Char'
type StringParser<A> = Parser<Char, A>
Within a Stream
, we have access to the current cursor
position which allows us to determine our current position within the buffer of unparsed character tokens. However, the cursor is unaware of the actual position of a character token within a given piece of source text.
Both solutions involve adding an additional piece of data to the Stream
generated for a given Parser
. In this case, the additional piece of data is the Pos
interface described below.
/**
* Represents the position of a token within the source text.
*/
export interface Pos {
readonly line: number
readonly col: number
}
Given a character and the previous position, the position of the character in question can be determined using the functions below:
/**
* Constructs a new `Pos` given a line number and a column number.
*/
export const pos = (line: number, col: number): Pos => ({
line,
col
})
/**
* The initial `Pos` within any source text.
*/
export const initialPos = pos(1, 1)
/**
* Updates the `Pos` given a character and the previous `Pos` in the source text.
*/
export const updatePosChar: (c: Char, pos: Pos) => Pos = (char, { line, col }) => {
switch (char) {
case '\n':
return pos(line + 1, 1)
case '\r':
return pos(line + 1, 1)
case '\t':
return pos(line, col + 8 - ((col - 1) % 8))
default:
return pos(line, col + 1)
}
}
The first solution that I came up with involves modifying the type of data stored in the buffer
of a Stream
.
import { Char } from 'parser-ts/Char'
import { Pos } from 'parser-ts/Pos'
import { Stream } from 'parser-ts/Stream'
/**
* Represents a character and its associated position with the source text.
*/
export interface PositionedChar {
readonly char: Char
readonly pos: Pos
}
/**
* Represents a stream of positioned characters.
*/
export type CharStream = Stream<PositionedChar>
The buffer of positioned characters must be created ahead of time by first splitting the source text into an array of character tokens, and then mapping the resulting array into an array of PositionedChar
s. The inefficiency of the solution follows from the two separate passes through the stream of character tokens.
The second solution involves modification of the source code of parser-ts
to include the Pos
module as well as a new CharStream
module given how common it is to use parser-ts
on a stream of character tokens.
The proposed code for the CharStream
module can be found below.
import { lookup } from 'fp-ts/lib/Array'
import { map, Option } from 'fp-ts/lib/Option'
import { pipe } from 'fp-ts/lib/function'
import { Char } from './char'
import { Pos, updatePosChar } from './Pos'
export interface CharStream {
readonly buffer: Array<Char>
readonly pos: Pos
readonly cursor: number
}
export const charStream: (buffer: Array<Char>, pos: Pos, cursor?: number) => CharStream = (
buffer,
pos,
cursor = 0
) => ({
buffer,
pos,
cursor
})
export const get: (s: CharStream) => Option<Char> = s => lookup(s.cursor, s.buffer)
export const atEnd: (s: CharStream) => boolean = s => s.cursor >= s.buffer.length
export const getAndNext: (s: CharStream) => Option<{ readonly value: Char; readonly next: CharStream }> = s =>
pipe(
get(s),
map(a => ({ value: a, next: charStream(s.buffer, updatePosChar(a, s.pos), s.cursor + 1) }))
)
Please let me know if you think that this type of context-sensitive parsing is out of scope for parser-ts
. I am very curious to hear your thoughts!
Do you think this combinator would fit?
export function alts<A>(...ps: Parser<A>[]): Parser<A> {
return ps.reduce((p1, p2) => p1.alt(p2), p.fail)
}
I find it improves readability over chaining .alt
s.
NB: Would it be more efficient to use reduceRight and swap the order of the parsers to alt
?
Currently there is no hexadecimal digit combinator available in parser-ts/char
.
The "core" library export a hexDigit
combinator since it's quite common e.g. Haskell and Purescript
const isHexDigit: (c: Char) => boolean = (c) =>
"0123456789abcdef".indexOf(c.toLowerCase()) !== -1;
const hexDigit: P.Parser<Char, Char> = P.expected(P.sat(isHexDigit), "a hex digit");
All users, regardless of skill would benefit from this. Not dependant on TS version.
It's easy enough to build this yourself and put into your own codebase. I've done so while attempting to port over my own uuid
and mac address
parsers written in Purescript.
Software | Version(s) |
---|---|
fp-ts | 2.16.1 |
parser-ts | 0.7.0 |
TypeScript | 5.2.2 |
for a parser, the parser should match at least once. If it does, it will return success with a combination of all immediate reoccurring matches.
I think this is parser.many1
but uses a Semigroup
as the main thing.
I found it handy in what I'm doing, might be good for someone else.
/** Repeats a value until it's no longer matching. must match at least once or it fails */
export function repeat<A>(
semigroup: SE.Semigroup<A>,
): Endomorphism<parser.Parser<string, A>> {
return (fa) =>
pipe(
fa,
parser.chain((a) =>
parser.ChainRec.chainRec(a, (a) =>
pipe(
fa,
parser.map((b) => semigroup.concat(a, b)),
parser.map(E.left),
parser.alt(() => parser.succeed(E.right(a))),
),
),
),
parser.alt(() => parser.fail()),
)
}
Suppose we have get
function of type:
const get: (s: string[]) => Option<string>
Then we compose a parser like:
pipe(
P.chain(() => P.many1(C.alphanum)),
P.map(get),
// fail if we get None, else extract value from Option monad,
// so the expected type of the pipe is Parser<string, string> and not Parser<string, Option<string>>
)
How am I am supposed to do things specified in the comment?
I have reached for the following utility parsers extensively in projects that use parser-ts
. I was wondering if inclusion in the main library would be of interest. Any feedback on the implementation or function signatures is welcome!
Inspiration taken from
import * as E from 'fp-ts/lib/Either'
import { Parser } from 'parser-ts/lib/Parser'
import { success } from 'parser-ts/lib/ParseResult'
/**
* Takes a `Parser` and tries to match it without consuming any input.
*
* @example
* import * as P from 'parser-ts/lib/Parser'
* import * as S from 'parser-ts/lib/string'
*
* const parser = S.fold([
*. S.string('hello '),
* P.lookAhead(S.string('world')),
* S.string('wor')
*. ])
*
* run(parser, 'hello world')
* // { _tag: 'Right', right: 'hello worldwor' }
*/
export const lookAhead = <I, A>(p: Parser<I, A>): Parser<I, A> => (i) =>
E.either.chain(p(i), (next) => success(next.value, i, i))
Inspiration taken from
import { not, Predicate } from 'fp-ts/lib/function'
import { many, sat, Parser } from 'parser-ts/lib/Parser'
/**
* Takes a `Predicate` and continues parsing until the given `Predicate` is satisfied.
*
* @example
* import * as P from 'parser-ts/lib/Parser'
* import * as C from 'parser-ts/lib/char'
*
* const parser = P.takeUntil((c: C.Char) => c === 'w')
*
* run(parser, 'hello world')
* // { _tag: 'Right', right: [ 'h', 'e', 'l', 'l', 'o', ' ' ] }
*/
export const takeUntil = <I>(predicate: Predicate<I>): Parser<I, I[]> =>
many(sat(not(predicate)))
As a parser-ts
user, I will need to import modules like parser-ts/lib/Parser
or parser-ts/es6/Parser
.
Ideally, it would be great to just import the module directly without prefix like parser-ts/Parser
that fp-ts
already shipped in 2.8.0.
Maybe we can follow what fp-ts
has done in gcanti/fp-ts#1241 to fix this.
parser-ts
users
Software | Version(s) |
---|---|
fp-ts | 2.8.0 |
parser-ts | 0.6.3 |
TypeScript | 3.9.2 |
(I also filed another similar issue for fp-ts-contrib
at gcanti/fp-ts-contrib#67)
Take a look at the following test (which currently fails):
const fatalParser: P.Parser<Char, Char> = i => error(i, ['expected'], true)
const parser5 = P.either(C.char('a'), () => fatalParser)
assert.deepStrictEqual(S.run('c')(parser5), error(stream(['c']), ['"a"', 'expected'], true))
I believe that the semigroup of ParserResult should use getLastSemigroup
on the fatal property, and not getFirstSemigroup
. I guess it is kind of a philosophical question as one can argue that I'm mistaken....
@gcanti thoughts?
If you think this should indeed be fixed, I have a PR ready to go
import { pipe } from "fp-ts/lib/function"
import { char, digit, oneOf } from "parser-ts/lib/char"
import { run } from "parser-ts/lib/code-frame"
import {
between,
bind,
bindTo,
many,
optional,
Parser,
} from "parser-ts/lib/Parser"
const opP = oneOf("+*")
const exprP = pipe(
digit,
bindTo("digitLeft"),
bind("op", () => opP),
bind("digitRight", () => digit)
)
const maybeWithParenths = <T extends unknown>(p: Parser<string, T>) =>
optional(between(char("("), char(")"))(p))
const mainP = many(maybeWithParenths(exprP))
const result = run(mainP, "(2+2)(3*3)")
console.log(result)
Am I doing something wrong or is this a bug?
Firstly, great work on this lib and fp-ts
in general. Definitely has changed the way I think about writing software big time. Secondly, why I'm here, hoping you can help me use parser-ts
more easily/efficiently for binary messages in a Node environment.
Any help or advice greatly appreciated!
Currently it's not easy to use a Stream<A>
with a buffer of the Node Buffer
type. I'm able to make it work by casting to an Array<number>
on construction, which works fine since the lib code is only doing lookups, length checks and slices I believe, all of which are supported by Buffer
and/or its parent interface Uint8Array
.
It'd be nice to support the Buffer
type also. It'd also be nice to have getMany
/ getManyAndNext
functions for extracting a slice of a buffer. For example in the wire protocol I'm looking at (PostgreSQL) some messages have 32 bit big endian length prefixes which indicate the length of the buffer following that pertain to that message.
I've been able to get some stuff working with a bit of casting here and there. Here's a test program to demonstrate:
import { pipe } from 'fp-ts/lib/function';
import * as O from 'fp-ts/lib/Option';
import * as E from 'fp-ts/lib/Either';
import * as P from 'parser-ts/Parser';
import * as PR from 'parser-ts/ParseResult';
import * as S from 'parser-ts/Stream';
type Byte = number;
const stream: (buf: Buffer, cursor?: number) => S.Stream<Byte> = (
buf,
cursor
) => S.stream(buf as unknown as Array<number>, cursor); // Buffer behaves enough like an Array for use by parser-ts - that is, supports: slice, at/[], length
function buffer(i: Buffer): P.Parser<Byte, Buffer>;
function buffer(i: string, enc?: BufferEncoding): P.Parser<Byte, Buffer>;
function buffer(
i: string | Buffer,
enc?: BufferEncoding
): P.Parser<Byte, Buffer> {
let buf: Buffer;
if (typeof i === 'string') {
buf = Buffer.from(i, enc);
} else {
buf = i;
}
return P.expected(
P.ChainRec.chainRec<Byte, Buffer, Buffer>(buf, (acc) =>
pipe(
O.fromNullable(acc.at(0)),
O.fold(
() => P.of(E.right(buf)),
(c) =>
pipe(
P.sat((b: Byte) => b === c),
P.chain(() => P.of(E.left(acc.slice(1))))
)
)
)
),
JSON.stringify(buf)
);
}
const getManyAndNext: <A>(
i: S.Stream<A>,
count: number
) => O.Option<{
value: A[];
next: S.Stream<A>;
}> = (i, count) => {
const endIndex = count + i.cursor;
if (endIndex <= i.buffer.length) {
return O.some({
value: i.buffer.slice(i.cursor, endIndex), // our buffer using Buffer actually outputs a Buffer here, not an Array<number>
next: S.stream(i.buffer, endIndex)
});
}
return O.none;
};
const items: <A>(count: number) => P.Parser<A, Array<A>> =
(count: number) => (i) =>
pipe(
getManyAndNext(i, count),
O.fold(
() => PR.error(i),
({ value, next }) => PR.success(value, next, i)
)
);
const uint32BE = pipe(
items<number>(4),
P.map((buf) => Buffer.from(buf).readUInt32BE()) // not ideal - creating a buf here is not necessary
// P.map((buf) => (buf as unknown as Buffer).readUInt32BE()) // this is more efficient/ works too but relies on casting
);
const bufParser = pipe(
buffer('x'),
P.chain(() => uint32BE),
P.chain((n) => items(n))
);
const buf = Buffer.allocUnsafe(11);
buf.write('x');
buf.writeUint32BE(5, 1);
buf.writeUint8(1, 5);
buf.writeUint8(2, 6);
buf.writeUint8(3, 7);
buf.writeUint8(4, 8);
buf.writeUint8(5, 9);
buf.writeUint8(6, 10);
console.log(bufParser(stream(buf)));
// Output:
// {
// _tag: 'Right',
// right: {
// value: <Buffer 01 02 03 04 05>,
// next: { buffer: <Buffer 78 00 00 00 05 01 02 03 04 05 06>, cursor: 10 },
// start: { buffer: <Buffer 78 00 00 00 05 01 02 03 04 05 06>, cursor: 0 }
// }
// }
This is for being able to create a parser that deals easily and efficiently with binary content.
Software | Version(s) |
---|---|
fp-ts | 2.11.9 |
parser-ts | 0.6.16 |
TypeScript | 4.5.2 |
const keyword = C.many1(C.notSpace);
const whitespace = S.spaces;
const lang = pipe(
keyword,
P.alt(() => whitespace)
);
const input = `hello world mickey mouse`;
const res = run(P.many(lang), input);
expectation: ['hello', ' ', 'word', ' ', 'mickey', ' ' ,'mouse']
actual: runs forever...
Is it possible to augment the API so that await
can be used for the chain
function argument?
Consider this as an example:
const space_padded = <A>(f: Parser<A>) =>
pstr.spaces.chain(_ =>
f.chain(a =>
pstr.spaces.chain(_ => p.of(a))))
Then this could be written with async
/await
instead, making it look like Haskell do-notation:
const space_padded = async <A>(f: Parser<A>) => {
const _ = await pstr.spaces.chain
const a = await f.chain
const _ = await pstr.spaces.chain
return a
}
Perhaps Parser
could extend Thenable
or Promise
to be able to skip chain
altogether:
const space_padded = async <A>(f: Parser<A>) => {
const _ = await pstr.spaces
const a = await f
const _ = await pstr.spaces
return a
}
Would this be possible?
I may be missing something fairly obvious, but I continue to struggle with using parser-ts
to compose parsers to build up a structured abstract syntax tree.
My suggestion would be to collaborate here to provide a simple example that can then be displayed in an examples
directory, similar to other fp-ts
ecosystem libraries.
The example I have been working on involves parsing simple command line statements. It should be capable of handling a command, positional arguments, and flags/named arguments, in that order. An example statement might look like:
hello ./foo -b --bar=baz
I would like to parse this into the following structure:
{
command: 'hello',
args: {
named: {
bar: "baz",
},
positional: {
0: "./foo",
}
},
flags: {
b: true
}
}
Here is what I have so far:
import { pipe } from 'fp-ts/lib/function'
import * as C from 'parser-ts/lib/char'
import * as S from 'parser-ts/lib/string'
import * as P from 'parser-ts/lib/Parser'
// -------------------------------------------------------------------------------------
// models
// -------------------------------------------------------------------------------------
export type Statement = Command | Argument
export interface Command {
_tag: 'Command'
value: string
}
export type Argument = Flag | Named | Positional
export interface Flag {
_tag: 'Flag'
value: string
}
export interface Named {
_tag: 'Named'
name: string
value: string
}
export interface Positional {
_tag: 'Positional'
value: string
}
// -------------------------------------------------------------------------------------
// constructors
// -------------------------------------------------------------------------------------
export const Command = (value: string): Command => ({ _tag: 'Command', value })
export const Flag = (value: string): Flag => ({ _tag: 'Flag', value })
export const Named = (name: string, value: string): Named => ({ _tag: 'Named', name, value })
export const Positional = (value: string): Positional => ({ _tag: 'Positional', value })
// -------------------------------------------------------------------------------------
// parsers
// -------------------------------------------------------------------------------------
const whitespaceSurrounded = P.surroundedBy(S.spaces)
const dash = C.char('-')
const doubleDash = S.string('--')
const equals = C.char('=')
const identifier = C.many1(C.alphanum)
const command = (cmd: string): P.Parser<string, Command> => pipe(S.string(cmd), P.map(Command))
const flag: P.Parser<string, Flag> = pipe(
dash,
P.chain(() => identifier),
P.map(Flag)
)
const named: P.Parser<string, Named> = pipe(
doubleDash,
P.chain(() => P.sepBy1(equals, identifier)),
P.map(([name, value]) => Named(name, value))
)
const positional: P.Parser<string, Positional> = pipe(C.many1(C.notSpace), P.map(Positional))
const argument = P.either<string, Argument>(flag, () =>
P.either<string, Named | Positional>(named, () => positional)
)
Ideally, these parsers should be able to be composed to create the abstract syntax tree, but I am having trouble finding a clean solution.
One idea I had was to derive a Monoid
instance for an array of Statement
s and then further derive a Monoid
instance for a Parser
of arrays of Statement
s. Then the array of parsed statements can be used to build the AST.
import * as A from 'fp-ts/lib/Array'
import { fold } from 'fp-ts/lib/Monoid'
import { run } from 'parser-ts/lib/code-frame'
const monoidStatements = A.getMonoid<Statement>()
const monoidStatementsParser = P.getMonoid<string, Array<Statement>>(monoidStatements)
const statement = (cmd: string) =>
fold(monoidStatementsParser)([
pipe(command(cmd), P.map(A.of)),
P.many(whitespaceSurrounded(argument))
])
run(statement('hello'), 'hello ./foo -b --bar=baz')
// {
// _tag: 'Right',
// right: [
// { _tag: 'Command', value: 'hello' },
// { _tag: 'Positional', value: './foo' },
// { _tag: 'Flag', value: 'b' },
// { _tag: 'Named', name: 'bar', value: 'baz' }
// ]
// }
However, this solution seems somewhat awkward to me and I am not sure it is the best approach.
Any guidance would be much appreciated, and I think the library would benefit from having a simple yet robust example for people to check out!
Would you think it's a good idea to add these 2 combinators into Parser
?
It might be useful for parsing things like (for example) phone number in format (123)-456-789
.
In TS, it might look like
export function between<I, A>(begin: Parser<I, A>, end: Parser<I, A>, p: Parser<I, A>): Parser<I, A> {
return Do(parser)
.do(begin)
.bind('content', p)
.do(end)
.return(({ content }) => content);
}
export function surroundedBy<I, A>(bound: Parser<I, A>, p: Parser<I, A>): Parser<I, A> {
return between(bound, bound, p);
}
For example I have multi line string with 3 types of lines:
How to skip empty lines and comment lines?
The comment claims that it parses any whitespace character, but the code only parses space:
Lines 51 to 56 in 01e0b93
The comments in lib/string
is also off in the same manner.
The manyTill
and many1Till
combinators from purescript-string-parsers
allow for writing some interesting parsers.
-- | Parse values until a terminator.
manyTill :: forall a end. Parser a -> Parser end -> Parser (List a)
-- | Parse values until the terminator matches, requiring at least one match.
many1Till :: forall a end. Parser a -> Parser end -> Parser (NonEmptyList a)
For example, if we were writing an HTML parser and wanted to define a parser that could parse HTML comments, per the spec we would have to adhere to the following:
12.1.6 Comments
Comments must have the following format:
- The string "<!--".
- Optionally, text, with the additional restriction that the text must not start with the string ">", nor start with the string "->", nor contain the strings "<!--", "-->", or "--!>", nor end with the string "<!-".
- The string "-->".
Note: The text is allowed to end with the string "<!", as in "<!--My favorite operators are > and ".
My initial thought was to utilize the new filter
combinator that we added in #30. This would work for rejecting invalid characters or strings encountered within the comment. However, because any other character is valid we would not have a reliable way of telling the parser to stop after the "-->" sequence.
Under the hood, purescript-string-parsers
uses the MonadRec
type class to handle recursively iterating through the stream and accumulating values until the terminator parser is encountered.
I implemented a similar solution using the ChainRec
type class from fp-ts
.
import { tailRec } from 'fp-ts/ChainRec'
import * as E from 'fp-ts/Either'
import { error, success, ParseResult, ParseSuccess } from 'parser-ts/ParseResult'
import * as P from 'parser-ts/Parser'
import { Stream } from 'parser-ts/Stream'
interface Next<I, A> {
readonly value: A
readonly stream: Stream<I>
}
const chainRec = <I, A, B>(a: A, f: (a: A) => P.Parser<I, E.Either<A, B>>): P.Parser<I, B> => {
const split = (result: ParseSuccess<I, E.Either<A, B>>): E.Either<Next<I, A>, ParseResult<I, B>> =>
E.isLeft(result.value)
? E.left({ value: result.value.left, stream: result.next })
: E.right(success(result.value.right, result.next, result.start))
return (i) => tailRec({ value: a, stream: i }, (state) => {
const result = f(state.value)(state.stream)
if (E.isLeft(result)) {
return E.right(error(state.stream))
}
return split(result.right)
})
}
This allows us to define manyTill
and many1Till
in terms of chainRec
.
import * as RA from 'fp-ts/ReadonlyArray'
import * as RNEA from 'fp-ts/ReadonlyNonEmptyArray'
export const manyTill = <I, A, B>(
parser: P.Parser<I, A>,
terminator: P.Parser<I, B>
): P.Parser<I, ReadonlyArray<A>> =>
pipe(
terminator,
P.map<B, ReadonlyArray<A>>(() => RA.empty),
P.alt<I, ReadonlyArray<A>>(() => many1Till(parser, terminator))
)
export const many1Till = <I, A, B>(
parser: P.Parser<I, A>,
terminator: P.Parser<I, B>
): P.Parser<I, RNEA.ReadonlyNonEmptyArray<A>> => {
return pipe(
parser,
P.chain((x) =>
chainRec(RNEA.of(x), (acc) =>
pipe(
terminator,
P.map(() => E.right(RNEA.reverse(acc))),
P.alt(() =>
pipe(
parser,
P.map((a) => E.left(RNEA.cons(a, acc)))
)
)
)
)
)
)
}
@gcanti - I am curious what you think of my approach above. I know that ChainRec
is marked for removal in fp-ts@v3
but it worked nicely for this use case. What would your thoughts be on adding these to the main library? Do you think these can be improved/cleaned up further?
I am trying to build ECMAScript modules (ESM) that is supported by TypeScript nightly build.
https://devblogs.microsoft.com/typescript/announcing-typescript-4-5-beta/
tsc
fails in *.mts (new source file extension for ESM) that imports parser-ts
module.
Error log:
node_modules/parser-ts/lib/Parser.d.ts(7,27): error TS2307: Cannot find module 'fp-ts/ChainRec' or its corresponding type declarations.
node_modules/parser-ts/lib/Parser.d.ts(410,16): error TS2665: Invalid module name in augmentation. Module 'fp-ts/lib/HKT' resolves to an untyped module at 'node_modules/fp-ts/lib/HKT/index.js', which cannot be augmented.
node_modules/parser-ts/lib/Parser.d.ts(429,40): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(434,48): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(439,36): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(449,32): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(454,48): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(459,37): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/Parser.d.ts(459,57): error TS2344: Type '"Parser"' does not satisfy the constraint 'keyof URItoKind2<any, any>'.
node_modules/parser-ts/lib/string.d.ts(6,33): error TS7016: Could not find a declaration file for module 'fp-ts/lib/HKT'. 'node_modules/fp-ts/lib/HKT/index.js' implicitly has an 'any' type.
If the 'fp-ts' package actually exposes this module, try adding a new declaration (.d.ts) file containing `declare module 'fp-ts/lib/HKT';`
tsc
succeeds.
main.mts
import { parser as P, string as S } from 'parser-ts';
console.log(P, S);
tsconfig.json
{
"compilerOptions": {
"strict": true,
"module": "nodenext",
}
}
package.json
{
"name": "test_esm",
"type": "module",
"devDependencies": {
"typescript": "^4.6.0-dev.20211113"
},
"dependencies": {
"fp-ts": "^2.11.5",
"parser-ts": "^0.6.16"
},
"scripts": {
"build": "tsc"
}
}
This issue seems to be fixable by change the import path.
Which versions of parser-ts are affected by this issue? Did this work in previous versions of parser-ts?
Software | Version(s) |
---|---|
fp-ts | 2.11.5 |
parser-ts | 0.6.16 |
TypeScript | next |
I would like to use parser-ts
to parse PromQL of which formal definition may be found below,
You may notice that the BinaryExpr
will recursively check Expr
where BinaryExpr
is also listed with a high priority.
But with the following code,
export const BinaryExprParser: P.Parser<C.Char, BinaryExpr> = pipe(
ExprParser,
P.bindTo('l'),
P.apFirst(S.spaces),
P.bind('op', () => binaryOpParser),
P.apFirst(S.spaces),
P.bind('r', () => ExprParser),
P.map(a => ({ _tag: 'binary_expr', left: a.l, right: a.r, op: a.op })),
);
we will encounter "Maximum call stack size exceeded error".
As is suggested from https://stackoverflow.com/questions/29158248/parser-combinators-and-left-recursion, we may "memorize" the (position, parser) tuple to avoid this issue. So is that possible to do this with parser-ts
?
Thanks for this neat library! For fun I'm attempting to port Pandoc's markdown parser to parser-ts
, but I'm noticing that the library really struggles with longer strings. Each of the following examples results in an Uncaught RangeError: Maximum call stack size exceeded
error. The limit is even less if I start combining parsers.
run(S.many(C.space), " ".repeat(1000));
run(S.many(C.alphanum), "x".repeat(1000));
run(S.many(C.digit), "9".repeat(1000));
run(S.many(C.char("x")), "x".repeat(1000));
Do you have any tips for working with longer strings? Is this a fundamental limitation of this approach, or is there some hope of working around it by modifying the recursion primitives?
(related to #41)
$ npm list | grep parser-ts
1:markdown-parser-ts@ /home/benjamin/projects/markdown-parser
20:โโโ [email protected]
Thanks!
The run
function is handy for users writing their own tests, but it is not exposed to users consuming the module.
Would you accept a PR expose it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.