[ rust ] [ elm ] [ ocaml ] [ java ] [ python ]
[ mastodon.social ]
Parse BNF grammar definitions
License: MIT License
There is a lot of duplication in the documentation between README.md and lib.rs
I can see two problems with this:
Maybe these problems don't outweigh some of the benefits? I am not sure! But usually reducing duplication is worth a discussion.
The documentation could be simplified down into one file via:
#![doc = include_str!("README.md")]
This would give both README.md and lib.rs the same content. It would also run the README.md code snippets as tests!
only README.md has:
only lib.rs has:
So considering the above, the only question I wanted to answer was how the badges look in rustdoc:
They seem fine!
So what do you think @shnewto ?
one good action enabling release drafting already exists
Don't have to use it tho. Just found it while searching around
Unsure if there's any desire for them at the moment, but while doing some reading, I realized we aren't explicitly supporting nullable productions, i.e. non-terminals that can represent zero terminals...
Further reading:
https://en.wikipedia.org/wiki/Chomsky_normal_form
https://web.stanford.edu/class/archive/cs/cs143/cs143.1156/handouts/parsing.pdf
Is your feature request related to a problem? Please describe.
There's no reason to include an entire BNF parser in the compiled application if I just have a set grammar that I want to use that is constant. However, this is what happens if I use the bnf crate in my application - it must parse and validate at runtime.
Describe the solution you'd like
Allow BNF syntax to be parsed at compile-time. Make the required functions const (FromStr::from_str is not const)
Describe alternatives you've considered
N/A
Additional context
N/A
At the moment, identifying a nonterminal rule silently disallows whitespace, i.e. my name
in <my name>
will parse as a nonterminal identified as my
. This is certainly not ideal. I do wonder whether it's reasonable to disallow whitespace by raising an error in cases like the above or if we should just handle and allow it. Probably we need to review the standard for BNF in this case and in general to ensure compliance.
Frequently while using this crate terms, expressions, productions, and grammars are built from their raw type parts:
let t1: Term = Term::Terminal(String::from("terminal"));
let nt1: Term = Term::Nonterminal(String::from("nonterminal"));
let e1: Expression = Expression::from_parts(vec![nt1, t1]);
Productions and Grammars require even more work. It would be convenient and powerful if instead the types could be constructed using the BNF syntax:
let term = Term::from_str("<nonterminal>").unwrap();
let expression = Expression::from_str("<nonterminal> \"terminal\"").unwrap();
let production = Production::from_str("<nonterminal> ::= <nonterminal> \"terminal\" | \"terminal\"").unwrap();
This could be achieved by implementing the FromStr trait for each type and leveraging existing parsers. Adding this to the Grammar
as well would remove the need for the bnf::parse
function.
There is however a difficult wrinkle to this problem:
A grammar can differentiate between the two by inspecting the full set of productions, but terms in productions and expressions are essentially ambiguous. Maybe some solution brainstorming will help start the discussion:
Hopefully @Snewt will have some perspective to resolve this block and enable this convenient functionality π€
while working on nullable rules (#113), profiling showed a potential for performance improvement.
ProductionMatching
owns matching productions with terms. currently, it is implemented:
pub(crate) struct ProductionMatching<'gram> {
pub prod_id: ProductionId,
pub lhs: &'gram crate::Term,
/// "right hand side" [`TermMatching`]s which are partitioned by the matched and unmatched.
/// For example: [Matched, Matched, Matched, Unmatched, Unmatched]
rhs: Vec<TermMatching<'gram>>,
/// The progress cursor used to separate [`TermMatching`]s in the "right hand side"
matched_count: usize,
}
The rhs
vector requires a fair amount of dynamic memory alloc and free while parsing (current location where vector is cloned).
There may be a better way to model matching terms.
If you pick this up, do not take this brainstorm as "the only way". This is just one of many possibilities.
instead of cloning the matching vector, matching could be modeled as a Trie/Stemming tree, with each node a Term
eg. <word> ::= 'B' 'A' 'D' | 'B' 'A' 'T'
graph TD
S[start matching prod] --> B
B --> A
A --> T
A --> D
Then each ProductionMatching
references a single node in this matching tree.
To check results, the criterion
benchmarking can be used:
cargo criterion
Tracing can be used to investigate:
CARGO_PROFILE_BENCH_DEBUG=true cargo flamegraph --bench bnf -- --bench
Iterator::size_hint helps consumers by providing some hint of the iterator's size. This benefits patterns like collecting an iterator into a vec (iter.collect::<Vec<_>>()
) because the collection can reserve the correct space, without requiring re-allocations.
BNF's iterators currently hold slice
s internally, so the size_hint
should be quite simple to implement (iterator impl search link)
There are 3 (or 4 once #92 is merged) Iterators to update:
Plenty of BNF examples, including the wikipedia example, do not require productions to terminate with a semicolon, but rather only require a newline. I believe the parser could be updated to optionally use the ending semicolon, which would support both formats.
Describe the bug
BNF grammar should be independent of the order of the rules as long as it is otherwise well-formed.
To Reproduce
Take the dna_left_recursive
test and reverse the order of the rules. The BNF parser can no longer successfully parse the input string.
#[test]
fn dna_left_recursive() {
let grammar: Grammar = "<base> ::= 'A' | 'C' | 'G' | 'T'
<dna> ::= <base> | <dna> <base>"
.parse()
.unwrap();
let input = "GATTACA";
let parses: Vec<_> = grammar.parse_input(input).map(|a| a.to_string()).collect();
assert_snapshot!(parses.join("\n"));
}
If #33 is addressed and bnf becomes a real parser generator, it could also become self-hosting, which would be pretty fun.
What would you think of extracting the test-cases to separate files and then re-including them using std::include_str!("path/to/file")
?
Potential benefits might include
This'd come at the cost of separating the text of the test-cases from the expected result.
The current implementation of fmt::Display
is overly simple since Terms may be parsed via 'single-quoted'
or "double-quoted"
.
#[test]
fn quote_term_to_and_from_str() {
let quote = Term::Terminal(String::from("\""));
let to_string = quote.to_string(); // -> \"\"\" , uh oh, that won't do
let from_string = Term::from_str(&to_string);
assert_eq!(Ok(Term::Terminal(String::from("\""))), from_string);
}
This test fails because the current behavior of to_string
produces a string which cannot be parsed back into the original Terminal
.
Some solution brainstorming may help:
fmt::Display
could inspect if the term contains "
to determine if output should use with single quotesTerm::Terminal
could be changed to a lower level enum of Single/Double quote typeMy personal gut would be the first option. It would add a few lines of complexity to fmt::Display for Term
and make the associated to_string
less performant, but it seems like a necessary cost. But maybe @Snewt will have a cleaner idea
We need a way to allow incorporating comments into a grammar. It seems somewhat standard to
use the ;
to indicate the start of a comment and a \n
to close it. It'd be a good add I think to make that work.
A couple initial questions:
;
all along the way?When I was in the university there was a program for learning grammatics in the programming languages theory course. It was very useful for learners to see how the code was parsed in the grammar tree. Can we do the same thing? It must not be so hard, basically, we already have it while we output Grammar
as a tree in the README file, the same thing is needed but with real data which could come not from the random generator but from the user input - this is how learners learn how to write grammar correctly.
Hello,
First of all wanted to say thank you for your hard work on this project.
What do you think about creating step by step example with using simple grammar? It seems to me this would help to see the usual work flow and may show general concepts better.
FYI: There is a template to make a good README.md https://gist.github.com/PurpleBooth/109311bb0361f32d87a2
I want to write a custom VQL parser use nom. And I found it have not contain grammer define function.
Can I use bnf lib to define VQL/SQL grammar like antlr4 g4 fileοΌ
I can't find some example for this, I think the process is:
I didn't know the process is right or not? Anyone could help me the right way?
Thanks.
Thinking about future features that could add value somewhere down the road. Initial ideas on design might be something like the following?
extern crate bnf;
// ...
let ebnf_grammar = bnf::ebnf::parse(ebnf_input);
let abnf_grammar = bnf::abnf::parse(abnf_input);
Alternatively we might be able to just make bnf::parse
smart enough to intuit the variant used.
Related to #108 and #112 β failures reproduce on 0.4.0 β€ bnf β€ 0.4.3
so not a regression from #113 .
Hit more cases where nullable productions aren't handled properly.
This example looks contrived but was adapted from a larger grammar I'm writing.
As usual, thanks for all your work @CrockAgile !
Sadly, this correctness issue continues to be a total blocker preventing me from adopting bnf
Hoping my battle-testing hasn't been in vain and there's an easy fix around the corner ππ»
use bnf::Grammar;
#[test]
fn test() {
let fails: &str = "
<disjunction> ::= <predicate> | <disjunction> <or> <predicate>
<predicate> ::= <char_null_one> | <special-string> '.'
<char_null_one> ::= <char_null_two>
<char_null_two> ::= <char_null_three>
<char_null_three> ::= <char>
<or> ::= <ws> 'or' <ws>
<ws> ::= <whitespace> | ' ' <ws>
<whitespace> ::= ' '
<special-string> ::= <special-char> | <special-char> <special-string>
<special-char> ::= <char> | <whitespace>
<char> ::= 'a'
";
let input = "a or a";
let passes_1 = fails.replace(
// skip nullable production <char_null_two>
"<char_null_one> ::= <char_null_two>",
"<char_null_one> ::= <char_null_three>",
);
assert!(passes_1.parse::<Grammar>().unwrap().parse_input(input).next().is_some());
let passes_2 = fails.replace(
// replace <whitespace> with its terminal ' '
"<ws> ::= <whitespace> | ' ' <ws>",
"<ws> ::= ' ' | ' ' <ws>",
);
assert!(passes_2.parse::<Grammar>().unwrap().parse_input(input).next().is_some());
let passes_3 = fails.replace(
// again, replace <whitespace> with its terminal ' '
"<special-char> ::= <char> | <whitespace>",
"<special-char> ::= <char> | ' '",
);
assert!(passes_3.parse::<Grammar>().unwrap().parse_input(input).next().is_some());
assert!(
fails.parse::<Grammar>().unwrap().parse_input(input).next().is_some(),
"all grammars except last one parsed: {input}"
);
}
The examples provided take a bnf grammar description and then are able to generate text that matches the grammar. How do I give the grammar a string compliant with bnf specification and then get a list of tokens out? This is something that pyparsing does.
The example use case is parsing a bnf string to a grammar. But, what are the use cases after that?
// let input = some grammar string
let grammar = bnf::parse(input);
// now what can I do with it?
I'd like to be smarter about how we generate random sentences from a given grammar.
There's a danger that a valid grammar can result in an infinite loop when randomly generating, so atm we just take an guess at if we're looping in a way that looks like it's not going to stop any time soon by checking the stack and bailing before we recurse too far.
For instance the case used in the tests is <nonterm> ::= <nonterm>
. We can't get to a terminal state so we're doomed to failure if we try to generate a sentence.
I'm not sure what an ideal solution is here, for one, it'd be nice to figure something that didn't have to short circuit the potential for "maybe infinite loops".
Maybe there's a way to check for impossible grammars for the generate logic first and error early if that's the case. Then... if we have the potential for a terminal state but aren't reaching it because of random chance we need to do something sensible... maybe allow a configurable max generated length and do the work to find some terminal nodes when we get there?
If can manage a max length, I like the idea of implementing a minimum length too π€
I just read thru the lalrpop documentation and it seems worth investigating the readability & performance comparison to the current nom usage. This exploration task could just be done on its own branch and never merged and it would serve its purpose. But if it stumbles into some improvements, then it could be up for consideration.
Right now we're using a tool called cargo-travis
that provides a convenience wrapper to kcov
for generating and publishing coverage reports to Coveralls. It does not expose the --exclude-pattern
or --exclude-file
options that would allow us to keep lines of test code out of the report.
It would be nice to manage building/caching/running kcov without the cargo-travis
wrapper in order to get a more accurate representation of our current test coverage.
Why does Grammar::parse_input
have a return type of impl Iterator<Item = ParseTree>
and not of Option<ParseTree>
or Result<ParseTree, Error>
? Is there any circumstance in which Grammar::parse_input
could parse to multiple ParseTree
s?
I know that I can access the data in a ParseTree
by calling ParseTree::rhs_iter
to get something of type ParseTreeIter
, which implements Iterator<Item = &ParseTreeNode>
. I can then iterate through this to get items of type &ParseTreeNode
, but how can I get the data out from this? src/lib.rs
only publicly exports Grammar
and ParseTree
from src/grammar.rs
, and I cannot construct the variants of the externally private ParseTreeNode
necessary to match ParseTreeNode
s and access the data stored within.
Describe the bug
A nonterminal that is not defined is not rejected during parsing.
To Reproduce
For example,
let input = "<dna> ::= <base> | <base> <dna>";
let grammar: Grammar = input.parse().unwrap();
does not panic, while 'base' is not defined in the input.
Desktop (please complete the following information):
As was discussed in #7 , generating a random sequence from a BNF grammar is a good feature. The erratic.js serves as a good example of behavior to implement. My only question is:
Vec<Terminals>
or as a String
?My assumptions for this feature are:
The types Expression
, Production
, and Grammar
all publicly expose internal vectors. I assume the intention was to allow users to traverse the nodes in the grammar. However, exposing the internal vector locks the project to supporting the vector type interface forever onward. Imagine in the future that the data structures were changed to trees, arrays, or anything besides vector. This would cause a breaking change in the public interface for users.
Instead, if these types only exposed iterators, they would still support tree traversal, but changing the internal implementation would not be a breaking change.
Is your feature request related to a problem? Please describe.
Most uses of BNFs out there are not formal i.e. they are meant to be human readable. As such, they contain non-standard features such as optional. For instance, consider this RFC that does not use standard BNF to describe Preferred Name Syntax for DNS
Additionally, are you interested in supporting EBNF and validating a given input.
Describe the solution you'd like
I would like to have these features. I am not sure how fine the configuration can be done. Not sure entirely how this is done in Rust but if its like #ifdef in C then I think its fair to want the code to be readable.
Describe alternatives you've considered
I think all of these features are expressable in terms of standard BNF. So users can always make that manual transformation.
Additional context
I needed this while trying to validate (Another feature I want) a given string with the BNF from above. See Wikipedia's page for BNF Variants
Dependabot has a couple PRs open to update some of this project's dependencies but it looks like updating is going to take a bit of manual work, i.e. updating the symbols that changed / went away.
Is your feature request related to a problem? Please describe.
Im trying to catch syntax errors in a custom plaintext file format, and when the parse fails, i simply receive no parse trees, with no information about where it failed.
Describe the solution you'd like
I'd like to see a line/column or character index, or something along those lines, that i can use to log to the user where the parsing has failed.
Is your feature request related to a problem? Please describe.
The existing Grammar::parse_input
method is useful for parsing input text according to a Grammar
. But the current API builds "one use" parsers. This limitation forces some redundant work to be repeated when parsing multiple input texts.
// each of these builds completely new `GrammarMatching` and other parsing tables,
// which are then freed before the next `parse_input`, wasting computation
grammar.parse_input(input_1);
grammar.parse_input(input_2);
grammar.parse_input(input_3);
Describe the solution you'd like
Instead, a single GrammarParser
could be available which would reuse parser data efficiently:
let parser = grammar.input_parser();
// common immutable parser data is shared between calls
parser.parse_input(input_1);
parser.parse_input(input_2);
parser.parse_input(input_3);
Describe alternatives you've considered
Maybe this just doesn't matter! I am not sure if this new API would even make much of a performance difference. I am not sure "how much faster" would warrant a new API method. 1% faster? 5% faster? 50% faster?
Looks related to nullable productions: #108 #112 #117
This grammar fails to match when we change a production from left-recursive to right-recursive.
The left-recursive grammar failed on 0.4.0 β€ bnf β€ 0.4.1
β the right-recursive still fails on 0.4.0 β€ bnf β€ 0.4.3
So this issue must have been partially addressed by @CrockAgile in #109 (0.4.2
)
Happy to help battle-testing any proposed fixes βοΈ
use bnf::Grammar;
#[test]
fn test() {
let input = "aa a";
let left_recursive: &str = "
<conjunction> ::= <conjunction> <ws> <predicate> | <predicate>
<predicate> ::= <string_null_one> | <special-string> '.'
<string_null_one> ::= <string_null_two>
<string_null_two> ::= <string_null_three>
<string_null_three> ::= <string>
<string> ::= <char_null> | <string> <char_null>
<special-string> ::= <special-char> | <special-string> <special-char>
<char_null> ::= <char>
<char> ::= 'a'
<special-char> ::= <char_null> | <whitespace>
<whitespace> ::= ' '
<ws> ::= ' ' | ' ' <ws>
";
assert!(left_recursive.parse::<Grammar>().unwrap().parse_input(input).next().is_some());
let right_recursive = left_recursive.replace(
// rewrite production from left- to right- recursive
"<string> ::= <char_null> | <string> <char_null>",
"<string> ::= <char_null> | <char_null> <string>",
);
assert!(
right_recursive.parse::<Grammar>().unwrap().parse_input(input).next().is_some(),
"right-recursive grammar failed to parse: {input}"
);
}
I'm going to be dropping our redundant Travis CI builds but still want code coverage reporting. Need to tie it to an existing GH action or make a new one...
core::ops
allows implementing custom logic for operators. Would implementing these for Grammar
/Production
/Expression
/Term
enable any cleaner syntax or usage? If so, which of these operators would be included?
expr = Term | Term | Term
?These were the operators I identified as possible enhancements, but feel free to add more to the discussion that may be useful.
Also, as a possible alternative some of these use cases could be handled by macros to allow for special case syntax:
expression![ term1 | term2 | term3 ] -> Expression::from_parts(vec![ term1, term2, term3 ])
But that would require the user importing BNF's macros.
a new benchmarking framework is on the scene! Divan looks pretty great, so the work for this issue is:
Describe the bug
Empty string rules still fail to match in some situations. This is the same as #108 although after it's fix it doesn't happen in all situations but still happens.
To Reproduce
Load grammar from tests/fixtures/bnf.bnf
Try to parse the first line of the text:
"<syntax> ::= <rule> | <rule> <syntax>\n"
This match will fail.
Try to parse the same line with whitespace before the newline
"<syntax> ::= <rule> | <rule> <syntax> \n"
This will parse properly
The first example should pass given <line-end>
may begin with <opt-whitespace>
which has an empty string rule
<line-end> ::= <opt-whitespace> <EOL>
<opt-whitespace> ::= "" | " " <opt-whitespace>
<EOL> ::= "
"
Unfortunately I haven't been able to find a smaller example. Weirdly it does parse "<syntax> ::= <rule>"
correctly but not the more complicated example above.
If there support for special characters like \n
or a built in definition for <EOL>
it could be good to have a test that a grammer made from tests/fixtures/bnf.bnf
can parse the file itself.
Desktop (please complete the following information):
Maybe you have already enabled, this, but it would put my mind at ease on the terminal if the master branch was protected: https://help.github.com/articles/configuring-protected-branches/
Looks like it's an issue with the gcrov action
actions-rs/grcov#142
Making an issue so if it takes very long to resolve on the gcrov side, we can track progress on however we decide to address it.
Currently cargo doc
on nightly Rust issues a series of warnings stemming from Rust's issue 44229. It'd be nice to pinpoint the specifics that are causing the warnings and fix them before they are warnings on stable Rust.
The errors start with the following report:
WARNING: documentation for this crate may be rendered differently using the new Pulldown renderer.
See https://github.com/rust-lang/rust/issues/44229 for details.
Clippy has a lot of best practices, but BNF is in violation of a lot of them. Some would be breaking changes (like removing the non std::str::FromStr
from_str
public functions). Other changes require updating the parsers to nom 5 best practices.
Once that is done, would just need to revert previous attempt to re-enable clippy CI.
Checking bnf v0.2.6
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:6:1
|
6 | / named!(pub prod_lhs< &[u8], Term >,
7 | | do_parse!(
8 | | nt: delimited!(char!('<'), take_until!(">"), ws!(char!('>'))) >>
9 | | _ret: ws!(tag!("::=")) >>
10 | | (Term::Nonterminal(String::from_utf8_lossy(nt).into_owned()))
11 | | )
12 | | );
| |__^
|
= note: `#[warn(deprecated)]` on by default
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:6:1
|
6 | / named!(pub prod_lhs< &[u8], Term >,
7 | | do_parse!(
8 | | nt: delimited!(char!('<'), take_until!(">"), ws!(char!('>'))) >>
9 | | _ret: ws!(tag!("::=")) >>
10 | | (Term::Nonterminal(String::from_utf8_lossy(nt).into_owned()))
11 | | )
12 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:14:1
|
14 | / named!(pub terminal< &[u8], Term >,
15 | | do_parse!(
16 | | t: alt!(
17 | | delimited!(char!('"'), take_until!("\""), ws!(char!('"'))) |
... |
21 | | )
22 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:14:1
|
14 | / named!(pub terminal< &[u8], Term >,
15 | | do_parse!(
16 | | t: alt!(
17 | | delimited!(char!('"'), take_until!("\""), ws!(char!('"'))) |
... |
21 | | )
22 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:24:1
|
24 | / named!(pub nonterminal< &[u8], Term >,
25 | | do_parse!(
26 | | nt: complete!(delimited!(char!('<'), take_until!(">"), ws!(char!('>')))) >>
27 | | ws!(not!(complete!(tag!("::=")))) >>
28 | | (Term::Nonterminal(String::from_utf8_lossy(nt).into_owned()))
29 | | )
30 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:24:1
|
24 | / named!(pub nonterminal< &[u8], Term >,
25 | | do_parse!(
26 | | nt: complete!(delimited!(char!('<'), take_until!(">"), ws!(char!('>')))) >>
27 | | ws!(not!(complete!(tag!("::=")))) >>
28 | | (Term::Nonterminal(String::from_utf8_lossy(nt).into_owned()))
29 | | )
30 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:42:1
|
42 | / named!(pub expression_next,
43 | | do_parse!(
44 | | ws!(char!('|')) >>
45 | | ret: recognize!(peek!(complete!(expression))) >>
46 | | (ret)
47 | | )
48 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:50:1
|
50 | / named!(pub expression< &[u8], Expression >,
51 | | do_parse!(
52 | | peek!(term) >>
53 | | terms: many1!(complete!(term)) >>
... |
63 | | )
64 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:74:1
|
74 | / named!(pub production< &[u8], Production >,
75 | | do_parse!(
76 | | lhs: ws!(prod_lhs) >>
77 | | rhs: many1!(complete!(expression)) >>
... |
86 | | )
87 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: use of deprecated item 'ws': whitespace parsing only works with macros and will not be updated anymore
--> src/parsers.rs:74:1
|
74 | / named!(pub production< &[u8], Production >,
75 | | do_parse!(
76 | | lhs: ws!(prod_lhs) >>
77 | | rhs: many1!(complete!(expression)) >>
... |
86 | | )
87 | | );
| |__^
|
= note: this warning originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
warning: unneeded return statement
--> src/grammar.rs:115:9
|
115 | return Ok(result);
| ^^^^^^^^^^^^^^^^^^ help: remove `return`: `Ok(result)`
|
= note: `#[warn(clippy::needless_return)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return
warning: useless use of `format!`
--> src/error.rs:55:32
|
55 | Needed::Unknown => format!("Data error: insufficient size, expectation unknown"),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using .to_string(): `"Data error: insufficient size, expectation unknown".to_string()`
|
= note: `#[warn(clippy::useless_format)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_format
warning: you should consider deriving a `Default` implementation for `expression::Expression`
--> src/expression.rs:16:5
|
16 | / pub fn new() -> Expression {
17 | | Expression { terms: vec![] }
18 | | }
| |_____^
|
= note: `#[warn(clippy::new_without_default)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#new_without_default
help: try this
|
10 | #[derive(Default)]
|
warning: defining a method called `from_str` on this type; consider implementing the `std::str::FromStr` trait or choosing a less ambiguous name
--> src/expression.rs:26:5
|
26 | / pub fn from_str(s: &str) -> Result<Self, Error> {
27 | | match parsers::expression_complete(s.as_bytes()) {
28 | | Result::Ok((_, o)) => Ok(o),
29 | | Result::Err(e) => Err(Error::from(e)),
30 | | }
31 | | }
| |_____^
|
= note: `#[warn(clippy::should_implement_trait)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#should_implement_trait
warning: you should consider deriving a `Default` implementation for `grammar::Grammar`
--> src/grammar.rs:20:5
|
20 | / pub fn new() -> Grammar {
21 | | Grammar {
22 | | productions: vec![],
23 | | }
24 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#new_without_default
help: try this
|
14 | #[derive(Default)]
|
warning: defining a method called `from_str` on this type; consider implementing the `std::str::FromStr` trait or choosing a less ambiguous name
--> src/grammar.rs:32:5
|
32 | / pub fn from_str(s: &str) -> Result<Self, Error> {
33 | | match parsers::grammar_complete(s.as_bytes()) {
34 | | Result::Ok((_, o)) => Ok(o),
35 | | Result::Err(e) => Err(Error::from(e)),
36 | | }
37 | | }
| |_____^
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#should_implement_trait
warning: writing `&String` instead of `&str` involves a new object where a slice will do.
--> src/grammar.rs:74:31
|
74 | fn traverse(&self, ident: &String, rng: &mut StdRng) -> Result<String, Error> {
| ^^^^^^^
|
= note: `#[warn(clippy::ptr_arg)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#ptr_arg
help: change this to
|
74 | fn traverse(&self, ident: &str, rng: &mut StdRng) -> Result<String, Error> {
| ^^^^
help: change `ident.clone()` to
|
86 | let nonterm = Term::Nonterminal(ident.to_string());
| ^^^^^^^^^^^^^^^^^
error: using `clone` on a double-reference; this will copy the reference instead of cloning the inner type
--> src/grammar.rs:99:37
|
99 | Some(e) => expression = e.clone(),
| ^^^^^^^^^
|
= note: `#[deny(clippy::clone_double_ref)]` on by default
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_double_ref
help: try dereferencing it
|
99 | Some(e) => expression = &(*e).clone(),
| ^^^^^^^^^^^^^
help: or try being explicit about what type to clone
|
99 | Some(e) => expression = &expression::Expression::clone(e),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When I match a grammar such as
<main> ::= "foo:" <bar>
<bar> ::= "bar"
against foo:bar
, it works. When I match a grammar such as
<main> ::= "foo:" ""
against foo:
, it works. However, when I match
<main> ::= "foo:" <baz>
<baz> ::= ""
against foo:
, it doesn't work.
Replicable using this code:
fn main() {
let grammar1: bnf::Grammar = "
<main> ::= \"foo:\" <bar>
<bar> ::= \"bar\"
".parse().unwrap();
let grammar2: bnf::Grammar = "
<main> ::= \"foo:\" \"\"
".parse().unwrap();
let grammar3: bnf::Grammar = "
<main> ::= \"foo:\" <baz>
<baz> ::= \"\"
".parse().unwrap();
assert!(grammar1.parse_input("foo:bar").next().is_some(), "grammar1 failed!");
assert!(grammar2.parse_input("foo:" ).next().is_some(), "grammar2 failed!");
assert!(grammar3.parse_input("foo:" ).next().is_some(), "grammar3 failed!");
}
Why isn't a parse tree like this being generated?
ParseTree {
lhs: Term::Nonterminal("main"),
rhs: [
ParseTreeNode::Terminal("foo:"),
ParseTreeNode::Nonterminal(ParseTree {
lhs: Term::Nonterminal("baz"),
rhs: [
ParseTreeNode::Terminal(""),
],
}),
],
}
Is this intentional/explicable behavior, or a bug? Thanks!
Parsing a grammar can fail. Currently, if parsing fails an empty grammar is returned and errors are printed to stdout. Users would likely be interested in detecting if the parsing failed, but capturing stdout
or stderr
is not ideal. The code required to currently detect an empty grammar would look something like:
let grammar = bnf::parse(input);
// errors printed to stdout
if grammar.productions.is_empty() {
println!("empty grammar");
} else {
println!("non-empty grammar");
}
Instead of printing error strings to stdout or stderr, they could be included in the Result<Grammar,Error>
type, allowing for more traditional error handling:
let grammarResult = bnf::parse(input);
match grammarResult {
Ok(grammar) => println!("working grammar: {}", grammar),
Err(e) => eprintln!("grammar parsing failed: {}", e),
}
If this change were to be made, would reporters still be involved? As a library it seems strange to output to stdout or stderr instead of output being handled in the code. But as a command line tool, error reporting in terminal is definitely valuable. Maybe BNF should include an optional CLI binary intended for terminal use? I forget if cargo allows for dual binary & library crates.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.