Comments (4)
@soasme Hi again. This will help me a lot to gain better understanding of how these encoding schemes work. You can close the issue as you wish. Thank you for your time and effort.
from peppapeg.
@retikulum I have fixed the issue by revamping the implementation of P4_CaseCmpInsensitive
. The case-insensitive literal rule "Hello Worìd" can now match the input "HELLO WORÌD", see test.
The case https://controlc.com/74f1e9b9 can pass on my local. Can you please revisit this issue when you're free?
Also, I think introducing fuzzy testing into this project is definitely a must-do. I have added it to the roadmap.
from peppapeg.
@soasme Thanks for your great effort. I have tested both cases. "Hello Worìd" works perfectly on my machine however the second case it is still failing. I think I shouldn't have copy-pasted test case on controlc because it generally brokes for non-ascii. I have tested second case both with sanitizers and without sanitizers. It again crashes. My crash_test case is attached.
For the introducing fuzzing test into project, you can consider integrating it with oss-fuzz (I know that it can take time to integrate it for small projects) or you can use similar script that I use for this campaign. It is inspired by afl++ installation script. I haven't tested it, I just wrote it for giving brief explanation to you.
#install afl++
sudo apt-get update
sudo apt-get install -y build-essential python3-dev automake git flex bison libglib2.0-dev libpixman-1-dev python3-setuptools
# try to install llvm 11 and install the distro default if that fails
sudo apt-get install -y lld-11 llvm-11 llvm-11-dev clang-11 || sudo apt-get install -y lld llvm llvm-dev clang
sudo apt-get install -y gcc-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-plugin-dev libstdc++-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-dev
git clone https://github.com/AFLplusplus/AFLplusplus && cd AFLplusplus
make distrib
sudo make install
#change directory to source
cd ..
#you can change harness according to which function/s you want to fuzz
afl-clang-fast harness.c peppapeg.c -o harness
#create one small corpus
mkdir inputs
echo "Hello World" > inputs/rand
sudo afl-system-config
mkdir sync_dir
#start afl -- you can change/play parameters for specific scenarios
afl-fuzz -i inputs -o sync_dir -- ./harness @@
from peppapeg.
Thanks for the procedure. I'll take it a look.
The exact character that caused the problem is 00
as it is used by C to determine the end of the input string. Currently, Peppa PEG only supports UTF-8 encoding; 00
is supposed not to be appearing in the middle of the content. I think you probably need to sanitize the input string by replacing 00
to something else like whitespace or \uFFFD. Similar assumption can be found in commonmark spec.
from peppapeg.
Related Issues (20)
- Report some invalid TOML example cases
- [Feature Request] Back reference can be used in nested expressions. HOT 1
- Precedence Climbing HOT 3
- Is it possible to support the C raw string literal to make the grammar a little simple to read? HOT 2
- Add debug info/trace output HOT 2
- Segfault when grammar entry rule doesn't exists HOT 1
- typedef const char* P4_CString
- Several tests are failing HOT 3
- Grammar railroad diagram
- Grammar to parse tree-siter grammars
- Performance compared to Lua HOT 3
- Compile error on header : peppa.h:1817:1: error: type qualifiers ignored on function return type HOT 5
- Example fails with MatchError: line 1:1, expect value
- Make Callbacks on a Successful Match.
- Support UTF-16 and UTF-32. HOT 1
- Create a P4_Grammar using PEG as Input
- [Feature Request]: Unicode Range
- [Feature Request] Optional Unicode Category HOT 1
- [Feature Request] Support Unicode Category ID_Start, Other_ID_Start, ID_Continue, Other_ID_Continue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peppapeg.