Giter Club home page Giter Club logo

Comments (4)

retikulum avatar retikulum commented on August 10, 2024 1

@soasme Hi again. This will help me a lot to gain better understanding of how these encoding schemes work. You can close the issue as you wish. Thank you for your time and effort.

from peppapeg.

soasme avatar soasme commented on August 10, 2024

@retikulum I have fixed the issue by revamping the implementation of P4_CaseCmpInsensitive. The case-insensitive literal rule "Hello Worìd" can now match the input "HELLO WORÌD", see test.

The case https://controlc.com/74f1e9b9 can pass on my local. Can you please revisit this issue when you're free?

Also, I think introducing fuzzy testing into this project is definitely a must-do. I have added it to the roadmap.

from peppapeg.

retikulum avatar retikulum commented on August 10, 2024

@soasme Thanks for your great effort. I have tested both cases. "Hello Worìd" works perfectly on my machine however the second case it is still failing. I think I shouldn't have copy-pasted test case on controlc because it generally brokes for non-ascii. I have tested second case both with sanitizers and without sanitizers. It again crashes. My crash_test case is attached.

image

crash_test.txt

For the introducing fuzzing test into project, you can consider integrating it with oss-fuzz (I know that it can take time to integrate it for small projects) or you can use similar script that I use for this campaign. It is inspired by afl++ installation script. I haven't tested it, I just wrote it for giving brief explanation to you.

#install afl++
sudo apt-get update
sudo apt-get install -y build-essential python3-dev automake git flex bison libglib2.0-dev libpixman-1-dev python3-setuptools
# try to install llvm 11 and install the distro default if that fails
sudo apt-get install -y lld-11 llvm-11 llvm-11-dev clang-11 || sudo apt-get install -y lld llvm llvm-dev clang 
sudo apt-get install -y gcc-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-plugin-dev libstdc++-$(gcc --version|head -n1|sed 's/.* //'|sed 's/\..*//')-dev
git clone https://github.com/AFLplusplus/AFLplusplus && cd AFLplusplus
make distrib
sudo make install
#change directory to source
cd ..
#you can change harness according to which function/s you want to fuzz
afl-clang-fast harness.c peppapeg.c -o harness 
#create one small corpus
mkdir inputs
echo "Hello World" > inputs/rand
sudo afl-system-config
mkdir sync_dir
#start afl -- you can change/play parameters for specific scenarios
afl-fuzz -i inputs -o sync_dir -- ./harness @@

from peppapeg.

soasme avatar soasme commented on August 10, 2024

Thanks for the procedure. I'll take it a look.

The exact character that caused the problem is 00 as it is used by C to determine the end of the input string. Currently, Peppa PEG only supports UTF-8 encoding; 00 is supposed not to be appearing in the middle of the content. I think you probably need to sanitize the input string by replacing 00 to something else like whitespace or \uFFFD. Similar assumption can be found in commonmark spec.

from peppapeg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.