Giter Club home page Giter Club logo

grayc's Introduction

Mutation-based Testing Tool for C Family Compilers and Code Analyzers

The GrayC approach involves using mutation-based fuzzing as a program generation technique (as described in our ISSTA '23 paper) and then using the generated programs to test compilers and analysers. It is currently usable for generating programs across the C family i.e. C,C++,Objective C and Objective C++. For replication of the results presented in our ISSTA '23 paper, please checkout and use the tool from the issta-2023 branch.

Features

This is the revamped version of the one presented in our ISSTA '23 paper. It contains the following enhancements:

  1. Write-Your-Own-Mutator
  2. Remove dependence on libfuzzer
  3. Interface to extend the tool for the entire C-family
  4. Out-of-tree implementation of the tool
  5. Rewrite of the codebase which now heavily relies on the LLVM/Clang framework
  6. Better debugging due to reliance on ASTMatchers and Clang's internal debugging framework
  7. Per mutation profiling mechanism for long fuzzing runs (courtesy LLVM's clang-tidy)

Installation

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-12 main"
sudo apt-get update
sudo apt-get install -y llvm-12 llvm-12-dev llvm-12-tools clang-12 libclang-common-12-dev libclang-12-dev 

This builds both LLVM and Clang on Ubuntu

git clone https://github.com/srg-imperial/GrayC.git
cd GrayC
mkdir build
cd build
cmake -GNinja -DCMAKE_C_COMPILER=clang-12 -DCMAKE_CXX_COMPILER=clang++-12 -DLLVM_CONFIG_BINARY=llvm-config-12 ../
ninja

Check the installation in the build directory as

bin/grayc --list-mutations

which should produce the following output

Enabled mutations:
    cmutation-assignment-expression-mutator
    cmutation-conditional-expression-mutator
    cmutation-duplicate-statement-mutator
    cmutation-jump-mutator
    cmutation-unary

Example

cd build 
echo "int main(){int a=0; ++a;return 0;}" > b.cpp
bin/grayc -mutations="-*,cmutation-unary" --apply-mutation b.cpp -- 

This should result in the following program

int main()
{
    int a = 0;
    --a;
    return 0;
}

GrayC: Write-Your-Own-Mutator (WYOM)

The inspiration behind this functionality was the extensible framework introduced by clang-tidy. More technically, the WYOM functionality is realised by making use of the add_new_mutator.py script, which automatically updates the various files while providing the boilerplate code to write a new mutation.

WYOM Example Usage

Let's see the case for development of a simple mutator that converts a + to a -. For now, we would like the mutator to work on C programs. We will start off by calling the add_new_mutator.py, which sits in the grayc folder, as follows:

./add_new_mutator.py cmutation binary-operator-mutator

The script does the following tasks:

  1. Registers the binary-operator-mutator within the cmutation module
  2. Provides BinaryOperatorMutator.cpp and BinaryOperatorMutator.h files
  3. Provides a small implementation of the BinaryOperatorMutator::registerMatchers and the BinaryOperatorMutator::check containing a sample matcher and the correponsing callback function.

The user is then expected to refine the ASTMatcher in the BinaryOperatorMutator::registerMatchers function and the callback code in the BinaryOperatorMutator::check function.

Once refined, the check can be called on a sample file in the aforementioned manner.

GrayC's mutators are divided into modules, based on the language that it targets. cmutation is the most general module corresponding to mutators applicable for the entire C family , while cxxmutation houses the C++ specific mutators. We aim to extend this by having modules for Objective C and Objective C++ in the near future.

grayc's People

Contributors

arindam-8 avatar dependabot[bot] avatar karineek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

junweizeng

grayc's Issues

Are there some mutators missing?

Hi, I noticed that this codebase has some inconsistencies with your paper.
For example, I couldn't find the actual delete operations in the code, which illustrated in your paper to delete sub-expressions from a given expression in a corpus program.

I list the mutator file you guys showed in this project:

utils-fuzzer
assignment-mutator
constant-mutator
delete-mutator
duplicate-mutator
expression-mutator
function-extractor
function-merger
jump-mutator
rename-transform
append-expression
extract-expression
global-extractor

I check the delete-mutator (about statement) and expression-mutator (about expand expression), and they are both Unrelated.

Could you please clarify the relationship between the 'mutator file' and the 'mutator operation' as described in your paper? I am particularly interested in understanding how these two components are connected. Thank you.

Enhancing the Robustness of the Helper Scripts with Additional String Matching or Similar Changes

Hi @arindam-8 .

Here's a simple example, test.c:

int main
{
    a = 1;
}
  1. Initially, I compiled it (gcc test.c) on an Ubuntu system without full-width single quotation marks (). The output was as follows:

    test.c: In function 'main':
    test.c:3:5: error: 'a' undeclared (first use in this function)
    3 | a = 1;
    | ^
    test.c:3:5: note: each undeclared identifier is reported only once for each function it appears in

    The In function 'main': in this case uses a half-width single quotation mark.

  2. However, when I switched to an Ubuntu system with Chinese language support (which includes full-width single quotation marks ), and recompiled it (gcc test.c), the output changed to:

    test.c: In function ‘main’:
    test.c:3:5: error: ‘a’ undeclared (first use in this function)
    a = 1;
    ^
    test.c:3:5: note: each undeclared identifier is reported only once for each function it appears in

    The In function ‘main’:in this case uses a full-width single quotation mark.

I apologize for the previous commit, which was incorrect. I think the original -ve "In function ‘bug’:" should be retained, and we can add -ve "In function 'bug':" after it to enhance the robustness of the program.

Originally posted by @Vanish-Zeng in #4 (comment)

There may be other string matching or similar changes that can enhance the robustness of the helper scripts. Feel free to contribute additional suggestions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.