Giter Club home page Giter Club logo

macer's Introduction

MACER: A Modular Framework for Accelerated Compilation Error Repair

This repository is the official implementation of MACER: A Modular Framework for Accelerated Compilation Error Repair. The paper describing the development of this framework can be accessed at [this arXiv link]

Setup

Installing Base Packages and Curate Datasets

This project runs on Python 3.6. To install all packages, and curate the datasets from TRACER and DeepFix, install conda and run

sudo make install

The MakeFile provided executes several steps. These are enumerated below for sake of clarity. If using the Makefile itself, you need not execute the following steps individually.

  1. Ubuntu/Debian packages

    conda install python=3.6.5

    sudo apt install clang python3-pip unzip gzip curl sqlite3

  2. Anaconda environment

    conda create --name macer36 python=3.6
    conda activate macer36
    
  3. Python packages

    pip3 install --version -r requirements.txt

  4. Set Clang paths

    Create symbolic link (if it doesn't already exist) to enable Python-Clang bind

    sudo apt-get install -y libclang-dev
    sudo ln -sf libclang-11.so libclang.so
    cd /usr/lib/x86_64-linux-gnu/
    sudo ln -s libclang-XX.YY.so.1 libclang.so
    

    Where, XX.YY is the version number of Clang installed on your system.

  5. Pull TRACER's dataset, which is used for training and testing of MACER

    git clone https://github.com/umairzahmed/tracer.git
    unzip tracer/data/dataset/singleL/singleL_Test.zip -d tracer/data/dataset/singleL/
    unzip tracer/data/dataset/singleL/singleL_Train+Valid.zip -d tracer/data/dataset/singleL/
    
  6. Repair class creation (refer Section 2. of our paper)

    python3 -m srcT.DataStruct.ClusterError

  7. Pull DeepFix dataset, used for testing MACER

    curl -O https://www.cse.iitk.ac.in/users/karkare/prutor/prutor-deepfix-09-12-2017.zip
    unzip prutor-deepfix-09-12-2017.zip
    gzip -d prutor-deepfix-09-12-2017/prutor-deepfix-09-12-2017.db.gz
    
  8. Extract DeepFix dataset into Macer's format

    sqlite3 -header -csv prutor-deepfix-09-12-2017/prutor-deepfix-09-12-2017.db "select * from Code where error<>'';" > prutor-deepfix-09-12-2017/deepfix_test.csv
    
    python3 -m srcT.DataStruct.PrepDeepFix
    

Installing Python Library Dependencies

The MACER toolchain makes use of various standard libraries. A minimal list is provided in the requirements.txt file. If you are a pip user, please use the following command to install these libraries:

pip3 install -r requirements.txt

Note about version dependency: although the requirements file specifies version dependencies to be exact, this is to err on the side of caution. For most of the libraries (e.g. pandas or scikit-learn), a more recent version should work well too. However, the dependency on the 1.2.4 version of the edlib library seems to be strict. Having a different version of this library may cause the toolchain to malfunction at evaluation time. Similarly, for tensorflow, it seems that version 2.0 can cause issues with training. We advise caution while using versions of libraries different from those mentioned in the requirements file.

Training

To train the model(s) afresh on TRACER's training dataset consisting of single-line incorrect-correct C program pairs (please refer to Section 4 in the manuscript linked above), execute the command given below.

python3 train.py 

Training might take between 7 minutes to 20 minutes depending on the machine configuration. Please note that executing the training routine will overwrite any previously trained model.

Testing/Evaluation

Repair individual C programs

To perform repair on a single C program using MACER, use

python3 testRepair.py <path-to-file.c> <PredK>

Example: an example program provided at data/input/fig_1a.c contains errors in a for loop statement specifier. To repair this program, execute the following command

$ python3 testRepair.py data/input/fig_1a.c 5

Pred@k accuracy on TRACER's single-line test dataset

To obtain Pred@K accuracy on TRACER's test dataset (please refer to Table 4. in the manuscript linked above), execute the command given below. In the following, the parameter PredK specifies how many of the ranked suggestions by MACER should be considered (MACER offers, for any input program, a ranked list of repair suggestions). Please refer to Section 6.2 in the manuscript linked above for a description of this metric.

Evaluating Pred@K accuracy on all programs in the test dataset should take about 2-3 minutes depending on the machine configuration. Pred@K accuracy obtained will be printed at the end of execution.

python3 testPredAtK.py <PredK>

Example:

$ python3 testPredAtK.py 5
Pred@5: 0.693

Repair accuracy on TRACER's single-line test dataset

To obtain MACER's repair accuracy on TRACER's single-line test dataset (please refer to Table 4. in the manuscript linked above), execute the command given below. The parameter PredK continues to specify how many of the ranked suggestions by MACER should be considered. Please refer to Section 6.2 in the manuscript linked above for a description of this metric.

Evaluating repair accuracy on all programs in the test dataset might take 30 - 50 minutes depending on the machine configuration. The time taken is longer here since unlike Pred@K where the repaired program is merely checked for (abstract) match with the gold standard (the students's own repaired program), here the program is recompiled. Repair accuracy obtained will be printed at the end of execution.

python3 testRepair.py tracer_single <PredK>

Example:

python3 testRepair.py tracer_single 5

Repair accuracy on DeepFix's test dataset

To obtain MACER's repair accuracy on DeepFix's test dataset (please refer to Table 5. in the manuscript linked above), execute the command given below. The parameter PredK continues to specify how many of the ranked suggestions by MACER should be considered.

Evaluating repair accuracy on all programs in the test dataset might take around 30 minutes depending on the machine configuration. Repair accuracy obtained will be printed at the end of execution.

python3 testRepair.py deepfix <PredK>

Example:

python3 testRepair.py deepfix 5

Expected Results

MACER should offer the following performance on the given datasets. Minor deviations are expected if training afresh due to various randomizations used during the training process.

Dataset Repair Accuracy
TRACER SingleLine 0.805
DeepFix 0.566

Contributing

This repository is released under the MIT license. If you would like to submit a bugfix or an enhancement to MACER, please open an issue on this GitHub repository. We welcome other suggestions and comments too (please mail the corresponding author at [email protected])

Acknowledgements

We are thankful to Bhaskar Mukhoty for helping us identify certain cross-platform dependecies of the MACER toolchain.

macer's People

Contributors

bstee615 avatar darshak7195 avatar elishatofunmi avatar purushottamkar avatar umairzahmed avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.