Giter Club home page Giter Club logo

xgdsmileboy / genpat Goto Github PK

View Code? Open in Web Editor NEW
29.0 7.0 9.0 205.98 MB

This is an automated transformation inference tool that leverages a big code corpus to guide the abstraction of transformation patterns.

License: GNU General Public License v2.0

Java 96.94% HTML 0.19% JavaScript 2.51% Shell 0.12% CSS 0.01% XSLT 0.01% R 0.11% Python 0.11% Perl 0.01% Dockerfile 0.01%
automatic-debugging program-repair program-transformation pattern-mining

genpat's Introduction

GenPat

Tool of paper: Jiang, Jiajun, et al. "Inferring program transformations from singular examples via big code." 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019.

I. Introduction

GenPat is an automatic program transformation inferring framework, which infers general (reusable) transformations from singular examples. In this process, it leverages a big code corpus to guide the inference process. The following figure is the workflow of our approach.

The workflow of this technique.\label{workflow}

Inferring Stage

  1. Extracting Hypergraphs : parses source code before and after changing into hypergraph representations with a set of attributes and relations in the graph.
  2. Extracting Modifications : performs hypergraph differencing and extracts modifications related to corresponding nodes in the graph. Specifically, tree-based differencing, e.g., GumTree, can be used to perfrom the differencing as this hypergraph is a super graph with the original AST as backbone.
  3. Inferring Transformations : based on the previous process, we know which nodes are modified. This step first selects a set of node according to the expansion starting from the changed nodes, which will constitute the context of the transformation. Next, a big code corpus will be utilized to select the attributes of those context nodes, i.e., frequently appeared attributes across projects will be kept in the transformation pattern, while others discarded.

As a result, the final transformation pattern contains a subset of the nodes in the hypergraph and their selected attributes.

Applying Stage

  1. Matching Hypergraph : when given a transformation pattern and an examplar code snippet, GenPat first extracts a hypergraph from the code and tries to match it with the given pattern, where all nodes and attributes in the pattern should be matched with some one in the hypergraph of the examplar code.
  2. Migrating Modifications : after matching the hypergraph with the pattern, a sequnence of mappings can be obtained to record the matching result of the hypergraphs. According to the mappings, modifications related to corresponding nodes can be transferred to the target hypergraph with replacing mapped variables and expressions.

If you want to use this project, please cite our technical paper published on ASE'19.

@inproceedings{GenPat:2019,
    author   = {Jiang, Jiajun and Ren, Luyao and Xiong, Yingfei and Zhang, Lingming},
    title    = {Inferring Program Transformations From Singular Examples via Big Code},
    series   = {ASE},
    year     = {2019},
    location = {San Diego, California, U.S.},
} 

II. Environment

  • OS: Linux (Tested on Ubuntu 16.04.2 LTS)
  • JDK: Oracle jdk1.8 (important!)
  • Mavan: Apache Maven 3.6
  • Download and configure Defects4J.
  • More runtime configurations can be found in the config-file.

III. How to run

GenPat was traditionally developed as a Maven project via IntelliJ, you can simply import this project to your workspace and run it as a common Maven program or contribute it. The main class is mfix.Main, and for the running option please refer to the Running Options.

  • As it is a maven project, a simpler method to build it is using the command line with mvn package, which will generate a runnable jar file in the artifact directory. (NOTE: Please ensure you have configured Java & Maven environment first)

Running Options

Our prototype of GenPat has multiple user interfaces for different purposes, basically you can run it within command line as:

java -jar GenPat.jar FUNC ARGS

where the FUNC denotes the function/interface you want to use and ARGS is a set of arguments corresponding to the FUNC.

The following details some of the FUNC that are important:

print: print the details of pattern
filter: filter and serialize patterns with given criteria.
repair: run porgram repair
cluster: run pattern cluster

More Details related to each FUNC:

  • print: "<command> -if <arg> [-of <arg>]", printing the details of the given pattern to console or file.

    • -if: denotes the absolute path of pattern file (e.g., xxx/example/patterns/pattern_file1.pattern, serialized file).

    • -of: denotes the absolute path of the output file.

      e.g., java -jar GenPat.jar print -if in_file -of out_file

  • filter: "<command> (-ip <arg> | -filter <arg>) [-dir <arg>] [-line <arg>] [-change <arg>] [-of <arg> | -op <arg>]", filter and serialize patterns with the given criteria.

    • Option group, one of the following options should be given

      1. -ip: denotes the #Directory# of the pairwise buggy-fixed files (e.g., xxx/example/src). #Directory# should be formatted as follows:

        |-directory
          |...|-buggy-version
          |...|-| source_code_file.java
          |...|-fixed-version
          |...|-| source_code_file.java

        One can reformat the structure of files in mfix.common.util.MiningUtils.java

      2. -filter: denotes the absolute path of the file that contains the paths of patterns to be filtered (e.g., xxx/example/records/PatternRecord.txt).

    • -dir: the output directory of the serialized patterns (needed only when option -filter is used, e.g., if the option for -filter is file PatternRecord.txt, -dir can be /home/jack/code, /home/jack and so on, which is the prefix of the path.

    • -line: max number of lines of code that were changed (filtering criterion).

    • -change: max number of change actions (filtering criterion).

    • -of | -op: the -of denotes the absolute path of output file for recording the paths of patterns, while -op denotes the absolute path of output directory (default file name is "PatternRecord.txt").

  • repair: "<command> (-bf <arg> | -bp <arg> | -xml | -d4j <arg>) (-pf <arg> | -pattern <arg>) [-d4jhome <arg>]", try to repair a buggy file or a bug in defects4j with the given pattern or list of patterns. One argument in each bracket should be given.

    • Option group, one of the following options should be given:
      1. -bf: denotes the absolute path of buggy file.
      2. -bp: denotes the absolute base directory of buggy program or file (GenPat will traverse the files in the directory for repair).
      3. -xml: no arguments for this option, denotes reading the buggy subject information from project.xml.
      4. -d4j: denotes the bug id in defects4j benchmark, e.g., chart_1.
    • Option group, one of the following options should be given:
      1. -pf: denotes the absolote path of the file which records the paths of all avaliable patterns (e.g., xxx/example/records/PatternRecord.txt).
      2. -pattern: denotes the absolote path of pattern file (e.g., xxx/example/patterns/pattern_file1.pattern).
    • -d4jhome: denotes the home directory of defects4j, used for running defects4j command (e.g., /home/jack/code/defects4j).
  • cluster: "<command> -if <arg> [-dir <arg>] [-op <arg>]", cluster same patterns together.

    • -if: denotes the absolute path of the file which contains the path information of patterns (e.g., xxx/example/records/PatternRecord.txt).
    • -dir: the home directory of pattern files, it can be /home/jack/code, /home/jack and so on, which is the prefix of the path in PatternRecord.txt).
    • -op: the output path of results.

We only list some of the most important features of GenPat, please learn more via checking its implementation.

Result Analysis

  • Typically, the output is the transformed source code after applying some pattern.
  • Particularly, when using GenPat for program repair on defects4j, it will produce two sub-folders, i.e., log and patch, where the log folder contains the log information during the repair, including used pattern, code-diffs and so on (view log for more details), while the patch folder contains the patches that can pass the test cases. (view patch for more details).

IV. Evaluation Result

Evaluations on two distinct application scenarios:

  • Systematic Editing—significantly outperforms the state-of-the-art Sydit with up to 5.5x correctly transformed cases.

  • Automatic Program Repair —successfully fixed 19 bugs in the Defects4J benchmark, 4 of which have never been repaired by any existing technique.

    Please read our paper for more evaluation results.

V. Structure of the project

  |--- README.md      :  user guidance
  |--- resources      :  resource files and configurations
  |--- doc            :  document
  |--- final          :  evaluation result
  |--- lib            :  dependent libraries
  |--- repair-result: :  evaluation result in our paper for program repair
  |--- replication    :  replication package with sampled cases
  |--- sbfl           :  fault localization tool
  |--- src            :  source code and test suite

ALL suggestions and contributions are welcomed.

genpat's People

Contributors

luyaor avatar monperrus avatar xgdsmileboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

genpat's Issues

How to run GenPat on my own data?

Hi @xgdsmileboy @FancyCoder0 , I understand the tool is mainly built on the top of the D4J data.
My question is, how can I run it on my own buggy and fixed code pair to get the inferred transformation pattern and apply it on my own file? Is there any easy way to do that without too many code changes? I am looking forward to hearing suggestions from you. Thx!

Wanted the steps to run this application in Windows10 System

on Repair getting below exception
[ERROR] 2021-10-08 17:36:23,511 @javaFile #readFileToString Illegal input file path : C:\XXXXX\XXXXX\Genpat_Original_code\GenPat\example\src\buggy-version
Exception in thread "main" java.lang.NullPointerException
at mfix.core.locator.FakeLocator.getLocations(FakeLocator.java:43)
at mfix.tools.Repair.repair(Repair.java:573)
at mfix.tools.Repair.repairAPI(Repair.java:703)
at mfix.Main.main(Main.java:58)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.