Giter Club home page Giter Club logo

rjust_fld's Introduction

This repository has been forked from https://bitbucket.org/rjust/fault-localization-data/overview . For analysis data and results visit http://bit.ly/pr_research_spreadsheet . Change sheets from the bottom bar for different results.

Fault-localization-data repository

This repository contains data files, data-collection scripts, and data-analysis scripts of the "Evaluating and Improving Fault Localization Techniques" project. Before exploring this repository, please read the technical report that describes the results.

Overview

The experiments evaluate various fault localization techniques on artificial faults and on real faults.

At a high level, here's how it all works:

  • The real and artificial faults come from the Defects4J Project.
  • For each D4J fault, the scripts in d4j_integration/ determine which lines are faulty. The resultant files are "buggy-lines" files, and live in analysis/pipeline-scripts/buggy-lines/.
  • Many fault localization techniques require coverage information. We use GZoltar to gather coverage information. The resultant files are called "matrix" and "spectra".
  • Mutation-based fault localization (MBFL) techniques require mutation analysis. Our Killmap project (which lives in killmap/) does mutation analysis on all faults. The resultant files are called "killmaps," and specify how each test behaves on each mutant. (Each killmap also has an associated "mutants-log" file, which describes all the mutants that were analyzed.)
  • Our scripts enable you to compute all the mutation and coverage information, but doing so takes a great deal of computation. The resulting mutation/coverage information is available at http://fault-localization.cs.washington.edu.
  • The "scoring pipeline" (which lives in analysis/pipeline-scripts/) determines how well each FL technique does on each fault -- that is, where the real buggy lines appear in the FL technique's ranking of the line of the program. The results appear in data/.

How-To

Before doing anything else, run ./setup.sh. This:

  • clones the appropriate Defects4J fork (unless you've already exported a D4J_HOME directory);
  • updates your .bashrc to export some environment variables:
    • D4J_HOME and DEFECTS4J_HOME, pointing to the new defects4j repository, if it needed
    • FL_DATA_HOME, pointing here
    • KILLMAP_HOME, pointing at ./killmap/
    • GZOLTAR_JAR, pointing to ./gzoltar/gzoltar.jar

How to score techniques

The workflow to score a set of FL techniques on a given fault looks like this:

  • Various pieces of fault information were generated by the tools in ./d4j_integration/ and then checked in. You don't need to generate them yourself, but if you want to, see the README.md in that directory.

  • To run GZoltar, use gzoltar/run_gzoltar.sh.

    Example invocation: bash run_gzoltar.sh Lang 37 . developer

    Creates the files ./matrix and ./spectra.

  • To run Killmap, use killmap/scripts/generate-matrix.

    Example invocation:

    killmap/scripts/generate-matrix \
      Lang 37 \
      /tmp/Lang-37 \
      Lang-37.mutants.log \
      | gzip > Lang-37.killmap.csv.gz
    

    Creates the files Lang-37.killmap.csv.gz and Lang-37.mutants.log.

  • To run the scoring pipeline, use analysis/pipeline-scripts/do-full-analysis.

    Example invocation:

    analysis/pipeline-scripts/do-full-analysis \
      Lang 37 'developer' \
      ./matrix ./spectra \
      Lang-37.killmap.csv.gz Lang-37.mutants.log \
      /tmp/Lang-37-scoring \
      Lang-37.scores.csv`
    

    Creates the file Lang-37.scores.csv.

For more details on any of these scripts, see the README.md in the script's directory.

If you want to skip running GZoltar and Killmap (which can be very computationally expensive), you can download the resulting files from http://fault-localization.cs.washington.edu.

Contents

  • analysis/: Tools for analyzing the output of coverage/mutation analyses.

  • aws/: Scripts for computing killmaps on AWS.

  • cluster_scripts/: Scripts for computing killmaps on a Sun Grid cluster.

  • d4j_integration/: Scripts that build upon or extend Defects4J to populate or query its database.

  • data/: Data files for the final results and corresponding support scripts.

  • gzoltar/: Scripts for running the GZoltar tool to collect line coverage information.

  • killmap/: Mutation-analysis tool whose output is used for the MBFL techniques we study.

  • stats/: R scripts that crunch the data to produce numbers for the paper.

  • utils/: Utility programs and libraries for running/analyzing tests and parsing data files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.