Giter Club home page Giter Club logo

sootfx's Introduction

SootFX

SootFX is a static code feature extraction tool for Java and Android, built to be extended with new resource providers and feature extraction units. It currently extracts method, class and whole-program features for jar and apk files, for apks it also extracts manifest features.

Building

SootFX is a java project and uses maven. It wasn't tested on other Java versions therefore we recommend using Java version 8. It can be imported as a maven project to your favourite IDE and can be built with maven plugins. It depends on soot-infoflow-android which is not available on maven central, this library can be installed to the local maven repository by running the install_dependencies.sh script in dependencies folder. After that it can be built with:

mvn install

Usage

SootFX can be used with its Java API, Python API or CLI.

Java API

The Java API can be accessed via api.SootFX An example for extracting all the method features:

SootFX sootFX = new SootFX();
sootFX.addClassPath(path); //path to jar or apk file
Set<MethodFeatureSet> featureSets = sootFX.extractAllMethodFeatures();
sootFX.printMultiSetToCSV(featureSets, outPath); //path to output csv file

Python API

Python requirements are defined in requirements.txt, and can be installed by running:

pip install -r requirements.txt

The Python API can be accessed in two steps:

  1. run api.SootFXEntryPoint. It starts a Py4J gateway server.
  2. run main.py in SootFXPy, which enables accessing the Java API over the gateway

Obtaining the API handle:

gateway = JavaGateway()
sootFX = gateway.entry_point.sootFX()

Listing all the method features:

availableFeatures = sootFX.listAllMethodFeatures()

Extracting selected method features as Pandas DataFrame:

selected_features = gateway.jvm.java.util.ArrayList()
selected_features.add('MethodAssignStmtCount')
selected_features.add('MethodBranchCount')
extracted_features = sootFX.extractMethodFeatures(selected_features)
df = converter.to_dataframe(extracted_features)

CLI

An executable jar can be built with:

mvn package

This will create SootFX-1.0-SNAPSHOT-jar-with-dependencies.jar under target folder. it can be used as a CLI tool.
in case of jar files:

  • to extract all the features:
    java -jar SootFX.jar "path/to/jar" "path/to/out/"
    
  • to extract features by using inclusion or exclusion lists:
    java -jar SootFX.jar "path/to/jar" "path/to/out/" "path/to/config.yaml"
    

in case of apk files:

  • to extract all the features:
    java -jar SootFX.jar "path/to/apk" "path/to/out/" "path/to/Android/sdk/platforms"
    
  • to extract features by using inclusion or exclusion lists:
    java -jar SootFX.jar "path/to/apk" "path/to/out/" "path/to/config.yaml" "path/to/Android/sdk/platforms"
    

inclusion and exclusion lists for different type of feature extraction units can be defined in config.yaml. Make sure to provide either an inclusion list, or an exclusion list. Inclusion list only extracts the selected features. Exclusion list extracts all but the selected features.

About

SootFX is developed by the Secure Software Engineering Group of the Paderborn University

Feel free to report issues if you find a bug or would like to implement a new feature.

Publications

For further information, have a look at the publication preprint.

If you use SootFX in your research projects, feel free to cite our paper:

@INPROCEEDINGS{9610670,
  author={Karakaya, Kadiray and Bodden, Eric},
  booktitle={2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM)},   
  title={SootFX: A Static Code Feature Extraction Tool for Java and Android},   
  year={2021},  
  volume={},  
  number={},  
  pages={181-186},  
  doi={10.1109/SCAM52516.2021.00030}
 }

sootfx's People

Contributors

kadirayk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.