codeshield-security / spds Goto Github PK

View Code? Open in Web Editor NEW

This project forked from crossingtud/spds

59.0 59.0 13.0 81.57 MB

Efficient and Precise Pointer-Tracking Data-Flow Framework

License: Eclipse Public License 2.0

Java 99.99% Shell 0.01%

spds's People

Contributors

Stargazers

Watchers

Forkers

anakinraw wcphkust martinschaef ocean390 rahlk ericbodden p4p3r akhtarjunaid dseekatz mohammad-abdul-hadi t3ls yijiangtian zhu1971

spds's Issues

Backward Query fails to apply flow function on first statement

Assume the BackwardFlowFunctions are overwritten to kill any data-flow fact for any call to System.exit. Then in the code below

int i = 0;
System.exit(0)
queryFor(i)

If triggering a query for i just before the queryFor(i) statement, no kill occurs and the backward query incorrectly reaches i = 0.

VectorTests fail on Java >= 9

The VectorTest (test2, test4, test5, test6) fails on Java >= 9 with message:

java.lang.RuntimeException: Unsound results: [MustBe [s (typestate.tests.VectorTest.<typestate.tests.VectorTest: void test2()>) @ mustBeInAcceptingState(s) in state ACCEPTING]]

Running on Java 8 the tests succeed, but on Java >= 9 the tests fail. Needs further investigation. Many java.util.Collection's implementations have been changed between Java 8 and Java 9. We need to inspect the differences of the java.util.Vector implementations and their effect on Boomerang to see why those tests suddenly fail.

Crashing about ClassCastException

According to README Examples, I have tried to run a IDEal example inference.example.Main, but it crashed, and the output was:

java.lang.ClassCastException: boomerang.scene.jimple.JimpleVal cannot be cast to boomerang.scene.AllocVal
	at boomerang.WeightedBoomerang.forwardSolve(WeightedBoomerang.java:1176)
        ...

here

SPDS/boomerangPDS/src/main/java/boomerang/WeightedBoomerang.java

Line 1176 in 1179227

var = ((AllocVal) query.var()).getDelegate();

I changed it to

var = query.var();

and there is no the crash.

Could not find de.fraunhofer.iem:PathExpression:1.0.0 dependency

I'm attempting to use SPDS as a Gradle dependency following this and this. I have

implementation 'de.fraunhofer.iem:WPDS:3.0.8'

in my build.gradle. I don't think the issue is with Gradle because I am able to pull the base "WPDS" package, just not one of its dependencies. I have the following error when building

Execution failed for task '[...]'.
> Could not resolve all files for configuration '[...]'.
   > Could not find de.fraunhofer.iem:PathExpression:1.0.0.
     Searched in the following locations:
       - https://repo.maven.apache.org/maven2/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
       - https://repo.maven.apache.org/maven2/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar
       - https://jcenter.bintray.com/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
       - https://jcenter.bintray.com/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar
       - https://maven.pkg.github.com/CodeShield-Security/SPDS/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
       - https://maven.pkg.github.com/CodeShield-Security/SPDS/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar

Dataflow gap with static setter/getter + alias

I think I've found a case where Boomerang misses some dataflow (3.1.2).

The reproducer below is based on FlowDroid Listing 2 (PLDI '14), but uses a static setter/getter instead of direct field writes.

boomerangPDS/src/main/java/boomerang/example/BoomerangExampleTarget1.java

public class BoomerangExampleTarget1 {
  public static void main(String... args) {
    Data p = new Data();
    taintIt(customSource(), p);
  }

  private static String customSource() { return "I'm tainted"; }
  private static void customSink(String sunk) { System.out.println(sunk); }

  private static void taintIt(String in, Data out) {
    Data x = out;
    Data.setter(x, in);
    customSink(Data.getter(out));
  }

  static class Data {
    private String f;
    static void setter(Data self, String value) {
      self.f = value;
    }
    static String getter(Data self) {
      return self.f;
    }
  }
}

If I change Data.setter(x, in); to Data.setter(out, in);, then I get a path from the source to the sink.

boomerangPDS/src/main/java/boomerang/example/ExampleMain1.java: createAnalysisTransformer()

 private static Transformer createAnalysisTransformer() {
    return new SceneTransformer() {
      protected void internalTransform(
          String phaseName, @SuppressWarnings("rawtypes") Map options) {
        SootCallGraph sootCallGraph = new SootCallGraph();
        AnalysisScope scope =
            new AnalysisScope(sootCallGraph) {
              @Override
              protected Collection<? extends Query> generate(Edge cfgEdge) {
                Statement statement = cfgEdge.getStart();
                if (statement.toString().contains("customSource") && statement.containsInvokeExpr()) {
                  Val arg = statement.getLeftOp();
                  return Collections.singleton(new ForwardQuery(cfgEdge,
                          new AllocVal(arg, statement, arg)));
                }
                return Collections.emptySet();
              }
            };

        Boomerang solver =
            new Boomerang(
                sootCallGraph, SootDataFlowScope.make(Scene.v()), new DefaultBoomerangOptions() {
              @Override
              public int analysisTimeoutMS() {
                return 10000;
              }
            });

        Collection<Query> seeds = scope.computeSeeds();
        for (Query query : seeds) {
          System.out.println("Solving query: " + query);

          ForwardBoomerangResults<Weight.NoWeight> res = solver.solve((ForwardQuery) query);

          if (res.isTimedout()) {
            throw new RuntimeException("Timed out");
          }

          res.asStatementValWeightTable().cellSet().forEach(cell -> {
            if (cell.getRowKey().getStart().containsInvokeExpr() &&
                    cell.getRowKey().getStart().getInvokeExpr().getMethod().getName().contains("customSink") &&
                    cell.getRowKey().getStart().uses(cell.getColumnKey())) {
              System.out.println("SOURCE: " + query.cfgEdge().getStart().toString());
              System.out.println("SINK: " + cell.getRowKey().getStart().toString());
            }
          });
        }
      }
    };
  }

handleMaps is ignored

The handleMapsBackward and handleMapsForward methods ignore the handleMaps option. It doesn't look like this option is used anywhere in the code.

Old Soot dependency

Currently all references to soot point to ca.mcgill.sable at version 4.1.0 To ensure SPDS always runs the on the latest version of Soot I'd suggest to change dependencies to org.soot-oss at version 4.2.1.

Unable to install SPDS

When i try to build SPDS, the server at https://soot-build.cs.uni-paderborn.de/nexus/repository/swt-upb/ throws an 502 Bad Gateway error.
Specifically it fails when i try to obtain the de.fraunhofer.iem:PathExpression:jar:1.0.0 jar file.
Could somebody help me solve this build problem?

Cheers!

3.1.2 release did not go through

The readme was updated to 3.1.2, but the release did not succeed via Actions.

Track multiple variables with the same query

Hello,

Is it possible to create a query that will track two variables?
For example:

void main() {
	foo();
	bar();
}

void foo() {
	queryTarget("a1", "b1");
}

void bar() {
	queryTarget("a2", "b2");
}

void queryTarget(String x, String y) {}

Given that my entry point is main, if I query x, I would get a1 and a2. Then if I query y, I would get b1 and b2. Obviously the results I expect are the sets (a1,b1) and (a2,b2), and NOT (a1,b2) or (a2,b1).
How can this be done?

Thanks,
Tom

Subqueries must continue in callers

When subqueries are triggered, and a subquery reaches the entry point of a method, it does not seem to propagate to any callee.

class CustomQuery1{
    final String field;
    private CustomQuery() {
        field = "someInfo";
    }
    
    public static void main() {
        new CustomQuery1().example();
    }
    
   public void example() {
        String info = field.toString();
        queryFor(info); //Should find "someInfo" but does not. 
    }
}

class CustomQuery2{
    final String field;
    private CustomQuery() {
        field = "someInfo";
    }
    
    public static void main() {
        new CustomQuery(2).example();
    }
    
   public void example2() {
        CustomQuery2 c = new CustomQuery2();
        String info = c.field.getData();
        queryFor(info); //Should find "someInfo" and does so
    }
}

Boomerang crashes when summaries are enabled

Hi,

I am using Boomerang 3.0.10 and trying to run the provided examples, but with call summaries enabled. The only change I have made to the examples is that I have replaced

          // 1. Create a Boomerang solver.
          Boomerang solver =
              new Boomerang(
                  sootCallGraph, SootDataFlowScope.make(Scene.v()), new DefaultBoomerangOptions());

with the following code:

        // 1. Create a Boomerang solver.
        Boomerang solver =
            new Boomerang(
                sootCallGraph,
                SootDataFlowScope.make(Scene.v()),
                new DefaultBoomerangOptions() {
                  @Override
                  public boolean callSummaries() {
                    return true;
                  }
                });

Running ExampleMain1 produces the following output:

<boomerang.example.BoomerangExampleTarget1: void <init>()>
<boomerang.example.BoomerangExampleTarget1: void main(java.lang.String[])>
<boomerang.example.BoomerangExampleTarget1: void staticCallOnFile(boomerang.example.BoomerangExampleTarget1$ClassWithField,boomerang.example.BoomerangExampleTarget1$NestedClassWithField)>
<boomerang.example.BoomerangExampleTarget1: void queryFor(boomerang.example.BoomerangExampleTarget1$ObjectOfInterest)>
Solving query: BackwardQuery: (queryVariable (boomerang.example.BoomerangExampleTarget1.<boomerang.example.BoomerangExampleTarget1: void staticCallOnFile(boomerang.example.BoomerangExampleTarget1$ClassWithField,boomerang.example.BoomerangExampleTarget1$NestedClassWithField)>),RS: queryFor(queryVariable))
Boomerang CRASHEDjava.lang.NullPointerException
All allocation sites of the query variable are:
{}
All aliasing access path of the query variable are:
[]

The null pointer exception occurs when Boomerang tries to retrieve a summary automaton during post*, but no summaries have been created yet at this point. I can also provide the whole stack trace if it helps, but it is quite long.

Is there some additional way that I need to configure Boomerang before I can use summaries?

Thanks,
David

Some questions

Hi Johannas! Hope you are doing well.

My name is Matan. I read your excellent Boomerang paper (over and over again..) after also watching the great DECA I/II course by prof. Bodden (and also reading the SPDS paper). I'm trying to reproduce this paper by re-implement Boomerang for Java in Golang.
I also looked at the code for the Java implementation of Boomerang by your group, but unfortunately I find it impenetrable. This is one of the reasons I want to re-implement it (understand it by building it)

I have a few open questions that I'm still not clear about and would greatly appreciate some clarification.

Is the client-driven context mechanism also used within subqueries?

for example, in figure 1. when performing a forward query starting from context1 at the allocation at line #4 asking which access paths at foo point to it; we encounter a field-write POI at a.f = s this should trigger an "AllAliases" query at this point which should only look at context1.
How is this information represented and passed to the subquery? Using the same mechanism described at "4.2 Client-Driven Context-Resolution"?

How can the idea about call/return POIs be implemented using SPDS instead of the access graph formulation (i.e. when we have a field-pds & automaton instead of an access graph)? In the access graph formulation we recursively query for all prefixes - How is this translated into the SPDS formulation?
Can dynamic fields and calls be supported as more POIs? by dynamic fields I mean x[f] where f is a variable (e.g. hashmaps or arrays), by dynamic calls I mean f(..) where f is variable (e.g. lambdas). My thought is that we can set the domain of fields to be all allocation sites and each statement involving dynamic fields as a POI which requires a PointsTo query to obtain the set of possible fields for this statement.

Some questions when I run ExampleMain2

Hello, I wonder if ExampleMain2 i.e. the forwardquery of the solver computes all the reachable sites on the control flow graph for the seeds? And if I want to instantiate a data flow analysis, should I use class boomerang.guided.DemandDrivenGuidedAnalysis or class ideal.IDEALAnalysis? Thank you very much!

Do not propagate into Summary when using DemandDrivenGuidedAnalysis

AllocationSites and Aliases not found when there is a call to an empty method

For the following target program:

public final class V {

  static Vector v;

  public static void main(String[] args) {
    Vector x = new Vector();
    v = x;
    foo();
    v.firstElement();
  }

  public static void foo() {
  }
}

I create an alias query at v.firstElement():

BackwardQuery:

($stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack3.firstElement() -> return)

After query solving I get:

getAllocationSites:
{}

getAllAliases:
[]

But if I remove the empty method call, it works as expected:

public final class V {

  static Vector v;

  public static void main(String[] args) {
    Vector x = new Vector();
    v = x;
    //foo();
    v.firstElement();
  }

  public static void foo() {
  }
}

I create an alias query at v.firstElement():

The Same BackwardQuery:

($stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack3.firstElement() -> return)

After query solving I get:

getAllocationSites:
{ForwardQuery: ($stack2 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack2 = new Vector -> $stack2.<init>())=boomerang.results.AbstractBoomerangResults$Context@e307c342}

getAllAliases:
[$stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>), 
$stack2 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>), 
x (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
StaticField: v<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>]

I am not sure if my configuration is wrong. I use the following bomerang options:

static class BoomerangOptions extends DefaultBoomerangOptions{
    @Override
    public boolean onTheFlyCallGraph() {
        return false;
    }

    public StaticFieldStrategy getStaticFieldStrategy() {
        return StaticFieldStrategy.FLOW_SENSITIVE;
    };

    @Override
    public boolean allowMultipleQueries() {
        return true;
    }

    @Override
    public boolean throwFlows() {
        return true;
    }

    @Override
    public boolean trackAnySubclassOfThrowable() {
        return true;
    }
}

How to share context with follow up queries?

I have the following test program and want to perform a BackwardQuery for variable bar in line

ConstantPropagationTest.assertReachesSources(bar, "java.util.Scanner:");

The query gives me the AllocVal

        return new StringBuilder(s).toString();

and from there, I want to start a follow up query for s.

import java.util.Scanner;
import ConstantPropagationTest;

public class Test1 {

    /**
     * Our constant propagation finds the definition of the return value here
     * and knows (by looking up a list of builtin summaries) that it needs to
     * perform a follow-up {@link boomerang.BackwardQuery} for s.
     * However, this follow-up query does not share the {@link boomerang.Context} of the original
     * query, so when it steps out of {@link #loosingTrackOfThings(String)}, it does not remember
     * that it came from {@link #entryPoint()} and explores all possible call sites instead.
     *
     * @param s Some string that's being passed through.
     * @return The same string that was passed as an argument.
     */
    public static String loosingTrackOfThings(String s) {
        return new StringBuilder(s).toString();
    }


    public void entryPoint() {
        Scanner scanner = new Scanner(System.in);
        String fooTest1 = scanner.next();
        final String bar = loosingTrackOfThings(fooTest1);
        ConstantPropagationTest.assertReachesSources(bar, "java.util.Scanner:");
    }

    public void unreleatedMethod1() {
        Scanner scanner = new Scanner(System.in);
        String fooTest1 = scanner.nextLine();
        final String bar = loosingTrackOfThings(fooTest1);

    }

    public void unreleatedMethod2() {
        Scanner scanner = new Scanner(System.in);
        String fooTest1 = scanner.toString();
        final String bar = loosingTrackOfThings(fooTest1);
    }
}

Currently, the second backward query finds reaching definitions in entryPoint, unreleatedMethod1, and unreleatedMethod2 because the second query doesn't get the call stack from the first.

Is there a way to share information between queries in Boomerang?

CryptoAnalysis gets StackOverflowError when analyzing it's own jar file

The problem when Analysis runs it's own jar file. StackOverflowError is found while investigating the issue. The snapshot of the error is here-

This issue is found while investigating main issue from CryptoAnalysis.

Stack overflow in apache fink

When running BoomerangPretransformer on ~/git/flink/build-target/lib/flink-table_2.11-1.12-SNAPSHOT.jar in Apache Fink, there is a stack overflow:

java.lang.StackOverflowError
	at soot.options.Options.v(Options.java:42)
	at soot.AbstractSootFieldRef.checkStatic(AbstractSootFieldRef.java:113)
	at soot.AbstractSootFieldRef.resolve(AbstractSootFieldRef.java:131)
	at soot.AbstractSootFieldRef.resolve(AbstractSootFieldRef.java:109)
	at soot.jimple.internal.AbstractInstanceFieldRef.getField(AbstractInstanceFieldRef.java:96)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:298)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
	at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)

Confused about getInvokedMethodOnInstance and exclusion behaviors

As I mentioned on the other tracker, I'm still a bit confused about how Boomerang's forward queries interact with exclusions. I've crafted a more complete small reproduction that I think illustrates my confusion.

Here's a summary:

If I craft a ForwardQuery pointing to new ArrayList, Boomerang finds the methods invoked on that instance correctly.
If I do the same thing pointing to my custom Foo class (defined below), Boomerang finds no invoked methods on it.
If I change my Soot setup to exclude Foo from analysis (just uncomment the commented line below), then Boomerang successfully finds the invoked methods on it.

For a simple example like this I can obviously just exclude Foo, but in the real world I often want to analyze aspects of Foo (and my Boomerang backward queries might jump into it) as well as getting methods invoked on its instances.

Can you help me understand why forward queries seem so sensitive to exclusions?

Harness code (Boomerang 3.0.2 from this repo + Soot 4.1)

import boomerang.Boomerang;
import boomerang.BoomerangOptions;
import boomerang.ForwardQuery;
import boomerang.scene.*;
import boomerang.scene.jimple.*;
import soot.*;
import soot.jimple.AssignStmt;
import soot.options.Options;

import java.util.Arrays;
import java.util.Collections;
import java.util.Map;

public class Repro {
  static BoomerangOptions opts = new IntAndStringBoomerangOptions();

  private static void setupSoot(String classPath) {
    Options.v().set_whole_program(true);
    Options.v().setPhaseOption("cg.spark", "on");
    Options.v().set_no_bodies_for_excluded(true);
    Options.v().set_allow_phantom_refs(true);
    Options.v().set_keep_line_number(true);

    /* ********* Uncomment this line to see methods invoked on Foo! ********* */
    // Options.v().set_exclude(Collections.singletonList("Foo"));

    Options.v().setPhaseOption("jb", "use-original-names:true");
    Options.v().set_soot_classpath(classPath);
    Options.v().set_prepend_classpath(true);
    Options.v().set_process_dir(Arrays.asList(classPath.split(":")));
    Scene.v().loadNecessaryClasses();
  }

  private static void analyze() {
    PackManager.v().getPack("wjtp").add(new Transform("wjtp.repro", new ReproTransformer()));
    PackManager.v().getPack("cg").apply();
    PackManager.v().getPack("wjtp").apply();
  }

  public static void main(String[] args) {
    setupSoot("Test.jar");
    analyze();
  }

  private static Map<Statement, DeclaredMethod> getMethodsInvokedFromInstanceInStatement(Method method, AssignStmt as) {
    Boomerang solver = new Boomerang(new SootCallGraph(), SootDataFlowScope.make(Scene.v()), opts);
    Statement stmt = JimpleStatement.create(as, method)[0];
    Val var = new AllocVal(new JimpleVal(as.getLeftOp(), method), stmt, new JimpleVal(as.getRightOp(), method));
    ForwardQuery fwq = new ForwardQuery(stmt, var);
    return solver.solve(fwq).getInvokedMethodOnInstance();
  }

  static class ReproTransformer extends SceneTransformer {
    @Override
    protected void internalTransform(String name, Map<String, String> options) {
      BoomerangPretransformer.v().reset();
      BoomerangPretransformer.v().apply();

      SootMethod m = Scene.v().getMethod("<Test: java.util.List foos()>");
      Method method = JimpleMethod.of(m);

      System.out.println("All method units:");
      for (Unit u : m.getActiveBody().getUnits()) {
        System.out.println("\t" + u.toString());
      }
      AssignStmt newFoo = (AssignStmt) m.getActiveBody().getUnits().stream().filter(
          x -> x.toString().contains("$stack2 = new Foo")).findFirst().get();
      AssignStmt newList = (AssignStmt) m.getActiveBody().getUnits().stream().filter(
          x -> x.toString().contains("$stack4 = new java.util.LinkedList")).findFirst().get();

      // This will only show results if set_exclude above gets uncommented
      System.out.println("\nFoo invoked methods:");
      for (Map.Entry<Statement, DeclaredMethod> e : getMethodsInvokedFromInstanceInStatement(method, newFoo).entrySet()) {
        System.out.println("\t" + e.getKey().toString());
        System.out.println("\t\t" + e.getValue().toString());
      }

      // This will show results regardless of the exclusion
      System.out.println("\nList invoked methods:");
      for (Map.Entry<Statement, DeclaredMethod> e : getMethodsInvokedFromInstanceInStatement(method, newList).entrySet()) {
        System.out.println("\t" + e.getKey().toString());
        System.out.println("\t\t" + e.getValue().toString());
      }
    }
  }
}

Code for analysis

import java.util.LinkedList;
import java.util.List;

class Foo {
	void bar() {
		System.out.println("zomg bar");
	}

	public void baz() {
		System.out.println("zomg baz");
	}
}

public class Test {
	public static List<Foo> foos() {
		Foo foo = new Foo();
		foo.baz();
		System.out.println(foo);
		List<Foo> x = new LinkedList<>();
		x.add(foo);
		foo.bar();
		return x;
	}

	public static void main(String[] args) {
		System.out.println(foos());
	}
}

How I produced `Test.jar` that the harness consumes

javac Test.java
jar cvf Test.jar *.class