codeshield-security / spds Goto Github PK
View Code? Open in Web Editor NEWThis project forked from crossingtud/spds
Efficient and Precise Pointer-Tracking Data-Flow Framework
License: Eclipse Public License 2.0
This project forked from crossingtud/spds
Efficient and Precise Pointer-Tracking Data-Flow Framework
License: Eclipse Public License 2.0
Assume the BackwardFlowFunctions are overwritten to kill any data-flow fact for any call to System.exit. Then in the code below
int i = 0;
System.exit(0)
queryFor(i)
If triggering a query for i
just before the queryFor(i)
statement, no kill occurs and the backward query incorrectly reaches i = 0
.
The VectorTest (test2, test4, test5, test6) fails on Java >= 9 with message:
java.lang.RuntimeException: Unsound results: [MustBe [s (typestate.tests.VectorTest.<typestate.tests.VectorTest: void test2()>) @ mustBeInAcceptingState(s) in state ACCEPTING]]
Running on Java 8 the tests succeed, but on Java >= 9 the tests fail. Needs further investigation. Many java.util.Collection
's implementations have been changed between Java 8 and Java 9. We need to inspect the differences of the java.util.Vector
implementations and their effect on Boomerang to see why those tests suddenly fail.
According to README Examples, I have tried to run a IDEal example inference.example.Main
, but it crashed, and the output was:
java.lang.ClassCastException: boomerang.scene.jimple.JimpleVal cannot be cast to boomerang.scene.AllocVal
at boomerang.WeightedBoomerang.forwardSolve(WeightedBoomerang.java:1176)
...
here
I changed it to
var = query.var();
and there is no the crash.
I'm attempting to use SPDS as a Gradle dependency following this and this. I have
implementation 'de.fraunhofer.iem:WPDS:3.0.8'
in my build.gradle
. I don't think the issue is with Gradle because I am able to pull the base "WPDS" package, just not one of its dependencies. I have the following error when building
Execution failed for task '[...]'.
> Could not resolve all files for configuration '[...]'.
> Could not find de.fraunhofer.iem:PathExpression:1.0.0.
Searched in the following locations:
- https://repo.maven.apache.org/maven2/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
- https://repo.maven.apache.org/maven2/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar
- https://jcenter.bintray.com/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
- https://jcenter.bintray.com/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar
- https://maven.pkg.github.com/CodeShield-Security/SPDS/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.pom
- https://maven.pkg.github.com/CodeShield-Security/SPDS/de/fraunhofer/iem/PathExpression/1.0.0/PathExpression-1.0.0.jar
I think I've found a case where Boomerang misses some dataflow (3.1.2).
The reproducer below is based on FlowDroid Listing 2 (PLDI '14), but uses a static setter/getter instead of direct field writes.
boomerangPDS/src/main/java/boomerang/example/BoomerangExampleTarget1.java
public class BoomerangExampleTarget1 {
public static void main(String... args) {
Data p = new Data();
taintIt(customSource(), p);
}
private static String customSource() { return "I'm tainted"; }
private static void customSink(String sunk) { System.out.println(sunk); }
private static void taintIt(String in, Data out) {
Data x = out;
Data.setter(x, in);
customSink(Data.getter(out));
}
static class Data {
private String f;
static void setter(Data self, String value) {
self.f = value;
}
static String getter(Data self) {
return self.f;
}
}
}
If I change Data.setter(x, in);
to Data.setter(out, in);
, then I get a path from the source to the sink.
boomerangPDS/src/main/java/boomerang/example/ExampleMain1.java: createAnalysisTransformer()
private static Transformer createAnalysisTransformer() {
return new SceneTransformer() {
protected void internalTransform(
String phaseName, @SuppressWarnings("rawtypes") Map options) {
SootCallGraph sootCallGraph = new SootCallGraph();
AnalysisScope scope =
new AnalysisScope(sootCallGraph) {
@Override
protected Collection<? extends Query> generate(Edge cfgEdge) {
Statement statement = cfgEdge.getStart();
if (statement.toString().contains("customSource") && statement.containsInvokeExpr()) {
Val arg = statement.getLeftOp();
return Collections.singleton(new ForwardQuery(cfgEdge,
new AllocVal(arg, statement, arg)));
}
return Collections.emptySet();
}
};
Boomerang solver =
new Boomerang(
sootCallGraph, SootDataFlowScope.make(Scene.v()), new DefaultBoomerangOptions() {
@Override
public int analysisTimeoutMS() {
return 10000;
}
});
Collection<Query> seeds = scope.computeSeeds();
for (Query query : seeds) {
System.out.println("Solving query: " + query);
ForwardBoomerangResults<Weight.NoWeight> res = solver.solve((ForwardQuery) query);
if (res.isTimedout()) {
throw new RuntimeException("Timed out");
}
res.asStatementValWeightTable().cellSet().forEach(cell -> {
if (cell.getRowKey().getStart().containsInvokeExpr() &&
cell.getRowKey().getStart().getInvokeExpr().getMethod().getName().contains("customSink") &&
cell.getRowKey().getStart().uses(cell.getColumnKey())) {
System.out.println("SOURCE: " + query.cfgEdge().getStart().toString());
System.out.println("SINK: " + cell.getRowKey().getStart().toString());
}
});
}
}
};
}
The handleMapsBackward and handleMapsForward methods ignore the handleMaps option. It doesn't look like this option is used anywhere in the code.
Currently all references to soot point to ca.mcgill.sable
at version 4.1.0
To ensure SPDS always runs the on the latest version of Soot I'd suggest to change dependencies to org.soot-oss
at version 4.2.1
.
When i try to build SPDS, the server at https://soot-build.cs.uni-paderborn.de/nexus/repository/swt-upb/ throws an 502 Bad Gateway error.
Specifically it fails when i try to obtain the de.fraunhofer.iem:PathExpression:jar:1.0.0 jar file.
Could somebody help me solve this build problem?
Cheers!
The readme was updated to 3.1.2, but the release did not succeed via Actions.
Hello,
Is it possible to create a query that will track two variables?
For example:
void main() {
foo();
bar();
}
void foo() {
queryTarget("a1", "b1");
}
void bar() {
queryTarget("a2", "b2");
}
void queryTarget(String x, String y) {}
Given that my entry point is main, if I query x, I would get a1 and a2. Then if I query y, I would get b1 and b2. Obviously the results I expect are the sets (a1,b1) and (a2,b2), and NOT (a1,b2) or (a2,b1).
How can this be done?
Thanks,
Tom
When subqueries are triggered, and a subquery reaches the entry point of a method, it does not seem to propagate to any callee.
class CustomQuery1{
final String field;
private CustomQuery() {
field = "someInfo";
}
public static void main() {
new CustomQuery1().example();
}
public void example() {
String info = field.toString();
queryFor(info); //Should find "someInfo" but does not.
}
}
class CustomQuery2{
final String field;
private CustomQuery() {
field = "someInfo";
}
public static void main() {
new CustomQuery(2).example();
}
public void example2() {
CustomQuery2 c = new CustomQuery2();
String info = c.field.getData();
queryFor(info); //Should find "someInfo" and does so
}
}
Hi,
I am using Boomerang 3.0.10 and trying to run the provided examples, but with call summaries enabled. The only change I have made to the examples is that I have replaced
// 1. Create a Boomerang solver.
Boomerang solver =
new Boomerang(
sootCallGraph, SootDataFlowScope.make(Scene.v()), new DefaultBoomerangOptions());
with the following code:
// 1. Create a Boomerang solver.
Boomerang solver =
new Boomerang(
sootCallGraph,
SootDataFlowScope.make(Scene.v()),
new DefaultBoomerangOptions() {
@Override
public boolean callSummaries() {
return true;
}
});
Running ExampleMain1 produces the following output:
<boomerang.example.BoomerangExampleTarget1: void <init>()>
<boomerang.example.BoomerangExampleTarget1: void main(java.lang.String[])>
<boomerang.example.BoomerangExampleTarget1: void staticCallOnFile(boomerang.example.BoomerangExampleTarget1$ClassWithField,boomerang.example.BoomerangExampleTarget1$NestedClassWithField)>
<boomerang.example.BoomerangExampleTarget1: void queryFor(boomerang.example.BoomerangExampleTarget1$ObjectOfInterest)>
Solving query: BackwardQuery: (queryVariable (boomerang.example.BoomerangExampleTarget1.<boomerang.example.BoomerangExampleTarget1: void staticCallOnFile(boomerang.example.BoomerangExampleTarget1$ClassWithField,boomerang.example.BoomerangExampleTarget1$NestedClassWithField)>),RS: queryFor(queryVariable))
Boomerang CRASHEDjava.lang.NullPointerException
All allocation sites of the query variable are:
{}
All aliasing access path of the query variable are:
[]
The null pointer exception occurs when Boomerang tries to retrieve a summary automaton during post*, but no summaries have been created yet at this point. I can also provide the whole stack trace if it helps, but it is quite long.
Is there some additional way that I need to configure Boomerang before I can use summaries?
Thanks,
David
Hi Johannas! Hope you are doing well.
My name is Matan. I read your excellent Boomerang paper (over and over again..) after also watching the great DECA I/II course by prof. Bodden (and also reading the SPDS paper). I'm trying to reproduce this paper by re-implement Boomerang for Java in Golang.
I also looked at the code for the Java implementation of Boomerang by your group, but unfortunately I find it impenetrable. This is one of the reasons I want to re-implement it (understand it by building it)
I have a few open questions that I'm still not clear about and would greatly appreciate some clarification.
for example, in figure 1. when performing a forward query starting from context1 at the allocation at line #4 asking which access paths at foo point to it; we encounter a field-write POI at a.f = s
this should trigger an "AllAliases" query at this point which should only look at context1.
How is this information represented and passed to the subquery? Using the same mechanism described at "4.2 Client-Driven Context-Resolution"?
How can the idea about call/return POIs be implemented using SPDS instead of the access graph formulation (i.e. when we have a field-pds & automaton instead of an access graph)? In the access graph formulation we recursively query for all prefixes - How is this translated into the SPDS formulation?
Can dynamic fields and calls be supported as more POIs? by dynamic fields I mean x[f]
where f
is a variable (e.g. hashmaps or arrays), by dynamic calls I mean f(..)
where f
is variable (e.g. lambdas). My thought is that we can set the domain of fields to be all allocation sites and each statement involving dynamic fields as a POI which requires a PointsTo query to obtain the set of possible fields for this statement.
Hello, I wonder if ExampleMain2 i.e. the forwardquery of the solver computes all the reachable sites on the control flow graph for the seeds? And if I want to instantiate a data flow analysis, should I use class boomerang.guided.DemandDrivenGuidedAnalysis or class ideal.IDEALAnalysis? Thank you very much!
For the following target program:
public final class V {
static Vector v;
public static void main(String[] args) {
Vector x = new Vector();
v = x;
foo();
v.firstElement();
}
public static void foo() {
}
}
I create an alias query at v.firstElement()
:
BackwardQuery:
($stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack3.firstElement() -> return)
After query solving I get:
getAllocationSites:
{}
getAllAliases:
[]
But if I remove the empty method call, it works as expected:
public final class V {
static Vector v;
public static void main(String[] args) {
Vector x = new Vector();
v = x;
//foo();
v.firstElement();
}
public static void foo() {
}
}
I create an alias query at v.firstElement()
:
The Same BackwardQuery:
($stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack3.firstElement() -> return)
After query solving I get:
getAllocationSites:
{ForwardQuery: ($stack2 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack2 = new Vector -> $stack2.<init>())=boomerang.results.AbstractBoomerangResults$Context@e307c342}
getAllAliases:
[$stack3 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
$stack2 (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
x (target.typestate.microbenchmark.vector.V.<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>),
StaticField: v<target.typestate.microbenchmark.vector.V: void main(java.lang.String[])>]
I am not sure if my configuration is wrong. I use the following bomerang options:
static class BoomerangOptions extends DefaultBoomerangOptions{
@Override
public boolean onTheFlyCallGraph() {
return false;
}
public StaticFieldStrategy getStaticFieldStrategy() {
return StaticFieldStrategy.FLOW_SENSITIVE;
};
@Override
public boolean allowMultipleQueries() {
return true;
}
@Override
public boolean throwFlows() {
return true;
}
@Override
public boolean trackAnySubclassOfThrowable() {
return true;
}
}
I have the following test program and want to perform a BackwardQuery for variable bar
in line
ConstantPropagationTest.assertReachesSources(bar, "java.util.Scanner:");
The query gives me the AllocVal
return new StringBuilder(s).toString();
and from there, I want to start a follow up query for s
.
import java.util.Scanner;
import ConstantPropagationTest;
public class Test1 {
/**
* Our constant propagation finds the definition of the return value here
* and knows (by looking up a list of builtin summaries) that it needs to
* perform a follow-up {@link boomerang.BackwardQuery} for s.
* However, this follow-up query does not share the {@link boomerang.Context} of the original
* query, so when it steps out of {@link #loosingTrackOfThings(String)}, it does not remember
* that it came from {@link #entryPoint()} and explores all possible call sites instead.
*
* @param s Some string that's being passed through.
* @return The same string that was passed as an argument.
*/
public static String loosingTrackOfThings(String s) {
return new StringBuilder(s).toString();
}
public void entryPoint() {
Scanner scanner = new Scanner(System.in);
String fooTest1 = scanner.next();
final String bar = loosingTrackOfThings(fooTest1);
ConstantPropagationTest.assertReachesSources(bar, "java.util.Scanner:");
}
public void unreleatedMethod1() {
Scanner scanner = new Scanner(System.in);
String fooTest1 = scanner.nextLine();
final String bar = loosingTrackOfThings(fooTest1);
}
public void unreleatedMethod2() {
Scanner scanner = new Scanner(System.in);
String fooTest1 = scanner.toString();
final String bar = loosingTrackOfThings(fooTest1);
}
}
Currently, the second backward query finds reaching definitions in entryPoint
, unreleatedMethod1
, and unreleatedMethod2
because the second query doesn't get the call stack from the first.
Is there a way to share information between queries in Boomerang?
The problem when Analysis runs it's own jar file. StackOverflowError is found while investigating the issue. The snapshot of the error is here-
This issue is found while investigating main issue from CryptoAnalysis.
When running BoomerangPretransformer
on ~/git/flink/build-target/lib/flink-table_2.11-1.12-SNAPSHOT.jar
in Apache Fink, there is a stack overflow:
java.lang.StackOverflowError
at soot.options.Options.v(Options.java:42)
at soot.AbstractSootFieldRef.checkStatic(AbstractSootFieldRef.java:113)
at soot.AbstractSootFieldRef.resolve(AbstractSootFieldRef.java:131)
at soot.AbstractSootFieldRef.resolve(AbstractSootFieldRef.java:109)
at soot.jimple.internal.AbstractInstanceFieldRef.getField(AbstractInstanceFieldRef.java:96)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:298)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
at boomerang.scene.jimple.BoomerangPretransformer.getFieldsDefinedInMethod(BoomerangPretransformer.java:310)
As I mentioned on the other tracker, I'm still a bit confused about how Boomerang's forward queries interact with exclusions. I've crafted a more complete small reproduction that I think illustrates my confusion.
Here's a summary:
ForwardQuery
pointing to new ArrayList
, Boomerang finds the methods invoked on that instance correctly.Foo
class (defined below), Boomerang finds no invoked methods on it.Foo
from analysis (just uncomment the commented line below), then Boomerang successfully finds the invoked methods on it.For a simple example like this I can obviously just exclude Foo
, but in the real world I often want to analyze aspects of Foo
(and my Boomerang backward queries might jump into it) as well as getting methods invoked on its instances.
Can you help me understand why forward queries seem so sensitive to exclusions?
import boomerang.Boomerang;
import boomerang.BoomerangOptions;
import boomerang.ForwardQuery;
import boomerang.scene.*;
import boomerang.scene.jimple.*;
import soot.*;
import soot.jimple.AssignStmt;
import soot.options.Options;
import java.util.Arrays;
import java.util.Collections;
import java.util.Map;
public class Repro {
static BoomerangOptions opts = new IntAndStringBoomerangOptions();
private static void setupSoot(String classPath) {
Options.v().set_whole_program(true);
Options.v().setPhaseOption("cg.spark", "on");
Options.v().set_no_bodies_for_excluded(true);
Options.v().set_allow_phantom_refs(true);
Options.v().set_keep_line_number(true);
/* ********* Uncomment this line to see methods invoked on Foo! ********* */
// Options.v().set_exclude(Collections.singletonList("Foo"));
Options.v().setPhaseOption("jb", "use-original-names:true");
Options.v().set_soot_classpath(classPath);
Options.v().set_prepend_classpath(true);
Options.v().set_process_dir(Arrays.asList(classPath.split(":")));
Scene.v().loadNecessaryClasses();
}
private static void analyze() {
PackManager.v().getPack("wjtp").add(new Transform("wjtp.repro", new ReproTransformer()));
PackManager.v().getPack("cg").apply();
PackManager.v().getPack("wjtp").apply();
}
public static void main(String[] args) {
setupSoot("Test.jar");
analyze();
}
private static Map<Statement, DeclaredMethod> getMethodsInvokedFromInstanceInStatement(Method method, AssignStmt as) {
Boomerang solver = new Boomerang(new SootCallGraph(), SootDataFlowScope.make(Scene.v()), opts);
Statement stmt = JimpleStatement.create(as, method)[0];
Val var = new AllocVal(new JimpleVal(as.getLeftOp(), method), stmt, new JimpleVal(as.getRightOp(), method));
ForwardQuery fwq = new ForwardQuery(stmt, var);
return solver.solve(fwq).getInvokedMethodOnInstance();
}
static class ReproTransformer extends SceneTransformer {
@Override
protected void internalTransform(String name, Map<String, String> options) {
BoomerangPretransformer.v().reset();
BoomerangPretransformer.v().apply();
SootMethod m = Scene.v().getMethod("<Test: java.util.List foos()>");
Method method = JimpleMethod.of(m);
System.out.println("All method units:");
for (Unit u : m.getActiveBody().getUnits()) {
System.out.println("\t" + u.toString());
}
AssignStmt newFoo = (AssignStmt) m.getActiveBody().getUnits().stream().filter(
x -> x.toString().contains("$stack2 = new Foo")).findFirst().get();
AssignStmt newList = (AssignStmt) m.getActiveBody().getUnits().stream().filter(
x -> x.toString().contains("$stack4 = new java.util.LinkedList")).findFirst().get();
// This will only show results if set_exclude above gets uncommented
System.out.println("\nFoo invoked methods:");
for (Map.Entry<Statement, DeclaredMethod> e : getMethodsInvokedFromInstanceInStatement(method, newFoo).entrySet()) {
System.out.println("\t" + e.getKey().toString());
System.out.println("\t\t" + e.getValue().toString());
}
// This will show results regardless of the exclusion
System.out.println("\nList invoked methods:");
for (Map.Entry<Statement, DeclaredMethod> e : getMethodsInvokedFromInstanceInStatement(method, newList).entrySet()) {
System.out.println("\t" + e.getKey().toString());
System.out.println("\t\t" + e.getValue().toString());
}
}
}
}
import java.util.LinkedList;
import java.util.List;
class Foo {
void bar() {
System.out.println("zomg bar");
}
public void baz() {
System.out.println("zomg baz");
}
}
public class Test {
public static List<Foo> foos() {
Foo foo = new Foo();
foo.baz();
System.out.println(foo);
List<Foo> x = new LinkedList<>();
x.add(foo);
foo.bar();
return x;
}
public static void main(String[] args) {
System.out.println(foos());
}
}
Test.jar
that the harness consumesjavac Test.java
jar cvf Test.jar *.class
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.