fp-benchmarks-aachen's Issues
`ld-unnormal_default.x86_64` is mis-compiled by Clang when optimizations are enabled
When ld-unnormal_default.x86_64
is built with optimizations enabled the benchmark is aggressively constant folded so many of the checks are not performed at run time. A bug in LLVM means this constant folding is done incorrectly which results in the benchmark always failing an assert when compiled with Clang and optimizations enabled.
The solution for now is to always build this benchmark without optimizations. The bug in LLVM will not be simple to fix so this will do for now.
I'll be submitting a fix soon.
Ninja & GMP
The GMP test will not build with ninja. Since it does seem to build with make, further analysis is currently deferred.
syn/atof_default.x86_64 is incorrect despite the spec saying it should be correct
Our tool just found a bug in this benchmark
Error: ASSERTION FAIL: out < 1
File: /home/user/fp-bench/benchmarks/c/aachen/syn/atof/test.c
Line: 14
assembly.ll line: 87
The generated test for this makes the "input" be
object 0: name: b'input'
object 0: size: 8
object 0:
object 0[0]: 0x00
object 0[1]: 0x00
object 0[2]: 0x32
object 0[3]: 0x65
object 0[4]: 0x31
object 0[5]: 0x00
object 0[6]: 0x00
object 0[7]: 0x00
effectively this means we pass the following C string to atof()
during execution
"0.2e1"
This is 2.0
which means assert(out < 1)
fails. This is a genuine bug which I have reproduced natively.
We never found this bug before because we previously had never let our tool execute for so long (~ 1hour).
sqr_ family of benchmarks incorrect
Aachen's fork of KLEE detected a problem with the aachen/syn/sqr_float-flow.x86_64
benchmark. This problem suggests that there is very likely to be a problem with all the variants of this benchmark. The assertion that fails is
assert(b == INFINITY || b == 0 || (a * 0.99 <= b && a >= b * 0.99) && "this should pass...");
Sidenote: Because the variables here are float
the constants should be 0.99f
. However the bug is still present even if this minor problem is removed.
Our (Imperial's) version of KLEE did not detect an error here due to the way we set up our solver timeout.
The problematic value generated for f
is -0x1.8100f8p-75
(C99 hex float format). This is approximately -3.9808458254123446e-23
.
Native replay of the generated KTest file confirms that the bug is reproducible.
The a *0.99 <= b
condition is false for the value of f
in the generated test case.
This assertion appears in the noflow (it's expected to fail here though) and long double and double variants too so I would expect those variants to fail too.
gmp_klee_inv_arg.x86_64 crashes KLEE by calling `raise()`
The gmp_klee_inv_arg.x86_64
benchmark crashes our fork of KLEE because libgmp's error handler tries to call raise(SIGFPE)
. Our version of KLEE just calls this as an external which causes KLEE is effectively call kill()
on itself.
The spec file currently contains
klee_inv_arg:
defines:
KLEE: null
INVALID_ARGUMENTS: null
dependencies:
klee_runtime: {}
description: >
Tests if gmp correctly rejects invalid arguments. >
This should never reach the assert in the main function.
verification_tasks:
no_assert_fail:
correct: true
no_reach_error_function:
correct: false
counter_examples:
-
description: 'Expected Error'
locations:
-
description: 'first method of abortion'
file: 'gmp-6.1.1/invalid.c'
line: 84
The expected failure is on line 84 which is a call to abort()
. Notice before that call is raise(SIGFPE)
.
void
__gmp_invalid_operation (void)
{
raise (SIGFPE);
abort ();
}
@danielschemmel I'm not sure how you to chose to handle raise()
but the fact you put the abort()
as the failure sounds like you are treated it as a no-op which isn't correct. KLEE doesn't currently have proper support for modelling signals so handling raise()
properly is a bit out of scope.
How do you want to handle this? The simplest thing is to remove the raise(SIGFPE)
klee_assume(f == f) is not a good way to ignore paths where symbolics are NaN
I've noticed that the syn/halve/test.c
benchmark does this
float f = 131, g;
#if defined(USE_KLEE)
klee_make_symbolic(&f, sizeof(f), "f");
#endif
klee_assume(f == f);
I presume that the use of klee_assume(f == f)
is avoid considering paths where f
is a nan. Although this will probably work I don't think this is a good idea for two reasons
- It is not very clear what is being done here.
klee_assume(isnan(f))
is much clearer. - Very aggressive compilers might optimize this to
klee_assume(true)
. I've not seen this happen but apparently is can under-ffast-math
. Although this is not a scenario we are checking I see no reason to avoid using theisnan()
here.
Mistakes in `aachen/syn/sqr/spec.yml`
The spec for this file and the corresponding source file have some problems.
For the aachen/syn/sqr_double-flow.x86_64.bc
benchmark we observe
KLEE: ERROR: /home/user/fp-bench/benchmarks/c/aachen/syn/sqr/test.c:45: ASSERTION FAIL: a * 0.99 <= b && a >= b * 0.99 && "this should have an error path..."
According to the spec we should only see an error on line 47.
Taking a closer look at ``aachen/syn/sqr/spec.yml` it suggests to me that.
- For
double-flow
theENABLE_FLOW
define should be calledENABLE_INFINITY
in the spec - For
longdouble-flow
theENABLE_FLOW
define should be calledENABLE_INFINITY
in the spec
in order for the provided counter examples to make sense. Of course you could rename the macro in the source file instead. I don't mind which.
`ld-unnormal` requires more investigation
This benchmark fails depending on which compiler I use when I try to run it natively.
# Clang 3.4
~/dev/klee/llvm/build/Release+Debug+Asserts/bin/clang -O0 -lm syn/ld-unnormal/test.c && ./a.out
No failures
# Clang 3.9
clang -O0 -lm syn/ld-unnormal/test.c && ./a.out
No failures
# gcc (GCC) 6.2.1 20160830
gcc -O0 -lm syn/ld-unnormal/test.c && ./a.out
a.out: syn/ld-unnormal/test.c:12: main: Assertion `!isnan(l)' failed.
Aborted (core dumped)
assert fails.
The specification for `real/sorting_doubles` is incorrect.
The spec for this benchmark claims it is correct w.r.t to all verification tasks. This is not correct.
Our tool reports that an assertion can fail.
real/sorting/main.c:10: ASSERTION FAIL: arr[i] >= arr[i - 1]
I've checked the generated test case. Which gives these concrete values
ktest file : 'test000018.ktest'
args : ['/tmp/sorting_doubles.x86_64.bc']
num objects: 2
object 0: name: b'a'
object 0: size: 8
object 0: data: -inf
object 1: name: b'b'
object 1: size: 8
object 1: data: inf
I've taken the ktest file and replayed it on the native binary and confirmed. The problem here I think is that when a
is -inf
and b
is inf
then you will have an element a + b
in the array being sorted which evaluates to NaN
fp-benchmarks-aachen/real/sorting/main.c
Line 20 in 4f8338a
after doing a sort the NaN will still be in the array and will not be comparable to any other element.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.