Giter Club home page Giter Club logo

fp-benchmarks-aachen's Introduction

Floating point benchmarks

This is a collection of synthetic and "real world" benchmarks written in C that use floating point numbers.

The benchmarks use the spec.yml format used by fp-bench.

Several category name conventions are used

  • ase_2017 - Benchmarks intended for study
  • aachen - Benchmarks from RWTH Aachen university
  • synthetic - Benchmarks that are synthetic
  • real_world - Benchmarks that at least partially represent real world code.

Build system

Drop this directory into the benchmarks folder in fp-bench with the name aachen.

fp-benchmarks-aachen's People

Contributors

danielschemmel avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

fp-benchmarks-aachen's Issues

gmp_klee_inv_arg.x86_64 crashes KLEE by calling `raise()`

@danielschemmel @afd @ccadar

The gmp_klee_inv_arg.x86_64 benchmark crashes our fork of KLEE because libgmp's error handler tries to call raise(SIGFPE). Our version of KLEE just calls this as an external which causes KLEE is effectively call kill() on itself.

The spec file currently contains

  klee_inv_arg:
    defines:
      KLEE: null
      INVALID_ARGUMENTS: null
    dependencies:
      klee_runtime: {}
    description: >
      Tests if gmp correctly rejects invalid arguments. >
      This should never reach the assert in the main function.
    verification_tasks:
      no_assert_fail:
        correct: true
      no_reach_error_function:
        correct: false
        counter_examples:
          -
            description: 'Expected Error'
            locations:
              -
                description: 'first method of abortion'
                file: 'gmp-6.1.1/invalid.c'
                line: 84

The expected failure is on line 84 which is a call to abort(). Notice before that call is raise(SIGFPE).

void
__gmp_invalid_operation (void)
{
  raise (SIGFPE);
  abort ();
}

@danielschemmel I'm not sure how you to chose to handle raise() but the fact you put the abort() as the failure sounds like you are treated it as a no-op which isn't correct. KLEE doesn't currently have proper support for modelling signals so handling raise() properly is a bit out of scope.

How do you want to handle this? The simplest thing is to remove the raise(SIGFPE)

Ninja & GMP

The GMP test will not build with ninja. Since it does seem to build with make, further analysis is currently deferred.

syn/atof_default.x86_64 is incorrect despite the spec saying it should be correct

Our tool just found a bug in this benchmark

Error: ASSERTION FAIL: out < 1
File: /home/user/fp-bench/benchmarks/c/aachen/syn/atof/test.c
Line: 14
assembly.ll line: 87

The generated test for this makes the "input" be

object    0: name: b'input'
object    0: size: 8
object    0:
object   0[0]: 0x00
object   0[1]: 0x00
object   0[2]: 0x32
object   0[3]: 0x65
object   0[4]: 0x31
object   0[5]: 0x00
object   0[6]: 0x00
object   0[7]: 0x00

effectively this means we pass the following C string to atof() during execution

"0.2e1"

This is 2.0 which means assert(out < 1) fails. This is a genuine bug which I have reproduced natively.

We never found this bug before because we previously had never let our tool execute for so long (~ 1hour).

The specification for `real/sorting_doubles` is incorrect.

The spec for this benchmark claims it is correct w.r.t to all verification tasks. This is not correct.

Our tool reports that an assertion can fail.

real/sorting/main.c:10: ASSERTION FAIL: arr[i] >= arr[i - 1]

I've checked the generated test case. Which gives these concrete values

ktest file : 'test000018.ktest'
args       : ['/tmp/sorting_doubles.x86_64.bc']
num objects: 2
object    0: name: b'a'
object    0: size: 8
object    0: data: -inf
object    1: name: b'b'
object    1: size: 8
object    1: data: inf

I've taken the ktest file and replayed it on the native binary and confirmed. The problem here I think is that when a is -inf and b is inf then you will have an element a + b in the array being sorted which evaluates to NaN

double arr[] = {10., a, b, .3, .2, 1., 3., a + b, 1., 8.};

after doing a sort the NaN will still be in the array and will not be comparable to any other element.

`ld-unnormal` requires more investigation

This benchmark fails depending on which compiler I use when I try to run it natively.

# Clang 3.4
~/dev/klee/llvm/build/Release+Debug+Asserts/bin/clang -O0 -lm syn/ld-unnormal/test.c  && ./a.out

No failures

# Clang 3.9
clang -O0 -lm syn/ld-unnormal/test.c  && ./a.out

No failures

# gcc (GCC) 6.2.1 20160830
gcc -O0 -lm syn/ld-unnormal/test.c  && ./a.out
a.out: syn/ld-unnormal/test.c:12: main: Assertion `!isnan(l)' failed.
Aborted (core dumped)

assert fails.

klee_assume(f == f) is not a good way to ignore paths where symbolics are NaN

I've noticed that the syn/halve/test.c benchmark does this

float f = 131, g;
#if defined(USE_KLEE)
klee_make_symbolic(&f, sizeof(f), "f");
#endif
klee_assume(f == f);

I presume that the use of klee_assume(f == f) is avoid considering paths where f is a nan. Although this will probably work I don't think this is a good idea for two reasons

  • It is not very clear what is being done here. klee_assume(isnan(f)) is much clearer.
  • Very aggressive compilers might optimize this to klee_assume(true). I've not seen this happen but apparently is can under -ffast-math. Although this is not a scenario we are checking I see no reason to avoid using the isnan() here.

Mistakes in `aachen/syn/sqr/spec.yml`

The spec for this file and the corresponding source file have some problems.

For the aachen/syn/sqr_double-flow.x86_64.bc benchmark we observe

KLEE: ERROR: /home/user/fp-bench/benchmarks/c/aachen/syn/sqr/test.c:45: ASSERTION FAIL: a * 0.99 <= b && a >= b * 0.99 && "this should have an error path..."

According to the spec we should only see an error on line 47.

Taking a closer look at ``aachen/syn/sqr/spec.yml` it suggests to me that.

  • For double-flow the ENABLE_FLOW define should be called ENABLE_INFINITY in the spec
  • For longdouble-flow the ENABLE_FLOW define should be called ENABLE_INFINITY in the spec

in order for the provided counter examples to make sense. Of course you could rename the macro in the source file instead. I don't mind which.

`ld-unnormal_default.x86_64` is mis-compiled by Clang when optimizations are enabled

When ld-unnormal_default.x86_64 is built with optimizations enabled the benchmark is aggressively constant folded so many of the checks are not performed at run time. A bug in LLVM means this constant folding is done incorrectly which results in the benchmark always failing an assert when compiled with Clang and optimizations enabled.

The solution for now is to always build this benchmark without optimizations. The bug in LLVM will not be simple to fix so this will do for now.

I'll be submitting a fix soon.

sqr_ family of benchmarks incorrect

Aachen's fork of KLEE detected a problem with the aachen/syn/sqr_float-flow.x86_64 benchmark. This problem suggests that there is very likely to be a problem with all the variants of this benchmark. The assertion that fails is

assert(b == INFINITY || b == 0 || (a * 0.99 <= b && a >= b * 0.99) && "this should pass...");

Sidenote: Because the variables here are float the constants should be 0.99f. However the bug is still present even if this minor problem is removed.

Our (Imperial's) version of KLEE did not detect an error here due to the way we set up our solver timeout.

The problematic value generated for f is -0x1.8100f8p-75 (C99 hex float format). This is approximately -3.9808458254123446e-23.

Native replay of the generated KTest file confirms that the bug is reproducible.

The a *0.99 <= b condition is false for the value of f in the generated test case.

This assertion appears in the noflow (it's expected to fail here though) and long double and double variants too so I would expect those variants to fail too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.