Giter Club home page Giter Club logo

Comments (13)

wintersteiger avatar wintersteiger commented on August 28, 2024

There are various max_* options in Z3, but I don't think any of them would be a solution to your problem. Could you elaborate on why timeouts are not good enough for you?

As long as the default Z3 tactics are used, there will always be an element of non-determinism, because some of them use timers on pre-processors, so sometimes it will end up with a whole different simplification of the problem, such that any other counter on conflicts or similar things will still be non-deterministic. (see e.g., all the try_for's in the QF_LIA default tactic: https://github.com/Z3Prover/z3/blob/master/src/tactic/smtlogics/qflia_tactic.cpp). In some cases the SMT kernel is also not used at all, because some other part of the tactical framework will solve the problem without ever counting any conflicts, but still requiring non-deterministic, long runtime. I don't think there is a simple solution to this problem that's better than using time/memouts.

from z3.

kanigsson avatar kanigsson commented on August 28, 2024

Thanks for your answer, and sorry for the late reply.

Let me first try to motivate why timelimit/memlimit is not good enough for us.

We have a testsuite with over 10000 VCs, and we run that testsuite on 4 different platforms (windows, linux, darwin, 32 and 64bits). We have an oracle which says which VCs should be proved and which not. Any fluctuation in the results requires quite costly manual inspection ...

Sometimes, we change our build/test machine for something more powerful, and of course the developers run part of the testsuite on their own (less powerful) machine. Sometimes the test machine is under load because of some other job. So we need reproducible results of all involved tools despite this quite diverse environments, and timeouts are not a realistic way to achieve this.

If a steps limit is used, then yes having an internal timeout like you describe is a no-no. But this seems to be easily fixable, you can just replace the timeout by (again) a counter limit. For example, the preprocessor is allowed to do n simplification steps.

Yes, when you use a steps limit, then all parts of the solver that consume non-negligible time need to contribute to increasing the internal counter. So this includes the preprocessor you mentioned, and the other part of the tactical framework you mentioned.

The alt-ergo and CVC4 tools both had an option for this, but they didn't really work precisely because some part was not contributing to the counter. Finding all those parts was exactly the work we went through with Alt-ergo and CVC4 teams. It works well now.

To get determinism, some other precautions must be taken, for example z3 allows to set various random seeds, these would also be required to be set to obtain reproducibility.

Finally, sometimes people think that steps should correspond linearly to time (e.g. 1000 steps correspond to 1 second or so), but this is probably impossible to achieve and not something we need. All we need is an internal counter which proceeds in a deterministic manner (unlike a timer), and no part of the solver being allowed to hog significant CPU time without the counter being increased.

Hope this explains why we need this and how it could be achieved ...

from z3.

wintersteiger avatar wintersteiger commented on August 28, 2024

Thanks for the detailed explanation. The random seed in Z3 is 0 by default, so that part shouldn't be an issue. Introducing various counters into all relevant parts of Z3 is a non-trivial task as it would touch pretty much every file in the codebase. And it still wouldn't get you there because of other issues, e.g., some hash functions depend on the ordering of objects in memory, which could be different in every run. I still fail to understand the reason for having full determinism: If your VCs didn't change, you can just keep yesterday's result, Z3 won't change it's mind about the result (even if it does change it's mind about how to get there). If there is a new or modified VC, then we don't know the resource limit and we'll have to guess a limit. If that limit is guessed by the engineer, they might just as well write "UNSAT" into a timestamped file at the same time.

from z3.

kanigsson avatar kanigsson commented on August 28, 2024

Hello,

yes, for the hash functions which depend on memory location, I don't see a solution. It seems that CVC4 and Alt-ergo don't do such things.

Considering the "caching" of VCs you describe - we have such a mechanism, but it can't always be used in nightly testing. There are two different use cases: we as a tool vendor, and the customer as a user of the tool.

As a tool vendor, we want to make sure that the tool works well without any kind of cache, and we also want to catch potential problems e.g. in the SMT solvers. So not running the SMT solver in the nightly run (because of using the cache instead) would partly defeat the purpose of nightly testing.

Most users of SPARK are in a context where some kind of certification is required. Some customer explained this problem to us, unfortunately I forgot half of the explanation. What I recall is that a VC cache is bad, because the cache is both an input and an output of the verification process, which is something the certification standards don't like. So they definitely can't use the cache for the "final", validating run of the tool, and they want to make sure regularly that they can do this run at any time, so in the end they will not use the cache in nightly testing, only as a convenience in day-to-day development.

A last thing, the main purpose of steps is not to reproduce proofs reliably, but reproduce any tool output reliably, whether that is UNSAT or UNKOWN or something else. When I run "cvc4 --rlimit=1000 some_vc", I will get always the same result, whether cvc4 manages to prove the VC or not.

I'm not sure I convinced you of the usefulness of the steps limit, what I can assure you that if Z3 has such an option, we will use it :-)

from z3.

yannickmoy avatar yannickmoy commented on August 28, 2024

To complete Johannes's explanation, the determinism is critical in our nightly regression testing to maintain a single oracle (VCs proved or not) across all platforms (maintaining a different set of oracles per platform would be too costly to maintain), and critical in the certification done by our customers (where a certification authority should be able to examine and sometimes reproduce your verification activities). To the point that neither us nor our customers submitted to certification can use a non-deterministic prover (that does not prevent using heuristics and non-determinism internally, but the result should be deterministic).

from z3.

wintersteiger avatar wintersteiger commented on August 28, 2024

Thanks again for the detailed explanation! Your point about certification authorities is very convincing, they definitely won't like anything non-deterministic. We're thinking about potential approaches to this problem, but so far we haven't found anything that solves all those problems. To get a better idea of what would need to be done, can you tell me what kind of theories you use and whether a deterministic tactic that is different from the default tactics (e.g., slower) would be an acceptable solution for you?

from z3.

yannickmoy avatar yannickmoy commented on August 28, 2024

Currently, we're using the theories of non-linear arithmetic over integers and reals, and the theory of bitvectors, and everything else is expressed through axiomatization with uninterpreted functions. In the future, we'll use also the theory of arrays and IEEE 754 floating-points. We may want to use the theory of records at some point.

A slower deterministic tactic would not be a problem for us. It would take a looong time for something that the prover knows to handle to become slower than manual inspection or manual proof.

from z3.

NikolajBjorner avatar NikolajBjorner commented on August 28, 2024

I added a new resource limit to the unstable branch. You can call it as follows:

 z3 10u05.04.smt2 memory_max_alloc_count=10000
 (error "line 43 column 10: number of configured allocations exceeded")

Z3 prints the allocation count if you ask it to return statistics, e.g.,

z3 10u05.04.smt2 -st
sat
(:cnf-encoding-aux-vars 103
 :eliminated-vars       3
 :max-memory            2.22
 :memory                1.18
 :num-allocs            9487736.00
 :time                  0.71
 :total-time            0.71)

from z3.

kanigsson avatar kanigsson commented on August 28, 2024

Thanks a lot! I've run this on our testsuite and overall it seems to work well.

I have a few issues to report, but for this I need to attach relatilvely big VC files. Can you advise how I should do it, as one cannot attach files to github issues? One issue is related to the new option you added, but other issues may be unrelated (mostly crashes).

from z3.

NikolajBjorner avatar NikolajBjorner commented on August 28, 2024

You can upload files using "gist". There is a menu option/link on the top of the github.com pages for creating ad-hoc repositories that can be used for single files.

from z3.

kanigsson avatar kanigsson commented on August 28, 2024

Thanks! Actually I didn't find any major issues with the option you implemented. This one is a minor one.

So please look at this file slow.smt2.

Using the unstable branch, compiled on my linux x86_64, when I do:

    z3 memory_max_alloc_count=8727000 slow.smt2

it takes about 8 seconds on my machine to exceed the limit. But increasing the limit slightly

    z3 memory_max_alloc_count=8728000 slow.smt2

it seems to loop, I stopped z3 after several minutes. It probably means that some internal code does not consume any resources.

from z3.

kanigsson avatar kanigsson commented on August 28, 2024

This one is another issue probably related to the resource limit option, because it doesn't appear without. On this file crash.smt2, if you run z3 with a high enough limit:

z3 memory_max_alloc_count=8728000 crash.smt2

it crashes with a Segmentation fault. This is again on x86_64 linux. It doesn't crash with lower resources (instead one gets the resource limit error) and without the resource limit opiton.

Let me know if you need more information or you want me to create a new issue.

from z3.

NikolajBjorner avatar NikolajBjorner commented on August 28, 2024

Bugs exposed by slow and crash are now fixed in unstable

from z3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.