Giter Club home page Giter Club logo

Comments (12)

dbeyer avatar dbeyer commented on June 5, 2024

The problem with the pull request that I accidentally merged (sorry again!)
was that it mixes too many different things.
Please create separate pull requests, one for each concern.

For example, you made changes to the continuous integration.
That should be explained and documented by one pull request.

Then you contributed verification tasks. Those should be contributed separately,
with a description file README.txt and licence file LICENSE.txt in each directory,
such that people know that they are free to use those.
Ideally, one pull request for one category.

Then, you added some code for the preprocessing demo category.
This includes a preproc configuration file that defines how to preprocess.

Finally, there is something that you call "SV-COMP benchmark".
I do not understand what you are referring to by this name, but at least,
obviously, this name is reserved for the whole repository (sv-benchmarks).
We use the notion benchmark (subject) to refer to a set of verification tasks
to benchmark (verb) certain characteristics of verification tools.
Perhaps call it mkPreproc or something like this, if it makes
a verification task from a set of source and include files.
This would fit to the category for which it is meant.

Best regards,
Dirk

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@dbeyer

The problem with the pull request that I accidentally merged (sorry again!)
was that it mixes too many different things.
Please create separate pull requests, one for each concern.

This doesn't really make sense because most of these things are intertwined with each other. I can't have separate PRs in parallel because they would be interdependent.

I grant you that the TravisCI changes could be broken down further. However

  • I wanted testing for the new SVCB infrastructure to be enabled from the beginning
  • The TravisCI changes are done in a separate commit.

For example, you made changes to the continuous integration.
That should be explained and documented by one pull request.

As noted above the TravisCI changes were in their own commit and documented in the commit message. They appeared in the same PR because those changes are needed for testing the SVCB infrastructure.

Then you contributed verification tasks. Those should be contributed separately,
with a description file README.txt and licence file LICENSE.txt in each directory,
such that people know that they are free to use those.
Ideally, one pull request for one category.

I'm happy to add a README.txt and LICENSE.txt files but these tasks are not meant to be permanent. They are "examples" which illustrate how to use the SVCB infrastructure. My expectation is that we would remove once we have some real benchmarks contributed.

The SVCB infrastructure also cannot be tested without these examples which is why they were all in the same PR.

Then, you added some code for the preprocessing demo category.
This includes a preproc configuration file that defines how to preprocess.

I don't know what code you are talking about here. Are you talking about the spec.yml files?
If so you've misunderstood their purpose. Their purpose is define properties about the benchmarks, one of the being what source files constitute a benchmark. Other properties include the properties being checked, what expected results are, architecture etc..

Read the schema to see the full description

https://github.com/sosy-lab/sv-benchmarks/blob/799bbb9d162c8206aa72f248dc652e887dfc6ce4/svcb/svcb/schema.yml

Finally, there is something that you call "SV-COMP benchmark".
I do not understand what you are referring to by this name, but at least,
obviously, this name is reserved for the whole repository (sv-benchmarks).
We use the notion benchmark (subject) to refer to a set of verification tasks
to benchmark (verb) certain characteristics of verification tools.
Perhaps call it mkPreproc or something like this, if it makes
a verification task from a set of source and include files.
This would fit to the category for which it is meant.

mkPreproc is not an appropriate name and I think shows a lack of understanding of what was actually in #139

The main thing that was actually in #139 is the "SVCB" infrastructure which is

  • A benchmark specification format defined by a schema (the fact that these use non-preprocessed source files is an implementation detail and should not be the main focus here)
  • A svcb python module which provides parsing and validation of benchmark specification files (spec.yml). It also provides code for generating build system target declarations, currently only for CMake.
  • CMake code to support building declared benchmarks.
  • Benchmark statistic gathering tools. Currently there is only one (category-count.py)
  • A svcomp.h and stub implementation for building benchmarks. This is referred to as the "SV-COMP runtime". This comes with Doxygen generated documentation.

So to repeat again. The main focus is the "SVCB" infrastructure. Not the "preprocessing".

I am happy to change the svcb name but not to mkPreproc. `svcb`` seemed like an appropriate name because it is very short and it is infrastructure for the SV-COMP benchmarks.

Anyway here's my proposal for moving forward

  1. I'll create a PR for most of the TravisCI changes that #139. Obviously the changes that test the "SVCB" infrastructure will have to be left out
  2. We can do some "bikeshedding" to determine a name for the infrastructure currently known as "SVCB".
  3. Once we have a name I can create a PR that adds the infrastructure along with the examples and testing.

Note for the above proposes splitting #139 (along with a name change of some bits and adding licenses) in to two PRs. I don't think it makes sense to split into more PRs

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

A few ideas for a replacement name for the "SVCB" infrastructure:

  • SVCBS - SV-COMP Benchmark Support (python module name svcbs)
  • SVCOMP-BENCHMARK-SUPPORT - " (python module name svcomp_benchmark_support)
  • TSIFNAS - The Support infrastructure formerly known as SVCB (not a serious suggestion ;) )

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@zvonimir Seeing as you some interest you may wish to give your opinion here.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on June 5, 2024

A few ideas for a replacement name for the "SVCB" infrastructure:

  • SVCBS - SV-COMP Benchmark Support (python module name svcbs)
  • SVCOMP-BENCHMARK-SUPPORT - " (python module name svcomp_benchmark_support)

Why are all these names specific for SV-COMP? As far as I see it, this repository is a benchmark suite for software verification, and SV-COMP is just one of its users. There are even tasks in this repository that are not part of SV-COMP and never will be.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on June 5, 2024

Regarding the actual proposal, I think it is way too early to merge it. There are a lot of open questions about what do we actually want, and these need to be discussed in the community. Afterwards we should talk about a design, and then an implementation could be proposed and eventually merged.

A prototype like you presented can be helpful for such discussions, so thank you for that, but we can have the discussion without it being merged.

The mailing list is probably a much better place for having general discussions than GitHub, e.g., because threading support is missing here. I won't start a discussion there, but I am going to list the first questions that came to my mind here such that the people who want to start discussing this can see what kind of questions I am referring to:

  • Do we even need a completely new infrastructure/metadata format at all? For example, C verification tasks that need pre-processing could simply be put in a directory per task and we tell the verifier to verify *.c in that directory. Necessary defines could be added to a header in the directory, and architecture would be defined per category like it is now.
  • What should be the scope of the new infrastructure? Only tasks written in C(++) without preprocessing? C files that are already preprocessed? Java verification tasks? Other languages? The name and presentation of the current prototype look like it should be for all languages, but in fact it seems pretty specific to C (for example it contains svcomp.h etc., so either it is only for C or the proposal does in fact mix too many things together).
  • Do we want the metadata about the tasks to be read by the verifier, or only by the benchmarking infrastructure? Is it acceptable to force all verifiers to parse and understand a complex metadata file? Otherwise, how will be pass all the relevant information from the benchmarking infrastructure to the verifier?
  • How much metadata do we actually want/need? How much variation do we want to allow, for example regarding variants of the input language etc.? We need to consider that for the verifiers it is less effort if there is as little variation as possible, so maybe aim for as little metadata as technically necessary?
  • Do we need to be able to specify completely different metadata for every single verification task, or is it enough to have sets of files (e.g., per directory) that share the metadata except the expected verification result?
  • Do we want to have different verification tasks that consist of the same source file(s) and property, i.e., differ only in some other flag? If so, how will we present them in tables etc.? How big would be the danger of confusing these tasks? Can people not familiar with the setup identify and use such verification tasks correctly? Do different verification tasks that have the same source file(s) and property need to have different expected results?

An important argument that we should have in mind when discussing this is the usability of all this outside of SV-COMP. For example, it is absolutely important that everybody can use the benchmark suite defined here as easily as possible when they do benchmarking for one of their papers, even when they need to compare against a verifier that is neither their own nor has at anytime participated in SV-COMP and thus does not support any kind of metadata format defined here.

Otherwise we will again start getting papers with insufficient experimental evaluation because of lack of (usable) benchmarks (to avoid this is one of the most important goals of SV-COMP), or worse, we would maybe even get papers with wrong experimental results because people used the benchmarks in a way that took not all relevant metadata in account.

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@sfsiegel you may be interested in this.

from sv-benchmarks.

sfsiegel avatar sfsiegel commented on June 5, 2024

I really think we should go ahead and do the merge.
People can look at it, play with it, and see what they think.
All of Phillipp's questions are reasonable, but also Dan and I have given many of them a lot of thought (with input from others) and we think we have come up with a solution that addresses them.
So I would like to see the basic system up and running with a few examples before making more suggestions. Then we will have something concrete to discuss.

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@PhilippWendler

Why are all these names specific for SV-COMP? As far as I see it, this repository is a benchmark suite for software verification, and SV-COMP is just one of its users. There are even tasks in this repository that are not part of SV-COMP and never will be.

Because the benchmark specification format is tied to and driven by SV-COMP. You may say that the benchmark suite is meant just for software verification but in its current state it is really tied to SV-COMP (preprocessed files, .prp and .set files are all things dictated by SV-COMP)

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@PhilippWendler

Regarding the actual proposal, I think it is way too early to merge it. There are a lot of open questions about what do we actually want, and these need to be discussed in the community. Afterwards we should talk about a design, and then an implementation could be proposed and eventually merged.

Merging doesn't necessarily mean that the design and implementation is finalised. My intention was that the contributed code would we iterated on once merged.

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

@PhilippWendler

A prototype like you presented can be helpful for such discussions, so thank you for that, but we can have the discussion without it being merged.

The mailing list is probably a much better place for having general discussions than GitHub, e.g., because threading support is missing here. I won't start a discussion there, but I am going to list the first questions that came to my mind here such that the people who want to start discussing this can see what kind of questions I am referring to:

...

These are good questions to ask. I have thoughts about some of them but these should be discussed on the mailing list. Notably though the list has been pretty silent about this so far as I solicited for comments on some of these quite a while ago.

from sv-benchmarks.

delcypher avatar delcypher commented on June 5, 2024

I've decided I'm not going to waste my time trying to merge this again.

Development will proceed at https://github.com/delcypher/svcomp-build-mockup and discussion can happen its issue tracker or on the sv-comp mailing list.

from sv-benchmarks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.