Giter Club home page Giter Club logo

Comments (7)

jlink avatar jlink commented on July 20, 2024 1

I chose a different path. Random number generation now uses the whole domain between allowed min and max values. However, the full domain is divided into several partitions so that lower numbers are generated with a higher probability. The cut off point is determined by Math.max(tries / 2, 10).

@rkraneis I tried with your original example and evenNumbersAreEvenAndSmall is now successfully falsified.

from jqwik.

jlink avatar jlink commented on July 20, 2024

Documenting the range in which random values are generated is surely necessary. I'll put that on top of my todo list.

There is one fundamental question, though, for which I haven't found a good answer yet: How do you determine a good domain range for numeric types? There are a two opposing forces:

  • Making the range small leads to higher coverage in the (supposedly more common) area of smaller numbers, but misses out on higher numbers.
  • Making the range large leads to lower coverage in smaller numbers.

The PBT tools whose implementation I've had a look at so far, do all cut off the range at some value. They just determine the value with different formulas. I haven't found a compelling rationale for any of these formulas yet.

Do you have a suggestion for a formula that's objectively better than (tries/2 - 3)? To be frank, I've already forgotten why added the "-3"...

What I think would be a real improvement, is to choose a different probability distribution. But that is complicated enough to think twice if the effort will be really worth it.

from jqwik.

rkraneis avatar rkraneis commented on July 20, 2024

Yes, I fully agree that this is not an easy topic (which is why I also proposed just documenting it). Naïvely I would think that a logarithmic coverage and explicitly including <number>.MIN_VALUE, -1, 0, 1 and <number>.MAX_VALUE might give better all purpose coverage. But you are right, the other (I looked at VavrTest, JunitQuickcheck, QuickTheories and scalacheck) frameworks use a linear distribution between <number>.MIN_VALUE and <number>.MAX_VALUE. Only scalacheck also includes -1, 0 and 1.
FWIW my motivation for a logarithmic coverage would be Benford's Law. But that might as well just be opinion ...

from jqwik.

jlink avatar jlink commented on July 20, 2024

FWIW: Integral numbers already include 0, 1, -1, MIN, MAX explicitly. Decimal/floating point numbers also a few more. You don't see them in your example b/c you filter them out.

A logarithmic coverage wouldn't have succeeded in falsifying your property either, would it? Depending on the log base that is...

from jqwik.

rkraneis avatar rkraneis commented on July 20, 2024

Good catch, I did not follow the code completely through :-). The drop off of the logarithmic distribution is actually already much too high (log2(Long.MAX_VALUE)=64 XD). A linear (uniform) distribution would have caught it, as shown in the initial question.

from jqwik.

rkraneis avatar rkraneis commented on July 20, 2024

And I have to correct myself regarding scalacheck, as this actually seems to be biased towards the lower end when given explicit bounds.

from jqwik.

jlink avatar jlink commented on July 20, 2024

Change available in version 0.8.5-SNAPSHOT

from jqwik.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.