Giter Club home page Giter Club logo

triscale's People

Contributors

arurke avatar dependabot[bot] avatar johndoe12312 avatar romain-jacob avatar triscale-anon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

arurke sekinat95

triscale's Issues

KPI calculated even if too little data supplied

This might be an issue in TriScale, or me misunderstanding a use-case.

TL;DR: analysis_kpi() returns a valid value when too few data-points supplied - if the "unintuitive" bound is selected (upper for percentile < 50, and vice versa).

Background: The intuitive way to calculate a KPI is to specify a bound which gives us the "worst case" (upper when percentile > 50, and vice versa). This allows us to make the "performance is at least X"-statements. However, I was thinking there was information in the other bound as well. This would show the width of the CI, and we could learn if the given metric varies a lot between runs. The first example coming to mind is industrial scenarios, where not only the maximum latency is interesting, but also its variability.

With this background I was routinely calling analysis_kpi() twice, once with bound set to upper and another with lower. Doing this I noticed I would be getting a valid value when the "unintuitive" bound was selected (upper for percentile < 50, and vice versa), even if I had too little data.

Example with too little data:

import triscale as triscale
import numpy as np

data = np.random.randint(0, 10, size=(5))

settings = {"bound": "lower", "percentile": 99,
            "confidence": 95, "bounds": [min(data), max(data)]}

independent, kpi = triscale.analysis_kpi(
                    data,
                    settings,
                    verbose=False)
print("KPI: " + str(kpi))

With bound set to "upper", the KPI correctly returns NaN. With bound set to "lower", a number is returned.

Division by zero in convergence test if yMax == yMin,

In analysis_metric(), if all elements in "y-axis" data are the same (i.e. min and max are identical), it leads to a division by zero in the convergence test, see https://github.com/romain-jacob/triscale/blob/master/helpers.py#L66 and two lines above.

The issue can easily be reproduced:

x = np.arange(0, 100)
y_same_value = np.full(len(x), 100)
df = pd.DataFrame(
    {'x': x,
     'y': y_same_value})

triscale.analysis_metric( 
    df,
    metric = {'measure': 50},
    convergence = {'expected': True})

>  triscale/helpers.py:66: RuntimeWarning: invalid value encountered in true_divide

An intuitive solution is to simply state that the data is converged in such cases with identical elements, but perhaps I am missing something about the statistics so I'll leave the PR to someone else ๐Ÿ˜…

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.