Giter Club home page Giter Club logo

Comments (2)

bpkroth avatar bpkroth commented on May 28, 2024

Some additional notes:

In cases of throughput or latency based benchmarks, it's not totally clear how to detect whether a specific trial is worse than a previous one, since some trial could theoretically speed up later or some such.

But, for raw time based ones, what we could do, would be to track the worst value seen so far, and then abort if we exceed that.
To do that, we'd need some additional metadata that this benchmark was in fact seeking to minimize wallclock time.

What's tricky is how we incorporate metrics from that.
Imagine for instance that you wanted to explain why some params/trials were bad. But in aborting some trials, you give up on gathering that data.

Moreover, we can't actually store a real time value for that trial, since we abort it early. Instead we need to store it in the DB as "ABORTED" or somesuch and then each time we train the optimizer fabricate a value for it.
Likely $W+\epsilon$ where $W$ is the worst value seen up until that point (i.e., serially examining historical trial data).

from mlos.

bpkroth avatar bpkroth commented on May 28, 2024

Per discussions, we need:

  • a new Environment phase - abort
    • the plan will be for users to add that as commands to their Environment configs that instruct the system how to cancel and cleanup a currently running run phase
    • that will get executed asynchronously
  • an additional config option to inform the scheduler when to invoke early abort logic for time based benchmarks
    • specifically it needs to know which metrics to look at
    • this should probably be per environment
      • for instance, right now we don't often teardown the VM for each trial, so the first time the VM gets setup, that necessarily takes longer, so we shouldn't include that in our elapsed time metrics
      • could be that we start tracking the elapsed time for every single phase in each environment for each trial and try to infer things, or ...
      • we also add a status or telemetry phase that includes commands used to asynchronously poll status of a run phase (or should it also support other phases?) in order to feed in-progress metrics back into the system and allow specifying that one of those (maybe just an implicit elapsed time, but probably not if sometimes the db needs to be reloaded for instance and other times it doesn't so the run phase overall may take longer on occassion even if the actual benchmark portion sometimes doesn't)

from mlos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.