Giter Club home page Giter Club logo

Comments (4)

spodila avatar spodila commented on July 29, 2024

You are right in that a TaskScheduler is stateful and that there would be a single instance in a cluster. The size of state information in Fenzo would be proportional to the number of VMs (aka agent/host) and the number of tasks assigned. Other state information, such as related to autoscaling and groups, are too small to be concerned about.
Although I have no specific data to quote, I used this test program to create 10,000 VMs (agents/hosts) each with 16 cores, filling the 160K cores with 45K tasks (1-, 8, and 12-copu tasks). I noticed the resident set size to be about 750MB.
While this is not meant to be a reference to figuring out memory for a given scale, the quick hack shows you a way to test your possible scale and measure the anticipated memory size as well as the performance to expect. Fenzo makes it easy to also test new plugins for constraints and fitness calculators. LeaseProvider and TaskRequestProvider classes in the test package are useful for this, instead of requiring actual agent hosts.

from fenzo.

huntc avatar huntc commented on July 29, 2024

Thanks for the reply @spodila.

What are your thoughts toward resiliency? For example, if your process containing the TaskScheduler dies then what action do you take?

Thanks again for the dialogue.

(closing as you've addressed my primary question)

from fenzo.

spodila avatar spodila commented on July 29, 2024

Upon start of the process containing the TaskScheduler, we initialize Fenzo with the entire state by calling TaskScheduler.getTaskAssigner().call(...) method for each task that is already known to be running. Specifically, since we run multiple instances of our framework with ZooKeeper based leader election, we perform this initialization upon being elected as the leader.

This does bring up a concern on latency at startup with large number of running tasks. However, we haven't come to the point where that is the next big problem to solve. If that does concern you, I'd love to hear your thoughts on it and/or exchange ideas on solving it.

from fenzo.

huntc avatar huntc commented on July 29, 2024

Perhaps a plugable mechanism that a) requests state from other scheduler instances, returning a CompletionStage aka Scala's Future, and b) a callback so that the scheduler can be notified of new state asynchronously.

This then permits multiple schedulers to work together. We could then back the request and the callback with CRDTs (for example).

I think you should also consider reentrancy in your API to support such a mechanism and concurrency in general.

from fenzo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.