Giter Club home page Giter Club logo

declarative-cluster-management's Introduction

License Build Status codecov Maven Central GitHub release (latest by date) javadoc

Declarative Cluster Management

  1. Overview
  2. Download
  3. Pre-requisites for use
  4. Quick Start
  5. Documentation
  6. Contributing
  7. Information for developers
  8. Learn more

Overview

Modern cluster management systems like Kubernetes routinely grapple with hard combinatorial optimization problems: load balancing, placement, scheduling, and configuration. Implementing application-specific algorithms to solve these problems is notoriously hard to do, making it challenging to evolve the system over time and add new features.

DCM is a tool to overcome this challenge. It enables programmers to build schedulers and cluster managers using a high-level declarative language (SQL).

Specifically, developers need to represent cluster state in an SQL database, and write constraints and policies that should apply on that state using SQL. From the SQL specification, the DCM compiler synthesizes a program that at runtime, can be invoked to compute policy-compliant cluster management decisions given the latest cluster state. Under the covers, the generated program efficiently encodes the cluster state as an optimization problem that can be solved using off-the-shelf solvers, freeing developers from having to design ad-hoc heuristics.

The high-level architecture is shown in the diagram below.

Download

The DCM project's groupId is com.vmware.dcm and its artifactId is dcm. We make DCM's artifacts available through Maven Central.

To use DCM from a Maven-based project, use the following dependency:

<dependency>
    <groupId>com.vmware.dcm</groupId>
    <artifactId>dcm</artifactId>
    <version>0.15.0</version>
</dependency>

To use within a Gradle-based project:

implementation 'com.vmware.dcm:dcm:0.15.0'

Pre-requisites for use

  1. We test regularly on JDK 11 and 16.

  2. We test regularly on OSX and Ubuntu 20.04.

  3. We currently support two solver backends.

    • Google OR-tools CP-SAT (version 9.1.9490). This is available by default when using the maven dependency.

    • MiniZinc (version 2.3.2). This backend is currently being deprecated. If you still want to use it in your project, or if you want run all tests in this repository, you will have to install MiniZinc out-of-band.

      To do so, download MiniZinc from https://www.minizinc.org/software.html ... and make sure you are able to invoke the minizinc binary from your commandline.

Quick start

Here is a complete program that you can run to get a feel for DCM.

import com.vmware.dcm.Model;
import org.jooq.DSLContext;
import org.jooq.impl.DSL;
import org.junit.jupiter.api.Test;

import java.util.List;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

public class QuickStartTest {

    @Test
    public void quickStart() {
        // Create an in-memory database and get a JOOQ connection to it
        final DSLContext conn = DSL.using("jdbc:h2:mem:");

        // A table representing some machines
        conn.execute("create table machines(id integer)");

        // A table representing tasks, that need to be assigned to machines by DCM.
        // To do so, create a variable column (prefixed by controllable__).
        conn.execute("create table tasks(task_id integer, controllable__worker_id integer, " +
                "foreign key (controllable__worker_id) references machines(id))");

        // Add four machines
        conn.execute("insert into machines values(1)");
        conn.execute("insert into machines values(3)");
        conn.execute("insert into machines values(5)");
        conn.execute("insert into machines values(8)");

        // Add two tasks
        conn.execute("insert into tasks values(1, null)");
        conn.execute("insert into tasks values(2, null)");

        // Time to specify a constraint! Just for fun, let's assign tasks to machines such that
        // the machine IDs sum up to 6.
        final String constraint = "create constraint example_constraint as " +
                "select * from tasks check sum(controllable__worker_id) = 6";

        // Create a DCM model using the database connection and the above constraint
        final Model model = Model.build(conn, List.of(constraint));

        // Solve and return the tasks table. The controllable__worker_id column will either be [1, 5] or [5, 1]
        final List<Integer> column = model.solve("TASKS")
                .map(e -> e.get("CONTROLLABLE__WORKER_ID", Integer.class));
        assertEquals(2, column.size());
        assertTrue(column.contains(1));
        assertTrue(column.contains(5));
    }
}

Documentation

The Model class serves as DCM's public API. It exposes two methods: Model.build() and model.solve().

  • Check out the tutorial to learn how to use DCM by building a simple VM load balancer
  • Check out our research papers for the back story behind DCM
  • The Model API Javadocs

Contributing

We welcome all feedback and contributions! ❤️

Please use Github issues for user questions and bug reports.

Check out the contributing guide if you'd like to send us a pull request.

Information for developers

The entire build including unit tests can be triggered from the root folder with the following command (make sure to setup both solvers first):

$: ./gradlew build

To avoid documentation drift, code snippets in a documentation file (like the README or tutorial) are embedded directly from source files that are continuously tested. To refresh these documentation files:

$: npx embedme <file>

The Kubernetes scheduler also comes with integration tests that run against a real Kubernetes cluster. It goes without saying that you should not point to a production cluster as these tests repeatedly delete all running pods and deployments. To run these integration-tests, make sure you have a valid KUBECONFIG environment variable that points to a Kubernetes cluster.

We recommend setting up a local multi-node cluster and a corresponding KUBECONFIG using kind. Once you've installed kind, run the following to create a test cluster:

 $: kind create cluster --config k8s-scheduler/src/test/resources/kind-test-cluster-configuration.yaml --name dcm-it

The above step will create a configuration file in your home folder (~/.kube/kind-config-dcm-it), make sure you initialize a KUBECONFIG environment variable to point to that path.

You can then execute the following command to run integration-tests against the created local cluster:

$: KUBECONFIG=~/.kube/kind-config-dcm-it ./gradlew :k8s-scheduler:integrationTest

To run a specific integration test class (example: SchedulerIT from the k8s-scheduler module):

$: KUBECONFIG=~/.kube/kind-config-dcm-it ./gradlew :k8s-scheduler:integrationTest --tests SchedulerIT

Learn more

To learn more about DCM, we suggest going through the following references:

Talks:

Research papers:

declarative-cluster-management's People

Contributors

ahmedwaleedmalik avatar ajsangeetha avatar askiad avatar faria-kalim avatar hunhoffe avatar jchesterpivotal avatar kexinrong avatar lalithsuresh avatar reith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

declarative-cluster-management's Issues

capacityConstaint will throw exception on empty domain, should it?

Currently model.solve() with or-tools backend will throw a runtime exception if there is a capacityConstraint being evaluated on empty domain set. Is it necessary? For now library users can check domain by querying table or view that capacityConstraint applies on but they should also synchronize models data-cache and database query execution until there is an API to query from model's cached data (is it?).

Also, k8s-scheduler implementations assumes mode.solve() does not throw exceptions (otherwise scheduler execution loop will be terminated). Maybe it's better to make sure operations never throw exception or maybe just checked exceptions?

Remove controllable__ syntax

Either dynamically declare variable columns during model initialization or support annotations in the schema.

Improve documentation

  • Improve Model.java Javadocs
  • Do not export Javadocs for classes that are not part of the public API
  • Add a docs/ folder
  • Separate top-level README into multiple parts, such as factoring out the tutorial into the docs/ folder or the examples/ folder.
  • Add API documentation

Reduce presolve times

Pod placement times especially at low batching sizes (like one pod) are dominated by presolve costs.

For example, the overall time to place a single pod in a 1000 node cluster as per JMH benchmarks is roughly 57ms (from pod arrival via the informer API, up to a binding decision being made):

Benchmark                      (numThreads)  (solverToUse)  Mode  Cnt           Score           Error  Units
EndToEnd.testSinglePodPlacement             1        ORTOOLS  avgt    3  5785447107.000 ± 737202520.836  ns/op
EndToEnd.testSinglePodPlacement             2        ORTOOLS  avgt    3  5785887140.333 ± 324973887.983  ns/op

Profiling these invocations via async-profiler, we find that presolve times are dominating. From a single example solver invocation:

Parameters: max_time_in_seconds: 1 log_search_progress: true num_search_workers: 2 cp_model_probing_level: 0
Optimization model '':
#Variables: 9028 (3 in objective)
 - 1 in [-2147483648,2147483647]
 - 10 in [0,1]
 - 3 in [0,2147483647]
 - 1 in [2,1001]
 - 9013 constants in {0,1,2,3,4,5,6,7,8,9,10,11,12 ... 993,994,995,996,997,998,999,1000,1001,5999,7900}
#kBoolAnd: 4 (#enforced: 4) (#literals: 4)
#kBoolOr: 4 (#enforced: 4) (#literals: 4)
#kCumulative: 6
#kInterval: 1001
#kLinear1: 18 (#enforced: 12)
*** starting model presolve at 0.00s
- 8019 affine relations were detected.
- 8019 variable equivalence relations were detected.
- rule 'bool_and: non-reified.' was applied 2 times.
- rule 'bool_or: always true' was applied 2 times.
- rule 'bool_or: only one literal' was applied 5 times.
- rule 'bool_or: removed enforcement literal' was applied 2 times.
- rule 'cumulative: no intervals' was applied 4 times.
- rule 'cumulative: removed intervals with no demands' was applied 6 times.
- rule 'enforcement literal not used' was applied 1 time.
- rule 'false enforcement literal' was applied 4 times.
- rule 'interval: unused, converted to linear' was applied 996 times.
- rule 'linear: empty' was applied 999 times.
- rule 'linear: fixed or dup variables' was applied 999 times.
- rule 'linear: infeasible' was applied 3 times.
- rule 'linear: size one' was applied 9 times.
- rule 'objective: variable not used elsewhere' was applied 2 times.
- rule 'presolve: iteration' was applied 1 time.
- rule 'true enforcement literal' was applied 9 times.
Optimization model '':
#Variables: 9 (1 in objective)
 - 1 in [0,2147483647]
 - 1 in [1,1000]
 - 1 in [2,1001]
 - 6 constants in {1,2,3,4,5,109}
#kCumulative: 2
#kInterval: 5
*** starting Search at 0.03s with 2 workers and strategies: [ auto, lp_br, helper, rnd_lns_auto, var_lns_auto, cst_lns_auto, rins/rens_lns_auto ]
#Bound   0.03s best:inf   next:[1,2.14748365e+09] auto
#1       0.03s best:2     next:[1,1]      auto num_bool:3
#2       0.03s best:1     next:[1,0]      auto num_bool:4
#Done    0.03s  auto
CpSolverResponse:
status: OPTIMAL
objective: 1
best_bound: 1
booleans: 4
conflicts: 0
branches: 3
propagations: 3
integer_propagations: 11
walltime: 0.0454228
usertime: 0.0454229
deterministic_time: 4.8e-07
primal_integral: 0
19:56:24.110 [computation-thread-0] INFO  org.dcm.Model - Solver has run successfully in 60546538ns. Processing records.

From a total runtime of roughly 45ms spent within the solver, 30ms is spent within the presolve phase.

Add usage documentation and examples

This is a tracking issue for adding usage documentation to DCM.

  • Update README.md with fully runnable examples of how to use DCM.
  • Add an examples folder with fully runnable examples.

Simplify installation

  • Once Google OR-Tools becomes available as maven package, we can save the trouble of manually installing the solver. Dependent on (google/or-tools#202).
  • Make MiniZinc optional for users (but not for developers, if they want to run all the tests)

Bump up to JDK 15

Text blocks is no longer in preview as of JDK 15. This will help clean up a significant amount of clutter in the codebase from constructing long SQL strings.

Being worked on on the jdk15 branch.

k8s-scheduler: Pod.Name probably should not be used as primary key

Since different namespaces can have different Pods by the same name, it looks better to define pods_info table's primary key as Pod.UUID and have foreign keys based on UUID. I see UUID is missing from pods_info, if there is a reason behind that, defining a compound key of Pod.Name and Pod.Namespace looks more appropriate.

Use consistent syntax for hard and soft constraints

With #82 being fixed, there is little reason to have a different query structure for hard and soft constraints. We can instead structure both constraints as a view that produces a set of records, followed by a CHECK or MAXIMIZE clause, followed by an expression. This would allow a user to easily switch between hard/soft constraints if they want.

API improvements to support ddlog

While H2 has been convenient thus far, it might help to separate out the JOOQ/JDBC-specific API boundary from Model.updateData() behind a more abstract interface can be used to collect input data required for the solver. This will allow us to interface with relational engines like ddlog, that JOOQ cannot interface with.

API improvements

Following #73, some API improvements that are in the works:

  • ModelException should exclusively correspond to issues with the model. Ideally, it'll only be thrown during model creation.
  • SolverException should only be thrown when invoking the underlying solver, or the solver API. This corresponds to bugs in the input data + solver-interaction. For now, SolverException.reason() will convey why the model failed.
  • Unsat cores from or-tools. Use a natural API (maybe in the form of tables?) to convey why a model failed.

@reith do add any further requirements from your end here.

Refactor IR internal naming

Much of the IR uses naming around monoid comprehensions, even though our IR has since evolved to only deal with list comprehensions.

Use only half-reified constraints where appropriate

Currently, the or-tools CP-SAT backend fully reifies all constraints when constructing expressions. This is overly conservative. We can reduce the number of the intermediate variables and constraints by only using half-reified constraints when appropriate (for example, logical constraints that have to be true).

Improve intermediate view type inferrence

For now, it assumes all columns computed are of type IntVar as a default. This however does not work if an intermediate view is consumed in a subsequent Group By.

CapacityConstraint scale and limits

capacityConstraint demands and capacities are Integers and if not normalized can overflow soon. In fact k8s-scheduler implementation will probably fail to schedule Pods requesting memory more than 4Gi. It seems ortools can work with Longs, in that case, do changing them have negative impact on solving performance?
Currently, I scale down values in my controller before storing them in database, but maybe implementation can get enhanced. Also there is a scale factor which further reduces overflow limit by 1000. If capacityConstraint is expected to work just for Integer, it's good to detect and throw exceptions in generated code iteration. It's better to fail model creation if database schema suggest that overflow is possible.

Also, there are two possible division by zero cases, here when a node has no capacity and here when some capacity in all nodes are zero. I surprised by second case because in my scenario the capacityConstraint been called for a different task that I wasn't solving model for it - the Policy that called was for scheduling Pods but I didn't have any Pod in database or model - So all demands where zero there and probably the function could have returned sooner.

Rewrite affinity/anti-affinity views

Users provide labels that describe which nodes pods are affine and anti-affine to. These labels help shortlist nodes that are whitelisted and blacklisted for the pod. Currently, we combine this information to create a final whitelist of nodes for the pod. This can be problematic if the user only provides us with a blacklist; then, we build a view that gives us the set of rows = {all nodes - blacklisted nodes}. This list is large and can be expensive to pull out. Instead, we can try to make use of the fact that the blacklist is usually small and rewrite our constraint to ignore the blacklisted nodes while placing the pod.

Migrate to the Apache Calcite parser

The presto parser API is not extensible, forcing us to shoehorn DCM's check/maximize syntax on top of views.

Calcite's parser is extensible, allowing us to add our own DDL for constraints, and restrict the subset of SQL we'd like to support more cleanly.

Expression short-circuiting and null friendliness for operations in Ops

An operation of the form CHECK x = 10 OR var1 = col1, currently gets compiled down to a series of ortools operations that constructs the full expression without returning early. If x=10 but var1 is null, we still get an NPE when passing an argument to Ops.

The above operation is equivalent to WHERE x != 10 CHECK var1 = col1, which generates an if-expression that only encodes the constraints for rows that pass the x != 10 predicate. Doing so makes this safe for use cases where var1 might be null if x=10 but has a value otherwise.

It's worth considering what the behavior should be in the presence of nulls and whether we can potentially rewrite the IR.

Improve examples/

Add examples for:

  • Incremental placement vs global re-shuffling
  • use case with A/B testing

Access tuples by index and not field name

Following #85, we have more instances in the generated code where we used field names to reference cells from Jooq Records. This is proving fragile given the three-way interaction between:

  • DCM canonicalizing table/field names (upper case always)
  • JOOQ using a schema file for code generation but then connecting to a...
  • database at runtime with its own rules for case sensitivity

The fix on the DCM side is to always use field indices instead of field names to refer to values within JOOQ Records, without any loss of readability.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.