Giter Club home page Giter Club logo

kubervisor's Introduction

kubervisor

The Kubervisor allow you to control which pods should receive traffic or not based on anomaly detection.It is a new kind of health check system.

Unlike readyness probe, the Kubervisor can be configured to remove pods from endpoints based on a global view of the health of the pod fleet. This guarantees that if all pods (or a majority) are under SLA, the system stability is not getting worse because of pod local decisions to "eliminate" itself.

Unlike a service mesh circuit breaker, the Kubervisor can act as a circuit breaker triggered by servers internal KPIs. The anomaly detection can be based on analisys done on external data source such as prometheus. It allows to easy build complex analysis by leveraging external system capabilities such as PromQL in the case of Prometheus.

Kubervisor comes with its own resource (CRD) to configure the system:

  • define the service to monitor
  • define the anomaly detection mechanism and configure it
  • define the grace period and retry policies
  • activation/deactivation/dryrun operation
  • display the current health check of the service

architecture

architecture diagram

  • The KubervisorService is the CRD (Kubernetes Custom Resource Definition). It is used to configure the kubervisor for a given Service. It also contains the status for the health of the service.
  • The Controller reads the KubervisorService to configure Breaker and Activator workers for the given service. It also monitors the service changes to adapt the configuration of the system. It also monitors the pods to build a cache for all the workers and to compute the health status of each service under control of the Kubervisor. The health of the serivce is persisted inside the status of the associated *BreakConfig
  • The Breaker is in charge of invoking the configured anomaly detection. Ensuring that it is not going bellow defined threshold or ratio, the Breaker will relabel some pods to prevent them to receive traffic.
  • The Activator is in charge of restablishing traffic on pods after the defined period of inactivity (equivalent to open state in a circuit breaker pattern). Depending on the configured policy and the numbers of retries performed on a pod, the Activator can decide to kill the pod or put it in pause (out of traffic forever) for further investigation.
  • The Anomaly detector part (all the blue part in the diagram) is where the data analysis is really performed. Depending on the KPI that you are working on (discrete value, continuous value) or the type of anomaly (ratio, threshold, trend ...) you can select an integrated implementation or delegate the to an external system that would return the list of pods that are out of policy. The proposed internal implementations used data from Prometheus.

more information in the developper documentation page

System Operations

Admin side

CRD

TODO

Scope

The Kubervisor is an operator that can run in a dedicated namespace and cover the resource of that namespace only, or as a global operator taking action in all namespace that it had been granted access. The used service account will determine the scope on which the Kubervisor will work.

It requires to be given the following roles:

  • get: pod, service
  • list: pod, service
  • update: pod, service
  • watch: pod, service
  • delete: pod

Deployment

TODO (Helm ?)

User side

To configure the system a user would have to complete the following steps:

  • Create the KubervisorService CRD in the namespace of the associated service
    • Select the service (by name)
    • Define the BreakConfiguration to configure the Anomaly Detection mechanism
    • Configure the Activator
  • Once the CRD status is Ready activate the system by adding the following label in the Selector of the service: kubervisor/traffic=yes
    • TODO: Alternativelly use the command kubectl .....

To deactivate any effect of the Kubervisor for a given service, simply delete from the Selector the label with key kubervisor/traffic

If you know that some pods are going to be under control of the Kubervisor, it is advised to directly add a label kubervisor/traffic=yes inside the pod template. This label must not be part of template only, not the selector!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapplication
  labels:
    app: myapplication
spec:
  replicas: 3
  selector:
    matchLabels:                       # <-- kubervisor labels must not appear in the selector!
      app: myapplication
  template:
    metadata:
      labels:
        app: myapplication
        kubervisor/traffic: yes        # <-- add label for kubervisor
    spec:
      containers:
      - name: myapp
        image: myapp:1.7.9
        ports:
        - containerPort: 80

This label is automattically added by the controller on the pods if it is missing. But this happens once the resources are synchronized in the controller (every couple of seconds in theory) and of course if the controller is running. Having the label already preset in the pod template prevent so corner case in case the controller is missbehaving or absent.

kubervisor's People

Contributors

clamoriniere avatar clamoriniere1a avatar dbenque-1a avatar sdminonne avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.