Giter Club home page Giter Club logo

k8s-externalipcontroller's Introduction

External IP Controller Build Status

Introduction

One of the possible ways to expose k8s services on a bare metal deployments is using External IPs. Each node runs a kube-proxy process which programs iptables rules to trap requests to External IPs and redirect them to the correct backends.

So, in order to access k8s service from the outside, we just need to route public traffic to one of the k8s worker nodes which have kube-proxy running and thus have all the needed iptables rules for External IPs configured.

Proposed solution

External IP Controller is a k8s application which is deployed on top of k8s cluster and which configures External IPs on k8s worker node(s) to provide IP connectivity.

Demo

asciicast

How to run tests

Install dependencies and prepare kubernetes dind cluster. It is supposed that Go v.1.7.x has been installed already.

make get-deps

Build necessary images and run tests.

make test

Use make help to see all the options available.

How to start using this?

Both controller and scheduller operate on third party resources and require them to be created. Since kubernetes 1.7 most of the installations enable RBAC. For this reason we need to grant our application correct permissions. For testing envrionment you can use:

kubectl apply -f examples/auth.yaml

In case you are using kubeadm dind environment - deploy claim controller and scheduller like this:

kubectl apply -f examples/claims/

For any other environment you need to ensure that --iface option in examples/claims/controller.yaml file is correct. This interface will be used for IP assignment.

If you want to use auto allocation from IP pool - you need to create atleast one such pool. We provided an example in file examples/ip-pool.yml. It can be applied with kubectl after third party resources will be created. We are not resyncing services after pool was created, so please ensure that it is created before you will start requesting IPs.

We also have one basic example with nginx service and pods - examples/nginx.yaml. This example creates deployment for nginx with single replica and service of type LoadBalancer.

For each service that require ip we will create ipclaim object. You can list all ipclaims with:

kubectl get ipclaims

Notes on CI and end-to-end tests

In tests we want to verify that IPs are reachable remotely. For this purpose we are using --testlink option in e2e tests. During the tests we will configure that link with IP from a network that is used in tests. This is also the reason why we are running e2e tests with sudo. The requirement here is that all kubernetes nodes must be in the same L2 domain. In our application we are assigning IPs to a node. In dind-based setup those nodes are regular containers. Therefore to guarantee connectivity in our CI we need to assign IP on a bridge used by docker.

For simplicity we want to limit number of running ipcontrollers to 2. To make it work with kubeadm-dind-cluster we have to set label ipcontroller= on kube workers. And in the test we are using this label as node selector for daemonset pods.

k8s-externalipcontroller's People

Contributors

adidenko avatar alexeykasatkin avatar dshulyak avatar pigmej avatar velp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

k8s-externalipcontroller's Issues

Add basic e2e tests

Create service with multiple externalIPs, verify that ip is pingable from another node and service is reachable again from another node.

Netmask is forced in claimscheduler

Currently the Netmask for an IP is set in the claimscheduler deployment.
Would it make sense to set it in the externalIp definition when assigning it to something in case there are different possible ip ranges?

Clean orphaned ipclaims

If scheduler process will be down while all relevant services will be removed - we will miss deletion events and thus some ipclaims won't be cleared.

More e2e destructive test cases for daemon set version

Currently we are covering partition scenario, e.g node is not responding.

To add:

  • remove daemon set by removing certain node from ds selector, controller destroyed (delete existing daemon set), verify that ips were reallocated, bring back daemon set and verify that ips were purged

  • ips evenly distributed between nodes, when node is went missing only relevant ips were rescheduled (watch events)

  • both nodes are dead - no ips should be purged

Node fetching error in json

After repeating these action below, I faced a problem.

  • annotate external-ip:auto to service
  • delete service
  • create service and annotate again

External IP is allocated but loopback address doesn't configured and
other ipclaims aren't rescheduled.
Controller log saids,

controller.go:174] Error fetching node node-002 : json: cannot unmarshal number 1e+06 into Go value of type int64

Is there any way to solve this problem?

Fix Readme

Readme is inconsistent now. It is an unstructured set of information about application. Need to define what should be there, make it structured and move some topics into documentation.

Tolerate node failures and balance external traffic load across multiple kube-proxy nodes

In current implementation we support only single replica managed by kubernetes deployment object.
Obviously it won't allow us to utilize multiple nodes with external connectivity.

To make it work we will change method of deployment to daemon sets, and also introduce multiple changes to support load balancing and tolerate network connectivity failures.

Issue will be updated with design proposal

Add queue for claim updates via API.

Now IP claims are updated (via k8s API) in several different places and conflicts between updates are possible then. Let's add a queue that will hold update requests (add, update, delete) and process requests consequently.

apiserver configuration for auto allocation

I am following these directions to setup auto allocation. I am unable to create the claim pool:

$ kubectl create -f ext_ip_pool.yml 
error: unable to recognize "ext_ip_pool.yml": no matches for ipcontroller.ext/, Kind=IpClaimPool

`$ cat ext_ip_pool.yml
apiVersion: ipcontroller.ext/v1
kind: IpClaimPool
metadata:
    name: test-pool
spec:
    cidr: 10.30.118.128/28
    ranges:
        - - 10.30.118.128
          - 10.30.118.129
        - - 10.30.118.130
          - 10.30.118.131

I do not see the ipcontroller.ext extension enabled on my kube-apiserver. I try to enable the extension using the kube-apiserver --runtime-config=extensions/v1beta1/ipcontroller.ext=true flag.

However, I still do not see the ipcontroller.ext extension enabled:


$ curl 127.0.0.1:8080/apis/extensions/v1beta1
{
  "kind": "APIResourceList",
  "groupVersion": "extensions/v1beta1",
  "resources": [
    {
      "name": "daemonsets",
      "namespaced": true,
      "kind": "DaemonSet"
    },
    {
      "name": "daemonsets/status",
      "namespaced": true,
      "kind": "DaemonSet"
    },
    {
      "name": "deployments",
      "namespaced": true,
      "kind": "Deployment"
    },
    {
      "name": "deployments/rollback",
      "namespaced": true,
      "kind": "DeploymentRollback"
    },
    {
      "name": "deployments/scale",
      "namespaced": true,
      "kind": "Scale"
    },
    {
      "name": "deployments/status",
      "namespaced": true,
      "kind": "Deployment"
    },
    {
      "name": "horizontalpodautoscalers",
      "namespaced": true,
      "kind": "HorizontalPodAutoscaler"
    },
    {
      "name": "horizontalpodautoscalers/status",
      "namespaced": true,
      "kind": "HorizontalPodAutoscaler"
    },
    {
      "name": "ingresses",
      "namespaced": true,
      "kind": "Ingress"
    },
    {
      "name": "ingresses/status",
      "namespaced": true,
      "kind": "Ingress"
    },
    {
      "name": "jobs",
      "namespaced": true,
      "kind": "Job"
    },
    {
      "name": "jobs/status",
      "namespaced": true,
      "kind": "Job"
    },
    {
      "name": "networkpolicies",
      "namespaced": true,
      "kind": "NetworkPolicy"
    },
    {
      "name": "replicasets",
      "namespaced": true,
      "kind": "ReplicaSet"
    },
    {
      "name": "replicasets/scale",
      "namespaced": true,
      "kind": "Scale"
    },
    {
      "name": "replicasets/status",
      "namespaced": true,
      "kind": "ReplicaSet"
    },
    {
      "name": "replicationcontrollers",
      "namespaced": true,
      "kind": "ReplicationControllerDummy"
    },
    {
      "name": "replicationcontrollers/scale",
      "namespaced": true,
      "kind": "Scale"
    },
    {
      "name": "thirdpartyresources",
      "namespaced": false,
      "kind": "ThirdPartyResource"
    }
  ]
}

How do you enable the ipcontroller.ext kube-apiserver extension?

Support Client Certificate Authentication

I am following these directions to setup an externalIP controller. The logs of the extip-controller pod shows the following error:

E0319 01:36:03.558680       7 reflector.go:214] github.com/Mirantis/k8s-externalipcontroller/vendor/k8s.io/client-go/1.5/tools/cache/reflector.go:109: Failed to list *v1.Service: Get https://10.3.0.1:443/api/v1/services: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca")

My kube-apiserver has client cert auth enabled using the client-ca-file config flag. It appears that the extip-controller needs to support k8s-client certificate auth.

Implement cleaner for e2e tests

We need reliable cleaner for e2e tests. One idea is to use provider based model and use docker exec to purge all leftovers after tests.

Implement fair manager based on third-party resources

Proposed version will have certain benefits in comparison to etcd based one.

  • It will be easier to understand. All interactions will be possible to describe using simple flow.
  • No 3rd party storage system
  • Less messages (for instance in etcd i am using 1 message per ttl for each ip address, in current version it will be possible to use only 1 message per node)
  • Easier testing because all data will be accessible from kubernetes in known form
  • More extandable (will add more here)
  • Integrates nicely with IP auto-allocation (will describe it elsewhere)

Proposed solution includes 2 3rd party resources:

  • IPNode (node where we are running controller)
  • IPClaim (claim will be source of truth for controller to assign IP)

Processes

Scheduler

  • watches service updates/deletion
  • creates ipclaims based on service events
  • schedules claims onto nodes based on policies (fair round robin as a simplest one)
  • watches IPNode events and enforces liveness property onto them, which means...
  • reschedule ipclaim if nodes won't update itself in certain period of time
  • watches claims and if it doesn't have owner processes it regularly

Controller

  • updates IPNode objects (use hostname as unique key)
  • watches IPClaim events (add/removes cidr to processing queue based on it)
  • during startup checks if ip is owned by someone else and adds it for removal in case it is

Auto-allocator

  • adds another configuration object into the picture (IPClaimRange)
  • listens to service events
  • listens to ipclaimrange events
  • if service require ip from range (annotations) - create IPClaim and add ip to a service
  • if IPClaimRange removed - purge all claims and externalips from services

WHOLE PICTURE

auto-allocator ---> externalip + ipclaim ---> scheduler (allocates claim) ---> controlller (assigns ip)

Reduce image size

Ubuntu size is ~200MB, ipcontroller binary is about ~27MB. For some reasons resulting image is 917 MB, which is strange. Need to debug it and reduce image size to normal one.

claimscheduler replica in crashloopbackoff after a while

After a while, one of the 2 claimscheduler replicas goes into crashloopbackoff and logs :

W0515 12:30:48.936031       1 client_config.go:438] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
F0515 12:30:59.023533       1 scheduler.go:70] URLs for tprs are not registered: json: cannot unmarshal number 1e+06 into Go value of type int64

Version: mirantis/k8s-externalipcontroller:latest on kubernetes 1.5.5 (Coreos/Tectonic).

Work around: Deleting the faulty POD gets a new one rescheduled.

Use queue-based processing

Use queue for processing updates, in a way it is done for all kube controllers loops, it will help to avoid stale data

Make register TPRs more reliable

We need to verify that any TPR can be used after it was registered. From my observation it is async operation and can take some time.

TODO:

Add a client that will try to list resources of created type right after we will create all TPRs

Reuse update mechanism for calico-controller

We don't need "ip a" for external IPs when using calico network plugin, we just need to announce route. So we can reuse our update controller class to provide calico support.

Add a way to assign all ips onto the same node which is alive

Current claim version of ipcontroller will try to split ips evenly between all available instances of ipcontroller. It might be useful to disable such load balancing and assign ips always to the same node, that is currently alive.

All other logic (e.g. resheduling and removal of stale ips) shouldnt be affected.

We need to add a flag which will disable

ipnode := s.findFairNode(liveNodes)

Add destructive e2e test for replica set version

  1. Create controller with rc=1
  2. Assign some ips and validate them
  3. Destroy node where current replica is running.
  4. Wait until replica will be re-spawned on the other node
  5. Validate that all ips will be reassigned to this node
  6. Verify that arp table for re-assigned ips was updated.

Add documentation on application functionality.

Documentation does not cover some pieces of the functionality including basics.

  1. Simple/Daemon set versions.
  2. Configuration, parameters.
  3. Failover.
  4. Cleaning of IPs.
  5. Internals: IP claims, scheduler, controller.

Add CI

Try to use travis with kubernetes-dind-cluster

Fix flake e2e test problem with kubernetes

  Expected

      <*errors.StatusError | 0xc8203f4b80>: {

          ErrStatus: {

              TypeMeta: {Kind: "", APIVersion: ""},

              ListMeta: {SelfLink: "", ResourceVersion: ""},

              Status: "Failure",

              Message: "pods \"externalipcontroller\" is forbidden: service account e2e-tests-ipcontroller-h961p/default was not found, retry after the service account is created",

              Reason: "Forbidden",

              Details: {

                  Name: "externalipcontroller",

                  Group: "",

                  Kind: "pods",

                  Causes: nil,

                  RetryAfterSeconds: 0,

              },

              Code: 403,

          },

      }

  to be nil

Figure out why integration tests fails sometimes

Integration test which creates controller with fake source and waits until external ip will be assigned on real linux box - fails very often. It shouldn't be very hard to make it reliable. But if the reason won't be clear, we can remove it, cause it is only useful to reduce feedback loop during debugging.

latest version gives failure

Hi,

When using the latest version with the example yaml files I'm getting this error:

14/12/2016 16:40:31Error: the server has asked for the client to provide credentials (post thirdpartyresources.extensions)

I'm running K8S 1.4.6 on top of Rancher 1.2.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.