huang-wei / shared-loadbalancer Goto Github PK

View Code? Open in Web Editor NEW

22.0 3.0 3.0 464 KB

Demo for 2018 KubeCon NA

Dockerfile 0.47% Makefile 1.21% Go 98.31%

kubernetes service loadbalancer kubecon

shared-loadbalancer's Introduction

Shared Kubernetes LoadBalancer

Background

We know that in Kubernetes, there are generally 3 ways to expose workloads publicly:

Service (with type NodePort)
Service (with type LoadBalancer)
Ingress

kubectl proxy and similar dev/debug solutions are not counted in.

NodePort Service comes almost as early as born of Kubernetes. But due to limitation on ports range (30000~32767), randomness of port, and the need to expose public network of (almost) the whole cluster, NodePort Service is usually not considered as a good L4 solution in serious production workloads.

A viable solution today for L4 apps is LoadBalancer service. It's implemented differently in different Kubernetes offerings, by connecting an Kubernetes Service object with a real/virtual IaaS LoadBalancer, so that traffic going through LoadBalancer endpoint can be routed to destination pods properly.

However, in reality, L7 (e.g. HTTP) workloads are way more widely used than L4 ones. So community comes up with the Ingress concept. Ingress object defines how incoming request can be routed to internal Service, and under the hood there is an ingress controller (1) dealing with Ingress objects, setting up mapping rules by leveraging Nginx/Envoy/etc. and also (2) (normally) exposing via LoadBalancer externally.

There is a misunderstanding that using Ingress, it's also doable to manage L4 workloads. It's not true. Why Ingress can work is b/c it can differentiate requests by HTTP headers, but for a L4 packet, it's only ip + port.

Motivation

Ingress introduces a possibility which enables you to expose multiple internal L7 services through one public endpoint. But it doesn't work for L4 workloads.

From the above picture, you might wonder where's the missing piece for L4 services? This is exactly the problem we're trying to solve in this project. And following factors are considered:

Cost effective
User friendly
Reusing existing Kubernetes assets
Minimum operation efforts
Consistent with Kubernetes roadmap

How It Works

We introduce a "SharedLoadBalancer Controller" to customize current Kubernetes behavior.

Without a "SharedLoadBalancer Controller", it's N Services (of type LoadBalancer) mapped to N LoadBalancer endpoints:

With a "SharedLoadBalancer Controller", it's N SharedLB CR objects mapped to 1 LoadBalancer endpoint (on different ports):

More Info

Want to get more info on this? Join us at KubeCon + CloudNativeCon North America 2018 in Seattle, December 11-13, we will be giving a session on this.

shared-loadbalancer's People

Contributors

Stargazers

Watchers

Forkers

brahmaroutu nulldowntimeltd 0xack13

shared-loadbalancer's Issues

[EKS] placeholder LoadBalancer service failed on healthcheck

It seems it's mandatory to have the LoadBalancer service working as expected; otherwise the ELB won't function.

avoid unnecessary LoadBalancer creation

LoadBalancer creation in some cloud providers (esp. GKE and AKS, takes ~1min) is time consuming.

Suppose we have 2 requests comes in a row, and we're out of capacity. Based on current code, the 2 requests will try to create 2 LoadBalancers in the same time, which is unnecessary. We should come up with a solution to hold the 2nd request until 1st request finishes.

Regarding the solution, we can't simply check if len(pendingQ) != 0 b/c it could be the case that port of 1st request has been occupied; in this case, 2nd request is expected to use existing LB, instead of being hold "aggressively".

[AKS] oauth2 refresh issue

2018-12-06T13:56:01.188-0800 ERROR providers.aks providers/aks.go:131 cannot query public ip {"pip": "137.135.101.36", "error": "azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/58de4ac8-a2d6-499b-b983-6f1c870d398e/resourceGroups/MC_res-grp-1_wei-aks_eastus/providers/Microsoft.Network/publicIPAddresses/kubernetes-a6d05da86f9a111e8b9676adc8f35aad?api-version=2017-09-01: StatusCode=0 -- Original Error: adal: Failed to execute the refresh request. Error = 'Get http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F: dial tcp 169.254.169.254:80: i/o timeout'"}

[EKS] potential bug if cluster service is deleted

If a cluster service (with type NodePort) is somewhat deleted, we should update the listener in ELB, as the NodePort will change.

apply scheduling design and practices

Matching incoming (Shared)LoadBalancer request to a real LoadBalancer is actually a scheduling/horizontal-auto-scaling problem.

horizontal auto scaling
- if LoadBalancer resources are out of capacity, create one dynamically
- if getting error on creating a LoadBalancer service, usually you ran out of capacity/quota in your account, then need to invoke IaaS SDK to increase quota for your account - maybe safer to do it manually by operator
scheduling predicates - filtering LB resources
- ports conflict (#6)
- protocol (TCP/UDP) matching?
- service "requests/limits" (or simply use a "weight" term)
- service "{anti-}affinity"
scheduling priorities - prioritizing LB resources
- lease requested (balanced) vs. most requested (packed)

[General] concurrent CR creations may leads to unexpected result

Creating a LoadBalancer will firstly get the LoadBalancer Service obj created, but with status "pending".

If we got 2 incoming CR requests, although they're protected by "pendingQ", it's still problematic that 2nd CR are trying to use "half-completed" LoadBalancer which is still being created - as cacheMap are already populated as the LoadBalancer Service has existed.