kubernetes-sigs / cluster-api-provider-vsphere Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Add vsan support.
There are quite a few yaml definitions and startup scripts currently captured in code. This makes the development/debug cycle very onerous. It may be fine to keep these in code, but we should allow overrides via clusterctl or other CLI. The following are scripts and yaml definitions in the code:
As the above described, there are yaml definitions in the code as well as in the input files to clusterctl. It's a bit of spaghetti and forces a rebuild and docker push to a registry to do a new round of debugging.
Debugging the cluster api stack is very time consuming and very complex. If the system had the ability to log to a remote log server, it will significantly improve development productivity. It will also be helpful for users of the cluster apis. Not sure if glog can concurrently write to a remote server or we need another sidecar container to stream the logging to the server.
Add cluster-api-provider-vsphere repos to prow and run the unit tests.
We can have a scenario where a single bootstrap cluster is created and never destroyed. In the event that this cluster goes down, there needs to be a persistent source of truth with which we can recreate the bootstrap cluster. Part of this issue is picking the persistent store. Some ideas are native vSphere store or an etcd cluster outside of kubernetes.
When deleting an existing target cluster, clusterctl usually takes 10s of minutes or simply times out trying the last step in the workflow -- delete cluster objects. The suspicion is that the cluster object has too many finalizers as shown in the snippet below,
apiVersion: cluster.k8s.io/v1alpha1
kind: Cluster
metadata:
creationTimestamp: 2018-09-10T23:00:58Z
deletionGracePeriodSeconds: 0
deletionTimestamp: 2018-09-10T23:01:59Z
finalizers:
- cluster.cluster.k8s.io
- cluster.cluster.k8s.io
- cluster.cluster.k8s.io
The above is from the cluster object on the bootstrap cluster.
Add some end to end test to validate the bootstrap and target clusters are correct. These tests can include the following as examples:
Write unit tests for cluster-api-provider-vsphere/cloud/vsphere/namedmachines/namedmachines.g.o
Move the machine actuator's create operation to govmomi from the Terraform code today.
The "API Endpoint" for the cluster ideally should be handled by the provider specific cluster controller. Without this we are dependent on the clusterctl client to drive the workflow as currently the cli updates the API Endpoint at the moment.
We need to add a cluster controller for the vsphere implementation and have that maintain the "API Endpoint" for the cluster
This topic has been discussed in the following places:
https://github.com/kubernetes-sigs/cluster-api/issues/158
kubernetes-sigs/cluster-api#467 (comment)
Create a dockerfile to build clusterctl.
This is a mirror of kubernetes-sigs/cluster-api#289. Since it's now up to the vendor specific providers to build clusterctl, we should look into completing this issue.
Cluster-API has migrated to kubebuilder and CRDs (PR kubernetes-sigs/cluster-api#494). We need to sync the vsphere cluster-api provider with the upstream changes.
we need to add the CI job that integrated with Prow that run all unit tests
This bug involves the minikube vsphere PR to use Fusion/Workstation as the vm driver.
When I use minikube directly to create the bootstrap cluster and then call clusterctl, the cli is able to get passed applying the cluster api server onto it.
When I use clusterctl with no existing external cluster, it fails to create an external cluster with minikube.
I0821 10:00:10.499364 73247 minikube.go:46] Running: minikube [start --bootstrapper=kubeadm --vm-driver=vmware]
I0821 10:01:41.570315 73247 minikube.go:50] Ran: minikube [start --bootstrapper=kubeadm --vm-driver=vmware] Output: Starting local Kubernetes v1.9.3 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.
I0821 10:01:41.580665 73247 loader.go:357] Config loaded from file /var/folders/2d/23rfh_dx16s92sy13wzqdvxw0000gr/T/207938425
I0821 10:01:41.586088 73247 clusterdeployer.go:129] Applying Cluster API stack to external cluster
I0821 10:01:41.586242 73247 clusterdeployer.go:312] Applying Cluster API APIServer
I0821 10:01:42.222581 73247 clusterclient.go:381] Waiting for kubectl apply...
I0821 10:01:42.666698 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:01:42.683291 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 404 Not Found in 14 milliseconds
I0821 10:01:42.684082 73247 request.go:1075] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0821 10:01:52.688428 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:01:52.689665 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 404 Not Found in 1 milliseconds
I0821 10:01:52.690056 73247 request.go:1075] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0821 10:02:02.687728 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:02:02.689234 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 404 Not Found in 1 milliseconds
I0821 10:02:02.689548 73247 request.go:1075] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0821 10:02:12.687086 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:02:12.688202 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 404 Not Found in 1 milliseconds
I0821 10:02:12.688546 73247 request.go:1075] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0821 10:02:22.687213 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:02:22.688728 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 404 Not Found in 1 milliseconds
I0821 10:02:22.688917 73247 request.go:1075] body was not decodable (unable to check for Status): couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
I0821 10:02:32.682933 73247 clusterclient.go:409] Waiting for Cluster v1alpha resources to become available...
I0821 10:02:32.694161 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1 200 OK in 11 milliseconds
I0821 10:02:32.697090 73247 clusterclient.go:422] Waiting for Cluster v1alpha resources to be listable...
I0821 10:02:32.730378 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/clusters 200 OK in 31 milliseconds
I0821 10:02:32.732530 73247 clusterdeployer.go:318] Applying Cluster API Provider Components
I0821 10:02:32.732554 73247 clusterclient.go:381] Waiting for kubectl apply...
I0821 10:02:33.114937 73247 clusterdeployer.go:134] Provisioning internal cluster via external cluster
I0821 10:02:33.114975 73247 clusterdeployer.go:136] Creating cluster object test1 on external cluster
I0821 10:02:33.139273 73247 round_trippers.go:436] POST https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/clusters 201 Created in 23 milliseconds
I0821 10:02:33.140314 73247 clusterdeployer.go:141] Creating master
I0821 10:02:33.155061 73247 round_trippers.go:436] POST https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines 201 Created in 14 milliseconds
I0821 10:02:33.156248 73247 clusterclient.go:433] Waiting for Machine vs-master-8x4mt to become ready...
I0821 10:02:33.165717 73247 round_trippers.go:436] GET https://192.168.218.129:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines/vs-master-8x4mt 200 OK in 9 milliseconds
Minikube never creates the vm. This fails on both Mac w/ Fusion and Linux w/ Workstation.
The current repo has no usage documentation. Add a quick start guide to the readme markdown page.
Now that we are adding a govmomi based implementation as well, the script templates as they stand today should be used by both govmomi and terraform implementations. Thus refactoring the template.go is needed to be consumable with both implementations
Based on what I've been told, we will likely kick off clusterctl tests by creating a pod on the existing kubernetes cluster in CI. We need the pods/containers/scripts that kick off clusterctl tests in CI written. The following are categories of tests we need written,
Note, each of these tests must be able to kick off a test to validate each commandline params. This issue may actually be an epic as it is a large task.
Request to change the default CNI to weave as it does not need an additional etcd, and it supports much richer network constructs.
The choice of weave for CNI is based on some available comparison here:
https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/
Using the clusterctl from this repo fails when using an existing cluster. Using a nested ESX on Fusion on Mac, I attempted to deploy a cluster with the following command,
clusterctl create cluster --existing-bootstrap-cluster-kubeconfig ~/.kube/config -m machines.yaml -c cluster.yaml -p provider-components.yaml --provider vsphere -v 6
What I see is a constant loop waiting for the master to come up,
I0821 08:55:10.288137 70954 clusterdeployer.go:136] Creating cluster object test1 on external cluster
I0821 08:55:10.376143 70954 round_trippers.go:436] POST https://192.168.218.131:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/clusters 201 Created in 87 milliseconds
I0821 08:55:10.377509 70954 clusterdeployer.go:141] Creating master
I0821 08:55:10.474704 70954 round_trippers.go:436] POST https://192.168.218.131:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines 201 Created in 96 milliseconds
I0821 08:55:10.476449 70954 clusterclient.go:433] Waiting for Machine vs-master-9pmdb to become ready...
I0821 08:55:10.480937 70954 round_trippers.go:436] GET https://192.168.218.131:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines/vs-master-9pmdb 200 OK in 4 milliseconds
I0821 08:55:20.480953 70954 clusterclient.go:433] Waiting for Machine vs-master-9pmdb to become ready...
I0821 08:55:20.576983 70954 round_trippers.go:436] GET https://192.168.218.131:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines/vs-master-9pmdb 200 OK in 95 milliseconds
I0821 08:55:30.484846 70954 clusterclient.go:433] Waiting for Machine vs-master-9pmdb to become ready...
I0821 08:55:30.573614 70954 round_trippers.go:436] GET https://192.168.218.131:8443/apis/cluster.k8s.io/v1alpha1/namespaces/default/machines/vs-master-9pmdb 200 OK in 88 milliseconds
I0821 08:55:40.484931 70954 clusterclient.go:433] Waiting for Machine vs-master-9pmdb to become ready...
What I see from the machine actuator's logs (via kubectl log):
ERROR: logging before flag.Parse: I0821 15:55:34.303364 1 queue.go:38] Start NodeWatcher Queue
ERROR: logging before flag.Parse: I0821 15:55:34.305496 1 queue.go:38] Start Machine Queue
ERROR: logging before flag.Parse: I0821 15:55:34.385695 1 controller.go:91] Running reconcile Machine for vs-master-9pmdb
ERROR: logging before flag.Parse: I0821 15:55:34.475709 1 machineactuator.go:175] Attempting to stage tf state for machine vs-master-9pmdb
ERROR: logging before flag.Parse: I0821 15:55:34.475913 1 machineactuator.go:177] machine does not have annotations, state does not exist
ERROR: logging before flag.Parse: I0821 15:55:34.475995 1 machineactuator.go:658] Instance existance checked in directory
ERROR: logging before flag.Parse: I0821 15:55:34.476064 1 controller.go:134] reconciling machine object vs-master-9pmdb triggers idempotent create.
ERROR: logging before flag.Parse: I0821 15:55:34.672349 1 machineactuator.go:201] Cleaning up the staging dir for machine vs-master-9pmdb
ERROR: logging before flag.Parse: I0821 15:55:34.673281 1 machineactuator.go:175] Attempting to stage tf state for machine vs-master-9pmdb
ERROR: logging before flag.Parse: I0821 15:55:34.673299 1 machineactuator.go:177] machine does not have annotations, state does not exist
ERROR: logging before flag.Parse: I0821 15:55:34.673303 1 machineactuator.go:284] Staged for machine create at /tmp/cluster-api/machines/vs-master-9pmdb/
To test clusterctl with a bootstrap cluster using minikube, we need some software installed on some machines in CI. Specifically,
It doesn't matter if these components are installed on some static VMs or on the CI/CD kubernetes cluster nodes as long as we can run clusterctl on the machine that these components are installed on.
Note, this requirement is contingent on the minikube bootstrap workflow remaining in clusterctl, for which there are currently no discussions on removing.
A lot of the workflow for the cluster api are not documented. We should add a design doc folder and add some markdown pages describing the current workflow. This will help people on-ramp onto the cluster api work.
Terraform calls govmomi underneath so using Terraform for the vSphere provider adds yet another layer in the provisioning stack that we need to debug. In addition, Terraform has it's own state file for their source of truth. We've found from previous projects that letting vSphere be the source of truth is much less problematic in the long run.
From @rsdcastro on April 17, 2018 18:43
From @rsdcastro on April 14, 2018 1:46
VM images provided by users might have the container runtime. This issue tracks verifying if the container runtime is already pre-installed and matches the version specified by the user.
Copied from original issue: kubernetes-retired/kube-deploy#687
Copied from original issue: kubernetes-sigs/cluster-api#79
To help with the transition from Terraform to govmomi, we should refactor the machine actuator in this provider. It will help make it easier for multiple developers to contribute to this effort.
We've just added zone support in the vSphere cloud provider. We should add the ability to recognize zones when performing CRUD operation on a cluster.
From @karan on May 29, 2018 17:13
This is for security and safety, since the provider config blob can contain sensitive credentials.
/kind feature
/assign @karan
Copied from original issue: kubernetes-sigs/cluster-api#226
The current ClusterStatus
object does not contain any high level field that someone could look at and tell if the target cluster is ready
for consumption. Given that the very definition of ready
is subjective. As we could claim ready potentially at different stages:
This is an open issue to be discussed in the cluster-api SIG at this moment.
Given that, I propose to further enrich the ProviderStatus
field in the Cluster
to track part of this status. Given the definition of Ready
for the cluster is not formalized, thus the proposal is to add APIStatus
field in the ProviderStatus
representing whether the kubernetes APIs are ready to be interacted with in the target cluster.
Currently, if users want to use clusterctl to create a cluster using minikube in the bootstrap workflow, requires a lot of manual steps just to get minikube working. There must be a way to create a docker container that has clusterctl and all the necessary minikube drivers.
This issue differs from #6. That issue is for creating a Dockerfile for a container to build clusterctl. This issue is to actually build a container for clusterctl and drivers.
Using clusterctl, cluster creation always fails. After some debugging, it appears kubeconfig cannot be pulled. Root cause still unknown.
From @karan on May 29, 2018 17:12
/kind feature
/assign @karan
Copied from original issue: kubernetes-sigs/cluster-api#225
Write unit tests for the new govmomi provisioner.
This is a feature request for Windows pod support once Windows support has been added to Kubernetes.
We need to run the k8s conformance tests on the target cluster, created by the cluster api. Add this capability to our CI system.
We can use vc sim (simulator) to run unit tests for the machine actuator. We need unit tests for the existing machine actuator. There may be some problems running Terraform against vc sim. We need to identify those issues and raise them in the vcsim repo.
Kubernetes will support isolated pods (e.g. pod vm) in the future. There is a proposal for RuntimeClass that will support in-host isolated pods (e.g. Kata and gVisor). This may not be the only method of supporting RuntimeClass. We should support this feature when deploying a cluster to vSphere.
Add support multiple nics on to the node machines.
This will help in setting the cluster in a way that the k8s API network and the data plane network could be possibly isolated
Add the ability to create nodes that contains multiple nics. The yaml definition also needs to be updated to support multiple nics.
Support VM nodes that have access to the host's devices (e.g. GPU). There is currently a proposal for ResourceClass that will allow this support.
The cloud provider configuration to be setup for the cluster today is rendered using the machine variables used for the master node. This assumption is not correct as the machine variables are there to provide information about where and what to deploy as a master node and not to serve as the input for the cloud provider config that the target kubernetes will use.
We need to clearly abstract out the information intended to be used for generating the cloud provider configuration for the cluster.
One possible place where this cloud provider configuration information could live is in the Cluster
object's providerConfig
. Currently it only contains the VC endpoint and credentials. Thus we would need to expand that definition.
The cloud init scripts are currently hardcoded in the template.go file. We can make those the default, but we should also allow users to pass in custom user-defined scripts.
Transition the machine actuator's delete operation to govmomi.
For enterprise use case, we want some central source of truth for information about clusters created with the cluster api.
Today, the Terraform provisioner keeps information about the cluster in various places, including config maps within kubernetes and the terraform state file. The cluster api has two mode of operations, pivot and non-pivot. In the pivot mode, a bootstrap cluster is created, that will then be used to deploy the target cluster. Afterwards, the bootstrap cluster is deleted, leaving information about the cluster in the target cluster itself. If multiple users create clusters using clusterctl, there is no central way to identify all the clusters created on the vCenter.
Get a CI/CD pipeline setup for this provider.
When deploying a machineset, we should support specifying a different VC in the spec that overrides the VC specified in the cluster object. This way from the infrastructure point of view, we can spread the node VMs across VCs. In doing so that assumption is that the user deploying this cluster will make sure that the VMs created across the VC will have a proper L3 connectivity to each other.
The main cluster api repo is migrating to CRD. The controllers will change. Re-vendor in the repo for this provider and update the vsphere provider.
There are times when there is a desire to create machine nodes with static IPs.
Give clusterctl the ability to query a list of created cluster and also describe the cluster. Currently, clusterctl can create and delete the cluster. A user can create multiple clusters, but there is no way to query existing clusters and describe the configuration of the clusters from clusterctl.
Define, design, and implement a backup/recovery approach for the cluster config. This is most likely just the etcd database.
From @krousey on June 11, 2018 16:57
A Calico manifest has a hard-coded service IP that makes an assumption that it falls within the cluster's service cidr. This assumption is made https://github.com/kubernetes-sigs/cluster-api/blob/7fdecc5cc4b4174ab5c540a027fcfccc7183f66f/cloud/vsphere/templates.go#L372 and https://github.com/kubernetes-sigs/cluster-api/blob/7fdecc5cc4b4174ab5c540a027fcfccc7183f66f/cloud/vsphere/templates.go#L492
This needs to be an address that falls withing the cluster's Service CIDR because it can be changed by the end-user here
/kind cluster-api-vsphere
/cc @karan
Copied from original issue: kubernetes-sigs/cluster-api#324
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.