I am using kubetool to generate my hieradata for me, including of coures all certifica

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I'm using weave, just the default arguments for kubetool (i.e. <code class="notranslat

Cannot use API server certificate for both external kubectl and internal kubelet about puppetlabs-kubernetes HOT 16 CLOSED

itilitymaarten commented on August 12, 2024

Cannot use API server certificate for both external kubectl and internal kubelet

from puppetlabs-kubernetes.

Comments (16)

scotty-c commented on August 12, 2024 1

@itilitymaarten We are a ticket in the current sprint to update the documentation that will make the next release of the module.

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

Okay so I've figured out that 10.96.0.1 is the ClusterIP used by the kubernetes server, so I assume the idea is that other pods try to find kubernetes at this ip address and therefore the certificate has to be valid for this IP?

I've managed to get my cluster working, but only by using the public IP address for the BOOTSTRAP_CONTROLLER_IP. I end up with the following command for kubetool:

docker run --rm -v "$(pwd)":/mnt \
  -e OS=debian \
  -e VERSION=1.9.2 \
  -e CONTAINER_RUNTIME=docker \
  -e CNI_PROVIDER=weave \
  -e FQDN=icc-kubernetes-masters.westeurope.cloudapp.azure.com \
  -e IP=<PUBLIC IP> \
  -e BOOTSTRAP_CONTROLLER_IP=<PUBLIC IP> \
  -e ETCD_INITIAL_CLUSTER="etcd-kubernetes-01=http://%{::ipaddress}:2380" \
  -e ETCD_IP="%{::ipaddress}" \
  -e KUBE_API_ADVERTISE_ADDRESS="%{::ipaddress}" \
  -e INSTALL_DASHBOARD=false \
  puppet/kubetool

(note that I've also substituted the hardcoded IP address for the ETCD_INITIAL_CLUSTER parameter; it makes no sense to me to provide a manual IP address to me, if that IP address should need to be the same as the ETCD_IP anyway)

Is this the expected setup?

I find the documentation quite vague on which parameters require which values and what they mean, especially as a newcomer to Kubernetes. However, it doesn't make sense to me that nodes inside the cluster should need to use the public IP address of the cluster. This would require that I always expose the Kubernetes API to the outside world, even if I would want to control it only from inside the cluster.

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

@itilitymaarten Most of the issues that you have listed here are Kubernetes related and not really issues with the module. For example ETCD_INITIAL_CLUSTER =! ETCD_IP if you have multiple controllers being spun up for the initial cluster. As for the networking inside Kubernetes all the pods hit the kube api on the internal address 10.96.0.1 kubectl and outside services hit the node ip. Again this is the way Kubernetes works not the module. I am just trying to understand what the issue is on this thread with the module ?

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

@ianscrymgeour If you are new to Kubernetes I would start here https://github.com/puppetlabs/kream
This is a project that allows you to play with the module so you can understand all the moving parts

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

@scotty-c thank you for your responses. I had seen Kream, but since I don't have a physical machine with Linux (or macOS) available to me, I instead chose/have to work with VMs hosted in Azure.

I think my biggest issue is understanding the documentation and the assumptions made by this module, especially surrounding the arguments for kubetool.
IP isn't mentioned at all in the readme, but I think I read in an issue on this repository that it should be the IP address the cluster is available at to the internet.
FQDN's description I find a little vague; I assume this is the FQDN at which the cluster is available externally (e.g. an FQDN going to a load balancer), but I'd prefer to see this made explicit, that way I don't have to guess.
BOOTSTRAP_CONTROLLER_IP, to me, automatically becomes the IP address of the machine in the cluster that is going to be the bootstrap controller (seeing as IP is external). These two are not the same, because of the load balancer. However, when I set this to my master's internal IP, the kubelet on the master is unable to connect, since the internal IP address doesn't match the ones in the certificate (which seem to be IP and 10.96.0.1).
ETCD_INITIAL_CLUSTER could use some clarification on the "etcd name" that is automatically being generated, i.e. etcd-<hostname>. I had to look at the template for etcd.yaml to figure out that this was the name being generated.

Perhaps a small paragraph describing an example HA setup (e.g. 3 master nodes, one bootstrap controllers, 2 controllers, and a loadbalancer machine facing the outside web) would also help to explain the situation that this module expects/creates.

All in all they're not big points, but especially when someone isn't that experience with Kubernetes yet (like I am), it would help a great deal in understanding what's happening. And again, my apologies if this is just due to lack of Kubernetes knowledge.

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

@itilitymaarten I will take this on board and create a ticket to work on the docs. In regards to the ip in kream we use an internal ip 172.17.10.101 and it works as we add it to the sans list. There could be 2 issues here

the internal ip of the compute resource is overlapping with the service ip range in Kubernetes
The internal ip is not in the san on the api servers cert.

More than happy to help out with the teething issues getting the module up and running

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

@scotty-c Thanks for your help then :) I'd be happy to review the changes to docs from a novice standpoint, if you like.

Assuming you're talking about the BOOTSTRAP_CONTROLLER_IP: I used an internal IP too, but it didn't get signed into the cert. The only IP's in the cert are the public IP of my master's loadbalancer, and the 10.96.0.1. However, I now believe that's correct. The IP should be from the master's loadbalancer, since otherwise my requests wouldn't be distributed across the masters.

Is there any preferred/required order to the puppet runs (if so, this would also be good to document)? Or should they all be run simultaneously? (for the masters, at least) The run on my bootstrap controller hangs indefinitely if the other controllers in ETCD_INITIAL_CLUSTER don't come up, due to etcd crash-looping (which takes down the API server with it). Once the other machines come up, they work fine, though. Last problem is that kube-dns doesn't actually start (pod is stuck on ContainerCreating, with the message "failed to create pod sandbox"), but I'll look into that further first.

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

So the etcd cluster needs to be established for the cluster to come up. So if you are creating a cluster of 3 controllers it would be best to run puppet at about the same time. Once the etcd cluster is ready the kube api server will be available.

What networking cni provider are you using? That will help me work out why kubedns is not starting.

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

I'm using weave, just the default arguments for kubetool (i.e. CNI_PROVIDER=weave), which turns into

kubernetes::cni_network_provider: https://git.io/weave-kube-1.6
kubernetes::cni_cluster_cidr: 
kubernetes::cni_node_cidr:

in my hiera data.

I've found the following in the logs:

(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/onlyif) NAME                   STATUS     ROLES     AGE       VERSION
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/onlyif) kubernetes-master-01   Ready      master    14m       v1.9.2
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/onlyif) kubernetes-master-02   NotReady   master    14m       v1.9.2
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/onlyif) kubernetes-master-03   NotReady   <none>    14m       v1.9.2
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) Exec try 1/5
(Exec[Install cni network provider](provider=posix)) Executing 'kubectl apply -f https://git.io/weave-kube-1.6'
Executing: 'kubectl apply -f https://git.io/weave-kube-1.6'
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) serviceaccount "weave-net" unchanged
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) clusterrole "weave-net" configured
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) clusterrolebinding "weave-net" configured
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) role "weave-net-kube-peer" unchanged
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) rolebinding "weave-net-kube-peer" unchanged
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) daemonset "weave-net" unchanged
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]/returns) executed successfully
(/Stage[main]/Kubernetes::Kube_addons/Exec[Install cni network provider]) The container Class[Kubernetes::Kube_addons] will propagate my refresh event
(Exec[Assign master role to controller](provider=posix)) Executing check 'kubectl describe nodes kubernetes-master-01 | tr -s ' ' | grep 'Roles: master''
Executing: 'kubectl describe nodes kubernetes-master-01 | tr -s ' ' | grep 'Roles: master''
(/Stage[main]/Kubernetes::Kube_addons/Exec[Assign master role to controller]/unless) Roles: master
(Exec[Checking for dns to be deployed](provider=posix)) Executing check 'kubectl get deploy -n kube-system kube-dns -o yaml | tr -s " " | grep "Deployment does not have minimum availability"'
Executing: 'kubectl get deploy -n kube-system kube-dns -o yaml | tr -s " " | grep "Deployment does not have minimum availability"'
(/Stage[main]/Kubernetes::Kube_addons/Exec[Checking for dns to be deployed]/onlyif)  message: Deployment does not have minimum availability.
(/Stage[main]/Kubernetes::Kube_addons/Exec[Checking for dns to be deployed]/returns) Exec try 1/50
(Exec[Checking for dns to be deployed](provider=posix)) Executing 'kubectl get deploy -n kube-system kube-dns -o yaml | tr -s " " | grep "Deployment has minimum availability"'
Executing: 'kubectl get deploy -n kube-system kube-dns -o yaml | tr -s " " | grep "Deployment has minimum availability"'
(/Stage[main]/Kubernetes::Kube_addons/Exec[Checking for dns to be deployed]/returns) Sleeping for 10 seconds between tries
(/Stage[main]/Kubernetes::Kube_addons/Exec[Checking for dns to be deployed]/returns) Exec try 2/50

The "Checking for dns to be deployed" repeats until 50 and then determines it failed, obviously.
The queer thing to me: should the bootstrap controller be showing that it's ready before dns is installed 'n ready? When using kubeadm, it never showed ready until everything was really up and running.

EDIT: events from the kube-dns pod:

Events:
  Type     Reason                  Age               From                           Message
  ----     ------                  ----              ----                           -------
  Normal   Scheduled               2m                default-scheduler              Successfully assigned kube-dns-ccf7b96b9-gsbsk to kubernetes-master-01
  Normal   SuccessfulMountVolume   2m                kubelet, kubernetes-master-01  MountVolume.SetUp succeeded for volume "kube-dns-token-d2ntl"
  Normal   SuccessfulMountVolume   2m                kubelet, kubernetes-master-01  MountVolume.SetUp succeeded for volume "kube-dns-config"
  Normal   SandboxChanged          2m (x11 over 2m)  kubelet, kubernetes-master-01  Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  2m (x12 over 2m)  kubelet, kubernetes-master-01  Failed create pod sandbox.

2nd EDIT:
I've got it to work, I think. Going to try with entirely clean VM's and verify. What I did:

I switched from using Weave to using Flannel. This didn't fix the issue, but might be part of the solution. I'll try to switch back to Weave once I've verified that the cluster can now be started successfully.
I disabled taints of the master nodes. I'd prefer to keep the tainted, but some of the errors that I've seen (I can't show them anymore, sorry) were showing that the kube-dns pod wasn't getting scheduled on the master nodes, because of the taint. After removing the taint, the puppet runs on the cluster is much faster, and works.

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

At the time do you have any nodes? what does the output of kubectl get pods --all-namespaces show you ?

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

I had no nodes in the cluster, just the 3 masters. I don't have the actual output of kubectl get pods --all-namespaces, but it showed all pods running except for the kube-dns pod, which showed 0/3 with ContainerCreating.

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

So i think you have no nodes to schedule on. We can test this by adding kubernetes::taint_master: false to hiera or add a worker node. Both will give you the same outcome.

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

This is what I already did; when I add that, everything works. But shouldn't the masters show that they're ready before we add any nodes? That's what I expected... perhaps this is still some Kubernetes knowledge missing on my side :)

from puppetlabs-kubernetes.

scotty-c commented on August 12, 2024

So from Kubernetes point of view, the cluster is ready, it will take your commands and queue them up until a worker node is available. By default, Kubernetes won't deploy worker tasks to controllers. This is even documented in the kubeadm docs https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#master-isolation

from puppetlabs-kubernetes.

itilitymaarten commented on August 12, 2024

Yeah I know that, but I thought kube-dns has to run on the masters too. Additionally, I figured that since two of the masters were showing they weren't ready yet, the cluster wouldn't be in a ready state. I will try again with the taint, but then including a worker node as well. Assuming that works well, I have my cluster running :)

The last thing I'm wondering about: how much of my struggles could be reflected in the docs of this repo, to help other people? For example, a diagram of a minimal cluster setup (say 3 masters and 2 nodes, with the appropriate load balancers?), perhaps even with an indication of where the important parameters for kubetool come from? (such as IP and BOOTSTRAP_CONTROLLER_IP)

Again, thanks for your help; if you feel no further additions to the docs are necessary (or that this is not the right issue to put them under), feel free to close this issue :)

from puppetlabs-kubernetes.

davejrt commented on August 12, 2024

All documentation has been updated with the release of v 2.0.0

from puppetlabs-kubernetes.

Cannot use API server certificate for both external kubectl and internal kubelet about puppetlabs-kubernetes HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent