Comments (24)
@mysticaltech Could be, I am mostly guessing at this point!
x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
I interpret this error as saying that the metric server (on agent-big-0) is trying to contact the API server on 10.2.0.1, but its certificate is only signed for localhost and the external ip, not the internal control plane ip (10.2.0.1). So SAN sounds suspicious to me.
EDIT: No, thats wrong. It's not about the API server (6443), see port 10250
from terraform-hcloud-kube-hetzner.
Can confirm that the metrics-server seems to be working in my just-deployed cluster (name-suffixes branch, but that shouldn't matter here)
from terraform-hcloud-kube-hetzner.
My bad all IPs that are not private must be.
All should be 10.X.0.X...
As soon as you open the file, you will know!
from terraform-hcloud-kube-hetzner.
The node-ip is wrong in the config.yaml, it should be the private ip of you server so 10.2.0.X form.
from terraform-hcloud-kube-hetzner.
Thank you so much! Now it looks okay :)
from terraform-hcloud-kube-hetzner.
Hi @MartiniMoe,
Looks like your control planes certificate does not include its internal IP. This looks like a bug, but I don't have time to investigate correctly atm. If you do, please try if adding the following line to your control planes k3s config in https://github.com/kube-hetzner/kube-hetzner/blob/master/control_planes.tf#L45
tls-san = module.control_planes[0].private_ipv4_address
and re-provision. This should include the control planes private ip in the cert, but is curently untested.
(We had this in an earlier version of kube-hetzner, but there's been a lot going on in this repo lately, hope that it's going to stabilize soon ;))
from terraform-hcloud-kube-hetzner.
Thanks, I will try that!
How can I re-provision easily? Or do I have to takedown everything and start over?
from terraform-hcloud-kube-hetzner.
Should be sufficient to taint your first control node in this case.
from terraform-hcloud-kube-hetzner.
Thanks. I added the line and let terraform recreate the control-node-0, but the problems persists :/
from terraform-hcloud-kube-hetzner.
Does the resulting k3s config look correct? Did you check whether the certificate is generated correctly? Did you try re-creating the cluster?
I sadly can't provide step-by-step instructions, you need to do some of the digging yourself (or wait until someone else does) ;)
from terraform-hcloud-kube-hetzner.
@phaer The tls-san
defaults to the node-ip
, that's why I removed it while fixing another certificate issue that was just the node agents using their public IP as node-ip. So I believe, that tls-san
, is not the problem, or is it?
from terraform-hcloud-kube-hetzner.
Ahhh.... Yes you are right!
from terraform-hcloud-kube-hetzner.
@MartiniMoe you are probably using master from 48h ago... Just pull the latest changes, this is fixed already!
from terraform-hcloud-kube-hetzner.
@mysticaltech Thanks, I already pulled and did a terraform apply
. Can I fix this without recreating the cluster?
from terraform-hcloud-kube-hetzner.
@MartiniMoe Yes probably, login via ssh to each agent (see in the readme).
then:
systemctl stop k3s-agent
Then edit the /etc/rancher/k3s/config.yml
Change the server IP, basically all IPs to the private IP.
systemctl start k3s-agent
from terraform-hcloud-kube-hetzner.
@mysticaltech Do I change "server":
or "node-ip":
or both?
from terraform-hcloud-kube-hetzner.
Maybe you need to drain and uncordon the node before.. I think it's best
from terraform-hcloud-kube-hetzner.
I'm not sure what happened here. My agent has the private IP "10.1.0.1" in the config file, but actually it has "10.2.0.1" 😕
from terraform-hcloud-kube-hetzner.
What's your network_ipv4_subnets
and agent_nodepools
?
from terraform-hcloud-kube-hetzner.
network_ipv4_subnets = {
control_plane = "10.1.0.0/16"
agent_big = "10.2.0.0/16"
# agent_small = "10.3.0.0/16"
}
agent_nodepools = {
agent-big = {
server_type = "cpx21",
count = 1,
subnet = "agent_big",
}
# agent-small = {
# server_type = "cpx11",
# count = 2,
# subnet = "agent_small",
# }
}
from terraform-hcloud-kube-hetzner.
If I change the agents IP in its config to its actual IP the node is shown as not ready afterwards in kubectl get nodes
.
from terraform-hcloud-kube-hetzner.
Post you config.yaml file here please. You r server IP should be 10.1.0.1. And check the node events, with kubectl describe. What does it say? Also did you drain it and and cordon before?
from terraform-hcloud-kube-hetzner.
Any updates on this @MartiniMoe?
from terraform-hcloud-kube-hetzner.
Ah yes, sorry for the late reply, I was a little busy.
This is the config.yaml
on agent-big-0:
static:~ # cat /etc/rancher/k3s/config.yaml
"flannel-iface": "eth1"
"kubelet-arg": "cloud-provider=external"
"node-ip": "10.1.0.1"
"node-label":
- "k3s_upgrade=true"
"node-name": "agent-big-0"
"server": "https://10.1.0.1:6443"
"token": "<token>"
There are no events:
~ ❯ kubectl describe node agent-big-0
[...]
Events: <none>
Regarding draining and cordon, I tried to do it, but it did not really work, because I had to much workload in the cluster. I then deleted some deployments, tried again and made the changes to config.yaml
. To be honest at this point I'm not exactly sure if the node was drained then 😕
from terraform-hcloud-kube-hetzner.
Related Issues (20)
- [Bug]: terraform apply fails to wait for MicroOS to become available HOT 1
- [Bug]: loadbalancer is not deployed when using autoscaling group with minimum 0 HOT 1
- [Bug]: cert-manager CRDs not available when using k3s-channel 1.30 HOT 2
- [Feature Request]: Configure all HelmChart deployments with version HOT 1
- [Feature Request]: Enable configuration of "bootstrap" argument for HelmCharts
- [Feature Request]: Use helmchart deployment for hetzner csi-driver
- [Bug]: using most recent autoscaler 1.30.2 fails HOT 6
- [Bug]: autoscaled nodes have more than 3 DNS entries HOT 2
- [Bug]: v2.14.2 does not work with cluster autoscaling disabled
- [Bug]: timed out waiting for the condition on deployments/system-upgrade-controller HOT 2
- [Bug]: Autoscaled servers not being removed HOT 1
- Support IPv6 only
- Allow change of kubeapi port
- [Bug]: Not able to connect to ssh when trying to build image
- [Feature Request]: Allow a reboot inside of preinstall_exec HOT 1
- Instance type cx22 is not available (currently or anymore) HOT 2
- [Bug]: microos snapshots - Error: no image found matching the selection
- Cloud-init (and therefore terraform) are failing, how to fix it? HOT 23
- Load Balancer would be newly created HOT 1
- [Bug]: initial cloud init failing HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-hcloud-kube-hetzner.