Giter Club home page Giter Club logo

Comments (6)

apricote avatar apricote commented on August 17, 2024 1

Hey @gecube,

the error happens because hccm reports a different set of addresses for the node than the node currently has. From the error message, API requests (thanks for including them!) and the kubectl output I think the addresses reported are:

  • Node (through kubelet):
    • InternalIP 10.0.0.6
  • HCCM:
    • ExternalIP 65.108.90.3

This causes a conflict, because the library we use (kubernetes/cloud-provider) expects that hccm returns all addresses that are already specified on the node -> No removals allowed.

HCCM only returns the ExternalIP because IPs from a network are only returned if you specify the ID or Name of the network in your configuration. We need this, as a server might be in multiple networks, but only one InternalIP makes sense here.

You can do this by setting the HCLOUD_NETWORK environment variable to the ID or Name of the Network your nodes are attached to.

If you want to run a cluster without public network access, you will need to make some more configuration, as this means that your nodes will node be able to pull images or access the Hetzner Cloud API. If you only want your intra-cluster communication through the private network, that should be enough.

If you also want to use the Routing functionality, you will need to make some more configuration to your CNI & the HCCM manifests. See https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/main/docs/deploy_with_networks.md

from hcloud-cloud-controller-manager.

gecube avatar gecube commented on August 17, 2024

So - shortly - it looks like when nodes have ONLY internal IPs from private hetzner network, for some reason HCCM could not match them.

from hcloud-cloud-controller-manager.

gecube avatar gecube commented on August 17, 2024

When running the cluster on the nodes with only public addresses - no issues:

Screenshot 2024-02-19 at 08 21 37
kubectl get nodes -owide
NAME                    STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP      OS-IMAGE         KERNEL-VERSION   CONTAINER-RUNTIME
talos-control-plane-1   Ready    control-plane   23h   v1.29.1   <none>        37.27.38.153     Talos (v1.6.3)   6.1.74-talos     containerd://1.7.11
talos-control-plane-2   Ready    control-plane   23h   v1.29.1   <none>        168.119.189.58   Talos (v1.6.3)   6.1.74-talos     containerd://1.7.11
talos-control-plane-3   Ready    control-plane   23h   v1.29.1   <none>        94.130.150.142   Talos (v1.6.3)   6.1.74-talos     containerd://1.7.11
talos-worker-1          Ready    <none>          23h   v1.29.1   <none>        65.108.90.3      Talos (v1.6.3)   6.1.74-talos     containerd://1.7.11
talos-worker-2          Ready    <none>          23h   v1.29.1   <none>        65.21.152.91     Talos (v1.6.3)   6.1.74-talos     containerd://1.7.11
kubectl get pods -n kube-system
NAME                                               READY   STATUS    RESTARTS      AGE
cilium-9mmcj                                       1/1     Running   0             23h
cilium-lr87f                                       1/1     Running   0             23h
cilium-nn795                                       1/1     Running   0             23h
cilium-operator-6d6fb6b85f-2n2g6                   1/1     Running   0             23h
cilium-operator-6d6fb6b85f-tt5d2                   1/1     Running   0             23h
cilium-rp9w6                                       1/1     Running   0             23h
cilium-xwt47                                       1/1     Running   0             23h
coredns-85b955d87b-tm47c                           1/1     Running   0             23h
coredns-85b955d87b-vx9zg                           1/1     Running   0             23h
hcloud-cloud-controller-manager-584f6fc4f4-w6zk2   1/1     Running   0             22h
hcloud-csi-controller-68f987547f-cz9cz             5/5     Running   0             22h
hcloud-csi-node-75pps                              3/3     Running   0             22h
hcloud-csi-node-85xlm                              3/3     Running   0             22h
hcloud-csi-node-927pf                              3/3     Running   0             22h
hcloud-csi-node-9w5sz                              3/3     Running   0             22h
hcloud-csi-node-nl94s                              3/3     Running   0             22h
kube-apiserver-talos-control-plane-1               1/1     Running   0             23h
kube-apiserver-talos-control-plane-2               1/1     Running   0             23h
kube-apiserver-talos-control-plane-3               1/1     Running   0             23h
kube-controller-manager-talos-control-plane-1      1/1     Running   2 (23h ago)   23h
kube-controller-manager-talos-control-plane-2      1/1     Running   0             23h
kube-controller-manager-talos-control-plane-3      1/1     Running   1 (23h ago)   23h
kube-scheduler-talos-control-plane-1               1/1     Running   2 (23h ago)   23h
kube-scheduler-talos-control-plane-2               1/1     Running   0             23h
kube-scheduler-talos-control-plane-3               1/1     Running   1 (23h ago)   23h

from hcloud-cloud-controller-manager.

gecube avatar gecube commented on August 17, 2024

@apricote Hi! Thanks for your considerations. So the only reason could be that I forgot HCLOUD_NETWORK ? It is a little bit weird as I am sure that I set it up in the secret... and I don't remember a relevant error messages in logs. I will make one more experiment to check.

from hcloud-cloud-controller-manager.

apricote avatar apricote commented on August 17, 2024

Not sure how you installed HCCM (Yaml Manifests, Helm Chart,..). But this is the related excerpt from the readme:

If you manage the network yourself it might still be required to let the CCM know about private networks. You can do this by adding the environment variable with the network name/ID in the CCM deployment.

         env:
           - name: HCLOUD_NETWORK
             valueFrom:
               secretKeyRef:
                 name: hcloud
                 key: network

You also need to add the network name/ID to the secret: kubectl -n kube-system create secret generic hcloud --from-literal=token=<hcloud API token> --from-literal=network=<hcloud Network_ID_or_Name>.

As far as I remember there is no error message, as its an optional configuration value and nodes may or may not be attached a network that should be used for in-cluster communication. But maybe the attached network is also for another service, proxy, .. so adding a log for whenever no network was configured but the Node has a network has the potential to spam the logs.

We could add a log that is only sent once when no network is configured, but a node with network is processed. Then set some internal variable to "silence" this until the process is restarted.

from hcloud-cloud-controller-manager.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.