Giter Club home page Giter Club logo

terraform-hcloud-talos's Introduction


Terraform - Hcloud - Talos

Terraform - Hcloud - Talos

GitHub Release

This repository contains a Terraform module for creating a Kubernetes cluster with Talos in the Hetzner Cloud.

  • Talos is a modern OS for Kubernetes. It is designed to be secure, immutable, and minimal.
  • Hetzner Cloud is a cloud hosting provider with nice terraform support and cheap prices.

Warning

It's under active development. Not all features are compatible with each other yet. Known issues are listed in the Known Issues section. If you find a bug or have a feature request, please open an issue.


Goals ๐Ÿš€

Goals Status Description
Production ready โœ… All recommendations from the Talos Production Clusters are implemented. But you need to read it carefully to understand all implications.
Use private networks for the internal communication of the cluster โœ…
Do not expose the Kubernetes and Talos API to the public internet via Load-Balancer โœ… Actually, the APIs are exposed to the public internet, but secured via the firewall_use_current_ip flag and a firewall rule that only allows traffic from one IP address.
Possibility to change alls CIDRs of the networks โ‰๏ธ Needs to be tested.
Configure the Cluster as good as possible to run in the Hetzner Cloud โœ… This includes manual configuration of the network devices and not via DHCP, provisioning of Floating IPs (VIP), etc.

Information about the Module

  • A lot of information can be found directly in the descriptions of the variables.
  • You can configure the module to create a cluster with 1, 3 or 5 control planes and n workers or only the control planes.
  • It allows scheduling pods on the control planes if no workers are created.
  • It has Multihoming configuration (etcd and kubelet listen on public and private IP).
  • It uses KubePrism as cluster endpoint.
  • If cluster_api_host is set, then you should create a corresponding DNS record pointing to either one control plane, the load balancer, floating IP, or alias IP. If cluster_api_host is not set, then a record for kube.[cluster_domain] should be created. It totally depends on your setup.

Additional installed software in the cluster

  • Cilium is a modern, efficient, and secure networking and security solution for Kubernetes.
  • It is used Cilium as the CNI instead of the default Flannel instead of the default Flannel.
  • It provides a lot of features like Network Policies, Load Balancing, and more.
  • Updates the Node objects with information about the server from the Cloud , like instance Type, Location, Datacenter, Server ID, IPs.
  • Cleans up stale Node objects when the server is deleted in the API.
  • Routes traffic to the pods through Hetzner Cloud Networks. Removes one layer of indirection.
  • Watches Services with type: LoadBalancer and creates Hetzner Cloud Load Balancers for them, adds Kubernetes Nodes as targets for the Load Balancer.

Prerequisites

Required Software

Recommended Software

Hetzner Cloud

Tip

If you don't have a Hetzner account yet, you are welcome to use this Hetzner Cloud Referral Link to claim 20โ‚ฌ credit and support this project.

  • Create a new project in the Hetzner Cloud Console
  • Create a new API token in the project
  • You can store the token in the environment variable HCLOUD_TOKEN or use it in the following commands/terraform files.

Usage

Packer

Create the talos os images (ARM and x86) via packer through running the create.sh. It is using the HCLOUD_TOKEN environment variable to authenticate against the Hetzner Cloud API and uses the project of the token to store the images. The talos os version is defined in the variable talos_version in talos-hcloud.pkr.hcl.

./_packer/create.sh

Terraform

Use the module as shown in the following working example:

Note

Actually, your current IP address has to have access to the nodes during the creation of the cluster.

module "talos" {
  source  = "hcloud-talos/talos/hcloud"
  version = "the-latest-version-of-the-module"

  talos_version = "v1.7.4" # The version of talos features to use in generated machine configurations

  hcloud_token = "your-hcloud-token"

  cluster_name     = "dummy.com"
  cluster_domain   = "cluster.dummy.com.local"
  cluster_api_host = "kube.dummy.com"

  # If true, the current IP address will be used as the source for the firewall rules.
  # ATTENTION: to determine the current IP, a request to a public service (https://ipv4.icanhazip.com) is made.
  firewall_use_current_ip = true

  datacenter_name = "fsn1-dc14"

  control_plane_count       = 3
  control_plane_server_type = "cax11"

  worker_count       = 3
  worker_server_type = "cax21"
}

You need to pipe the outputs of the module:

output "talosconfig" {
  value     = module.talos.talosconfig
  sensitive = true
}

output "kubeconfig" {
  value     = module.talos.kubeconfig
  sensitive = true
}

Then you can then run the following commands to export the kubeconfig and talosconfig:

terraform output --raw kubeconfig > ./kubeconfig
terraform output --raw talosconfig > ./talosconfig

Move these files to the correct location and use them with kubectl and talosctl.

Additional Configuration Examples

Kubelet Extra Args

kubelet_extra_args = {
  system-reserved            = "cpu=100m,memory=250Mi,ephemeral-storage=1Gi"
  kube-reserved              = "cpu=100m,memory=200Mi,ephemeral-storage=1Gi"
  eviction-hard              = "memory.available<100Mi,nodefs.available<10%"
  eviction-soft              = "memory.available<200Mi,nodefs.available<15%"
  eviction-soft-grace-period = "memory.available=2m30s,nodefs.available=4m"
}

Sysctls Extra Args

sysctls_extra_args = {
  # Fix for https://github.com/cloudflare/cloudflared/issues/1176
  "net.core.rmem_default" = "26214400"
  "net.core.wmem_default" = "26214400"
  "net.core.rmem_max"     = "26214400"
  "net.core.wmem_max"     = "26214400"
}

Activate Kernel Modules

kernel_modules_to_load = [
  {
    name = "binfmt_misc" # Required for QEMU
  }
]

Known Limitations

  • Changes in the user_data (e.g. talos_machine_configuration) and image (e.g. version upgrades with packer) will not be applied to existing nodes, because it would force a recreation of the nodes.

Known Issues

  • enable_alias_ip can lead to error messages occurring during the first bootstrap. More about this here: siderolabs/talos#8493 If these error messages occur, one control plane must be restarted after complete initialisation once. This should resolve the error.
  • IPv6 dual stack is not supported by Talos yet. You can activate IPv6 with enable_ipv6, but it should not have any effect.
  • enable_kube_span let's the cluster not get in ready state. It is not clear why yet. I have to investigate it.

Credits

  • kube-hetzner For the inspiration and the great terraform module. This module is based on many ideas and code snippets from kube-hetzner.
  • Talos For the incredible OS.
  • Hetzner Cloud For the great cloud hosting.

terraform-hcloud-talos's People

Contributors

mpepping avatar mrclrchtr avatar teeteufel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

terraform-hcloud-talos's Issues

Feedback: Don't put the hcloud token in a var

It is not up to you how the consumer sets up the module and the hcloud provider, They may source it through a env var to avoid having it in the state. If it is a var it will be in the state!

Error: Error making request ipv6.icanhazip.com

First of all pretty cool module I wanted to test it but running this TF module on WSL2 windows seems there is no IPV6, even if I do enable_ipv6 = false

โ•ท
โ”‚ Error: Error making request
โ”‚
โ”‚   with module.talos.data.http.personal_ipv6[0],
โ”‚   on .terraform/modules/talos/firewall.tf line 7, in data "http" "personal_ipv6":
โ”‚    7: data "http" "personal_ipv6" {
โ”‚
โ”‚ Error making request: GET https://ipv6.icanhazip.com giving up after 1 attempt(s): Get "https://ipv6.icanhazip.com": dial tcp [2606:4700::6810:b8f1]:443: connect: network is
โ”‚ unreachable
โ•ต

ARM or AMD

Hi ๐Ÿ‘‹

When reading the doc it states that packer creates X86 and AMD images, but reading the code it seems that it should be X86 and ARM.

Create the talos os images (AMD and x86) via packer through running the [create.sh](_packer/create.sh).

image_arm = var.image_url_arm != null ? var.image_url_arm : "https://github.com/siderolabs/talos/releases/download/${var.talos_version}/hcloud-arm64.raw.xz"

I would like to use ARM servers, just want to make sure that it is actually a possibility.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

github-actions
.github/workflows/release.yml
  • actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11
  • cycjimmy/semantic-release-action v4.0.0@61680d0e9b02ff86f5648ade99e01be17f0260a4
terraform
terraform.tf
  • hcloud >=1.45.0
  • http >=3.4.2
  • talos >=0.4.0
  • hashicorp/terraform 1.7.5

  • Check this box to trigger a request for Renovate to run again on this repository

Hetzner Load Balancer

As talked by email, Is it possible to add option to use hetzner load balancers?

-> to add load-balancer.hetzner.cloud/disable-private-ingress: "true"

Thaks
Joao

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.