Talos cluster on Proxmox with Terraform

In a previous post I showed how to set up a Kubernetes cluster on Proxmox using Terraform and Ansible. That setup has worked well, but I wanted to reduce the complexity of the setup and make updating to new Kubernetes versions easier.

To achieve this, I switched to using Talos, an immutable Linux distribution that is designed to run Kubernetes (and only that). Talos is distributed as a disk image and bootstrapped using a single configuration file, the Talos machine configuration. The developers of Talos provide a Terraform module that can be used to generate this configuration file. This allowed me to drop Ansible entirely and bring up a fully functional Kubernetes cluster using only Terraform.

In this post, I will show how to bring up a two-node Talos cluster (one control-plane node and one worker node) on Proxmox using Terraform.

Installing and configuring Terraform providers

Start with an empty directory. Create a file called providers.tf with the following content:

Then run the following command to download and install the two providers:

$ terraform init

Next, we need to tell the Proxmox provider which server it should connect to and provide a username/password. Create a file called main.tf containing the following:

To provide the username and password, we will use environment variables. There are several ways to manage these. I use a tool called direnv which automatically sets the environment variables when I enter the terraform directory and remove them when I leave. This ensures that the username and password are not exposed to other applications on my system. To use direnv, create a file called .envrc and add the following:

Note: Since this file contains sensitive information you should avoid pushing it to a shared git repository. I recommend either adding it to .gitignore or using a tool like git-crypt to encrypt it.

Generating the Talos machineconfig

Next, we will use Terraform to generate the Talos machineconfig. The machineconfig includes the IP addresses of the nodes in our cluster, so we need to allocate these beforehand. How you do this will depend on your network setup. I used IP addresses that are oustide of the DHCP range of my router to avoid collisions. After allocating the IP addresses, create a file called variables.tf with the following content:

Then create a file called cluster.tf and add the following to generate the machineconfig:

Upload files to Proxmox

The next step is to upload a Talos disk image to Proxmox, to be used later when creating virtual machines. We will use a proxmox_virtual_environment_download_file resource to make the Proxmox server download the image directly. This way we don't have to download the image to our local machine first.

Create a file called files.tf with the following content. Make sure to change the node_name parameter to match your Proxmox setup.

You may notice that we are downloading Talos from a somewhat unusual URL. The reason for this is that we need a Talos image with qemu-guest-agent support, so that Terraform can know when the virtual machines have come up and received an IP address. We therefore use factory.talos.dev to download a customized image. You can replace this URL with your own if you want to add more features to the image.

Create the VMs

The final step is to create the Talos virtual machines in Proxmox. We do this by declaring two proxmox_virtual_environment_vm resources. We have to configure them to use the Talos disk image and assign them their respective IP addresses.

To do this, create a file called virtual_machines.tf with the following content. Remember to change the node_name parameter to match your Proxmox setup.

You can tweak the number of CPUs, RAM, etc. to your needs, but remember to stay within the Talos system requirements.

That's it! You can now run the following command in the directory with the .tf files to bring up your new Talos cluster:

$ terraform apply

Once Terraform has finished running, you can run the following commands to get the kubeconfig and talosconfig files for your cluster:

$ terraform output -raw kubeconfig
$ terraform output -raw talosconfig