Add a Worker Node

Use this guide when you want to add a worker to an existing control plane without rebuilding the cluster.

Step 1: Update the Ansible inventory

Add the new host under k8s_workers in ansible/inventory/hosts.yaml. If it is a CPU-only worker, also add it under cpu_only. If it is a GPU worker, add it under gpu_nvidia or gpu_intel.

Step 2: Confirm node provisioning settings

Check the shared versions and paths in ansible/group_vars/all.yaml, especially:

Step 1: k8s_version for the apt repo Step 2: containerd_version and containerd_install_method Step 3: longhorn_storage_path

If you change longhorn_storage_path, also update the Longhorn bootstrap template in bootstrap/templates/longhorn.yaml.

Step 3: Provision the worker with Ansible

Run the provisioning playbook from the repo root so paths resolve:

ansible-playbook -i ansible/inventory/hosts.yaml \
  ansible/playbooks/provision-cpu.yaml \
  -e @ansible/group_vars/all.yaml \
  --limit <worker-hostname>

For GPU nodes, use the matching playbook:

ansible-playbook -i ansible/inventory/hosts.yaml \
  ansible/playbooks/provision-nvidia-gpu.yaml \
  -e @ansible/group_vars/all.yaml \
  --limit <worker-hostname>

ansible-playbook -i ansible/inventory/hosts.yaml \
  ansible/playbooks/provision-intel-gpu.yaml \
  -e @ansible/group_vars/all.yaml \
  --limit <worker-hostname>

The playbook handles system prep, containerd, Kubernetes packages, Longhorn prerequisites, and the Tailscale client install.

Step 4: (Optional) Enroll the node in Tailscale

The Ansible role installs tailscaled but does not join the tailnet. If you want tailnet SSH or access to the node, run:

sudo tailscale up

Step 5: (Optional) Enable Longhorn disk creation on the node

If Longhorn is in use, label the node so it gets a default disk:

kubectl label node <worker-node-name> node.longhorn.io/create-default-disk=true --overwrite

Step 6: Join the worker to the cluster

On the control plane, create a join command:

kubeadm token create --print-join-command

Run the generated command on the worker:

sudo kubeadm join <control-plane-ip>:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

Step 7: Validate the node

kubectl get nodes

If the node stays NotReady, verify Cilium is installed and running on the control plane and that the worker can reach the API server.

Step 1: Update the Ansible inventory​

Step 2: Confirm node provisioning settings​

Step 3: Provision the worker with Ansible​

Step 4: (Optional) Enroll the node in Tailscale​

Step 5: (Optional) Enable Longhorn disk creation on the node​

Step 6: Join the worker to the cluster​

Step 7: Validate the node​