Skip to content

Kubernetes Pod Scheduling and Node Affinity

Control exactly where your pods run in the cluster using Node Affinity, Taints, and Tolerations. This guide explains how to target specific nodes (like an ARM64 worker) while ensuring high availability.

Nodes can be labeled with key-value pairs. Standard labels include:

  • kubernetes.io/arch: amd64, arm64
  • kubernetes.io/os: linux
  • node-role.kubernetes.io/worker: worker

You can add custom labels:

Terminal window
kubectl label node <node-name> disk=ssd

Affinity allows you to constrain which nodes your pod is eligible to be scheduled on.

  • Required (requiredDuringSchedulingIgnoredDuringExecution): The pod must run on a matching node. If no match is found, the pod stays Pending.
  • Preferred (preferredDuringSchedulingIgnoredDuringExecution): The scheduler tries to find a matching node. If not found, it schedules the pod anywhere.

When moving workloads or updating deployments, ensure zero downtime by configuring the RollingUpdate strategy correctly.

  • maxUnavailable: 0: Ensure no old pods are killed until new ones are Ready.
  • maxSurge: 1: Allow creating 1 extra pod above the desired count during updates.

Example: Preferring a Specific Worker Node

Section titled “Example: Preferring a Specific Worker Node”

This configuration encourages the pod to run on any node labeled with node-role.kubernetes.io/worker=worker (for example, a dedicated worker), but allows it to run on the control plane if the worker is down.

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
template:
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker
operator: In
values:
- worker
containers:
- name: my-app
image: my-image:latest

Taints repel pods from nodes. A pod must have a matching toleration to be scheduled on a tainted node.

Example Taint:

Terminal window
kubectl taint nodes node1 key=value:NoSchedule

Example Toleration:

tolerations:
- key: "key"
operator: "Equal"
value: "value"
effect: "NoSchedule"

Avoid pinning every application to a specific node.

Adding affinity or nodeSelector to everything create rigid constraints that fight against Kubernetes’ ability to self-heal.

  • Hardware requirements: The app needs a specific GPU, USB device, or architecture (ARM64 vs AMD64) present only on certain nodes.
  • Performance/cost: Offload non-critical apps to a cheaper/slower node.
  • Data gravity: Kubernetes handles this for you. If a pod uses a PVC on a specific node, Kubernetes naturally schedules the pod there.

The “Filebrowser Lesson” (Anti-Pattern)

Section titled “The “Filebrowser Lesson” (Anti-Pattern)”

Do not force a pod to a node if it shares dependencies with other pods.

  • Scenario: Filebrowser shared a volume with Jellyfin.
  • Constraint 1: Jellyfin must run on node-1 (GPU).
  • Constraint 2: Filebrowser’s volume is therefore locked to node-1.
  • Error: Forcing Filebrowser to node-2 caused a deadlock because the volume could not follow it.
  • Fix: Remove the affinity. Let Kubernetes co-locate them naturally.