[18/24] U is for Upgrades: Managing Cluster Lifecycle


📚 This is Post #18 in the Kubernetes A-to-Z Series

Reading Order: ← Previous: Quality AssuranceNext: Authentication →

Series Progress: 18/24 complete | Difficulty: Advanced | Time: 30 min | Part 5/6: Operations

Upgrading Kubernetes is notoriously scary. It’s a complex distributed system, and changing the engine while the car is driving down the highway requires precision.

In this post, we’ll cover the “U” of Kubernetes: Upgrades.

The Golden Rule of Upgrades

Never skip a minor version.

Kubernetes versions are expressed as x.y.z (e.g., 1.29.1).

  • x: Major version (1)
  • y: Minor version (29)
  • z: Patch version (1)

You can upgrade from 1.28 to 1.29, but not from 1.28 to 1.30. You must go step-by-step.

Version Skew Policy

Kubernetes components have a specific compatibility matrix.

  • kube-apiserver: The source of truth.
  • kubelet: Can be up to 3 minor versions older than apiserver.
  • kubectl: Can be +/- 1 minor version of apiserver.

This means you upgrade the Control Plane first, then the Worker Nodes.

Upgrade Strategies

1. In-Place Upgrade (Kubeadm)

This is the standard way for self-managed clusters.

Step 1: Upgrade Control Plane

# On control plane node
sudo apt-get update && sudo apt-get install -y kubeadm=1.29.0-00
sudo kubeadm upgrade plan
sudo kubeadm upgrade apply v1.29.0

Step 2: Upgrade Kubelet & Kubectl

sudo apt-get install -y kubelet=1.29.0-00 kubectl=1.29.0-00
sudo systemctl daemon-reload
sudo systemctl restart kubelet

Step 3: Upgrade Worker Nodes (One by One) This is where it gets tricky. You need to move workloads off the node before upgrading it.

2. Node Draining

Before upgrading a node (or rebooting it), you must drain it.

kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data

This command:

  1. Cordons the node (marks it unschedulable).
  2. Evicts all pods (safely terminates them so they restart elsewhere).

Once the node is empty, you upgrade the kubelet/OS, reboot, and then uncordon it.

kubectl uncordon node-1

3. Blue/Green Clusters (The Cloud Way)

If you are using a managed service (EKS, GKE, AKS) or have good automation, it’s often safer to create a new cluster with the new version and switch traffic over.

  1. Create Cluster B (v1.29).
  2. Deploy apps to Cluster B.
  3. Switch DNS/LoadBalancer to Cluster B.
  4. Delete Cluster A (v1.28).

Pros: Zero risk to existing workloads if upgrade fails. Cons: Costs double during the transition; stateful apps are hard to move.

Summary

  • Plan ahead: Read the release notes for breaking changes (API removals).
  • Backup etcd: Always backup before upgrading.
  • Drain nodes: Respect your workloads.
  • One step at a time: Don’t skip versions.

Next Steps

We’re almost at the end of the alphabet! Next up is Y is for YAML, where we’ll master the language that defines it all.


Series Navigation: