Security incidents in Kubernetes rarely come from one catastrophic failure. Most of them come from small gaps across identity, networking, workload configuration, and software supply chain. This guide gives you a practical hardening checklist you can apply to production clusters.
1. Identity and Access (IAM + RBAC)
- Disable anonymous API access where possible.
- Use short-lived credentials and avoid static admin tokens.
- Enforce least privilege in RBAC. Avoid broad
cluster-adminbindings. - Separate human access and workload access.
- Require MFA in your IdP for privileged roles.
Example: audit high-risk RBAC bindings.
kubectl get clusterrolebinding -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.roleRef.name}{"\n"}{end}'
2. Namespace and Tenant Isolation
- Isolate teams and environments by namespace.
- Add
ResourceQuotaandLimitRangefor each tenant namespace. - Block cross-namespace secret access via RBAC.
- Use dedicated node pools for sensitive workloads if needed.
3. Pod Security Standards
Use restricted Pod Security where possible:
runAsNonRoot: trueallowPrivilegeEscalation: false- drop Linux capabilities
readOnlyRootFilesystem: trueseccompProfile: RuntimeDefault
apiVersion: v1
kind: Pod
metadata:
name: hardened-pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: ghcr.io/example/app:1.0.0
securityContext:
runAsNonRoot: true
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
4. Network Security
- Start with default deny
NetworkPolicyper namespace. - Explicitly allow only required ingress/egress flows.
- Restrict egress to internet for workloads that do not need it.
- Encrypt service-to-service traffic with mTLS (service mesh or CNI capabilities).
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
5. Secret Management
- Never store plaintext secrets in Git.
- Enable etcd encryption at rest.
- Rotate secrets automatically.
- Prefer external secret managers (Vault, cloud secret managers).
- Scope secret access by service account.
6. Supply Chain Security
- Only pull images from trusted registries.
- Sign and verify container images.
- Scan images for vulnerabilities in CI.
- Pin image tags to immutable digests for production.
image: ghcr.io/example/api@sha256:2d0c...b81a
7. Admission Control and Policy as Code
- Use OPA Gatekeeper or Kyverno for guardrails.
- Enforce: no privileged containers, required labels, trusted registries.
- Deny workloads with
:latesttags. - Require resource requests/limits for all pods.
Example Kyverno rule idea:
- deny if
spec.containers[].imageends with:latest
8. Runtime Detection and Auditing
- Enable Kubernetes audit logs and centralize them.
- Add runtime detection (Falco or equivalent).
- Alert on suspicious events:
- shell spawned in production container
- privilege escalation attempts
- unexpected outbound connections
9. Node and Control Plane Hardening
- Keep Kubernetes and node OS patched.
- Minimize SSH access to worker nodes.
- Use hardened node images.
- Restrict access to API server endpoint.
- Back up etcd regularly and test restore.
10. Incident Readiness
- Define incident severity levels and owners.
- Create runbooks for credential leak, compromised image, and lateral movement.
- Practice security game days.
- Measure MTTD and MTTR for security incidents.
Production Hardening Baseline
If you need a fast baseline, enforce these first:
- Restricted Pod Security standards.
- Default deny NetworkPolicy.
- Least-privilege RBAC with IdP integration.
- Signed images and vulnerability scanning in CI.
- Audit logging with runtime detection.
This baseline removes the most common privilege escalation and lateral movement paths while staying practical for engineering teams.