[21/24] Z is for Zero-Downtime Deployments: Advanced Deployment Strategies


This is Post #18 in the Kubernetes A-to-Z Series

Reading Order: Previous: Federation | Next: GitOps

Series Progress: 21/24 complete | Difficulty: Advanced | Time: 35-40 min | Part 6/6: Security & Production

Welcome to the final post in our Kubernetes A-to-Z Series! We’ll conclude with Zero-Downtime Deployments - advanced strategies that ensure your applications remain available during updates. These patterns are essential for production-grade Kubernetes operations.

Deployment Strategies Overview

┌─────────────────────────────────────────────────┐
│  Deployment Strategies                          │
│                                                 │
│  Rolling Update    Blue-Green    Canary         │
│  ┌───┐ ┌───┐      ┌───┐ ┌───┐   ┌───┐ ┌───┐   │
│  │v1 │→│v2 │      │v1 │ │v2 │   │v1 │ │v2 │   │
│  │v1 │→│v2 │      │   │ │   │   │v1 │ │   │   │
│  │v1 │→│v2 │      └───┘ └───┘   │v1 │ │   │   │
│  └───┘ └───┘      Switch        └───┘ └───┘   │
│  Gradual          Instant       % Traffic      │
│                                                 │
│  A/B Testing      Shadow        Feature Flags  │
│  ┌───────────┐    ┌───────────┐ ┌───────────┐  │
│  │ Feature A │    │  Mirror   │ │ Toggle On │  │
│  │ Feature B │    │  Traffic  │ │ Toggle Off│  │
│  └───────────┘    └───────────┘ └───────────┘  │
└─────────────────────────────────────────────────┘

Rolling Updates

Optimized Rolling Update

# rolling-update.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%        # Max extra pods during update
      maxUnavailable: 25%  # Max unavailable pods
  minReadySeconds: 10      # Wait before considering ready
  progressDeadlineSeconds: 600
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - name: webapp
        image: myapp:v2.0
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 10"]
      terminationGracePeriodSeconds: 30

Pod Disruption Budget

# pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: webapp-pdb
spec:
  minAvailable: 80%  # or maxUnavailable: 20%
  selector:
    matchLabels:
      app: webapp

Blue-Green Deployments

Blue-Green with Services

# blue-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp-blue
  labels:
    app: webapp
    version: blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
      version: blue
  template:
    metadata:
      labels:
        app: webapp
        version: blue
    spec:
      containers:
      - name: webapp
        image: myapp:v1.0
        ports:
        - containerPort: 8080
---
# green-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp-green
  labels:
    app: webapp
    version: green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
      version: green
  template:
    metadata:
      labels:
        app: webapp
        version: green
    spec:
      containers:
      - name: webapp
        image: myapp:v2.0
        ports:
        - containerPort: 8080
---
# service.yaml - Switch by changing selector
apiVersion: v1
kind: Service
metadata:
  name: webapp
spec:
  selector:
    app: webapp
    version: blue  # Change to 'green' to switch
  ports:
  - port: 80
    targetPort: 8080

Blue-Green Switch Script

#!/bin/bash
CURRENT=$(kubectl get svc webapp -o jsonpath='{.spec.selector.version}')

if [ "$CURRENT" == "blue" ]; then
  NEW="green"
else
  NEW="blue"
fi

echo "Switching from $CURRENT to $NEW"
kubectl patch svc webapp -p "{\"spec\":{\"selector\":{\"version\":\"$NEW\"}}}"
echo "Switch complete"

Canary Deployments

Manual Canary with Services

# canary-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp-stable
spec:
  replicas: 9
  selector:
    matchLabels:
      app: webapp
      track: stable
  template:
    metadata:
      labels:
        app: webapp
        track: stable
    spec:
      containers:
      - name: webapp
        image: myapp:v1.0
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp-canary
spec:
  replicas: 1  # 10% of traffic
  selector:
    matchLabels:
      app: webapp
      track: canary
  template:
    metadata:
      labels:
        app: webapp
        track: canary
    spec:
      containers:
      - name: webapp
        image: myapp:v2.0
---
apiVersion: v1
kind: Service
metadata:
  name: webapp
spec:
  selector:
    app: webapp  # Matches both stable and canary
  ports:
  - port: 80
    targetPort: 8080

Istio Canary with Traffic Splitting

# istio-canary.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: webapp
spec:
  hosts:
  - webapp
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: webapp
        subset: canary
  - route:
    - destination:
        host: webapp
        subset: stable
      weight: 90
    - destination:
        host: webapp
        subset: canary
      weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: webapp
spec:
  host: webapp
  subsets:
  - name: stable
    labels:
      version: v1
  - name: canary
    labels:
      version: v2

Argo Rollouts

Installing Argo Rollouts

kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

# Install kubectl plugin
brew install argoproj/tap/kubectl-argo-rollouts

Canary Rollout

# argo-canary-rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: webapp
spec:
  replicas: 10
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - name: webapp
        image: myapp:v1.0
        ports:
        - containerPort: 8080
  strategy:
    canary:
      steps:
      - setWeight: 5
      - pause: {duration: 2m}
      - setWeight: 20
      - pause: {duration: 5m}
      - setWeight: 50
      - pause: {duration: 5m}
      - setWeight: 80
      - pause: {duration: 2m}
      canaryService: webapp-canary
      stableService: webapp-stable
      trafficRouting:
        istio:
          virtualService:
            name: webapp
            routes:
            - primary
      analysis:
        templates:
        - templateName: success-rate
        startingStep: 2
        args:
        - name: service-name
          value: webapp-canary

Blue-Green Rollout

# argo-bluegreen-rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: webapp
spec:
  replicas: 5
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      containers:
      - name: webapp
        image: myapp:v1.0
        ports:
        - containerPort: 8080
  strategy:
    blueGreen:
      activeService: webapp-active
      previewService: webapp-preview
      autoPromotionEnabled: false
      prePromotionAnalysis:
        templates:
        - templateName: smoke-tests
      postPromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: webapp-active

Analysis Templates

# analysis-template.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
  - name: service-name
  metrics:
  - name: success-rate
    interval: 1m
    successCondition: result[0] >= 0.95
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus:9090
        query: |
          sum(rate(http_requests_total{service="{{args.service-name}}",status=~"2.."}[5m]))
          /
          sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: latency-check
spec:
  metrics:
  - name: latency-p95
    interval: 1m
    successCondition: result[0] < 500
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus:9090
        query: |
          histogram_quantile(0.95,
            sum(rate(http_request_duration_ms_bucket{service="webapp"}[5m])) by (le)
          )

Flagger Progressive Delivery

Flagger Canary

# flagger-canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: webapp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  progressDeadlineSeconds: 600
  service:
    port: 80
    targetPort: 8080
    gateways:
    - public-gateway.istio-system.svc.cluster.local
    hosts:
    - webapp.example.com
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m
    webhooks:
    - name: load-test
      url: http://flagger-loadtester/
      metadata:
        cmd: "hey -z 1m -q 10 -c 2 http://webapp-canary:8080/"

GitOps with ArgoCD

Application Definition

# argocd-application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: webapp
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/org/webapp
    targetRevision: HEAD
    path: deploy/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

Sync Waves

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: webapp
  annotations:
    argocd.argoproj.io/sync-wave: "-1"
---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: webapp-config
  annotations:
    argocd.argoproj.io/sync-wave: "0"
---
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  annotations:
    argocd.argoproj.io/sync-wave: "1"
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: webapp
  annotations:
    argocd.argoproj.io/sync-wave: "2"

Feature Flags

ConfigMap-Based Feature Flags

# feature-flags.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: feature-flags
data:
  NEW_CHECKOUT: "true"
  DARK_MODE: "false"
  BETA_FEATURES: "true"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
spec:
  template:
    spec:
      containers:
      - name: webapp
        image: myapp:v1.0
        envFrom:
        - configMapRef:
            name: feature-flags

Pre-Deployment Checks

# pre-deployment-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pre-deploy-checks
  annotations:
    argocd.argoproj.io/hook: PreSync
    argocd.argoproj.io/hook-delete-policy: HookSucceeded
spec:
  template:
    spec:
      containers:
      - name: checks
        image: curlimages/curl
        command:
        - /bin/sh
        - -c
        - |
          echo "Running pre-deployment checks..."
          curl -f http://database:5432/health || exit 1
          curl -f http://cache:6379/ping || exit 1
          echo "All checks passed!"
      restartPolicy: Never
  backoffLimit: 3

Graceful Shutdown

# graceful-shutdown.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: webapp
        image: myapp:v1.0
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - |
                echo "Stopping gracefully..."
                sleep 15
                /app/shutdown.sh
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          periodSeconds: 5

Commands Reference

# Rolling update
kubectl set image deployment/webapp webapp=myapp:v2.0
kubectl rollout status deployment/webapp
kubectl rollout history deployment/webapp
kubectl rollout undo deployment/webapp

# Argo Rollouts
kubectl argo rollouts get rollout webapp
kubectl argo rollouts promote webapp
kubectl argo rollouts abort webapp
kubectl argo rollouts retry rollout webapp

# ArgoCD
argocd app sync webapp
argocd app rollback webapp
argocd app history webapp

# Blue-Green switch
kubectl patch svc webapp -p '{"spec":{"selector":{"version":"green"}}}'

Key Takeaways

  • Rolling updates provide gradual replacement with zero downtime
  • Blue-green enables instant switchover with easy rollback
  • Canary releases to subset of users for validation
  • Argo Rollouts automates progressive delivery with analysis
  • Flagger integrates with service mesh for traffic management
  • GitOps ensures declarative, auditable deployments
  • Pod Disruption Budgets protect availability during updates
  • Graceful shutdown prevents dropped connections

Resources for Further Learning


Series Navigation:

Complete Series: Kubernetes A-to-Z Series Overview