[9/24] V is for Volumes: Persistent Storage in Kubernetes


This is Post #8 in the Kubernetes A-to-Z Series

Reading Order: Previous: Namespaces | Next: ConfigMaps and Secrets

Series Progress: 9/24 complete | Difficulty: Intermediate | Time: 25 min | Part 3/6: Configuration

Welcome to the eighth post in our Kubernetes A-to-Z Series! Now that you understand cluster organization with Namespaces, let’s explore Volumes - the mechanism for persistent storage in Kubernetes. Volumes enable your applications to store and share data beyond the lifecycle of individual containers.

Why Do We Need Volumes?

Containers are ephemeral by nature - when a container restarts, all data inside is lost. Volumes solve this problem by providing persistent storage that survives container restarts and can be shared between containers.

Container Storage vs Volumes

Without Volumes (Ephemeral):
┌─────────────────────────────────────┐
│  Pod                                │
│  ┌─────────────────────────────┐    │
│  │  Container                  │    │
│  │  ┌─────────────────────┐    │    │
│  │  │  /app/data (lost!)  │    │    │
│  │  └─────────────────────┘    │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘
Data lost on container restart

With Volumes (Persistent):
┌─────────────────────────────────────┐
│  Pod                                │
│  ┌─────────────────────────────┐    │
│  │  Container                  │    │
│  │  ┌─────────────────────┐    │    │
│  │  │  /app/data (mount)  │────┼────┼──► Volume
│  │  └─────────────────────┘    │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘
Data persists across restarts

Key Volume Benefits

  • Data Persistence: Survive container and pod restarts
  • Data Sharing: Share data between containers in same pod
  • External Storage: Connect to cloud storage, NFS, databases
  • Decoupling: Separate storage lifecycle from application lifecycle
  • Portability: Abstract storage details from applications

Volume Types Overview

Kubernetes supports many volume types:

Ephemeral Volumes

  • emptyDir: Temporary storage, deleted with pod
  • configMap: Mount configuration data
  • secret: Mount sensitive data
  • downwardAPI: Expose pod metadata

Persistent Volumes

  • hostPath: Mount node filesystem (testing only)
  • nfs: Network File System
  • persistentVolumeClaim: Dynamic/static provisioning
  • awsElasticBlockStore: AWS EBS
  • gcePersistentDisk: GCP Persistent Disk
  • azureDisk: Azure Managed Disk

Basic Volume Types

1. emptyDir Volume

Temporary storage that exists for the pod’s lifetime:

# emptydir-volume.yaml
apiVersion: v1
kind: Pod
metadata:
  name: webapp-with-cache
spec:
  containers:
  - name: webapp
    image: myapp:v1.0
    volumeMounts:
    - name: cache-volume
      mountPath: /app/cache
  - name: cache-warmer
    image: cache-warmer:v1.0
    volumeMounts:
    - name: cache-volume
      mountPath: /cache
  volumes:
  - name: cache-volume
    emptyDir: {}

2. emptyDir with Memory Backing

For high-performance temporary storage:

# emptydir-memory.yaml
apiVersion: v1
kind: Pod
metadata:
  name: fast-cache-pod
spec:
  containers:
  - name: webapp
    image: myapp:v1.0
    volumeMounts:
    - name: memory-cache
      mountPath: /app/cache
  volumes:
  - name: memory-cache
    emptyDir:
      medium: Memory
      sizeLimit: 256Mi

3. hostPath Volume

Mount node filesystem (use with caution):

# hostpath-volume.yaml
apiVersion: v1
kind: Pod
metadata:
  name: hostpath-example
spec:
  containers:
  - name: webapp
    image: myapp:v1.0
    volumeMounts:
    - name: host-logs
      mountPath: /var/log/app
  volumes:
  - name: host-logs
    hostPath:
      path: /var/log/myapp
      type: DirectoryOrCreate

Persistent Volumes (PV) and Claims (PVC)

Understanding the PV/PVC Model

Administrator Creates PV:
┌─────────────────────────────────────┐
│  PersistentVolume (Cluster-wide)    │
│  ┌─────────────────────────────┐    │
│  │  Name: database-pv          │    │
│  │  Capacity: 100Gi            │    │
│  │  AccessMode: ReadWriteOnce  │    │
│  │  StorageClass: fast-ssd     │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘

Developer Creates PVC:
┌─────────────────────────────────────┐
│  PersistentVolumeClaim (Namespaced) │
│  ┌─────────────────────────────┐    │
│  │  Name: database-pvc         │    │
│  │  Request: 50Gi              │    │
│  │  AccessMode: ReadWriteOnce  │    │
│  │  StorageClass: fast-ssd     │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘

Binding:
PV (100Gi) ◄───────► PVC (50Gi Request)
         Bound

Creating a PersistentVolume

# persistent-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: database-pv
  labels:
    type: local
    app: database
spec:
  capacity:
    storage: 100Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /mnt/data/database
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: shared-files-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteMany
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: nfs
  nfs:
    server: nfs-server.example.com
    path: /exports/shared

Creating a PersistentVolumeClaim

# persistent-volume-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
  namespace: production
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  storageClassName: manual
  selector:
    matchLabels:
      type: local
      app: database
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-files-pvc
  namespace: production
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 20Gi
  storageClassName: nfs

Using PVC in a Pod

# pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
  name: database-pod
  namespace: production
spec:
  containers:
  - name: database
    image: postgres:14
    ports:
    - containerPort: 5432
    env:
    - name: PGDATA
      value: /var/lib/postgresql/data/pgdata
    volumeMounts:
    - name: database-storage
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: database-storage
    persistentVolumeClaim:
      claimName: database-pvc

Access Modes

ModeAbbreviationDescription
ReadWriteOnceRWOSingle node read-write
ReadOnlyManyROXMultiple nodes read-only
ReadWriteManyRWXMultiple nodes read-write
ReadWriteOncePodRWOPSingle pod read-write
# access-modes-example.yaml
# Database - Single writer
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
# Shared content - Multiple readers
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: shared-content-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
---
# Static assets - Read-only access
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: static-assets-pvc
spec:
  accessModes:
  - ReadOnlyMany
  resources:
    requests:
      storage: 10Gi

Storage Classes

What are Storage Classes?

Storage Classes enable dynamic provisioning - automatically creating PVs when PVCs are created.

# storage-classes.yaml
# Fast SSD storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
# Standard HDD storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard-hdd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: st1
reclaimPolicy: Delete
allowVolumeExpansion: true
---
# NFS shared storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-shared
provisioner: nfs.csi.k8s.io
parameters:
  server: nfs-server.example.com
  share: /exports
reclaimPolicy: Retain

Using Storage Classes

# dynamic-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi

Default Storage Class

# Check default storage class
kubectl get storageclass

# Set default storage class
kubectl patch storageclass fast-ssd -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Reclaim Policies

PolicyDescription
RetainKeep PV data after PVC deletion
DeleteDelete PV and underlying storage
RecycleBasic scrub (deprecated)
# reclaim-policy-examples.yaml
# Production database - Retain data
apiVersion: v1
kind: PersistentVolume
metadata:
  name: production-db-pv
spec:
  capacity:
    storage: 500Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: fast-ssd
---
# Temporary workloads - Delete when done
apiVersion: v1
kind: PersistentVolume
metadata:
  name: temp-workspace-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: standard-hdd

StatefulSets with Volumes

For stateful applications requiring stable storage:

# statefulset-with-storage.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:14
        ports:
        - containerPort: 5432
        env:
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 100Gi

Volume Expansion

Expand existing volumes without downtime:

# Check if storage class supports expansion
kubectl get storageclass fast-ssd -o yaml | grep allowVolumeExpansion

# Edit PVC to expand
kubectl patch pvc database-pvc -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'

# Check expansion status
kubectl get pvc database-pvc
kubectl describe pvc database-pvc

Volume Snapshots

Create point-in-time snapshots:

# volume-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-snapclass
driver: ebs.csi.aws.com
deletionPolicy: Delete
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: database-snapshot
spec:
  volumeSnapshotClassName: csi-snapclass
  source:
    persistentVolumeClaimName: database-pvc
---
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc-restored
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi
  dataSource:
    name: database-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Volume Troubleshooting

Common Issues and Solutions

# Check PV status
kubectl get pv
kubectl describe pv database-pv

# Check PVC status
kubectl get pvc -n production
kubectl describe pvc database-pvc -n production

# Check if PVC is bound
kubectl get pvc database-pvc -o jsonpath='{.status.phase}'

# Check storage class
kubectl get storageclass
kubectl describe storageclass fast-ssd

# Check events for volume issues
kubectl get events --field-selector involvedObject.kind=PersistentVolumeClaim

# Check pod volume mounts
kubectl describe pod database-pod | grep -A 10 Volumes

Debugging Volume Mount Issues

# Check if volume is mounted in pod
kubectl exec -it database-pod -- df -h

# Check volume permissions
kubectl exec -it database-pod -- ls -la /var/lib/postgresql/data

# Check pod events for mount errors
kubectl describe pod database-pod | grep -A 5 Events

Volume Commands Reference

# PersistentVolume Management
kubectl get pv
kubectl describe pv database-pv
kubectl delete pv database-pv

# PersistentVolumeClaim Management
kubectl get pvc -n production
kubectl describe pvc database-pvc -n production
kubectl delete pvc database-pvc -n production

# Storage Classes
kubectl get storageclass
kubectl describe storageclass fast-ssd

# Volume Expansion
kubectl patch pvc database-pvc -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'

# Volume Snapshots
kubectl get volumesnapshot
kubectl describe volumesnapshot database-snapshot

Best Practices

1. Use Storage Classes for Dynamic Provisioning

# Always prefer dynamic provisioning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data-pvc
spec:
  storageClassName: fast-ssd
  accessModes: ["ReadWriteOnce"]
  resources:
    requests:
      storage: 50Gi

2. Set Appropriate Reclaim Policies

# Production: Retain
# Development: Delete
persistentVolumeReclaimPolicy: Retain

3. Use Volume Snapshots for Backups

# Regular snapshots for disaster recovery
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: daily-backup-$(date +%Y%m%d)
spec:
  source:
    persistentVolumeClaimName: production-db-pvc

4. Right-size Your Volumes

# Start with appropriate size, enable expansion
spec:
  resources:
    requests:
      storage: 100Gi

Key Takeaways

  • Volumes provide persistent storage for containerized applications
  • emptyDir provides temporary pod-level storage
  • PersistentVolumes and PersistentVolumeClaims enable durable storage
  • Storage Classes enable dynamic provisioning
  • Access Modes control how volumes can be mounted
  • Reclaim Policies determine what happens when PVCs are deleted
  • StatefulSets with volumeClaimTemplates manage stateful applications

Command Reference Cheatsheet

# PV/PVC Management
kubectl get pv
kubectl get pvc -A
kubectl describe pv database-pv
kubectl describe pvc database-pvc -n production

# Storage Classes
kubectl get storageclass
kubectl describe storageclass fast-ssd

# Volume Operations
kubectl patch pvc database-pvc -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
kubectl get volumesnapshot
kubectl describe volumesnapshot database-snapshot

# Troubleshooting
kubectl get events --field-selector involvedObject.kind=PersistentVolumeClaim
kubectl exec -it pod-name -- df -h
kubectl describe pod pod-name | grep -A 10 Volumes

Next Steps

Now that you understand Volumes and persistent storage, you’re ready to explore ConfigMaps and Secrets in the next post. We’ll learn how to manage application configuration and sensitive data effectively in Kubernetes.

Resources for Further Learning


Series Navigation:

Complete Series: Kubernetes A-to-Z Series Overview