Platform Engineering works when it removes real friction for product teams. An Internal Developer Platform (IDP) should hide the repetitive infrastructure work, expose safe self-service paths, and still leave enough detail visible for engineers who need to debug production.
The Kubernetes version of that platform usually starts with Backstage, GitOps, templates, observability, and policy. The hard part is not installing the tools. The hard part is turning them into a product developers actually use.
What is an Internal Developer Platform?
An Internal Developer Platform (IDP) is a curated set of tools, services, and workflows that enable developers to build, deploy, and manage applications without requiring deep infrastructure expertise.
Traditional Approach:
┌──────────────────────────────────────────────┐
│ Developer needs to: │
│ - Learn Kubernetes YAML │
│ - Understand networking │
│ - Configure CI/CD pipelines │
│ - Set up monitoring │
│ - Manage infrastructure │
│ - Handle security policies │
│ ↓ Result: Low velocity, high cognitive load │
└──────────────────────────────────────────────┘
With Internal Developer Platform:
┌──────────────────────────────────────────────┐
│ Developer uses: │
│ - Self-service portal │
│ - Template-based deployment │
│ - Automated CI/CD │
│ - Built-in observability │
│ - Standardized workflows │
│ - Integrated security │
│ ↓ Result: High velocity, focus on business │
└──────────────────────────────────────────────┘
Platform Engineering vs DevOps
Platform Engineering is an evolution of DevOps that treats internal platforms as products:
| Aspect | DevOps | Platform Engineering |
|---|---|---|
| Focus | Process & Culture | Product & Experience |
| Approach | ”You build it, you run it" | "We enable you to build it” |
| Responsibility | Developers own everything | Platform team enables developers |
| Abstraction | Low-level tools | High-level self-service |
| Goal | Automation | Developer experience |
| Team Structure | Cross-functional teams | Dedicated platform team |
Core IDP Components
1. Developer Portal (Backstage)
Purpose: One place for developers to find and operate services.
- Service catalog
- Documentation hub
- Software templates
- Kubernetes integration
- Plugins ecosystem
2. Self-Service Infrastructure
Purpose: Provision common infrastructure without ticket handoffs.
- Environment creation
- Database provisioning
- Secret management
- Resource quotas
3. Golden Paths
Purpose: Standardize the happy path for common service types.
- Application templates
- CI/CD pipelines
- Deployment strategies
- Best practices
4. Observability
Purpose: Make service health visible before incidents become archaeology.
- Centralized logging
- Metrics and dashboards
- Distributed tracing
- Cost visibility
5. Security and Governance
Purpose: Put guardrails in the platform instead of relying on tribal knowledge.
- RBAC and authentication
- Network policies
- Compliance checks
- Audit trails
Installing Backstage on Kubernetes
Prerequisites
# Install Node.js (v18+)
# macOS
brew install node@18
# Install Backstage CLI
npm install -g @backstage/cli
Create Backstage Application
# Create new Backstage app
npx @backstage/create-app@latest
# Follow prompts:
# ? Enter a name for the app: my-platform
# ? Select database: PostgreSQL
cd my-platform
# Install dependencies
yarn install
# Run locally (development)
yarn dev
Backstage Configuration
# app-config.yaml
app:
title: My Internal Developer Platform
baseUrl: https://platform.mycompany.com
organization:
name: My Company
backend:
baseUrl: https://platform.mycompany.com
listen:
port: 7007
database:
client: pg
connection:
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT}
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
catalog:
import:
entityFilename: catalog-info.yaml
rules:
- allow: [Component, System, API, Resource, Location]
locations:
# GitHub organization discovery
- type: url
target: https://github.com/myorg/backstage-entities/blob/main/catalog-info.yaml
kubernetes:
serviceLocatorMethod:
type: 'multiTenant'
clusterLocatorMethods:
- type: 'config'
clusters:
- url: https://kubernetes.default.svc
name: production
authProvider: 'serviceAccount'
skipTLSVerify: false
skipMetricsLookup: false
auth:
environment: production
providers:
github:
production:
clientId: ${AUTH_GITHUB_CLIENT_ID}
clientSecret: ${AUTH_GITHUB_CLIENT_SECRET}
Deploy to Kubernetes
# backstage-deployment.yaml
apiVersion: v1
kind: Namespace
metadata:
name: backstage
---
apiVersion: v1
kind: Secret
metadata:
name: postgres-secrets
namespace: backstage
type: Opaque
stringData:
POSTGRES_USER: backstage
POSTGRES_PASSWORD: changeme
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: backstage
spec:
selector:
app: postgres
ports:
- port: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: backstage
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-secrets
key: POSTGRES_USER
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secrets
key: POSTGRES_PASSWORD
- name: POSTGRES_DB
value: backstage
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: backstage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backstage
namespace: backstage
spec:
replicas: 2
selector:
matchLabels:
app: backstage
template:
metadata:
labels:
app: backstage
spec:
serviceAccountName: backstage
containers:
- name: backstage
image: myregistry/backstage:latest
ports:
- containerPort: 7007
env:
- name: POSTGRES_HOST
value: postgres
- name: POSTGRES_PORT
value: "5432"
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-secrets
key: POSTGRES_USER
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secrets
key: POSTGRES_PASSWORD
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
name: backstage
namespace: backstage
spec:
selector:
app: backstage
ports:
- port: 80
targetPort: 7007
type: LoadBalancer
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: backstage
namespace: backstage
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: backstage-reader
rules:
- apiGroups:
- "*"
resources:
- "*"
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: backstage-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: backstage-reader
subjects:
- kind: ServiceAccount
name: backstage
namespace: backstage
Build and Push Backstage Image
# Build Docker image
yarn build:backend
# Create Dockerfile
cat > packages/backend/Dockerfile <<'EOF'
FROM node:18-bullseye-slim
WORKDIR /app
# Install dependencies
COPY package.json yarn.lock ./
COPY packages/backend/package.json packages/backend/
RUN yarn install --frozen-lockfile --production
# Copy built backend
COPY packages/backend/dist packages/backend/dist
CMD ["node", "packages/backend/dist/index.js"]
EOF
# Build and push
docker build -t myregistry/backstage:latest -f packages/backend/Dockerfile .
docker push myregistry/backstage:latest
# Deploy to Kubernetes
kubectl apply -f backstage-deployment.yaml
Software Templates (Golden Paths)
Creating a Service Template
# templates/nodejs-service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: nodejs-service
title: Node.js Microservice
description: Create a new Node.js microservice with CI/CD
tags:
- nodejs
- recommended
spec:
owner: platform-team
type: service
parameters:
- title: Service Information
required:
- name
- description
- owner
properties:
name:
title: Name
type: string
description: Unique name for the service
pattern: '^[a-z0-9-]+$'
description:
title: Description
type: string
description: What does this service do?
owner:
title: Owner
type: string
description: Team or person responsible
ui:field: OwnerPicker
ui:options:
allowedKinds:
- Group
- User
- title: Configuration
properties:
database:
title: Database
type: string
enum:
- postgres
- mysql
- mongodb
- none
default: postgres
cache:
title: Cache
type: boolean
description: Enable Redis cache
default: true
steps:
- id: fetch-base
name: Fetch Base
action: fetch:template
input:
url: ./skeleton
values:
name: ${{ parameters.name }}
description: ${{ parameters.description }}
owner: ${{ parameters.owner }}
database: ${{ parameters.database }}
cache: ${{ parameters.cache }}
- id: publish
name: Publish to GitHub
action: publish:github
input:
allowedHosts: ['github.com']
description: ${{ parameters.description }}
repoUrl: github.com?owner=myorg&repo=${{ parameters.name }}
defaultBranch: main
repoVisibility: private
- id: create-argocd-app
name: Create ArgoCD Application
action: argocd:create-app
input:
name: ${{ parameters.name }}
namespace: ${{ parameters.name }}
repoUrl: https://github.com/myorg/${{ parameters.name }}
path: kubernetes/
- id: register
name: Register Component
action: catalog:register
input:
repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
catalogInfoPath: '/catalog-info.yaml'
output:
links:
- title: Repository
url: ${{ steps.publish.output.remoteUrl }}
- title: Open in catalog
icon: catalog
entityRef: ${{ steps.register.output.entityRef }}
Template Skeleton Structure
templates/nodejs-service/skeleton/
├── catalog-info.yaml
├── package.json
├── src/
│ └── index.js
├── Dockerfile
├── kubernetes/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── ingress.yaml
├── .github/
│ └── workflows/
│ └── ci.yaml
└── README.md
Catalog Info Template
# templates/nodejs-service/skeleton/catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: ${{ values.name }}
description: ${{ values.description }}
annotations:
github.com/project-slug: myorg/${{ values.name }}
backstage.io/kubernetes-id: ${{ values.name }}
argocd/app-name: ${{ values.name }}
tags:
- nodejs
- microservice
links:
- url: https://github.com/myorg/${{ values.name }}
title: Repository
icon: github
spec:
type: service
lifecycle: production
owner: ${{ values.owner }}
system: platform
dependsOn:
{%- if values.database != 'none' %}
- resource:${{ values.database }}
{%- endif %}
{%- if values.cache %}
- resource:redis
{%- endif %}
providesApis:
- ${{ values.name }}-api
Kubernetes Plugin Integration
Installing Kubernetes Plugin
# Install plugin
yarn add --cwd packages/app @backstage/plugin-kubernetes
# Install backend plugin
yarn add --cwd packages/backend @backstage/plugin-kubernetes-backend
Backend Configuration
// packages/backend/src/plugins/kubernetes.ts
import { KubernetesBuilder } from '@backstage/plugin-kubernetes-backend';
import { Router } from 'express';
import { PluginEnvironment } from '../types';
import { CatalogClient } from '@backstage/catalog-client';
export default async function createPlugin(
env: PluginEnvironment,
): Promise<Router> {
const catalogApi = new CatalogClient({ discoveryApi: env.discovery });
const { router } = await KubernetesBuilder.createBuilder({
logger: env.logger,
config: env.config,
catalogApi,
permissions: env.permissions,
}).build();
return router;
}
Frontend Integration
// packages/app/src/components/catalog/EntityPage.tsx
import { EntityKubernetesContent } from '@backstage/plugin-kubernetes';
const serviceEntityPage = (
<EntityLayout>
<EntityLayout.Route path="/kubernetes" title="Kubernetes">
<EntityKubernetesContent refreshIntervalMs={30000} />
</EntityLayout.Route>
</EntityLayout>
);
Self-Service Database Provisioning
Database Operator Template
# templates/database/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: postgresql-database
title: PostgreSQL Database
description: Provision a new PostgreSQL database
spec:
owner: platform-team
type: resource
parameters:
- title: Database Configuration
required:
- name
- owner
properties:
name:
title: Database Name
type: string
pattern: '^[a-z0-9-]+$'
owner:
title: Owner
type: string
ui:field: OwnerPicker
size:
title: Storage Size
type: string
enum:
- 10Gi
- 50Gi
- 100Gi
default: 10Gi
backup:
title: Enable Backups
type: boolean
default: true
steps:
- id: create-postgres
name: Create PostgreSQL Instance
action: kubernetes:apply
input:
namespaced: true
namespace: databases
manifest: |
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: ${{ parameters.name }}
spec:
instances: 3
storage:
size: ${{ parameters.size }}
backup:
enabled: ${{ parameters.backup }}
retentionPolicy: "30d"
- id: create-secret
name: Create Database Credentials Secret
action: kubernetes:apply
input:
manifest: |
apiVersion: v1
kind: Secret
metadata:
name: ${{ parameters.name }}-credentials
namespace: databases
stringData:
database: ${{ parameters.name }}
username: app
password: ${{ generatePassword() }}
- id: register
name: Register Resource
action: catalog:register
input:
catalogInfoContent: |
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
name: ${{ parameters.name }}
description: PostgreSQL database
annotations:
backstage.io/kubernetes-id: ${{ parameters.name }}
spec:
type: database
owner: ${{ parameters.owner }}
system: platform
CI/CD Integration with ArgoCD
ArgoCD Plugin for Backstage
# Install ArgoCD plugin
yarn add --cwd packages/app @roadiehq/backstage-plugin-argo-cd
yarn add --cwd packages/backend @roadiehq/backstage-plugin-argo-cd-backend
ArgoCD Backend Setup
// packages/backend/src/plugins/argocd.ts
import { createRouter } from '@roadiehq/backstage-plugin-argo-cd-backend';
import { Router } from 'express';
import { PluginEnvironment } from '../types';
export default async function createPlugin(
env: PluginEnvironment,
): Promise<Router> {
return await createRouter({
logger: env.logger,
config: env.config,
});
}
Configuration
# app-config.yaml
argocd:
baseUrl: https://argocd.mycompany.com
username: ${ARGOCD_USERNAME}
password: ${ARGOCD_PASSWORD}
appLocatorMethods:
- type: 'config'
instances:
- name: production
url: https://argocd.mycompany.com
token: ${ARGOCD_TOKEN}
Cost Visibility with Kubecost
Kubecost Integration
# kubecost-values.yaml
global:
prometheus:
enabled: false
fqdn: http://prometheus-server.prometheus.svc:80
kubecostProductConfigs:
clusterName: production
ingress:
enabled: true
hosts:
- cost.mycompany.com
# Install Kubecost
helm install kubecost cost-analyzer \
--repo https://kubecost.github.io/cost-analyzer/ \
--namespace kubecost \
--create-namespace \
-f kubecost-values.yaml
Kubecost Plugin for Backstage
// Custom plugin for cost visibility
export const CostWidget = () => {
const { entity } = useEntity();
const namespace = entity.metadata.annotations?.['backstage.io/kubernetes-namespace'];
return (
<InfoCard title="Monthly Cost">
<iframe
src={`https://cost.mycompany.com/allocation?namespace=${namespace}`}
width="100%"
height="400px"
/>
</InfoCard>
);
};
Developer Portal Features
Service Catalog
# catalog-info.yaml - System
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
name: ecommerce
description: E-commerce platform
spec:
owner: platform-team
---
# catalog-info.yaml - Component
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: checkout-service
description: Handles checkout process
annotations:
github.com/project-slug: myorg/checkout-service
backstage.io/kubernetes-id: checkout-service
backstage.io/techdocs-ref: dir:.
spec:
type: service
lifecycle: production
owner: checkout-team
system: ecommerce
dependsOn:
- resource:postgres-checkout
- resource:redis-cache
providesApis:
- checkout-api
---
# catalog-info.yaml - API
apiVersion: backstage.io/v1alpha1
kind: API
metadata:
name: checkout-api
description: Checkout API
spec:
type: openapi
lifecycle: production
owner: checkout-team
system: ecommerce
definition:
$text: https://github.com/myorg/checkout-service/blob/main/openapi.yaml
TechDocs Integration
# Install TechDocs
yarn add --cwd packages/app @backstage/plugin-techdocs
yarn add --cwd packages/backend @backstage/plugin-techdocs-backend
# mkdocs.yml
site_name: 'Checkout Service Documentation'
nav:
- Home: index.md
- Architecture: architecture.md
- API Reference: api.md
- Runbooks: runbooks.md
plugins:
- techdocs-core
Platform Metrics and Monitoring
Golden Signals Dashboard
# grafana-dashboard-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: platform-dashboard
namespace: monitoring
data:
platform-metrics.json: |
{
"dashboard": {
"title": "Platform Health",
"panels": [
{
"title": "Deployment Frequency",
"targets": [{
"expr": "rate(argocd_app_sync_total[1d])"
}]
},
{
"title": "Lead Time for Changes",
"targets": [{
"expr": "argocd_app_sync_duration_seconds"
}]
},
{
"title": "Mean Time to Recovery",
"targets": [{
"expr": "avg(time() - kube_pod_created)"
}]
},
{
"title": "Change Failure Rate",
"targets": [{
"expr": "rate(argocd_app_sync_failed_total[1d])"
}]
}
]
}
}
DORA Metrics Tracking
// Custom backend plugin for DORA metrics
export class DORAMetricsCollector {
async getDeploymentFrequency(timeRange: string): Promise<number> {
const syncs = await this.argocd.getSyncs(timeRange);
return syncs.length / this.getDays(timeRange);
}
async getLeadTimeForChanges(): Promise<number> {
const commits = await this.github.getCommits();
const deployments = await this.argocd.getDeployments();
return this.calculateAverageTime(commits, deployments);
}
async getMTTR(): Promise<number> {
const incidents = await this.incidents.getResolved();
return this.calculateAverageResolutionTime(incidents);
}
async getChangeFailureRate(): Promise<number> {
const deployments = await this.argocd.getDeployments();
const failed = deployments.filter(d => d.status === 'Failed');
return (failed.length / deployments.length) * 100;
}
}
Security and Compliance
Policy Enforcement with OPA
# opa-policy.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: platform-policies
namespace: opa
data:
policy.rego: |
package platform.admission
import future.keywords.if
deny[msg] if {
input.request.kind.kind == "Deployment"
not input.request.object.spec.template.spec.securityContext
msg := "Deployments must define securityContext"
}
deny[msg] if {
input.request.kind.kind == "Deployment"
container := input.request.object.spec.template.spec.containers[_]
not container.resources.limits
msg := sprintf("Container %v must define resource limits", [container.name])
}
deny[msg] if {
input.request.kind.kind == "Deployment"
container := input.request.object.spec.template.spec.containers[_]
container.image
not startswith(container.image, "myregistry.com/")
msg := sprintf("Container %v uses unauthorized registry", [container.name])
}
Automated Security Scanning
# .github/workflows/security-scan.yaml
name: Security Scan
on: [push]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: 'myregistry.com/${{ github.repository }}:${{ github.sha }}'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload to Security Tab
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
- name: Policy Check
run: |
conftest test kubernetes/ --policy opa-policies/
Practical Rules for IDP Success
1. Start with User Research
## Developer Survey Questions
1. What are your biggest pain points in deploying applications?
2. How much time do you spend on infrastructure tasks?
3. What would make your development workflow better?
4. What documentation is missing or unclear?
5. What repetitive tasks would you like automated?
2. Establish Golden Paths
Golden Path Characteristics:
- Opinionated but flexible
- Well-documented
- Automated end-to-end
- Secure by default
- Observable out-of-the-box
- Cost-optimized
3. Measure Platform Adoption
# Platform metrics to track
metrics:
adoption:
- services_using_templates
- self_service_provisioning_rate
- documentation_usage
efficiency:
- time_to_first_deployment
- deployment_frequency
- pr_to_production_time
satisfaction:
- nps_score
- support_ticket_volume
- developer_survey_results
4. Treat Platform as a Product
- Assign product manager
- Regular user feedback sessions
- Roadmap planning
- Feature prioritization
- Marketing and communication
- Training and onboarding
Pitfalls to Avoid
Building Everything In-House
Problem: Reinventing the wheel Solution: Use existing tools such as Backstage and ArgoCD where they fit.
Too Much Abstraction
Problem: Hiding too much complexity Solution: Provide escape hatches for advanced users
Lack of Documentation
Problem: Low adoption due to unclear usage Solution: Keep docs close to templates, ownership, and runbooks
Ignoring Developer Feedback
Problem: Building features nobody wants Solution: Regular feedback loops and user research
No Metrics
Problem: Can’t prove platform value Solution: Track DORA metrics, adoption, and satisfaction
Measuring Platform Success
Key Performance Indicators
| Metric | Target | Measurement |
|---|---|---|
| Time to First Deploy | < 1 day | Onboarding to production |
| Deployment Frequency | Multiple/day | GitOps sync rate |
| MTTR | < 1 hour | Incident to resolution |
| Platform Adoption | > 80% | Services on platform |
| Developer Satisfaction | NPS > 50 | Quarterly surveys |
| Self-Service Rate | > 90% | Automated vs manual |
What to Keep
- Platform Engineering treats internal platforms as products
- Backstage provides foundation for developer portals
- Golden Paths standardize common workflows
- Self-Service reduces toil and increases velocity
- Templates enable consistent, secure deployments
- Observability built-in from day one
- Metrics prove platform value and guide improvements
- Developer Experience is the primary focus, but production clarity still matters
Further Reading
- Backstage Documentation
- Platform Engineering Guide
- Team Topologies
- CNCF Platforms White Paper
- Backstage Community
- Platform Engineering Slack
Start with one painful workflow, one team, and one measurable improvement. A useful platform grows from repeated adoption, not from a big-bang portal launch.