Skip to main content
โšก Calmops

Platform Engineering: Building Internal Developer Platforms

Platform engineering is the discipline of building and operating internal platforms that enable developers to deliver software faster and more reliably. It’s about creating a “paved road” that makes the right thing easy.

In this guide, we’ll explore platform engineering principles, components, and implementation.

What is Platform Engineering?

The Problem

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              Developer Friction Examples                       โ”‚
โ”‚                                                             โ”‚
โ”‚   Before Platform Engineering:                               โ”‚
โ”‚                                                             โ”‚
โ”‚   Developer wants to deploy a service:                      โ”‚
โ”‚                                                             โ”‚
โ”‚   1. Create ticket for infra team (2 days)                 โ”‚
โ”‚   2. Wait for database provisioning (1 day)                โ”‚
โ”‚   3. Configure CI/CD pipeline (1 day)                       โ”‚
โ”‚   4. Set up monitoring (1 day)                              โ”‚
โ”‚   5. Configure alerts (1 day)                                โ”‚
โ”‚   6. Set up secrets (1 day)                                 โ”‚
โ”‚   7. ... (2 weeks total!)                                   โ”‚
โ”‚                                                             โ”‚
โ”‚   After Platform Engineering:                                โ”‚
โ”‚                                                             โ”‚
โ”‚   1. Click "Deploy Service" (่‡ชๅŠฉ)                          โ”‚
โ”‚   2. Service deployed in minutes!                           โ”‚
โ”‚                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Platform Engineering Definition

platform_engineering = {
    "definition": "Building internal platforms that enable developer self-service",
    
    "goals": [
        "Reduce developer friction",
        "Standardize tooling and processes",
        "Improve developer experience",
        "Accelerate time to production",
        "Improve system reliability"
    ],
    
    "vs_traditional": {
        "traditional": "Infra team does everything",
        "platform": "Platform team builds self-service tools"
    }
}

Platform Components

Core Components

# Internal Developer Platform components

components:
  - name: "Self-service provisioning"
    description: "Deploy services, databases, caches with one click"
    tools: ["Terraform", "Crossplane", "Helmfile"]
    
  - name: "CI/CD pipelines"
    description: "Standardized build and deployment"
    tools: ["GitHub Actions", "GitLab CI", "ArgoCD"]
    
  - name: "Service catalog"
    description: "Service ownership and metadata"
    tools: ["Backstage", "Port", "ServiceNow"]
    
  - name: "Observability"
    description: "Logging, metrics, tracing"
    tools: ["Prometheus", "Grafana", "Jaeger"]
    
  - name: "Secret management"
    description: "Secure credential handling"
    tools: ["Vault", "AWS Secrets", "Sealed Secrets"]
    
  - name: "API gateway"
    description: "Traffic management"
    tools: ["Kong", "Ambassador", "Istio"]

Golden Paths

# Golden path = opinionated, supported path

golden_path = {
    "description": "Pre-configured, supported way to do something",
    
    "benefits": [
        "Reduces decision fatigue",
        "Ensures best practices",
        "Faster onboarding",
        "Easier debugging (standardized)"
    ],
    
    "vs_self_service": {
        "self_service": "Developer can do anything (complex)",
        "golden_path": "Developer guided to best path (simpler)"
    }
}

# Example: Deploying a service
golden_path_example = """
1. Use template โ†’ 2. Fill config โ†’ 3. Merge โ†’ 4. Done!
   (vs: manually create K8s, Docker, CI/CD, monitoring...)
"""

Backstage - Service Catalog

Setting Up Backstage

# Backstage installation

# 1. Create Backstage app
npx @backstage/create-app@latest

# 2. Add plugins
# app-config.yaml
app:
  title: Developer Portal

integrations:
  github:
    - host: github.com
      token: ${GITHUB_TOKEN}

proxy:
  '/spotify':
    target: https://api.spotify.com
    changeOrigin: true

# 3. Register a service
# catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  description: Payment processing service
  annotations:
    github.com/project-slug: myorg/payment-service
spec:
  type: service
  lifecycle: production
  owner: platform-team
  system: payments

Custom Plugins

// Custom plugin example
import { createPlugin } from '@backstage/core';

export const myPlugin = createPlugin({
  id: 'my-plugin',
  
  routes: {
    root: '/my-plugin',
  },
  
  // Custom entity card
  entityCard: {
    element: <PaymentStatusCard />,
    if: (entity) => entity.spec?.type === 'service',
  },
});

Self-Service Provisioning

Infrastructure as Code Templates

# Terraform module for standard service

variable "service_name" {
  description = "Name of the service"
  type        = string
}

variable "team" {
  description = "Team owning the service"
  type        = string
}

variable "environment" {
  description = "Environment"
  type        = string
  default     = "production"
}

# Creates everything needed for a service
module "service" {
  source = "./modules/service"
  
  service_name = var.service_name
  team         = var.team
  environment  = var.environment
  
  # Everything pre-configured!
  # - Kubernetes namespace
  # - Database (if needed)
  # - Redis cache (if needed)
  # - S3 bucket (if needed)
  # - CI/CD pipeline
  # - Monitoring
  # - Alerts
  # - Secrets
}

GitOps for Platform

# ArgoCD application for service provisioning

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: payment-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/org/platform-manifests
    path: services/payment-service
    targetRevision: main
  destination:
    server: https://kubernetes.default.svc
    namespace: payment
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Developer Experience

Measuring DevEx

# Developer Experience Metrics

devex_metrics = {
    "deployment_frequency": {
        "description": "How often code deploys to production",
        "good": "Multiple times per day",
        "bad": "Once per month or less"
    },
    
    "lead_time": {
        "description": "Time from commit to production",
        "good": "Minutes to hours",
        "bad": "Weeks to months"
    },
    
    "mttr": {
        "description": "Mean time to recovery",
        "good": "Minutes",
        "bad": "Hours to days"
    },
    
    "change_failure_rate": {
        "description": "Percentage of failed deployments",
        "good": "< 5%",
        "bad": "> 15%"
    },
    
    "developer_satisfaction": {
        "description": "Survey score",
        "good": "> 4/5",
        "bad": "< 3/5"
    }
}

Developer Surveys

# Platform team should regularly survey developers

survey_questions:
  - "How long does it take to deploy a new service?"
  - "How easy is it to get started?"
  - "How satisfied are you with the platform?"
  - "What blocks you most?"
  - "What would you improve?"
  
frequency: "Quarterly"

actions:
  - "Review results in platform team"
  - "Create issues for improvements"
  - "Share progress with stakeholders"

Platform as a Product

Treating Platform as Product

# Platform as a Product principles

platform_as_product = {
    "product_manager": "Dedicated person for platform",
    
    "user_research": {
        "interviews": "Talk to developers regularly",
        "surveys": "Quarterly surveys",
        "analytics": "Track platform usage"
    },
    
    "roadmap": "Based on developer needs, not tech trends",
    
    "slas": "Commit to reliability and support levels",
    
    "documentation": "Treat docs as product - keep updated"
}

Internal Developer Portal

# Example portal features

portal_features:
  - name: "Service catalog"
    description: "Find all services, owners, documentation"
    
  - name: "Environments"
    description: "View environment status, deploy"
    
  - name: "Logs & metrics"
    description: "Quick access to observability"
    
  - name: "Runbooks"
    description: "Operational guidance"
    
  - name: "Cost tracking"
    description: "See cost by team/service"
    
  - name: "Dependencies"
    description: "Service dependency visualization"

Building Your Platform

Starting Simple

# Platform maturity model

maturity_stages = [
    {
        "stage": "1. Manual",
        "description": "Everything manual, ticket-based",
        "focus": "Start tracking"
    },
    
    {
        "stage": "2. Scripts",
        "description": "Team creates scripts for common tasks",
        "focus": "Identify most common requests"
    },
    
    {
        "stage": "3. Self-service",
        "description": "Developers can self-serve common tasks",
        "focus": "Build first golden paths"
    },
    
    {
        "stage": "4. Integrated",
        "description": "Platform integrated with workflows",
        "focus": "Improve developer experience"
    },
    
    {
        "stage": "5. Automated",
        "description": "Platform continuously improves itself",
        "focus": "AIOps, auto-remediation"
    }
]

Quick Wins

# Start with high impact, low effort

quick_wins:
  - name: "Service catalog"
    impact: "High"
    effort: "Medium"
    tool: "Backstage"
    
  - name: "Shared CI/CD templates"
    impact: "High"
    effort: "Low"
    tool: "GitHub Actions reusable workflows"
    
  - name: "Developer documentation portal"
    impact: "Medium"
    effort: "Low"
    tool: "Docusaurus / GitBook"
    
  - name: "Standard Kubernetes manifests"
    impact: "High"
    effort: "Medium"
    tool: "Helm charts / Kustomize"
    
  - name: "On-call rotation tool"
    impact: "Medium"
    effort: "Low"
    tool: "Opsgenie / PagerDuty"

Conclusion

Platform engineering accelerates developer productivity:

  • Self-service: Developers can provision resources
  • Golden paths: Pre-configured, supported workflows
  • Service catalog: Find and manage services
  • Treat as product: User research, roadmap, metrics

Start with quick wins and iterate based on developer feedback.


Comments