Introduction
Kubernetes was designed to automate the deployment and management of containerized applications. However, many complex workloads require domain-specific knowledge that the Kubernetes API alone cannot provide. This is where Operators come in.
Operators are Kubernetes extensions that encode operational knowledge into software, automating the entire lifecycle of complex stateful applications. They represent a powerful pattern for managing applications that require specialized handling.
This guide covers the Operator pattern, how to build Operators, and practical examples for automating complex workloads.
Understanding the Operator Pattern
The Operator pattern extends Kubernetes with custom controllers that understand application-specific requirements.
Core Concepts
Custom Resource Definition (CRD): Extends the Kubernetes API with custom resource types.
Controller: Continuously monitors resources and reconciles desired state.
Operator: A combination of CRDs and controllers that encode domain knowledge.
How Operators Work
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Operator Pattern โ
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Custom โโโโโโโโบโ Operator Controller โ โ
โ โ Resource โ โ (Reconciliation Loop) โ โ
โ โ (YAML) โ โโโโโโโโโโโโโฌโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโ โ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโ โ
โ โผ โ โ
โ โโโโโโโโโโโโ โ โ
โ โ Deploy โโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Manage โ โ
โ โ Monitor โ โ
โ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Common Operators
| Operator | Purpose |
|---|---|
| Prometheus Operator | Monitoring stack |
| cert-manager | TLS certificates |
| External-DNS | DNS management |
| Vault Operator | Secret management |
| MySQL Operator | Database management |
| Kafka Operator | Event streaming |
Building an Operator
Using Operator SDK
# Install Operator SDK
brew install operator-sdk
# Initialize operator project
operator-sdk init --domain example.com --project-name my-operator
# Create API (CRD)
operator-sdk create api --group cache --version v1alpha1 --kind Redis
# Generate manifests
make manifests
# Build operator
make docker-build IMG=my-operator:latest
# Deploy to cluster
make deploy
Define Custom Resource
# config/crd/bases/cache.example.com_redis.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: redis.cache.example.com
spec:
group: cache.example.com
names:
kind: Redis
listKind: RedisList
plural: redis
singular: redis
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
replicas:
type: integer
minimum: 1
maximum: 10
image:
type: string
storage:
type: string
status:
type: object
properties:
readyReplicas:
type: integer
conditions:
type: array
Implement Controller Logic
// controllers/redis_controller.go
package controllers
import (
"context"
"fmt"
"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
cachev1alpha1 "my-operator/api/v1alpha1"
)
// RedisReconciler reconciles a Redis object
type RedisReconciler struct {
client.Client
Log logr.Logger
Scheme *runtime.Scheme
}
// +kubebuilder:rbac:groups=cache.example.com,resources=redis,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cache.example.com,resources=redis/status,verbs=get;update;patch
// +kubebuilder:rbac:groups="",resources=pods;services;configmaps;events,verbs="*"
func (r *RedisReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := r.Log.WithValues("redis", req.NamespacedName)
// Fetch the Redis instance
redis := &cachev1alpha1.Redis{}
err := r.Get(ctx, req.NamespacedName, redis)
if err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// Check if being deleted
if redis.DeletionTimestamp != nil {
return ctrl.Result{}, r.cleanup(redis)
}
// Reconcile the Redis deployment
result, err := r.reconcileRedis(redis)
if err != nil {
log.Error(err, "Failed to reconcile Redis")
return result, err
}
return ctrl.Result{}, nil
}
func (r *RedisReconciler) reconcileRedis(redis *cachev1alpha1.Redis) (ctrl.Result, error) {
// Create or update Deployment
deployment := r.newDeployment(redis)
if err := r.Create(context.TODO(), deployment); err != nil {
return ctrl.Result{}, err
}
// Create or update Service
service := r.newService(redis)
if err := r.Create(context.TODO(), service); err != nil {
return ctrl.Result{}, err
}
// Update status
redis.Status.ReadyReplicas = *redis.Spec.Replicas
if err := r.Status().Update(context.TODO(), redis); err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
}
func (r *RedisReconciler) newDeployment(redis *cachev1alpha1.Redis) *appsv1.Deployment {
labels := map[string]string{"app": redis.Name}
return &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: redis.Name,
Namespace: redis.Namespace,
},
Spec: appsv1.DeploymentSpec{
Replicas: redis.Spec.Replicas,
Selector: &metav1.LabelSelector{
MatchLabels: labels,
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labels,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "redis",
Image: redis.Spec.Image,
Ports: []corev1.ContainerPort{
{ContainerPort: 6379},
},
},
},
},
},
},
}
}
func (r *RedisReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&cachev1alpha1.Redis{}).
Complete(r)
}
Using Operators
Deploying with Operator
# redis.yaml - Custom Resource
apiVersion: cache.example.com/v1alpha1
kind: Redis
metadata:
name: my-redis
namespace: default
spec:
replicas: 3
image: redis:7-alpine
storage: 10Gi
# Apply
kubectl apply -f redis.yaml
# Monitor status
kubectl get redis my-redis
kubectl describe redis my-redis
Operator Lifecycle Management
# Install Operator from OperatorHub
operatorhub install prometheus
# Update Operator
operatorhub update prometheus
# Uninstall Operator
operatorhub uninstall prometheus
Best Practices
Resource Definitions
- Use meaningful defaults
- Validate all fields with CEL or webhook
- Document fields in CRD
- Version CRDs properly (v1alpha1 โ v1beta1 โ v1)
Controller Development
- Implement idempotent reconciliation
- Handle partial state gracefully
- Add status conditions for clarity
- Implement proper error handling
- Add finalizers for cleanup
// Example: Idempotent reconciliation
func (r *MyReconciler) Reconcile(ctx context.Context, req ctrl.Request) error {
existing := &MyResource{}
err := r.Get(ctx, req.NamespacedName, existing)
if errors.IsNotFound(err) {
// Resource doesn't exist - create it
return r.create(req)
}
if err != nil {
return err
}
// Resource exists - update if needed
return r.update(existing)
}
Testing
// controllers/suite_test.go
import (
"testing"
"my-operator/api/v1alpha1"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/envtest"
)
func TestRedisReconciler(t *testing.T) {
env := &envtest.Environment{}
cfg, _ := env.Start()
k8sClient, _ := client.New(cfg, client.Options{})
// Create test resource
redis := &v1alpha1.Redis{
ObjectMeta: metav1.ObjectMeta{Name: "test"},
Spec: v1alpha1.RedisSpec{
Replicas: 1,
Image: "redis:7",
},
}
k8sClient.Create(context.Background(), redis)
// Test reconciliation logic here
}
Common Operator Patterns
Backup Operator
apiVersion: backup.example.com/v1alpha1
kind: Backup
metadata:
name: daily-backup
spec:
schedule: "0 2 * * *"
retention: 30d
target:
kind: Database
name: production-db
storage:
type: s3
bucket: my-backups
Database Operator
apiVersion: database.example.com/v1alpha1
kind: PostgreSQL
metadata:
name: my-db
spec:
version: "15"
replicas: 2
storage: 100Gi
backup:
enabled: true
schedule: "0 */6 * * *"
Implementation Checklist
Planning Phase
- Identify application requirements
- Define CRD schema
- Plan reconciliation logic
- Design status reporting
Development Phase
- Initialize Operator SDK project
- Create CRD definitions
- Implement controller logic
- Add validation webhooks
- Write tests
Deployment Phase
- Build container image
- Create OLM bundle
- Publish to OperatorHub
- Document usage
Summary
Operators extend Kubernetes to manage complex, stateful applications:
- CRDs define custom resource types for your application
- Controllers implement reconciliation logic
- Operators package CRDs and controllers together
- Best practices include idempotency, validation, and testing
The Operator pattern enables sophisticated automation that would otherwise require manual intervention.
Comments