Skip to main content
โšก Calmops

Feature Management: Flags, Rollouts, and Experimentation

Feature flags decouple deployment from release. They let you ship code to production and control who sees what, when. This guide covers feature management strategies, implementation patterns, and building confidence in your releases.

What Are Feature Flags?

Feature flags are boolean controls that toggle features on/off without deploying new code:

# Simple feature flag
if feature_flags.is_enabled("new_checkout"):
    return render_new_checkout()
else:
    return render_old_checkout()

Types of Feature Flags

Type Purpose Lifetime
Release Flags Hide unfinished features Short-term
Experiment Flags A/B testing Medium-term
Operational Flags Kill switches, config Long-term
Permission Flags User-specific features Long-term

Implementing Feature Flags

Simple In-Code Implementation

class FeatureFlags:
    def __init__(self):
        self._flags = {
            "new_checkout": False,
            "dark_mode": True,
            "ai_recommendations": False,
            "beta_dashboard": ["user-1", "user-2", "user-3"],  # Users list
            "percentage_rollout": 10,  # 10% rollout
        }
    
    def is_enabled(self, flag_name: str, user_id: str = None) -> bool:
        flag = self._flags.get(flag_name)
        
        if flag is None:
            return False
        
        if isinstance(flag, bool):
            return flag
        
        if isinstance(flag, list):
            return user_id in flag
        
        if isinstance(flag, int):
            # Percentage rollout
            user_hash = hash(f"{flag_name}:{user_id}") % 100
            return user_hash < flag
        
        return False

# Usage
flags = FeatureFlags()

if flags.is_enabled("new_checkout", user_id="user-123"):
    return NewCheckoutPage()
else:
    return OldCheckoutPage()

Using a Feature Flag Service

import requests

class LaunchDarklyClient:
    def __init__(self, api_key, project_key):
        self.api_key = api_key
        self.project_key = project_key
        self.base_url = "https://app.launchdarkly.com"
    
    def is_enabled(self, flag_key: str, user_key: str) -> bool:
        response = requests.get(
            f"{self.base_url}/flags/{self.project_key}/{flag_key}",
            headers={"Authorization": api_key}
        )
        # Evaluate targeting rules
        return evaluate_flag(response.json(), user_key)

# Or use the SDK
import launchdarkly_server_sdk

ld_client = launchdarkly_server_sdk.init("api-key")
user = {"key": "user-123", "email": "[email protected]"}

show_new_feature = ld_client.variation("new-checkout", user, False)

In Configuration

# config/features.yaml
features:
  new_checkout:
    enabled: false
    rollout_percentage: 0
  
  dark_mode:
    enabled: true
    rollout_percentage: 100
  
  ai_recommendations:
    enabled: true
    rollout_percentage: 10
    target:
      - email: "*@company.com"
      - plan: "premium"

  beta_dashboard:
    enabled: true
    target_users:
      - user-1
      - user-2

Canary Releases

Gradually roll out changes to a small percentage of users:

import hashlib

class CanaryRelease:
    def __init__(self, service_name: str):
        self.service_name = service_name
        self.rollout_percentage = 10
    
    def should_route_to_canary(self, user_id: str = None) -> bool:
        if not user_id:
            # Use random for no-user requests
            import random
            return random.random() * 100 < self.rollout_percentage
        
        # Consistent hashing - same user always gets same result
        hash_value = int(hashlib.md5(
            f"{self.service_name}:{user_id}".encode()
        ).hexdigest(), 16)
        
        return (hash_value % 100) < self.rollout_percentage

# Usage with API Gateway
@app.route("/api/<path:endpoint>")
async def proxy(endpoint, request):
    user_id = get_user_id(request)
    
    canary = CanaryRelease("order-service")
    
    if canary.should_route_to_canary(user_id):
        # Route to canary version
        return await proxy_to("order-service-v2", request)
    else:
        # Route to stable version
        return await proxy_to("order-service-v1", request)

Kubernetes Canary

# Canary deployment with Istio
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - route:
        - destination:
            host: order-service-v1
          weight: 90
        - destination:
            host: order-service-v2
          weight: 10

Flagger Canary

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: order-service
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  service:
    port: 80
  canaryAnalysis:
    interval: 1m
    threshold: 10
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        threshold: 99
        interval: 1m
      - name: error-rate
        threshold: 5
        interval: 1m

A/B Testing

Test different variations with user segments:

class Experiment:
    def __init__(self, name: str, variations: dict):
        """
        variations: {
            "control": 50,  # 50% weight
            "variant_a": 25,
            "variant_b": 25
        }
        """
        self.name = name
        self.variations = variations
    
    def get_variant(self, user_id: str = None) -> str:
        if not user_id:
            import random
            r = random.random() * 100
        else:
            # Consistent hashing
            hash_value = int(hashlib.md5(
                f"{self.name}:{user_id}".encode()
            ).hexdigest(), 16)
            r = (hash_value % 10000) / 100
        
        cumulative = 0
        for variant, weight in self.variations.items():
            cumulative += weight
            if r < cumulative:
                return variant
        
        return "control"

# Usage
checkout_experiment = Experiment(
    name="new_checkout",
    variations={"control": 50, "variant_a": 25, "variant_b": 25}
)

variant = checkout_experiment.get_variant(user_id="user-123")

if variant == "control":
    return render_old_checkout()
elif variant == "variant_a":
    return render_checkout_with_visa()
else:
    return render_checkout_with_paypal()

Tracking Results

import analytics

class ExperimentTracker:
    def __init__(self, experiment_name: str):
        self.experiment_name = experiment_name
    
    def track_assignment(self, user_id: str, variant: str):
        analytics.track(
            user_id,
            "Experiment Assigned",
            {
                "experiment_name": self.experiment_name,
                "variant": variant
            }
        )
    
    def track_conversion(self, user_id: str, event_name: str, properties: dict = None):
        analytics.track(
            user_id,
            event_name,
            {
                "experiment_name": self.experiment_name,
                **(properties or {})
            }
        )

# Usage
tracker = ExperimentTracker("new_checkout")

@app.route("/checkout")
def checkout():
    variant = checkout_experiment.get_variant(user_id)
    tracker.track_assignment(user_id, variant)
    
    if variant == "control":
        return render_checkout_control()
    return render_checkout_variant()

@app.route("/purchase")
def purchase():
    # Track conversion
    tracker.track_conversion(user_id, "purchase_completed", {
        "total": 99.99,
        "variant": checkout_experiment.get_variant(user_id)
    })

Kill Switches

Quickly disable problematic features:

class KillSwitch:
    def __init__(self):
        self._switches = {}
    
    def enable(self, feature: str):
        self._switches[feature] = True
    
    def disable(self, feature: str):
        self._switches[feature] = False
    
    def is_active(self, feature: str) -> bool:
        return self._switches.get(feature, True)
    
    async def wrap(self, feature: str, func, *args, **kwargs):
        if not self.is_active(feature):
            return {"error": f"Feature {feature} is disabled"}
        
        try:
            return await func(*args, **kwargs)
        except Exception as e:
            # Auto-disable on error
            logger.error(f"Feature {feature} failed, disabling", error=str(e))
            self.disable(feature)
            raise

# Usage
killswitch = KillSwitch()

@killswitch.wrap("ai_recommendations", ai_service.get_recommendations, user_id)
async def get_recommendations(user_id):
    return await ai_service.get_recommendations(user_id)

Gradual Rollout

Increase rollout based on metrics:

class GradualRollout:
    def __init__(self, feature_name: str, initial_percentage: int = 1):
        self.feature_name = feature_name
        self.current_percentage = initial_percentage
        self.metrics = {"errors": 0, "success": 0, "p50": [], "p99": []}
    
    def record_success(self, latency_ms: int):
        self.metrics["success"] += 1
        self.metrics["p50"].append(latency_ms)
        self.metrics["p99"].append(latency_ms)
    
    def record_error(self):
        self.metrics["errors"] += 1
    
    def should_increase(self) -> bool:
        total = self.metrics["success"] + self.metrics["errors"]
        if total < 1000:
            return False
        
        error_rate = self.metrics["errors"] / total
        p99_latency = sorted(self.metrics["p99"])[int(len(self.metrics["p99"]) * 0.99)]
        
        # Increase if error rate < 1% and p99 < 500ms
        return error_rate < 0.01 and p99_latency < 500
    
    def promote(self):
        if self.should_increase() and self.current_percentage < 100:
            self.current_percentage = min(100, self.current_percentage * 2)
            logger.info(f"Promoted {self.feature_name} to {self.current_percentage}%")
            
            # Reset metrics for next phase
            self.metrics = {"errors": 0, "success": 0, "p50": [], "p99": []}

Feature Flag Management UI

Build a simple admin interface:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional, List

app = FastAPI()

class FeatureFlag(BaseModel):
    name: str
    enabled: bool
    rollout_percentage: int = 100
    target_users: Optional[List[str]] = None

flags_db = {}

@app.get("/flags")
def list_flags():
    return flags_db

@app.post("/flags")
def create_flag(flag: FeatureFlag):
    flags_db[flag.name] = flag
    return flag

@app.put("/flags/{flag_name}")
def update_flag(flag_name: str, flag: FeatureFlag):
    if flag_name not in flags_db:
        raise HTTPException(status_code=404, detail="Flag not found")
    flags_db[flag_name] = flag
    return flag

@app.delete("/flags/{flag_name}")
def delete_flag(flag_name: str):
    if flag_name in flags_db:
        del flags_db[flag_name]
    return {"status": "deleted"}

Tools Comparison

Tool Type Features Pricing
LaunchDarkly Managed Full-featured $$$
Split.io Managed A/B testing $$$
Unleash Open-source Self-hosted option Free/$
FeatureFlags.io Open-source Simple Free
Cloudflare Workers Built-in Edge $
Custom Build your own Flexible Free

Best Practices

Do

  • Keep flags short-lived
  • Clean up after rollout
  • Monitor flag metrics
  • Have rollback plan
  • Use meaningful names

Don’t

  • Don’t use for every small change
  • Don’t nest flags deeply
  • Don’t forget to clean up
  • Don’t ignore failure modes

Conclusion

Feature flags transform how you ship software:

  • Separate deployment from release
  • Enable canary and gradual rollouts
  • Run A/B experiments safely
  • Kill switches for quick response
  • Build your own or use a service

Start with simple flags, add experimentation, then build full progressive delivery.

External Resources

Comments