Load Balancing Strategies: Complete Guide

Load balancing is essential for distributing traffic across multiple servers. This guide covers algorithms, implementations, and best practices for building resilient systems.

Why Load Balancing?

Without Load Balancing:           With Load Balancing:
┌──────────────┐                 ┌──────────────────────┐
│              │                 │    Load Balancer     │
│   Requests   │                 └──────────┬───────────┘
│    ────────▶ │                            │
│              │                 ┌───────────┼───────────┐
│              │                 ▼           ▼           ▼
│   Server 1   │              ┌───────┐ ┌───────┐ ┌───────┐
│   (overload) │              │Server1│ │Server2│ │Server3│
└──────────────┘              └───────┘ └───────┘ └───────┘

Results:
- Server crashes
- No fault tolerance
- Poor user experience

Results:
- Distributed load
- Automatic failover
- Better performance

Load Balancing Algorithms

1. Round Robin

class RoundRobin:
    def __init__(self, servers):
        self.servers = servers
        self.index = 0
    
    def get_server(self):
        server = self.servers[self.index]
        self.index = (self.index + 1) % len(self.servers)
        return server

# All servers get equal requests
# Good for: Homogeneous servers, stateless services

2. Least Connections

class LeastConnections:
    def __init__(self, servers):
        self.servers = {s: 0 for s in servers}
    
    def get_server(self):
        # Select server with fewest active connections
        server = min(self.servers, key=self.servers.get)
        self.servers[server] += 1
        return server
    
    def release(self, server):
        self.servers[server] -= 1

# Dynamic - adapts to current load
# Good for: Long-lived connections, variable request times

3. Weighted Round Robin

class WeightedRoundRobin:
    def __init__(self, servers):
        # servers = [("server1", 3), ("server2", 1)]
        self.servers = []
        for server, weight in servers:
            self.servers.extend([server] * weight)
        self.index = 0
    
    def get_server(self):
        server = self.servers[self.index]
        self.index = (self.index + 1) % len(self.servers)
        return server

# Server1: 75% of traffic (weight 3)
# Server2: 25% of traffic (weight 1)
# Good for: Heterogeneous server capacities

4. IP Hash

class IPHash:
    def __init__(self, servers):
        self.servers = servers
    
    def get_server(self, client_ip):
        # Consistent hashing
        hash_value = hash(client_ip)
        index = hash_value % len(self.servers)
        return self.servers[index]

# Same IP always goes to same server
# Good for: Sticky sessions, cache locality

5. Least Response Time

import time

class LeastResponseTime:
    def __init__(self, servers):
        self.servers = {s: {"active": 0, "avg_time": 0} for s in servers}
    
    def get_server(self):
        # Select server with lowest (active + avg_response_time)
        best = min(
            self.servers.items(),
            key=lambda x: x[1]["active"] + x[1]["avg_time"]
        )
        self.servers[best[0]]["active"] += 1
        return best[0]
    
    def record_response_time(self, server, duration):
        # Update rolling average
        current = self.servers[server]["avg_time"]
        self.servers[server]["avg_time"] = (current * 0.7 + duration * 0.3)
        self.servers[server]["active"] -= 1

# Adapts to real performance
# Good for: Performance-critical applications

Health Checks

Types of Health Checks

import requests

class HealthChecker:
    def __init__(self, servers):
        self.servers = servers
        self.status = {s: True for s in servers}
    
    # 1. TCP Check - Port open
    def tcp_check(self, server, port=80):
        import socket
        sock = socket.socket()
        sock.settimeout(2)
        try:
            sock.connect((server, port))
            return True
        except:
            return False
    
    # 2. HTTP Check - GET endpoint
    def http_check(self, server):
        try:
            response = requests.get(f"http://{server}/health", timeout=2)
            return response.status_code == 200
        except:
            return False
    
    # 3. Deep Check - Verify dependencies
    def deep_check(self, server):
        try:
            # Check main service
            r1 = requests.get(f"http://{server}/health")
            # Check database
            r2 = requests.get(f"http://{server}/db/health")
            # Check cache
            r3 = requests.get(f"http://{server}/cache/health")
            
            return all(r.status_code == 200 for r in [r1, r2, r3])
        except:
            return False
    
    def check_all(self):
        for server in self.servers:
            is_healthy = self.http_check(server)
            self.status[server] = is_healthy
        return self.status

Health Check Configuration

# Example Kubernetes health check config
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Sticky Sessions

Why Sticky Sessions?

Without Sticky Sessions:          With Sticky Sessions:
┌─────────┐                       ┌─────────┐
│   LB    │                       │   LB    │
└────┬────┘                       └────┬────┘
     │                                  │
  ┌──┴──┐                          ┌───┴───┐
  ▼     ▼                          ▼       ▼
Server1 Server2                  Server1  Server2
  │                               │
  ▼                               ▼
User A  User A                   User A   User B
(not saved)                      (session saved)

Implementation

# Cookie-based sticky sessions
class CookieStickyLB:
    def __init__(self, servers):
        self.servers = servers
    
    def get_server(self, request):
        # Check for existing session cookie
        session_id = request.cookies.get('session_id')
        
        if session_id:
            # Look up which server serves this session
            server = session_store.get(session_id)
            if server and server in self.servers:
                return server
        
        # New session - select server
        server = self.round_robin()
        
        # Store session mapping
        session_store.set(session_id, server)
        
        return server

# Redis session store
import redis
session_store = redis.Redis(host='localhost', db=1)

Geographic Load Balancing

DNS-Based Geo-Routing

┌─────────────────────────────────────────────────────────┐
│                   DNS Resolution                         │
├─────────────────────────────────────────────────────────┤
│                                                          │
│   User in US ──▶ dns.google.com ──▶ 104.1.1.1 (US Edge) │
│                                                          │
│   User in EU ──▶ dns.google.com ──▶ 104.2.2.2 (EU Edge) │
│                                                          │
│   User in ASIA─▶ dns.google.com ──▶ 104.3.3.3 (Asia)    │
│                                                          │
└─────────────────────────────────────────────────────────┘

Implementation with GeoDNS

# Route53 Geolocation routing
import boto3

route53 = boto3.client('route53')

# Create weighted routing policy
response = route53.create_health_check({
    'HealthCheckConfig': {
        'Type': 'HTTPS',
        'FullyQualifiedDomainName': 'us-east.example.com',
        'Port': 443,
        'ResourcePath': '/health'
    }
})

# Associate health check with endpoint
# US users → US server pool
# EU users → EU server pool
# Asia users → Asia server pool

Anycast DNS

Traditional DNS:          Anycast DNS:
                          
┌─────────┐              ┌─────────┐
│  User   │              │  User   │
└────┬────┘              └────┬────┘
     │ DNS lookup              │
     ▼                         ▼
┌─────────┐              ┌─────────┐
│  Root   │              │  Anycast│
│  DNS    │              │  IP     │──────────┐
└────┬────┘              └─────────┘          │
     │ IP for us-east                             │
     ▼                         ▼            ▼
┌─────────┐              ┌─────────┐      ┌─────────┐
│ us-east │              │Edge DNS │      │Edge DNS │
│  server │              │ (any IP)│      │(any IP) │
└─────────┘              └─────────┘      └─────────┘

Layer 4 vs Layer 7 Load Balancing

Layer 4 (TCP/UDP)

# Nginx stream block - Layer 4
stream {
    upstream backend {
        server 10.0.1.1:80;
        server 10.0.1.2:80;
    }
    
    server {
        listen 80;
        proxy_pass backend;
    }
}

# Pros:
# - Lower latency (no SSL termination)
# - Less resource intensive
# - Better for high traffic

# Cons:
# - No content-based routing
# - No URL rewriting

Layer 7 (HTTP/HTTPS)

# Nginx http block - Layer 7
http {
    upstream api_backend {
        server 10.0.1.1:80;
        server 10.0.1.2:80;
    }
    
    upstream web_backend {
        server 10.0.2.1:80;
        server 10.0.2.2:80;
    }
    
    server {
        listen 80;
        
        # Route based on path
        location /api/ {
            proxy_pass http://api_backend;
        }
        
        location / {
            proxy_pass http://web_backend;
        }
    }
}

# Pros:
# - Content-based routing
# - Can modify requests/responses
# - SSL termination

# Cons:
# - Higher latency
# - More resource usage

High Availability Load Balancer Setup

                    ┌─────────────────┐
                    │   Global DNS    │
                    │  (Failover)    │
                    └────────┬────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
              ▼              ▼              ▼
       ┌───────────┐  ┌───────────┐  ┌───────────┐
       │    LB 1   │  │    LB 2   │  │    LB 3   │
       │  (Active) │  │ (Standby) │  │(Standby)  │
       └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
             │              │              │
             └──────────────┼──────────────┘
                            │
                    ┌───────┴───────┐
                    │               │
                    ▼               ▼
              ┌───────────┐   ┌───────────┐
              │ App Server│   │ App Server│
              │   Pool    │   │   Pool    │
              └───────────┘   └───────────┘

Best Practices

Configuration Example

# HAProxy configuration
global
    log /dev/log local0
    maxconn 4000
    user haproxy
    group haproxy
    ssl-default-bind-options no-sslv3

defaults
    log global
    mode http
    option httplog
    option dontlognull
    option redispatch
    retries 3
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend http-in
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/server.pem
    
    # Health check endpoint
    monitor-uri /haproxy_status
    
    default_backend app-servers

backend app-servers
    # Load balancing algorithm
    balance roundrobin
    
    # Health check
    option httpchk GET /health
    http-check expect status 200
    
    # Servers
    server app1 10.0.1.1:80 check inter 2000 rise 2 fall 3
    server app2 10.0.1.2:80 check inter 2000 rise 2 fall 3
    server app3 10.0.1.3:80 check inter 2000 rise 2 fall 3 backup
    
    # Enable session persistence
    appsession JSESSIONID len 64 timeout 1h mode insert

Load Balancing Strategies: Complete Guide

Load Balancing Strategies: Complete Guide

Why Load Balancing?

Load Balancing Algorithms

1. Round Robin

2. Least Connections

3. Weighted Round Robin

4. IP Hash

5. Least Response Time

Health Checks

Types of Health Checks

Health Check Configuration

Sticky Sessions

Why Sticky Sessions?

Implementation

Geographic Load Balancing

DNS-Based Geo-Routing

Implementation with GeoDNS

Anycast DNS

Layer 4 vs Layer 7 Load Balancing

Layer 4 (TCP/UDP)

Layer 7 (HTTP/HTTPS)

High Availability Load Balancer Setup

Best Practices

Configuration Example

External Resources

Comments