Load balancing is essential for distributing traffic across multiple servers. This guide covers algorithms, implementations, and best practices for building resilient systems.
Why Load Balancing?
Without Load Balancing: With Load Balancing:
┌──────────────┐ ┌──────────────────────┐
│ │ │ Load Balancer │
│ Requests │ └──────────┬───────────┘
│ ────────▶ │ │
│ │ ┌───────────┼───────────┐
│ │ ▼ ▼ ▼
│ Server 1 │ ┌───────┐ ┌───────┐ ┌───────┐
│ (overload) │ │Server1│ │Server2│ │Server3│
└──────────────┘ └───────┘ └───────┘ └───────┘
Results:
- Server crashes
- No fault tolerance
- Poor user experience
Results:
- Distributed load
- Automatic failover
- Better performance
Load Balancing Algorithms
1. Round Robin
class RoundRobin:
def __init__(self, servers):
self.servers = servers
self.index = 0
def get_server(self):
server = self.servers[self.index]
self.index = (self.index + 1) % len(self.servers)
return server
## All servers get equal requests
## Good for: Homogeneous servers, stateless services
2. Least Connections
class LeastConnections:
def __init__(self, servers):
self.servers = {s: 0 for s in servers}
def get_server(self):
# Select server with fewest active connections
server = min(self.servers, key=self.servers.get)
self.servers[server] += 1
return server
def release(self, server):
self.servers[server] -= 1
## Dynamic - adapts to current load
## Good for: Long-lived connections, variable request times
3. Weighted Round Robin
class WeightedRoundRobin:
def __init__(self, servers):
# servers = [("server1", 3), ("server2", 1)]
self.servers = []
for server, weight in servers:
self.servers.extend([server] * weight)
self.index = 0
def get_server(self):
server = self.servers[self.index]
self.index = (self.index + 1) % len(self.servers)
return server
## Server1: 75% of traffic (weight 3)
## Server2: 25% of traffic (weight 1)
## Good for: Heterogeneous server capacities
4. IP Hash
class IPHash:
def __init__(self, servers):
self.servers = servers
def get_server(self, client_ip):
# Consistent hashing
hash_value = hash(client_ip)
index = hash_value % len(self.servers)
return self.servers[index]
## Same IP always goes to same server
## Good for: Sticky sessions, cache locality
5. Least Response Time
import time
class LeastResponseTime:
def __init__(self, servers):
self.servers = {s: {"active": 0, "avg_time": 0} for s in servers}
def get_server(self):
# Select server with lowest (active + avg_response_time)
best = min(
self.servers.items(),
key=lambda x: x[1]["active"] + x[1]["avg_time"]
)
self.servers[best[0]]["active"] += 1
return best[0]
def record_response_time(self, server, duration):
# Update rolling average
current = self.servers[server]["avg_time"]
self.servers[server]["avg_time"] = (current * 0.7 + duration * 0.3)
self.servers[server]["active"] -= 1
## Adapts to real performance
## Good for: Performance-critical applications
Health Checks
Types of Health Checks
import requests
class HealthChecker:
def __init__(self, servers):
self.servers = servers
self.status = {s: True for s in servers}
# 1. TCP Check - Port open
def tcp_check(self, server, port=80):
import socket
sock = socket.socket()
sock.settimeout(2)
try:
sock.connect((server, port))
return True
except:
return False
# 2. HTTP Check - GET endpoint
def http_check(self, server):
try:
response = requests.get(f"http://{server}/health", timeout=2)
return response.status_code == 200
except:
return False
# 3. Deep Check - Verify dependencies
def deep_check(self, server):
try:
# Check main service
r1 = requests.get(f"http://{server}/health")
# Check database
r2 = requests.get(f"http://{server}/db/health")
# Check cache
r3 = requests.get(f"http://{server}/cache/health")
return all(r.status_code == 200 for r in [r1, r2, r3])
except:
return False
def check_all(self):
for server in self.servers:
is_healthy = self.http_check(server)
self.status[server] = is_healthy
return self.status
Health Check Configuration
## Example Kubernetes health check config
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Sticky Sessions
Why Sticky Sessions?
Without Sticky Sessions: With Sticky Sessions:
┌─────────┐ ┌─────────┐
│ LB │ │ LB │
└────┬────┘ └────┬────┘
│ │
┌──┴──┐ ┌───┴───┐
▼ ▼ ▼ ▼
Server1 Server2 Server1 Server2
│ │
▼ ▼
User A User A User A User B
(not saved) (session saved)
Implementation
## Cookie-based sticky sessions
class CookieStickyLB:
def __init__(self, servers):
self.servers = servers
def get_server(self, request):
# Check for existing session cookie
session_id = request.cookies.get('session_id')
if session_id:
# Look up which server serves this session
server = session_store.get(session_id)
if server and server in self.servers:
return server
# New session - select server
server = self.round_robin()
# Store session mapping
session_store.set(session_id, server)
return server
## Redis session store
import redis
session_store = redis.Redis(host='localhost', db=1)
Geographic Load Balancing
DNS-Based Geo-Routing
┌─────────────────────────────────────────────────────────┐
│ DNS Resolution │
├─────────────────────────────────────────────────────────┤
│ │
│ User in US ──▶ dns.google.com ──▶ 104.1.1.1 (US Edge) │
│ │
│ User in EU ──▶ dns.google.com ──▶ 104.2.2.2 (EU Edge) │
│ │
│ User in ASIA─▶ dns.google.com ──▶ 104.3.3.3 (Asia) │
│ │
└─────────────────────────────────────────────────────────┘
Implementation with GeoDNS
## Route53 Geolocation routing
import boto3
route53 = boto3.client('route53')
## Create weighted routing policy
response = route53.create_health_check({
'HealthCheckConfig': {
'Type': 'HTTPS',
'FullyQualifiedDomainName': 'us-east.example.com',
'Port': 443,
'ResourcePath': '/health'
}
})
## Associate health check with endpoint
## US users → US server pool
## EU users → EU server pool
## Asia users → Asia server pool
Anycast DNS
Traditional DNS: Anycast DNS:
┌─────────┐ ┌─────────┐
│ User │ │ User │
└────┬────┘ └────┬────┘
│ DNS lookup │
▼ ▼
┌─────────┐ ┌─────────┐
│ Root │ │ Anycast│
│ DNS │ │ IP │──────────┐
└────┬────┘ └─────────┘ │
│ IP for us-east │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ us-east │ │Edge DNS │ │Edge DNS │
│ server │ │ (any IP)│ │(any IP) │
└─────────┘ └─────────┘ └─────────┘
Layer 4 vs Layer 7 Load Balancing
Layer 4 (TCP/UDP)
## Nginx stream block - Layer 4
stream {
upstream backend {
server 10.0.1.1:80;
server 10.0.1.2:80;
}
server {
listen 80;
proxy_pass backend;
}
}
## Pros:
## - Lower latency (no SSL termination)
## - Less resource intensive
## - Better for high traffic
## Cons:
## - No content-based routing
## - No URL rewriting
Layer 7 (HTTP/HTTPS)
## Nginx http block - Layer 7
http {
upstream api_backend {
server 10.0.1.1:80;
server 10.0.1.2:80;
}
upstream web_backend {
server 10.0.2.1:80;
server 10.0.2.2:80;
}
server {
listen 80;
# Route based on path
location /api/ {
proxy_pass http://api_backend;
}
location / {
proxy_pass http://web_backend;
}
}
}
## Pros:
## - Content-based routing
## - Can modify requests/responses
## - SSL termination
## Cons:
## - Higher latency
## - More resource usage
High Availability Load Balancer Setup
┌─────────────────┐
│ Global DNS │
│ (Failover) │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ LB 1 │ │ LB 2 │ │ LB 3 │
│ (Active) │ │ (Standby) │ │(Standby) │
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
│ │ │
└──────────────┼──────────────┘
│
┌───────┴───────┐
│ │
▼ ▼
┌───────────┐ ┌───────────┐
│ App Server│ │ App Server│
│ Pool │ │ Pool │
└───────────┘ └───────────┘
Best Practices
Configuration Example
## HAProxy configuration
global
log /dev/log local0
maxconn 4000
user haproxy
group haproxy
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
option redispatch
retries 3
timeout connect 5000
timeout client 50000
timeout server 50000
frontend http-in
bind *:80
bind *:443 ssl crt /etc/ssl/certs/server.pem
# Health check endpoint
monitor-uri /haproxy_status
default_backend app-servers
backend app-servers
# Load balancing algorithm
balance roundrobin
# Health check
option httpchk GET /health
http-check expect status 200
# Servers
server app1 10.0.1.1:80 check inter 2000 rise 2 fall 3
server app2 10.0.1.2:80 check inter 2000 rise 2 fall 3
server app3 10.0.1.3:80 check inter 2000 rise 2 fall 3 backup
# Enable session persistence
appsession JSESSIONID len 64 timeout 1h mode insert
Conclusion
Load balancing is fundamental to scalable, reliable systems. Choose your algorithm based on workload characteristics: round-robin for uniform requests, least connections for variable-length processing, consistent hashing for cache-friendly routing. Health checks and graceful connection draining are non-negotiable in production. Layer 4 balancing is simpler and faster; layer 7 balancing provides richer routing and inspection capabilities.
Comments