Load Balancing Strategies: Complete Guide
Load balancing is essential for distributing traffic across multiple servers. This guide covers algorithms, implementations, and best practices for building resilient systems.
Why Load Balancing?
Without Load Balancing: With Load Balancing:
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ Load Balancer โ
โ Requests โ โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ โโโโโโโโโถ โ โ
โ โ โโโโโโโโโโโโโผโโโโโโโโโโโโ
โ โ โผ โผ โผ
โ Server 1 โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ
โ (overload) โ โServer1โ โServer2โ โServer3โ
โโโโโโโโโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโ
Results:
- Server crashes
- No fault tolerance
- Poor user experience
Results:
- Distributed load
- Automatic failover
- Better performance
Load Balancing Algorithms
1. Round Robin
class RoundRobin:
def __init__(self, servers):
self.servers = servers
self.index = 0
def get_server(self):
server = self.servers[self.index]
self.index = (self.index + 1) % len(self.servers)
return server
# All servers get equal requests
# Good for: Homogeneous servers, stateless services
2. Least Connections
class LeastConnections:
def __init__(self, servers):
self.servers = {s: 0 for s in servers}
def get_server(self):
# Select server with fewest active connections
server = min(self.servers, key=self.servers.get)
self.servers[server] += 1
return server
def release(self, server):
self.servers[server] -= 1
# Dynamic - adapts to current load
# Good for: Long-lived connections, variable request times
3. Weighted Round Robin
class WeightedRoundRobin:
def __init__(self, servers):
# servers = [("server1", 3), ("server2", 1)]
self.servers = []
for server, weight in servers:
self.servers.extend([server] * weight)
self.index = 0
def get_server(self):
server = self.servers[self.index]
self.index = (self.index + 1) % len(self.servers)
return server
# Server1: 75% of traffic (weight 3)
# Server2: 25% of traffic (weight 1)
# Good for: Heterogeneous server capacities
4. IP Hash
class IPHash:
def __init__(self, servers):
self.servers = servers
def get_server(self, client_ip):
# Consistent hashing
hash_value = hash(client_ip)
index = hash_value % len(self.servers)
return self.servers[index]
# Same IP always goes to same server
# Good for: Sticky sessions, cache locality
5. Least Response Time
import time
class LeastResponseTime:
def __init__(self, servers):
self.servers = {s: {"active": 0, "avg_time": 0} for s in servers}
def get_server(self):
# Select server with lowest (active + avg_response_time)
best = min(
self.servers.items(),
key=lambda x: x[1]["active"] + x[1]["avg_time"]
)
self.servers[best[0]]["active"] += 1
return best[0]
def record_response_time(self, server, duration):
# Update rolling average
current = self.servers[server]["avg_time"]
self.servers[server]["avg_time"] = (current * 0.7 + duration * 0.3)
self.servers[server]["active"] -= 1
# Adapts to real performance
# Good for: Performance-critical applications
Health Checks
Types of Health Checks
import requests
class HealthChecker:
def __init__(self, servers):
self.servers = servers
self.status = {s: True for s in servers}
# 1. TCP Check - Port open
def tcp_check(self, server, port=80):
import socket
sock = socket.socket()
sock.settimeout(2)
try:
sock.connect((server, port))
return True
except:
return False
# 2. HTTP Check - GET endpoint
def http_check(self, server):
try:
response = requests.get(f"http://{server}/health", timeout=2)
return response.status_code == 200
except:
return False
# 3. Deep Check - Verify dependencies
def deep_check(self, server):
try:
# Check main service
r1 = requests.get(f"http://{server}/health")
# Check database
r2 = requests.get(f"http://{server}/db/health")
# Check cache
r3 = requests.get(f"http://{server}/cache/health")
return all(r.status_code == 200 for r in [r1, r2, r3])
except:
return False
def check_all(self):
for server in self.servers:
is_healthy = self.http_check(server)
self.status[server] = is_healthy
return self.status
Health Check Configuration
# Example Kubernetes health check config
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Sticky Sessions
Why Sticky Sessions?
Without Sticky Sessions: With Sticky Sessions:
โโโโโโโโโโโ โโโโโโโโโโโ
โ LB โ โ LB โ
โโโโโโฌโโโโโ โโโโโโฌโโโโโ
โ โ
โโโโดโโโ โโโโโดโโโโ
โผ โผ โผ โผ
Server1 Server2 Server1 Server2
โ โ
โผ โผ
User A User A User A User B
(not saved) (session saved)
Implementation
# Cookie-based sticky sessions
class CookieStickyLB:
def __init__(self, servers):
self.servers = servers
def get_server(self, request):
# Check for existing session cookie
session_id = request.cookies.get('session_id')
if session_id:
# Look up which server serves this session
server = session_store.get(session_id)
if server and server in self.servers:
return server
# New session - select server
server = self.round_robin()
# Store session mapping
session_store.set(session_id, server)
return server
# Redis session store
import redis
session_store = redis.Redis(host='localhost', db=1)
Geographic Load Balancing
DNS-Based Geo-Routing
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ DNS Resolution โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ User in US โโโถ dns.google.com โโโถ 104.1.1.1 (US Edge) โ
โ โ
โ User in EU โโโถ dns.google.com โโโถ 104.2.2.2 (EU Edge) โ
โ โ
โ User in ASIAโโถ dns.google.com โโโถ 104.3.3.3 (Asia) โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Implementation with GeoDNS
# Route53 Geolocation routing
import boto3
route53 = boto3.client('route53')
# Create weighted routing policy
response = route53.create_health_check({
'HealthCheckConfig': {
'Type': 'HTTPS',
'FullyQualifiedDomainName': 'us-east.example.com',
'Port': 443,
'ResourcePath': '/health'
}
})
# Associate health check with endpoint
# US users โ US server pool
# EU users โ EU server pool
# Asia users โ Asia server pool
Anycast DNS
Traditional DNS: Anycast DNS:
โโโโโโโโโโโ โโโโโโโโโโโ
โ User โ โ User โ
โโโโโโฌโโโโโ โโโโโโฌโโโโโ
โ DNS lookup โ
โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโ
โ Root โ โ Anycastโ
โ DNS โ โ IP โโโโโโโโโโโโ
โโโโโโฌโโโโโ โโโโโโโโโโโ โ
โ IP for us-east โ
โผ โผ โผ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ us-east โ โEdge DNS โ โEdge DNS โ
โ server โ โ (any IP)โ โ(any IP) โ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
Layer 4 vs Layer 7 Load Balancing
Layer 4 (TCP/UDP)
# Nginx stream block - Layer 4
stream {
upstream backend {
server 10.0.1.1:80;
server 10.0.1.2:80;
}
server {
listen 80;
proxy_pass backend;
}
}
# Pros:
# - Lower latency (no SSL termination)
# - Less resource intensive
# - Better for high traffic
# Cons:
# - No content-based routing
# - No URL rewriting
Layer 7 (HTTP/HTTPS)
# Nginx http block - Layer 7
http {
upstream api_backend {
server 10.0.1.1:80;
server 10.0.1.2:80;
}
upstream web_backend {
server 10.0.2.1:80;
server 10.0.2.2:80;
}
server {
listen 80;
# Route based on path
location /api/ {
proxy_pass http://api_backend;
}
location / {
proxy_pass http://web_backend;
}
}
}
# Pros:
# - Content-based routing
# - Can modify requests/responses
# - SSL termination
# Cons:
# - Higher latency
# - More resource usage
High Availability Load Balancer Setup
โโโโโโโโโโโโโโโโโโโ
โ Global DNS โ
โ (Failover) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ LB 1 โ โ LB 2 โ โ LB 3 โ
โ (Active) โ โ (Standby) โ โ(Standby) โ
โโโโโโโฌโโโโโโ โโโโโโโฌโโโโโโ โโโโโโโฌโโโโโโ
โ โ โ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ
โ
โโโโโโโโโดโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโ โโโโโโโโโโโโโ
โ App Serverโ โ App Serverโ
โ Pool โ โ Pool โ
โโโโโโโโโโโโโ โโโโโโโโโโโโโ
Best Practices
Configuration Example
# HAProxy configuration
global
log /dev/log local0
maxconn 4000
user haproxy
group haproxy
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
option redispatch
retries 3
timeout connect 5000
timeout client 50000
timeout server 50000
frontend http-in
bind *:80
bind *:443 ssl crt /etc/ssl/certs/server.pem
# Health check endpoint
monitor-uri /haproxy_status
default_backend app-servers
backend app-servers
# Load balancing algorithm
balance roundrobin
# Health check
option httpchk GET /health
http-check expect status 200
# Servers
server app1 10.0.1.1:80 check inter 2000 rise 2 fall 3
server app2 10.0.1.2:80 check inter 2000 rise 2 fall 3
server app3 10.0.1.3:80 check inter 2000 rise 2 fall 3 backup
# Enable session persistence
appsession JSESSIONID len 64 timeout 1h mode insert
Comments