Load balancing distributes traffic across multiple servers for availability, scalability, and performance. This guide covers algorithms, types, and implementation.
Load Balancing Algorithms
algorithms:
- name: "Round Robin"
description: "Sequentially distribute requests"
- name: "Weighted Round Robin"
description: "Distribute based on capacity"
- name: "Least Connections"
description: "Route to server with fewest active"
- name: "Least Response Time"
description: "Route to fastest responding"
- name: "IP Hash"
description: "Consistent hashing by client IP"
- name: "Random"
description: "Random distribution"
Implementation
class LoadBalancer:
def __init__(self, servers, algorithm="round_robin"):
self.servers = servers
self.algorithm = algorithm
self.current_index = 0
self.connections = {s: 0 for s in servers}
def get_server(self, client_ip=None):
if self.algorithm == "round_robin":
return self._round_robin()
elif self.algorithm == "least_connections":
return self._least_connections()
elif self.algorithm == "ip_hash":
return self._ip_hash(client_ip)
elif self.algorithm == "weighted":
return self._weighted()
return self.servers[0]
def _round_robin(self):
server = self.servers[self.current_index]
self.current_index = (self.current_index + 1) % len(self.servers)
return server
def _least_connections(self):
return min(self.servers, key=lambda s: self.connections[s])
def _ip_hash(self, client_ip):
hash_value = sum(ord(c) for c in client_ip)
return self.servers[hash_value % len(self.servers)]
def _weighted(self):
# Assume servers have weights: server = (host, weight)
total_weight = sum(s[1] for s in self.servers)
random_point = random.randint(1, total_weight)
cumulative = 0
for server, weight in self.servers:
cumulative += weight
if random_point <= cumulative:
return server
return self.servers[0][0]
def record_request(self, server):
self.connections[server] += 1
def record_response(self, server):
self.connections[server] -= 1
L4 vs L7 Load Balancing
layer_4:
name: "Transport Layer"
examples: "HAProxy TCP, AWS NLB"
pros:
- "Lower latency"
- "Simpler"
cons:
- "No content awareness"
- "Limited routing"
layer_7:
name: "Application Layer"
examples: "HAProxy HTTP, AWS ALB"
pros:
- "Content-based routing"
- "Can modify requests"
cons:
- "Higher latency"
- "More resource intensive"
Health Checks
class HealthChecker:
def __init__(self, servers):
self.servers = servers
self.status = {s: True for s in servers}
def check_health(self, server):
try:
# TCP check
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(2)
result = sock.connect_ex((server['host'], server['port']))
sock.close()
if result == 0:
# HTTP check if configured
if server.get('http_path'):
response = requests.get(
server['http_path'],
timeout=2
)
return response.status_code == 200
return True
return False
except Exception:
return False
def get_healthy_servers(self):
return [s for s in self.servers if self.status[s]]
def run_checks_periodically(self):
while True:
for server in self.servers:
self.status[server] = self.check_health(server)
time.sleep(10) # Check every 10 seconds
HAProxy Configuration
# HAProxy configuration
global
log /dev/log local0
maxconn 4096
user haproxy
group haproxy
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http_front
bind *:80
bind *:443 ssl crt /etc/ssl/certs/server.pem
default_backend app_backend
backend app_backend
balance roundrobin
# Health check
option httpchk GET /health
server app1 10.0.0.1:8080 check inter 2000 rise 2 fall 3
server app2 10.0.0.2:8080 check inter 2000 rise 2 fall 3
server app3 10.0.0.3:8080 check inter 2000 rise 2 fall 3 backup
Kubernetes Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
Best Practices
# Load balancing best practices
health_checks:
- "Check every 5-30 seconds"
- "Use HTTP for app health"
- "Multiple failure thresholds"
resilience:
- "At least 2 load balancers"
- "Distribute across AZs"
- "Use DNS failover"
monitoring:
- "Track request latency"
- "Monitor server health"
- "Alert on failures"
Conclusion
Load balancing is essential for distributed systems:
- Algorithms: Round robin, least connections, IP hash
- L4 vs L7: Choose based on needs
- Health checks: Critical for reliability
- Kubernetes: Use Ingress for service mesh
Use managed services (AWS ALB, Cloudflare) for production.
Comments