Load Balancing Deep Dive: Algorithms, Types, and Implementation

Load balancing distributes traffic across multiple servers for availability, scalability, and performance. This guide covers algorithms, types, and implementation.

Load Balancing Algorithms

algorithms:
  - name: "Round Robin"
    description: "Sequentially distribute requests"
    
  - name: "Weighted Round Robin"
    description: "Distribute based on capacity"
    
  - name: "Least Connections"
    description: "Route to server with fewest active"
    
  - name: "Least Response Time"
    description: "Route to fastest responding"
    
  - name: "IP Hash"
    description: "Consistent hashing by client IP"
    
  - name: "Random"
    description: "Random distribution"

Implementation

class LoadBalancer:
    def __init__(self, servers, algorithm="round_robin"):
        self.servers = servers
        self.algorithm = algorithm
        self.current_index = 0
        self.connections = {s: 0 for s in servers}
    
    def get_server(self, client_ip=None):
        if self.algorithm == "round_robin":
            return self._round_robin()
        elif self.algorithm == "least_connections":
            return self._least_connections()
        elif self.algorithm == "ip_hash":
            return self._ip_hash(client_ip)
        elif self.algorithm == "weighted":
            return self._weighted()
        
        return self.servers[0]
    
    def _round_robin(self):
        server = self.servers[self.current_index]
        self.current_index = (self.current_index + 1) % len(self.servers)
        return server
    
    def _least_connections(self):
        return min(self.servers, key=lambda s: self.connections[s])
    
    def _ip_hash(self, client_ip):
        hash_value = sum(ord(c) for c in client_ip)
        return self.servers[hash_value % len(self.servers)]
    
    def _weighted(self):
        # Assume servers have weights: server = (host, weight)
        total_weight = sum(s[1] for s in self.servers)
        random_point = random.randint(1, total_weight)
        
        cumulative = 0
        for server, weight in self.servers:
            cumulative += weight
            if random_point <= cumulative:
                return server
        
        return self.servers[0][0]
    
    def record_request(self, server):
        self.connections[server] += 1
    
    def record_response(self, server):
        self.connections[server] -= 1

L4 vs L7 Load Balancing

layer_4:
  name: "Transport Layer"
  examples: "HAProxy TCP, AWS NLB"
  pros:
    - "Lower latency"
    - "Simpler"
  cons:
    - "No content awareness"
    - "Limited routing"

layer_7:
  name: "Application Layer"
  examples: "HAProxy HTTP, AWS ALB"
  pros:
    - "Content-based routing"
    - "Can modify requests"
  cons:
    - "Higher latency"
    - "More resource intensive"

Health Checks

class HealthChecker:
    def __init__(self, servers):
        self.servers = servers
        self.status = {s: True for s in servers}
    
    def check_health(self, server):
        try:
            # TCP check
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.settimeout(2)
            result = sock.connect_ex((server['host'], server['port']))
            sock.close()
            
            if result == 0:
                # HTTP check if configured
                if server.get('http_path'):
                    response = requests.get(
                        server['http_path'],
                        timeout=2
                    )
                    return response.status_code == 200
                
                return True
            
            return False
            
        except Exception:
            return False
    
    def get_healthy_servers(self):
        return [s for s in self.servers if self.status[s]]
    
    def run_checks_periodically(self):
        while True:
            for server in self.servers:
                self.status[server] = self.check_health(server)
            
            time.sleep(10)  # Check every 10 seconds

HAProxy Configuration

# HAProxy configuration

global
    log /dev/log local0
    maxconn 4096
    user haproxy
    group haproxy

defaults
    log global
    mode http
    option httplog
    option dontlognull
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

frontend http_front
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/server.pem
    
    default_backend app_backend

backend app_backend
    balance roundrobin
    
    # Health check
    option httpchk GET /health
    
    server app1 10.0.0.1:8080 check inter 2000 rise 2 fall 3
    server app2 10.0.0.2:8080 check inter 2000 rise 2 fall 3
    server app3 10.0.0.3:8080 check inter 2000 rise 2 fall 3 backup

Kubernetes Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80

Best Practices

# Load balancing best practices

health_checks:
  - "Check every 5-30 seconds"
  - "Use HTTP for app health"
  - "Multiple failure thresholds"

resilience:
  - "At least 2 load balancers"
  - "Distribute across AZs"
  - "Use DNS failover"

monitoring:
  - "Track request latency"
  - "Monitor server health"
  - "Alert on failures"

Conclusion

Load balancing is essential for distributed systems:

Algorithms: Round robin, least connections, IP hash
L4 vs L7: Choose based on needs
Health checks: Critical for reliability
Kubernetes: Use Ingress for service mesh

Use managed services (AWS ALB, Cloudflare) for production.