Skip to main content

Load Balancing Strategies: Complete Guide

Created: February 26, 2026 Larry Qu 6 min read

Load balancing is essential for distributing traffic across multiple servers. This guide covers algorithms, implementations, and best practices for building resilient systems.

Why Load Balancing?

Without Load Balancing:           With Load Balancing:
┌──────────────┐                 ┌──────────────────────┐
│              │                 │    Load Balancer     │
│   Requests   │                 └──────────┬───────────┘
│    ────────▶ │                            │
│              │                 ┌───────────┼───────────┐
│              │                 ▼           ▼           ▼
│   Server 1   │              ┌───────┐ ┌───────┐ ┌───────┐
│   (overload) │              │Server1│ │Server2│ │Server3│
└──────────────┘              └───────┘ └───────┘ └───────┘

Results:
- Server crashes
- No fault tolerance
- Poor user experience

Results:
- Distributed load
- Automatic failover
- Better performance

Load Balancing Algorithms

1. Round Robin

class RoundRobin:
    def __init__(self, servers):
        self.servers = servers
        self.index = 0
    
    def get_server(self):
        server = self.servers[self.index]
        self.index = (self.index + 1) % len(self.servers)
        return server

## All servers get equal requests
## Good for: Homogeneous servers, stateless services

2. Least Connections

class LeastConnections:
    def __init__(self, servers):
        self.servers = {s: 0 for s in servers}
    
    def get_server(self):
        # Select server with fewest active connections
        server = min(self.servers, key=self.servers.get)
        self.servers[server] += 1
        return server
    
    def release(self, server):
        self.servers[server] -= 1

## Dynamic - adapts to current load
## Good for: Long-lived connections, variable request times

3. Weighted Round Robin

class WeightedRoundRobin:
    def __init__(self, servers):
        # servers = [("server1", 3), ("server2", 1)]
        self.servers = []
        for server, weight in servers:
            self.servers.extend([server] * weight)
        self.index = 0
    
    def get_server(self):
        server = self.servers[self.index]
        self.index = (self.index + 1) % len(self.servers)
        return server

## Server1: 75% of traffic (weight 3)
## Server2: 25% of traffic (weight 1)
## Good for: Heterogeneous server capacities

4. IP Hash

class IPHash:
    def __init__(self, servers):
        self.servers = servers
    
    def get_server(self, client_ip):
        # Consistent hashing
        hash_value = hash(client_ip)
        index = hash_value % len(self.servers)
        return self.servers[index]

## Same IP always goes to same server
## Good for: Sticky sessions, cache locality

5. Least Response Time

import time

class LeastResponseTime:
    def __init__(self, servers):
        self.servers = {s: {"active": 0, "avg_time": 0} for s in servers}
    
    def get_server(self):
        # Select server with lowest (active + avg_response_time)
        best = min(
            self.servers.items(),
            key=lambda x: x[1]["active"] + x[1]["avg_time"]
        )
        self.servers[best[0]]["active"] += 1
        return best[0]
    
    def record_response_time(self, server, duration):
        # Update rolling average
        current = self.servers[server]["avg_time"]
        self.servers[server]["avg_time"] = (current * 0.7 + duration * 0.3)
        self.servers[server]["active"] -= 1

## Adapts to real performance
## Good for: Performance-critical applications

Health Checks

Types of Health Checks

import requests

class HealthChecker:
    def __init__(self, servers):
        self.servers = servers
        self.status = {s: True for s in servers}
    
    # 1. TCP Check - Port open
    def tcp_check(self, server, port=80):
        import socket
        sock = socket.socket()
        sock.settimeout(2)
        try:
            sock.connect((server, port))
            return True
        except:
            return False
    
    # 2. HTTP Check - GET endpoint
    def http_check(self, server):
        try:
            response = requests.get(f"http://{server}/health", timeout=2)
            return response.status_code == 200
        except:
            return False
    
    # 3. Deep Check - Verify dependencies
    def deep_check(self, server):
        try:
            # Check main service
            r1 = requests.get(f"http://{server}/health")
            # Check database
            r2 = requests.get(f"http://{server}/db/health")
            # Check cache
            r3 = requests.get(f"http://{server}/cache/health")
            
            return all(r.status_code == 200 for r in [r1, r2, r3])
        except:
            return False
    
    def check_all(self):
        for server in self.servers:
            is_healthy = self.http_check(server)
            self.status[server] = is_healthy
        return self.status

Health Check Configuration

## Example Kubernetes health check config
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Sticky Sessions

Why Sticky Sessions?

Without Sticky Sessions:          With Sticky Sessions:
┌─────────┐                       ┌─────────┐
│   LB    │                       │   LB    │
└────┬────┘                       └────┬────┘
     │                                  │
  ┌──┴──┐                          ┌───┴───┐
  ▼     ▼                          ▼       ▼
Server1 Server2                  Server1  Server2
  │                               │
  ▼                               ▼
User A  User A                   User A   User B
(not saved)                      (session saved)

Implementation

## Cookie-based sticky sessions
class CookieStickyLB:
    def __init__(self, servers):
        self.servers = servers
    
    def get_server(self, request):
        # Check for existing session cookie
        session_id = request.cookies.get('session_id')
        
        if session_id:
            # Look up which server serves this session
            server = session_store.get(session_id)
            if server and server in self.servers:
                return server
        
        # New session - select server
        server = self.round_robin()
        
        # Store session mapping
        session_store.set(session_id, server)
        
        return server

## Redis session store
import redis
session_store = redis.Redis(host='localhost', db=1)

Geographic Load Balancing

DNS-Based Geo-Routing

┌─────────────────────────────────────────────────────────┐
│                   DNS Resolution                         │
├─────────────────────────────────────────────────────────┤
│                                                          │
│   User in US ──▶ dns.google.com ──▶ 104.1.1.1 (US Edge) │
│                                                          │
│   User in EU ──▶ dns.google.com ──▶ 104.2.2.2 (EU Edge) │
│                                                          │
│   User in ASIA─▶ dns.google.com ──▶ 104.3.3.3 (Asia)    │
│                                                          │
└─────────────────────────────────────────────────────────┘

Implementation with GeoDNS

## Route53 Geolocation routing
import boto3

route53 = boto3.client('route53')

## Create weighted routing policy
response = route53.create_health_check({
    'HealthCheckConfig': {
        'Type': 'HTTPS',
        'FullyQualifiedDomainName': 'us-east.example.com',
        'Port': 443,
        'ResourcePath': '/health'
    }
})

## Associate health check with endpoint
## US users → US server pool
## EU users → EU server pool
## Asia users → Asia server pool

Anycast DNS

Traditional DNS:          Anycast DNS:
                          
┌─────────┐              ┌─────────┐
│  User   │              │  User   │
└────┬────┘              └────┬────┘
     │ DNS lookup              │
     ▼                         ▼
┌─────────┐              ┌─────────┐
│  Root   │              │  Anycast│
│  DNS    │              │  IP     │──────────┐
└────┬────┘              └─────────┘          │
     │ IP for us-east                             │
     ▼                         ▼            ▼
┌─────────┐              ┌─────────┐      ┌─────────┐
│ us-east │              │Edge DNS │      │Edge DNS │
│  server │              │ (any IP)│      │(any IP) │
└─────────┘              └─────────┘      └─────────┘

Layer 4 vs Layer 7 Load Balancing

Layer 4 (TCP/UDP)

## Nginx stream block - Layer 4
stream {
    upstream backend {
        server 10.0.1.1:80;
        server 10.0.1.2:80;
    }
    
    server {
        listen 80;
        proxy_pass backend;
    }
}

## Pros:
## - Lower latency (no SSL termination)
## - Less resource intensive
## - Better for high traffic

## Cons:
## - No content-based routing
## - No URL rewriting

Layer 7 (HTTP/HTTPS)

## Nginx http block - Layer 7
http {
    upstream api_backend {
        server 10.0.1.1:80;
        server 10.0.1.2:80;
    }
    
    upstream web_backend {
        server 10.0.2.1:80;
        server 10.0.2.2:80;
    }
    
    server {
        listen 80;
        
        # Route based on path
        location /api/ {
            proxy_pass http://api_backend;
        }
        
        location / {
            proxy_pass http://web_backend;
        }
    }
}

## Pros:
## - Content-based routing
## - Can modify requests/responses
## - SSL termination

## Cons:
## - Higher latency
## - More resource usage

High Availability Load Balancer Setup

                    ┌─────────────────┐
                    │   Global DNS    │
                    │  (Failover)    │
                    └────────┬────────┘
              ┌──────────────┼──────────────┐
              │              │              │
              ▼              ▼              ▼
       ┌───────────┐  ┌───────────┐  ┌───────────┐
       │    LB 1   │  │    LB 2   │  │    LB 3   │
       │  (Active) │  │ (Standby) │  │(Standby)  │
       └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
             │              │              │
             └──────────────┼──────────────┘
                    ┌───────┴───────┐
                    │               │
                    ▼               ▼
              ┌───────────┐   ┌───────────┐
              │ App Server│   │ App Server│
              │   Pool    │   │   Pool    │
              └───────────┘   └───────────┘

Best Practices

Configuration Example

## HAProxy configuration
global
    log /dev/log local0
    maxconn 4000
    user haproxy
    group haproxy
    ssl-default-bind-options no-sslv3

defaults
    log global
    mode http
    option httplog
    option dontlognull
    option redispatch
    retries 3
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend http-in
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/server.pem
    
    # Health check endpoint
    monitor-uri /haproxy_status
    
    default_backend app-servers

backend app-servers
    # Load balancing algorithm
    balance roundrobin
    
    # Health check
    option httpchk GET /health
    http-check expect status 200
    
    # Servers
    server app1 10.0.1.1:80 check inter 2000 rise 2 fall 3
    server app2 10.0.1.2:80 check inter 2000 rise 2 fall 3
    server app3 10.0.1.3:80 check inter 2000 rise 2 fall 3 backup
    
    # Enable session persistence
    appsession JSESSIONID len 64 timeout 1h mode insert

Conclusion

Load balancing is fundamental to scalable, reliable systems. Choose your algorithm based on workload characteristics: round-robin for uniform requests, least connections for variable-length processing, consistent hashing for cache-friendly routing. Health checks and graceful connection draining are non-negotiable in production. Layer 4 balancing is simpler and faster; layer 7 balancing provides richer routing and inspection capabilities.

External Resources


Comments

Share this article

Scan to read on mobile

👍 Was this article helpful?