Caching Strategies: Redis, Memcached, and Distributed Caching Patterns

Every database has a breaking point. As your application grows, database queries become the bottleneck. A single query that takes 10ms might seem fast, but when you’re handling 10,000 requests per second, that’s 100 seconds of database time per second—impossible.

Caching is the solution. By storing frequently accessed data in memory, you can serve requests in microseconds instead of milliseconds. The difference between a system that handles 100 requests per second and one that handles 10,000 is often just a well-implemented caching layer.

In this guide, we’ll explore caching fundamentals, compare Redis and Memcached, and dive into distributed caching patterns that power the world’s largest applications.

Caching Fundamentals

Core Concepts

Cache Hit: Request is served from cache (fast, ~1-5ms) Cache Miss: Data not in cache, fetched from source (slow, ~10-100ms) Hit Rate: Percentage of requests served from cache (target: 80-95%) TTL (Time To Live): How long data stays in cache before expiring

Why Caching Matters

Without caching:
User Request → Database Query (10ms) → Response (10ms)
1000 req/s = 10,000ms database time/s = 10 database servers needed

With caching (90% hit rate):
User Request → Cache Hit (1ms) 90% of time
User Request → Database Query (10ms) 10% of time
Average: 1.9ms per request
1000 req/s = 1,900ms database time/s = 2 database servers needed

Eviction Policies

When cache is full, old data must be removed:

LRU (Least Recently Used): Remove least recently accessed data
LFU (Least Frequently Used): Remove least frequently accessed data
FIFO (First In First Out): Remove oldest data
TTL-based: Remove expired data

Redis: The Powerhouse

Redis is an in-memory data structure store that supports strings, lists, sets, hashes, and more. It’s the most popular caching solution for good reason.

Redis Features

Data Structures: Beyond simple key-value storage

import redis

r = redis.Redis(host='localhost', port=6379)

# Strings
r.set('user:1:name', 'Alice')
r.get('user:1:name')  # b'Alice'

# Lists (queues, stacks)
r.lpush('notifications', 'message1', 'message2')
r.lrange('notifications', 0, -1)

# Sets (unique items)
r.sadd('user:1:tags', 'python', 'redis', 'caching')
r.smembers('user:1:tags')

# Hashes (objects)
r.hset('user:1', mapping={'name': 'Alice', 'email': '[email protected]'})
r.hgetall('user:1')

# Sorted Sets (leaderboards, rankings)
r.zadd('leaderboard', {'alice': 100, 'bob': 95, 'charlie': 90})
r.zrange('leaderboard', 0, -1, withscores=True)

Expiration and TTL:

# Set with expiration
r.setex('session:123', 3600, 'session_data')  # Expires in 1 hour

# Set TTL on existing key
r.expire('user:1:name', 3600)

# Get remaining TTL
r.ttl('user:1:name')  # Returns seconds until expiration

Atomic Operations:

# Increment counter atomically
r.incr('page:views')  # Thread-safe counter

# Atomic compare-and-set
r.set('lock', 'locked', nx=True, ex=10)  # Set only if not exists

# Transactions
pipe = r.pipeline()
pipe.incr('counter')
pipe.expire('counter', 3600)
pipe.execute()

Pub/Sub for Real-time Updates:

# Publisher
r.publish('notifications', 'New message for user 1')

# Subscriber
pubsub = r.pubsub()
pubsub.subscribe('notifications')
for message in pubsub.listen():
    print(message)

When to Use Redis

Complex data structures needed
Real-time features (leaderboards, notifications)
Session storage
Rate limiting
Job queues
Pub/Sub messaging
Distributed locks

Memcached: The Lightweight Alternative

Memcached is a simpler, lightweight caching solution focused purely on key-value storage.

Memcached Features

Simple Key-Value Store:

import memcache

mc = memcache.Client(['127.0.0.1:11211'])

# Basic operations
mc.set('user:1:name', 'Alice', time=3600)
mc.get('user:1:name')  # b'Alice'

# Increment
mc.incr('page:views')

# Delete
mc.delete('user:1:name')

# Batch operations
mc.set_multi({
    'user:1:name': 'Alice',
    'user:2:name': 'Bob'
}, time=3600)

mc.get_multi(['user:1:name', 'user:2:name'])

Consistent Hashing for Distribution:

# Memcached automatically distributes across servers
mc = memcache.Client([
    '127.0.0.1:11211',
    '127.0.0.1:11212',
    '127.0.0.1:11213'
])

# Data is automatically distributed across servers
# Consistent hashing ensures same key always goes to same server

When to Use Memcached

Simple key-value caching
High throughput, low latency needed
Horizontal scaling required
Memory efficiency important
No complex data structures needed

Redis vs. Memcached: Direct Comparison

Feature	Redis	Memcached
Data Types	Strings, Lists, Sets, Hashes, Sorted Sets	Strings only
Persistence	Yes (RDB, AOF)	No
Replication	Yes (Master-Slave)	No
Pub/Sub	Yes	No
Transactions	Yes	No
Lua Scripting	Yes	No
Memory Efficiency	Good	Excellent
Throughput	High	Very High
Complexity	Moderate	Low
Use Case	Complex, feature-rich	Simple, high-throughput

Choose Redis if: You need complex data structures, persistence, or advanced features Choose Memcached if: You need maximum simplicity and throughput for basic caching

Distributed Caching Patterns

Pattern 1: Cache-Aside (Lazy Loading)

Application checks cache first, fetches from database if miss.

def get_user(user_id):
    # Check cache
    cache_key = f'user:{user_id}'
    user = cache.get(cache_key)
    
    if user is None:
        # Cache miss: fetch from database
        user = database.query(f'SELECT * FROM users WHERE id = {user_id}')
        
        # Store in cache
        cache.set(cache_key, user, ttl=3600)
    
    return user

Pros: Simple, works with existing databases Cons: Cache misses cause database hits, stale data possible

Pattern 2: Write-Through

Data written to cache and database simultaneously.

def update_user(user_id, data):
    # Write to cache
    cache_key = f'user:{user_id}'
    cache.set(cache_key, data, ttl=3600)
    
    # Write to database
    database.update(f'UPDATE users SET ... WHERE id = {user_id}', data)
    
    return data

Pros: Cache always consistent with database Cons: Slower writes, extra latency

Pattern 3: Write-Behind (Write-Back)

Data written to cache immediately, database updated asynchronously.

def update_user(user_id, data):
    # Write to cache immediately
    cache_key = f'user:{user_id}'
    cache.set(cache_key, data, ttl=3600)
    
    # Queue database update for later
    queue.enqueue('update_user_in_db', user_id, data)
    
    return data

# Background job
def update_user_in_db(user_id, data):
    database.update(f'UPDATE users SET ... WHERE id = {user_id}', data)

Pros: Fast writes, reduced database load Cons: Risk of data loss if cache fails before database update

Pattern 4: Read-Through

Cache loader automatically fetches from database on miss.

class CacheLoader:
    def __init__(self, cache, database):
        self.cache = cache
        self.database = database
    
    def get(self, key):
        # Check cache
        value = self.cache.get(key)
        
        if value is None:
            # Automatic load from database
            value = self.load_from_database(key)
            self.cache.set(key, value, ttl=3600)
        
        return value
    
    def load_from_database(self, key):
        # Parse key to get query parameters
        user_id = key.split(':')[1]
        return self.database.query(f'SELECT * FROM users WHERE id = {user_id}')

# Usage
loader = CacheLoader(cache, database)
user = loader.get('user:123')

Pros: Transparent to application, automatic loading Cons: Requires cache loader implementation

Cache Invalidation Strategies

Time-Based Expiration (TTL)

# Set expiration time
cache.set('user:1', user_data, ttl=3600)  # Expires in 1 hour

# Refresh TTL
cache.expire('user:1', 3600)

Pros: Simple, automatic cleanup Cons: Stale data until expiration

Event-Based Invalidation

def update_user(user_id, data):
    # Update database
    database.update(user_id, data)
    
    # Invalidate cache
    cache.delete(f'user:{user_id}')
    cache.delete(f'user:{user_id}:posts')  # Related data
    cache.delete('users:list')  # Aggregated data

Pros: Immediate consistency Cons: Must remember all related keys

Tag-Based Invalidation

# Store with tags
cache.set('user:1', user_data, tags=['user', 'user:1'])
cache.set('user:1:posts', posts, tags=['user', 'user:1', 'posts'])

# Invalidate by tag
cache.invalidate_by_tag('user:1')  # Invalidates all related data

Pros: Invalidate related data easily Cons: Requires cache system support

Probabilistic Early Expiration (XFetch)

import random
import time

def get_with_xfetch(key, ttl=3600):
    value = cache.get(key)
    
    if value is None:
        # Cache miss
        return fetch_from_database(key)
    
    # Check if near expiration
    remaining_ttl = cache.ttl(key)
    
    # Probabilistically refresh before expiration
    if remaining_ttl < ttl * 0.25:  # Last 25% of TTL
        if random.random() < 0.1:  # 10% chance
            # Refresh in background
            queue.enqueue('refresh_cache', key)
    
    return value

Pros: Prevents thundering herd on expiration Cons: More complex implementation

Common Pitfalls

Pitfall 1: Cache Stampede

Multiple requests hit database simultaneously when cache expires.

# Bad: Multiple threads fetch simultaneously
def get_user(user_id):
    user = cache.get(f'user:{user_id}')
    if user is None:
        user = database.query(user_id)  # All threads do this!
        cache.set(f'user:{user_id}', user)
    return user

# Good: Use lock to prevent stampede
def get_user(user_id):
    cache_key = f'user:{user_id}'
    user = cache.get(cache_key)
    
    if user is None:
        # Try to acquire lock
        lock_key = f'{cache_key}:lock'
        if cache.set(lock_key, '1', nx=True, ex=10):
            try:
                user = database.query(user_id)
                cache.set(cache_key, user, ttl=3600)
            finally:
                cache.delete(lock_key)
        else:
            # Wait for other thread to populate cache
            time.sleep(0.1)
            user = cache.get(cache_key)
    
    return user

Pitfall 2: Cache Invalidation Complexity

Forgetting to invalidate related cache entries.

# Bad: Only invalidates one key
def update_user(user_id, data):
    database.update(user_id, data)
    cache.delete(f'user:{user_id}')  # Misses related data!

# Good: Invalidate all related data
def update_user(user_id, data):
    database.update(user_id, data)
    
    # Invalidate all related keys
    cache.delete(f'user:{user_id}')
    cache.delete(f'user:{user_id}:profile')
    cache.delete(f'user:{user_id}:settings')
    cache.delete('users:list')
    cache.delete(f'users:by_email:{data["email"]}')

Pitfall 3: Caching Mutable Objects

Modifying cached objects affects cache.

# Bad: Mutable object
user = cache.get('user:1')
user['name'] = 'Bob'  # Modifies cache!

# Good: Serialize/deserialize
import json

user_json = cache.get('user:1')
user = json.loads(user_json)
user['name'] = 'Bob'
cache.set('user:1', json.dumps(user))

Pitfall 4: Insufficient Monitoring

# Monitor cache health
def monitor_cache():
    hit_rate = cache.get_stats()['hits'] / (cache.get_stats()['hits'] + cache.get_stats()['misses'])
    
    if hit_rate < 0.8:
        alert('Low cache hit rate')
    
    memory_usage = cache.get_stats()['bytes']
    if memory_usage > cache.max_memory * 0.9:
        alert('Cache near capacity')

Performance Considerations

Benchmarking

import time

def benchmark_cache(cache, iterations=10000):
    # Write performance
    start = time.time()
    for i in range(iterations):
        cache.set(f'key:{i}', f'value:{i}')
    write_time = time.time() - start
    
    # Read performance
    start = time.time()
    for i in range(iterations):
        cache.get(f'key:{i}')
    read_time = time.time() - start
    
    print(f'Write: {write_time/iterations*1000:.3f}ms per operation')
    print(f'Read: {read_time/iterations*1000:.3f}ms per operation')

# Typical results:
# Redis: ~0.1ms per operation
# Memcached: ~0.05ms per operation

Sizing Cache

Cache Size = (Average Object Size) × (Number of Objects) × (Replication Factor)

Example:
- Average user object: 1KB
- Number of users: 1,000,000
- Replication factor: 2
- Required cache: 1KB × 1,000,000 × 2 = 2GB

Network Considerations

# Batch operations to reduce network round-trips
# Bad: Multiple round-trips
user1 = cache.get('user:1')
user2 = cache.get('user:2')
user3 = cache.get('user:3')

# Good: Single round-trip
users = cache.get_multi(['user:1', 'user:2', 'user:3'])

Conclusion

Caching is essential for building scalable applications. The right caching strategy can reduce database load by 90% and improve response times dramatically.

Key takeaways:

Choose the right tool: Redis for complex needs, Memcached for simplicity
Pick appropriate patterns: Cache-aside for simplicity, write-through for consistency
Invalidate carefully: Plan invalidation strategy from the start
Avoid common pitfalls: Cache stampede, stale data, mutable objects
Monitor continuously: Track hit rates and memory usage
Benchmark your setup: Understand your cache performance characteristics

Start with cache-aside pattern and simple TTL-based expiration. As your system grows, evolve to more sophisticated patterns. The investment in a well-designed caching layer pays dividends in scalability and performance.

Happy caching!