Caching in Python: Master functools.lru_cache and Custom Cache Implementations

Your function is called 10,000 times with the same arguments. Each call performs expensive computation—database queries, API calls, complex calculations. What if you could return the result instantly on the second call? That’s caching.

Caching is one of the most effective performance optimization techniques. By storing computation results and reusing them, you can dramatically reduce execution time and resource usage. Python provides built-in caching through functools.lru_cache, but understanding when and how to use it—and when to build custom solutions—is crucial for writing efficient applications.

This guide explores caching strategies in Python, from simple decorators to sophisticated custom implementations.

What Is Caching and Why It Matters

Caching stores the results of expensive operations so you can reuse them instead of recomputing. Think of it like remembering a phone number instead of looking it up every time.

The Performance Impact

import time

def expensive_operation(n):
    """Simulate an expensive computation"""
    time.sleep(1)  # Simulate 1 second of work
    return n * n

# Without caching: 10 calls = 10 seconds
start = time.time()
for i in range(10):
    expensive_operation(5)
print(f"Without cache: {time.time() - start:.2f}s")  # ~10 seconds

# With caching: 10 calls = 1 second (after first call)
from functools import lru_cache

@lru_cache(maxsize=128)
def cached_operation(n):
    time.sleep(1)
    return n * n

start = time.time()
for i in range(10):
    cached_operation(5)
print(f"With cache: {time.time() - start:.2f}s")  # ~1 second

Result: 10x performance improvement with a single decorator.

functools.lru_cache: The Built-in Solution

What Is LRU Cache?

LRU (Least Recently Used) cache stores function results and evicts the least recently used item when the cache is full. It’s perfect for functions with a limited set of inputs.

Basic Usage

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    """Calculate Fibonacci number with caching"""
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# First call: computes result
print(fibonacci(30))  # Takes a moment

# Second call: returns cached result instantly
print(fibonacci(30))  # Instant

# Check cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=29, misses=31, maxsize=128, currsize=31)

Key Parameters

maxsize

Maximum number of cached results. Set to None for unlimited cache.

# Limited cache (128 results)
@lru_cache(maxsize=128)
def limited_cache(x):
    return x ** 2

# Unlimited cache (use with caution!)
@lru_cache(maxsize=None)
def unlimited_cache(x):
    return x ** 2

# No caching (maxsize=0)
@lru_cache(maxsize=0)
def no_cache(x):
    return x ** 2

typed

When True, arguments of different types are cached separately.

@lru_cache(maxsize=128, typed=False)
def add_untyped(a, b):
    return a + b

print(add_untyped(1, 2))      # Returns 3
print(add_untyped(1.0, 2.0))  # Returns 3.0 (same cache entry)

@lru_cache(maxsize=128, typed=True)
def add_typed(a, b):
    return a + b

print(add_typed(1, 2))        # Returns 3
print(add_typed(1.0, 2.0))    # Returns 3.0 (different cache entry)

Cache Management

from functools import lru_cache

@lru_cache(maxsize=128)
def expensive_function(x):
    return x ** 2

# Get cache statistics
stats = expensive_function.cache_info()
print(f"Hits: {stats.hits}, Misses: {stats.misses}")

# Clear the cache
expensive_function.cache_clear()

# Check if cache is empty
print(expensive_function.cache_info())
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)

When functools.lru_cache Shines

Scenario 1: Recursive Functions

from functools import lru_cache

# Without cache: Exponential time complexity O(2^n)
def fib_slow(n):
    if n < 2:
        return n
    return fib_slow(n - 1) + fib_slow(n - 2)

# With cache: Linear time complexity O(n)
@lru_cache(maxsize=None)
def fib_fast(n):
    if n < 2:
        return n
    return fib_fast(n - 1) + fib_fast(n - 2)

# Performance comparison
import time

start = time.time()
fib_slow(35)
print(f"Slow: {time.time() - start:.2f}s")  # ~3 seconds

start = time.time()
fib_fast(35)
print(f"Fast: {time.time() - start:.2f}s")  # ~0.001 seconds

Scenario 2: Pure Functions with Limited Inputs

from functools import lru_cache

@lru_cache(maxsize=32)
def get_user_by_id(user_id):
    """Fetch user from database"""
    # Simulate database query
    return {'id': user_id, 'name': f'User {user_id}'}

# Repeated calls with same ID return cached result
user1 = get_user_by_id(1)
user1_again = get_user_by_id(1)  # From cache

print(get_user_by_id.cache_info())
# CacheInfo(hits=1, misses=1, maxsize=32, currsize=1)

Scenario 3: Expensive Computations

from functools import lru_cache
import math

@lru_cache(maxsize=128)
def is_prime(n):
    """Check if number is prime"""
    if n < 2:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

# First call: computes
is_prime(1000000007)

# Second call: instant
is_prime(1000000007)

Limitations of functools.lru_cache

Limitation 1: No Time-Based Expiration

Cache entries never expire. Stale data can be a problem.

# ❌ PROBLEM: Cache never expires
from functools import lru_cache
import time

@lru_cache(maxsize=128)
def get_stock_price(symbol):
    """Get current stock price"""
    # Simulates API call
    return 100.0

price = get_stock_price('AAPL')
# After 1 hour, price is still cached (stale!)
time.sleep(3600)
price_again = get_stock_price('AAPL')  # Returns old price

Limitation 2: Not Thread-Safe for Modification

Multiple threads can cause issues when clearing cache.

# ❌ PROBLEM: Not thread-safe
from functools import lru_cache
import threading

@lru_cache(maxsize=128)
def compute(x):
    return x ** 2

def clear_cache():
    compute.cache_clear()

# Race condition if multiple threads clear simultaneously
thread1 = threading.Thread(target=clear_cache)
thread2 = threading.Thread(target=clear_cache)
thread1.start()
thread2.start()

Limitation 3: Arguments Must Be Hashable

Can’t cache functions with unhashable arguments (lists, dicts).

from functools import lru_cache

@lru_cache(maxsize=128)
def process_data(data):
    return sum(data)

# ✅ Works: tuple is hashable
process_data((1, 2, 3))

# ❌ Fails: list is not hashable
process_data([1, 2, 3])  # TypeError: unhashable type: 'list'

Custom Cache Implementations

When functools.lru_cache isn’t enough, build custom solutions.

Time-Based Expiration Cache

import time
from typing import Any, Callable, Dict, Optional

class TTLCache:
    """Cache with time-to-live (TTL) expiration"""
    
    def __init__(self, ttl_seconds: int):
        self.ttl = ttl_seconds
        self.cache: Dict[Any, tuple] = {}
    
    def get(self, key: Any) -> Optional[Any]:
        """Get value from cache if not expired"""
        if key not in self.cache:
            return None
        
        value, timestamp = self.cache[key]
        
        # Check if expired
        if time.time() - timestamp > self.ttl:
            del self.cache[key]
            return None
        
        return value
    
    def set(self, key: Any, value: Any) -> None:
        """Store value in cache with current timestamp"""
        self.cache[key] = (value, time.time())
    
    def clear(self) -> None:
        """Clear all cache entries"""
        self.cache.clear()

# Usage
cache = TTLCache(ttl_seconds=60)

cache.set('user:1', {'name': 'Alice'})
print(cache.get('user:1'))  # {'name': 'Alice'}

time.sleep(61)
print(cache.get('user:1'))  # None (expired)

Decorator with TTL

import time
from functools import wraps
from typing import Any, Callable

def ttl_cache(ttl_seconds: int):
    """Decorator for caching with TTL"""
    def decorator(func: Callable) -> Callable:
        cache = {}
        
        @wraps(func)
        def wrapper(*args, **kwargs) -> Any:
            # Create cache key from arguments
            key = (args, tuple(sorted(kwargs.items())))
            
            # Check if in cache and not expired
            if key in cache:
                value, timestamp = cache[key]
                if time.time() - timestamp < ttl_seconds:
                    return value
                else:
                    del cache[key]
            
            # Compute and cache result
            result = func(*args, **kwargs)
            cache[key] = (result, time.time())
            return result
        
        return wrapper
    return decorator

@ttl_cache(ttl_seconds=60)
def get_weather(city):
    """Get weather for city (cached for 60 seconds)"""
    # Simulate API call
    return f"Weather in {city}: 72°F"

print(get_weather('New York'))  # Computes
print(get_weather('New York'))  # From cache
time.sleep(61)
print(get_weather('New York'))  # Recomputes (expired)

Thread-Safe Cache

import threading
from typing import Any, Callable, Dict, Optional

class ThreadSafeCache:
    """Thread-safe cache with lock"""
    
    def __init__(self, maxsize: int = 128):
        self.maxsize = maxsize
        self.cache: Dict[Any, Any] = {}
        self.lock = threading.RLock()
    
    def get(self, key: Any) -> Optional[Any]:
        """Get value from cache (thread-safe)"""
        with self.lock:
            return self.cache.get(key)
    
    def set(self, key: Any, value: Any) -> None:
        """Store value in cache (thread-safe)"""
        with self.lock:
            # Simple eviction: remove oldest if full
            if len(self.cache) >= self.maxsize:
                oldest_key = next(iter(self.cache))
                del self.cache[oldest_key]
            
            self.cache[key] = value
    
    def clear(self) -> None:
        """Clear cache (thread-safe)"""
        with self.lock:
            self.cache.clear()

# Usage
cache = ThreadSafeCache(maxsize=100)

def worker():
    for i in range(1000):
        cache.set(f'key:{i}', f'value:{i}')
        cache.get(f'key:{i}')

# Multiple threads can safely use cache
threads = [threading.Thread(target=worker) for _ in range(4)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print("Cache operations completed safely")

Comparison: functools.lru_cache vs Custom Solutions

Feature	functools.lru_cache	Custom TTL	Custom Thread-Safe
Time-based expiration	❌	✅	✅
Thread-safe	⚠️ (read-only)	❌	✅
Hashable args only	✅	✅	✅
Memory efficient	✅	⚠️	⚠️
Easy to use	✅	❌	❌
Production-ready	✅	⚠️	⚠️

Caching Libraries

For complex scenarios, consider established libraries:

cachetools

from cachetools import TTLCache, cached
import time

# TTL cache with 60-second expiration
@cached(cache=TTLCache(maxsize=128, ttl=60))
def get_data(key):
    return f"Data for {key}"

print(get_data('key1'))  # Computes
print(get_data('key1'))  # From cache

Redis

import redis
import json

redis_client = redis.Redis(host='localhost', port=6379)

def get_user(user_id):
    # Check Redis cache
    cached = redis_client.get(f'user:{user_id}')
    if cached:
        return json.loads(cached)
    
    # Fetch from database
    user = {'id': user_id, 'name': f'User {user_id}'}
    
    # Store in Redis for 1 hour
    redis_client.setex(
        f'user:{user_id}',
        3600,
        json.dumps(user)
    )
    
    return user

Best Practices

1. Cache Only Pure Functions

# ✅ GOOD: Pure function (same input = same output)
@lru_cache(maxsize=128)
def calculate_tax(amount):
    return amount * 0.1

# ❌ BAD: Impure function (depends on external state)
@lru_cache(maxsize=128)
def get_current_time():
    return datetime.now()  # Changes every call

2. Choose Appropriate Cache Size

# ❌ TOO SMALL: Cache misses frequently
@lru_cache(maxsize=2)
def process_user(user_id):
    pass

# ✅ APPROPRIATE: Based on expected usage
@lru_cache(maxsize=1000)
def process_user(user_id):
    pass

# ❌ TOO LARGE: Wastes memory
@lru_cache(maxsize=1000000)
def process_user(user_id):
    pass

3. Monitor Cache Performance

from functools import lru_cache

@lru_cache(maxsize=128)
def expensive_function(x):
    return x ** 2

# Use cache statistics to optimize
def print_cache_stats():
    stats = expensive_function.cache_info()
    hit_rate = stats.hits / (stats.hits + stats.misses) if (stats.hits + stats.misses) > 0 else 0
    print(f"Hit rate: {hit_rate:.1%}")
    print(f"Cache size: {stats.currsize}/{stats.maxsize}")

# Call function multiple times
for i in range(100):
    expensive_function(i % 10)

print_cache_stats()

4. Handle Cache Invalidation

from functools import lru_cache

@lru_cache(maxsize=128)
def get_user_data(user_id):
    return {'id': user_id, 'name': 'Alice'}

def update_user(user_id, name):
    # Update database
    # ...
    
    # Invalidate cache
    get_user_data.cache_clear()

# Or clear specific entry (requires custom implementation)

Common Pitfalls

Pitfall 1: Caching Mutable Objects

# ❌ PROBLEM: Mutable default argument
@lru_cache(maxsize=128)
def append_to_list(item, lst=[]):
    lst.append(item)
    return lst

# All calls share the same list!
print(append_to_list(1))  # [1]
print(append_to_list(2))  # [1, 2]

Pitfall 2: Ignoring Memory Usage

# ❌ PROBLEM: Unlimited cache grows indefinitely
@lru_cache(maxsize=None)
def process_unique_ids(id):
    return id ** 2

# If called with millions of unique IDs, memory explodes
for i in range(10000000):
    process_unique_ids(i)

Pitfall 3: Caching with Side Effects

# ❌ PROBLEM: Function has side effects
@lru_cache(maxsize=128)
def log_and_process(data):
    print(f"Processing {data}")  # Side effect
    return data ** 2

log_and_process(5)  # Prints "Processing 5"
log_and_process(5)  # Doesn't print (cached)

Conclusion

Caching is a powerful optimization technique. Key takeaways:

Use functools.lru_cache for simple, pure functions with limited inputs
Build custom caches when you need time-based expiration or thread safety
Monitor cache performance using cache_info() to verify effectiveness
Cache only pure functions to avoid subtle bugs
Choose appropriate cache sizes based on your use case
Consider libraries like cachetools or Redis for complex scenarios
Handle cache invalidation when underlying data changes

Start with functools.lru_cache for simplicity. As your application grows and requirements become more complex, graduate to custom implementations or established libraries.

Remember: caching is a trade-off between speed and memory. Use it wisely, measure its impact, and optimize based on data.

Happy caching!