Introduction
Caching is the single most impactful optimization you can make to improve API performance. A well-designed caching strategy can reduce response times from hundreds of milliseconds to microseconds, decrease database load by 90% or more, and dramatically improve user experience.
However, caching introduces complexity. Cache invalidationโone of computer science’s two hardest problemsโcan lead to stale data if not handled correctly. This guide covers everything from HTTP caching headers to distributed caching with Redis, CDN integration, and battle-tested patterns used by high-scale systems.
The Caching Hierarchy
Modern applications use multiple layers of caching:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Client Cache โ
โ (Browser, Mobile App) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CDN Cache โ
โ (Edge Locations Worldwide) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ API Gateway Cache โ
โ (In-Memory, L1/L2) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Application Cache โ
โ (Redis, Memcached) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฒโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Database Query Cache โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each layer serves different purposes with different trade-offs.
HTTP Caching
Cache-Control Headers
The foundation of web caching is the HTTP Cache-Control header:
Cache-Control: public, max-age=3600, s-maxage=86400
Key Directives:
| Directive | Description |
|---|---|
public |
Can be cached by any cache (CDN, proxy) |
private |
Only cached by browser, not shared caches |
no-cache |
Must revalidate with server before use |
no-store |
Never cache, always fetch fresh |
max-age=seconds |
Time until cache is stale |
s-maxage=seconds |
Override for shared caches (CDN) |
must-revalidate |
Must revalidate after expiry |
Practical HTTP Caching Examples
// Express.js: Setting cache headers
app.get('/api/products', (req, res) => {
// Cache in browser for 5 minutes, CDN for 1 hour
res.set('Cache-Control', 'public, max-age=300, s-maxage=3600');
res.json(products);
});
app.get('/api/user/profile', (req, res) => {
// Private data - only browser cache
res.set('Cache-Control', 'private, max-age=60');
res.json(userProfile);
});
app.post('/api/cart', (req, res) => {
// Never cache mutations
res.set('Cache-Control', 'no-store');
res.json({ success: true });
});
ETag and Last-Modified
ETags provide conditional caching using content hashing:
// Generate ETag from content
const crypto = require('crypto');
function generateETag(content) {
return crypto.createHash('md5').update(content).digest('hex');
}
// Express middleware for ETag
app.get('/api/products', (req, res) => {
const products = getProducts();
const json = JSON.stringify(products);
const etag = generateETag(json);
// Check if client has cached version
if (req.headers['if-none-match'] === etag) {
return res.status(304).end();
}
res.set('ETag', etag);
res.set('Cache-Control', 'public, max-age=300');
res.json(products);
});
Vary Header
Different requests may need different cached responses:
Vary: Accept-Language, Accept-Encoding
// Cache different versions for different languages
app.get('/api/products', (req, res) => {
const lang = req.headers['accept-language'];
const products = getProducts(lang);
res.set('Vary', 'Accept-Language');
res.set('Cache-Control', 'public, max-age=3600');
res.json(products);
});
Application-Level Caching
Redis Cache-Aside Pattern
The cache-aside pattern is the most common strategy:
const redis = require('redis');
const client = redis.createClient();
async function getUser(userId) {
const cacheKey = `user:${userId}`;
// Step 1: Check cache first
const cached = await client.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// Step 2: Cache miss - fetch from database
const user = await database.users.findById(userId);
if (user) {
// Step 3: Store in cache with TTL
await client.setEx(cacheKey, 3600, JSON.stringify(user));
}
return user;
}
Write-Through Cache
Write data to cache and database simultaneously:
async function createUser(userData) {
const user = await database.users.create(userData);
// Write to cache immediately
const cacheKey = `user:${user.id}`;
await client.setEx(cacheKey, 3600, JSON.stringify(user));
return user;
}
async function updateUser(userId, updates) {
const user = await database.users.update(userId, updates);
// Update cache on write
const cacheKey = `user:${userId}`;
await client.setEx(cacheKey, 3600, JSON.stringify(user));
return user;
}
Write-Behind Cache
Async write to database after cache update:
async function updateUser(userId, updates) {
const cacheKey = `user:${userId}`;
// Update cache immediately (fast)
await client.setEx(cacheKey, 3600, JSON.stringify(updates));
// Queue database write (async, eventual consistency)
await queue.publish('user-updates', { userId, updates });
}
// Background worker processes queue
queue.consume('user-updates', async (message) => {
await database.users.update(message.userId, message.updates);
});
Cache Warming
Pre-populate cache on startup:
async function warmCache() {
console.log('Warming cache...');
// Load popular products
const products = await database.products.findPopular(100);
const pipeline = client.multi();
for (const product of products) {
pipeline.setEx(`product:${product.id}`, 3600, JSON.stringify(product));
}
await pipeline.exec();
console.log(`Cached ${products.length} products`);
}
// Schedule warming on startup
app.on('startup', warmCache);
Redis Data Structures for Caching
Hashes for Objects
// Store user object as hash
async function cacheUser(user) {
await client.hSet(`user:${user.id}`, {
name: user.name,
email: user.email,
createdAt: user.createdAt
});
await client.expire(`user:${user.id}`, 3600);
}
async function getUser(userId) {
return await client.hGetAll(`user:${userId}`);
}
Sorted Sets for Rankings
// Leaderboard with sorted sets
async function recordScore(userId, score) {
await client.zAdd('leaderboard', { score, value: userId });
}
async function getTopScores(limit = 10) {
return await client.zRangeWithScores('leaderboard', 0, limit - 1, { REV: true });
}
Streams for Event Caching
// Cache recent events
async function addEvent(event) {
await client.xAdd('events', '*', event);
}
async function getRecentEvents(count = 100) {
return await client.xRange('events', '-', '+', { COUNT: count });
}
Cache Invalidation Strategies
Time-Based Expiration (TTL)
// Different TTLs for different data types
const TTL = {
user: 3600, // 1 hour
product: 300, // 5 minutes
inventory: 60, // 1 minute
settings: 86400 // 24 hours
};
await client.setEx(`product:${id}`, TTL.product, JSON.stringify(product));
Event-Based Invalidation
// Invalidate cache when data changes
async function invalidateUserCache(userId) {
const keys = [
`user:${userId}`,
`user:${userId}:profile`,
`user:${userId}:settings`
];
await client.del(keys);
}
// Subscribe to database changes
database.on('user:updated', (user) => {
invalidateUserCache(user.id);
});
Pattern-Based Invalidation
// Use Redis SCAN for pattern matching
async function invalidateProductCache(categoryId) {
let cursor = 0;
do {
[cursor, keys] = await client.scan(cursor, {
MATCH: `product:${categoryId}:*`,
COUNT: 100
});
if (keys.length > 0) {
await client.del(keys);
}
} while (cursor !== '0');
}
Versioned Cache Keys
// Version-based invalidation
const VERSION = 'v2';
async function getProducts() {
const cacheKey = `products:${VERSION}:${categoryId}`;
// ... caching logic
}
async function invalidateProductsCache() {
// Increment version to invalidate all products cache
VERSION = 'v3';
}
Distributed Caching Patterns
Redis Cluster for High Availability
const redis = require('redis');
const RedisCluster = require('redis-cluster');
const cluster = new RedisCluster([
{ host: 'redis-1', port: 6379 },
{ host: 'redis-2', port: 6379 },
{ host: 'redis-3', port: 6379 }
]);
// Automatic key distribution across nodes
const value = await cluster.get(`user:${userId}`);
Read-Through Cache
// Cache-aside with read-through
const Cache = require('read-through');
const cache = Cache(
async (key) => {
// This runs only on cache miss
return await database.query(key);
},
{ ttl: 3600 }
);
// Single interface - handles cache automatically
const user = await cache(`user:${userId}`);
Write-Around with Async Cache Fill
async function writeData(key, data) {
// Write directly to database
await database.set(key, data);
// Invalidate cache (async, don't wait)
setImmediate(() => cache.invalidate(key));
}
CDN Caching Strategies
Cloudflare Configuration
// Workers script for custom cache logic
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request));
});
async function handleRequest(request) {
const url = new URL(request.url);
// Static assets: long cache
if (url.pathname.match(/\.(js|css|png|jpg)$/)) {
const response = await fetch(request);
const headers = new Headers(response.headers);
headers.set('Cache-Control', 'public, max-age=31536000, immutable');
return new Response(response.body, { status: response.status, headers });
}
// API: short cache with revalidation
if (url.pathname.startsWith('/api/')) {
const response = await fetch(request);
const headers = new Headers(response.headers);
headers.set('Cache-Control', 'public, max-age=60, stale-while-revalidate=300');
return new Response(response.body, { status: response.status, headers });
}
return fetch(request);
}
AWS CloudFront Functions
function handler(event) {
const request = event.request;
const response = event.response;
// Cache based on query parameters
if (request.uri.startsWith('/api/products')) {
request.headers['cloudfront-viewer-country'] = { value: 'US' };
}
// Add cache headers
response.headers['cache-control'] = {
value: 'public, max-age=300, s-maxage=3600'
};
return request;
}
Stale-While-Revalidate
Allow serving stale content while fetching fresh:
Cache-Control: public, max-age=60, stale-while-revalidate=300
// Implement SWR in your API
app.get('/api/products', async (req, res) => {
const cached = await redis.get('products');
if (cached) {
// Return cached immediately
res.set('X-Cache', 'HIT');
// Trigger background refresh if stale
const age = await redis.ttl('products');
if (age < 0) {
backgroundRefresh('products');
}
return res.json(JSON.parse(cached));
}
const products = await database.products.findAll();
await redis.setEx('products', 300, JSON.stringify(products));
res.json(products);
});
Caching Best Practices
Key Naming Conventions
// Consistent, hierarchical key naming
const KEY_PREFIX = 'app:';
function makeKey(...parts) {
return [KEY_PREFIX, ...parts].join(':');
}
// Usage
makeKey('user', userId); // app:user:123
makeKey('product', category, id); // app:product:electronics:456
makeKey('list', 'products', 'page', 1); // app:list:products:page:1
Serialization Strategies
// JSON (simple, universal)
await client.set('key', JSON.stringify(data));
const value = JSON.parse(await client.get('key'));
// MessagePack (smaller, faster)
const msgpack = require('msgpack');
await client.set('key', msgpack.pack(data));
const value = msgpack.unpack(await client.get('key'));
// Compression for large values
const zlib = require('zlib');
async function cacheWithCompression(key, data) {
const compressed = zlib.deflateSync(JSON.stringify(data));
await client.set(key, compressed);
}
Error Handling
async function getWithFallback(key, fetcher, ttl = 3600) {
try {
const cached = await client.get(key);
if (cached) return JSON.parse(cached);
} catch (error) {
// Log but don't crash - fall through to database
console.error('Cache error:', error.message);
}
// Cache miss or error - fetch from source
const data = await fetcher();
if (data) {
try {
await client.setEx(key, ttl, JSON.stringify(data));
} catch (error) {
console.error('Cache write error:', error.message);
}
}
return data;
}
Metrics and Monitoring
const metrics = {
hits: 0,
misses: 0,
errors: 0
};
async function getWithMetrics(key) {
try {
const result = await client.get(key);
if (result) {
metrics.hits++;
return JSON.parse(result);
}
metrics.misses++;
return null;
} catch (error) {
metrics.errors++;
return null;
}
}
// Expose metrics
app.get('/metrics/cache', (req, res) => {
const hitRate = (metrics.hits / (metrics.hits + metrics.misses)) * 100;
res.json({
hits: metrics.hits,
misses: metrics.misses,
errors: metrics.errors,
hitRate: hitRate.toFixed(2) + '%'
});
});
Cache Patterns by Use Case
User Sessions
// Session storage in Redis
async function createSession(userId) {
const sessionId = crypto.randomUUID();
const session = { userId, createdAt: Date.now() };
await client.setEx(
`session:${sessionId}`,
86400, // 24 hours
JSON.stringify(session)
);
return sessionId;
}
async function getSession(sessionId) {
const session = await client.get(`session:${sessionId}`);
return session ? JSON.parse(session) : null;
}
Rate Limiting
// Sliding window rate limiter
async function isRateLimited(key, limit, window) {
const now = Date.now();
const windowStart = now - window;
// Remove old entries
await client.zRemRangeByScore(key, 0, windowStart);
// Count requests in window
const count = await client.zCard(key);
if (count >= limit) {
return true;
}
// Add current request
await client.zAdd(key, { score: now, value: `${now}` });
await client.expire(key, window / 1000);
return false;
}
// Usage
app.use(async (req, res, next) => {
const key = `ratelimit:${req.ip}`;
if (await isRateLimited(key, 100, 60000)) {
return res.status(429).json({ error: 'Too many requests' });
}
next();
});
API Response Caching
// Generic API cache middleware
function cacheApi(ttl = 300) {
return async (req, res, next) => {
if (req.method !== 'GET') return next();
const key = makeKey('api', req.path, JSON.stringify(req.query));
try {
const cached = await client.get(key);
if (cached) {
return res.json(JSON.parse(cached));
}
// Capture original json method
const originalJson = res.json.bind(res);
// Override to cache response
res.json = async (data) => {
await client.setEx(key, ttl, JSON.stringify(data));
return originalJson(data);
};
} catch (error) {
console.error('Cache middleware error:', error);
}
next();
};
}
app.use('/api', cacheApi(300));
Common Pitfalls and Solutions
Pitfall 1: Cache Stampede
// Problem: Multiple requests hit database simultaneously on cache miss
// Solution: Distributed lock
async function getWithLock(key, fetcher, ttl = 3600) {
const lockKey = `lock:${key}`;
// Try to acquire lock
const acquired = await client.set(lockKey, '1', {
NX: true,
EX: 10 // 10 second lock timeout
});
if (acquired) {
try {
const data = await fetcher();
await client.setEx(key, ttl, JSON.stringify(data));
return data;
} finally {
await client.del(lockKey);
}
}
// Wait and retry
await new Promise(r => setTimeout(r, 100));
return getWithLock(key, fetcher, ttl);
}
Pitfall 2: Large Value Serialization
// Problem: Storing massive objects
// Solution: Store only what's needed, use compression
async function cacheUserSummary(user) {
const summary = {
id: user.id,
name: user.name,
avatar: user.avatar
// Don't include all user data
};
await client.setEx(`user:summary:${user.id}`, 3600, JSON.stringify(summary));
}
Pitfall 3: TTL Chaos
// Problem: Random TTLs causing cache thrashing
// Solution: Use consistent TTL buckets
const TTL_BUCKETS = [60, 300, 900, 3600, 86400];
function normalizeTTL(desiredTTL) {
return TTL_BUCKETS.reduce((prev, curr) =>
Math.abs(curr - desiredTTL) < Math.abs(prev - desiredTTL) ? curr : prev
);
}
Performance Comparison
| Caching Layer | Latency | Capacity | Use Case |
|---|---|---|---|
| Browser Cache | <1ms | Limited | Static assets |
| CDN Edge | 5-50ms | Global | Public API responses |
| Redis | 1-5ms | GBs | User data, sessions |
| Database Cache | 5-20ms | GBs | Query results |
Conclusion
Caching is essential for building high-performance APIs. Start with HTTP caching for public resources, add Redis for dynamic data, and leverage CDNs for global distribution.
Remember these key principles: cache reads are fast but cache invalidation is hardโuse TTLs and versioned keys to simplify. Monitor your cache hit rates and tune TTLs based on data freshness requirements. With proper caching, you can handle 100x more traffic with 10x better response times.
Comments