Backend Performance Optimization: Caching, Connection Pooling, and Profiling

Introduction

Backend performance problems fall into a few categories: slow database queries (covered in Database Query Optimization), inefficient caching, blocking I/O, and resource exhaustion. This guide covers the non-database techniques that have the biggest impact.

Measure first. Profile before optimizing — you’ll almost always be surprised by where the actual bottleneck is.

Profiling Node.js

Built-in Profiler

# Generate a CPU profile
node --prof server.js

# Run load test while profiling
ab -n 1000 -c 10 http://localhost:3000/api/users

# Process the profile
node --prof-process isolate-*.log > profile.txt
cat profile.txt | head -50

Clinic.js (Recommended)

npm install -g clinic

# CPU profiling
clinic doctor -- node server.js

# Flame graph
clinic flame -- node server.js

# I/O bottleneck detection
clinic bubbleprof -- node server.js

Manual Timing

// Measure specific operations
async function getUsers(filters) {
    const start = performance.now();

    const users = await db.query('SELECT * FROM users WHERE ...', filters);

    const elapsed = performance.now() - start;
    if (elapsed > 100) {
        logger.warn('Slow getUsers', { elapsed, filters });
    }

    return users;
}

// Express middleware: log slow requests
app.use((req, res, next) => {
    const start = Date.now();
    res.on('finish', () => {
        const elapsed = Date.now() - start;
        if (elapsed > 500) {
            logger.warn('Slow request', {
                method: req.method,
                path: req.path,
                status: res.statusCode,
                elapsed,
            });
        }
    });
    next();
});

Caching with Redis

Cache-Aside Pattern

The most common caching pattern: check cache first, fall back to database:

import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

async function getUser(id) {
    const cacheKey = `user:${id}`;

    // 1. Check cache
    const cached = await redis.get(cacheKey);
    if (cached) {
        return JSON.parse(cached);
    }

    // 2. Cache miss — fetch from database
    const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
    if (!user) return null;

    // 3. Store in cache with TTL
    await redis.setEx(cacheKey, 3600, JSON.stringify(user));  // 1 hour TTL

    return user;
}

// Invalidate on update
async function updateUser(id, updates) {
    const user = await db.query(
        'UPDATE users SET ... WHERE id = $1 RETURNING *',
        [id, ...Object.values(updates)]
    );

    // Delete cache entry
    await redis.del(`user:${id}`);

    return user;
}

Cache Stampede Prevention

When a popular cache key expires, many requests hit the database simultaneously:

// Mutex lock prevents stampede
async function getUserWithLock(id) {
    const cacheKey = `user:${id}`;
    const lockKey  = `lock:user:${id}`;

    // Check cache
    const cached = await redis.get(cacheKey);
    if (cached) return JSON.parse(cached);

    // Try to acquire lock (NX = only set if not exists)
    const locked = await redis.set(lockKey, '1', { NX: true, EX: 10 });

    if (locked) {
        // We have the lock — fetch and cache
        try {
            const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
            await redis.setEx(cacheKey, 3600, JSON.stringify(user));
            return user;
        } finally {
            await redis.del(lockKey);
        }
    } else {
        // Another request is fetching — wait and retry
        await new Promise(r => setTimeout(r, 50));
        return getUserWithLock(id);
    }
}

Batch Caching

// Fetch multiple users efficiently
async function getUsers(ids) {
    // Check cache for all IDs at once
    const keys = ids.map(id => `user:${id}`);
    const cached = await redis.mGet(keys);

    const results = {};
    const missingIds = [];

    cached.forEach((value, i) => {
        if (value) {
            results[ids[i]] = JSON.parse(value);
        } else {
            missingIds.push(ids[i]);
        }
    });

    if (missingIds.length > 0) {
        // Fetch missing from database in one query
        const users = await db.query(
            'SELECT * FROM users WHERE id = ANY($1)',
            [missingIds]
        );

        // Cache the results
        const pipeline = redis.multi();
        users.forEach(user => {
            results[user.id] = user;
            pipeline.setEx(`user:${user.id}`, 3600, JSON.stringify(user));
        });
        await pipeline.exec();
    }

    return ids.map(id => results[id]);
}

Cache Patterns by Use Case

Pattern	When to Use	TTL
Cache-aside	General purpose	Minutes to hours
Write-through	Data that must be consistent	Same as DB
Write-behind	High write volume, eventual consistency OK	N/A
Read-through	Transparent caching layer	Minutes to hours
Refresh-ahead	Predictable access patterns	Refresh before expiry

Connection Pooling

Creating a new database connection for every request is expensive. Use a pool:

// PostgreSQL with pg
import { Pool } from 'pg';

const pool = new Pool({
    connectionString: process.env.DATABASE_URL,
    max: 20,                    // max connections in pool
    min: 5,                     // min idle connections
    idleTimeoutMillis: 30000,   // close idle connections after 30s
    connectionTimeoutMillis: 2000,  // fail if can't connect in 2s
    maxUses: 7500,              // recycle connections after 7500 uses
});

// Monitor pool health
pool.on('error', (err) => {
    logger.error('Unexpected pool error', err);
});

// Use the pool
async function query(sql, params) {
    const client = await pool.connect();
    try {
        return await client.query(sql, params);
    } finally {
        client.release();  // always release back to pool
    }
}

// Check pool stats
setInterval(() => {
    logger.info('Pool stats', {
        total: pool.totalCount,
        idle: pool.idleCount,
        waiting: pool.waitingCount,
    });
}, 60000);

// MongoDB with Mongoose
import mongoose from 'mongoose';

await mongoose.connect(process.env.MONGODB_URI, {
    maxPoolSize: 10,        // max connections
    minPoolSize: 2,         // min idle connections
    serverSelectionTimeoutMS: 5000,
    socketTimeoutMS: 45000,
});

Async Optimization

Parallel vs Sequential

// BAD: sequential — total time = sum of all times
async function getUserDashboard(userId) {
    const user    = await getUser(userId);        // 50ms
    const posts   = await getUserPosts(userId);   // 80ms
    const friends = await getUserFriends(userId); // 60ms
    // Total: ~190ms
    return { user, posts, friends };
}

// GOOD: parallel — total time = max of all times
async function getUserDashboard(userId) {
    const [user, posts, friends] = await Promise.all([
        getUser(userId),        // 50ms ─┐
        getUserPosts(userId),   // 80ms  ├─ run simultaneously
        getUserFriends(userId), // 60ms ─┘
    ]);
    // Total: ~80ms
    return { user, posts, friends };
}

Streaming Large Responses

// BAD: loads entire dataset into memory
app.get('/export', async (req, res) => {
    const allUsers = await db.query('SELECT * FROM users');  // could be millions
    res.json(allUsers);
});

// GOOD: stream results
import { pipeline } from 'stream/promises';
import { Transform } from 'stream';

app.get('/export', async (req, res) => {
    res.setHeader('Content-Type', 'application/json');
    res.write('[');

    let first = true;
    const cursor = db.query('SELECT * FROM users').cursor();

    for await (const row of cursor) {
        if (!first) res.write(',');
        res.write(JSON.stringify(row));
        first = false;
    }

    res.write(']');
    res.end();
});

Worker Threads for CPU Work

// main.js — offload CPU-intensive work to a worker thread
import { Worker } from 'worker_threads';

function runWorker(data) {
    return new Promise((resolve, reject) => {
        const worker = new Worker('./workers/processor.js', {
            workerData: data
        });
        worker.on('message', resolve);
        worker.on('error', reject);
    });
}

app.post('/process', async (req, res) => {
    // Don't block the event loop with heavy computation
    const result = await runWorker(req.body);
    res.json(result);
});

// workers/processor.js
import { workerData, parentPort } from 'worker_threads';

// CPU-intensive work runs in separate thread
const result = heavyComputation(workerData);
parentPort.postMessage(result);

HTTP Response Optimization

Compression

import compression from 'compression';

app.use(compression({
    level: 6,           // compression level (1-9, 6 is good balance)
    threshold: 1024,    // only compress responses > 1KB
    filter: (req, res) => {
        // Don't compress already-compressed formats
        if (req.headers['x-no-compression']) return false;
        return compression.filter(req, res);
    },
}));

Response Caching Headers

// Cache static API responses
app.get('/api/config', (req, res) => {
    res.set('Cache-Control', 'public, max-age=3600');  // 1 hour
    res.json(config);
});

// No cache for user-specific data
app.get('/api/me', authenticate, (req, res) => {
    res.set('Cache-Control', 'private, no-cache');
    res.json(req.user);
});

// ETag for conditional requests
app.get('/api/posts/:id', async (req, res) => {
    const post = await getPost(req.params.id);
    const etag = `"${post.updatedAt.getTime()}"`;

    if (req.headers['if-none-match'] === etag) {
        return res.status(304).end();  // Not Modified
    }

    res.set('ETag', etag);
    res.json(post);
});

Rate Limiting

Protect your backend from abuse and ensure fair resource distribution:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';

// General API rate limit
const apiLimiter = rateLimit({
    windowMs: 15 * 60 * 1000,  // 15 minutes
    max: 100,                   // 100 requests per window
    standardHeaders: true,
    legacyHeaders: false,
    store: new RedisStore({
        sendCommand: (...args) => redis.sendCommand(args),
    }),
});

// Stricter limit for auth endpoints
const authLimiter = rateLimit({
    windowMs: 15 * 60 * 1000,
    max: 10,
    message: { error: 'Too many login attempts, try again in 15 minutes' },
});

app.use('/api/', apiLimiter);
app.use('/api/auth/', authLimiter);

Performance Checklist

Profile before optimizing (use clinic.js or --prof)
Add Redis caching for expensive, frequently-read data
Use connection pooling for all databases
Run independent async operations in parallel with Promise.all
Stream large responses instead of loading into memory
Enable gzip/brotli compression
Set appropriate Cache-Control headers
Use worker threads for CPU-intensive operations
Implement rate limiting
Monitor p95/p99 latency, not just averages

Resources

Node.js Performance Best Practices
Clinic.js — Node.js performance profiling
Redis Documentation
node-postgres Pool