Introduction
Rate limiting protects your API from abuse, ensures fair usage, and prevents cascading failures. This guide covers the main algorithms and implementations for production-ready rate limiting.
Why Rate Limiting?
┌─────────────────────────────────────────────────────────────┐
│ Rate Limiting Purposes │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Prevent Abuse │
│ • Stop malicious users │
│ • Block scrapers and bots │
│ • Prevent DoS attacks │
│ │
│ 2. Ensure Fairness │
│ • One user doesn't monopolize resources │
│ • Premium users get priority │
│ • Free tier limits respected │
│ │
│ 3. Cost Control │
│ • Prevent unexpected bills │
│ • Match capacity to pricing │
│ • Graceful degradation │
│ │
│ 4. Stability │
│ • Prevent cascading failures │
│ • Maintain SLA for all users │
│ │
└─────────────────────────────────────────────────────────────┘
Rate Limiting Algorithms
1. Fixed Window
┌─────────────────────────────────────────────────────────────┐
│ Fixed Window Algorithm │
├─────────────────────────────────────────────────────────────┤
│ │
│ Window: 1 minute │
│ Limit: 100 requests │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Minute 1: ████████████████████ 80/100 │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Minute 2: ████████████████████ 80/100 │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ⚠️ Burst problem: 150 requests at minute boundary │
│ 80 in minute 1 + 70 in minute 2 = 150 total │
│ │
└─────────────────────────────────────────────────────────────┘
// Fixed Window Implementation
class FixedWindowRateLimiter {
private windows = new Map<string, { count: number; windowStart: number }>();
constructor(private limit: number, private windowMs: number) {}
async isAllowed(key: string): Promise<boolean> {
const now = Date.now();
const windowStart = Math.floor(now / this.windowMs) * this.windowMs;
let window = this.windows.get(key);
if (!window || window.windowStart !== windowStart) {
window = { count: 0, windowStart };
this.windows.set(key, window);
}
if (window.count >= this.limit) {
return false;
}
window.count++;
return true;
}
}
2. Sliding Window
┌─────────────────────────────────────────────────────────────┐
│ Sliding Window Algorithm │
├─────────────────────────────────────────────────────────────┤
│ │
│ Window: 1 minute (sliding) │
│ Limit: 100 requests │
│ │
│ Current time: 45s │
│ ┌─────────────────────────────────────────────┐ │
│ │ 15s ago ────────────────────────── now │ │
│ │ ████████████████░░░░░░░░░░░░░░░░░░ 60 req │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ✓ Smoother than fixed window │
│ ✓ No burst at boundaries │
│ │
└─────────────────────────────────────────────────────────────┘
// Sliding Window Implementation
class SlidingWindowRateLimiter {
private requests = new Map<string, number[]>();
constructor(private limit: number, private windowMs: number) {}
async isAllowed(key: string): Promise<boolean> {
const now = Date.now();
const windowStart = now - this.windowMs;
// Get existing requests in window
let times = this.requests.get(key) || [];
times = times.filter(t => t > windowStart);
if (times.length >= this.limit) {
return false;
}
times.push(now);
this.requests.set(key, times);
return true;
}
}
3. Token Bucket
┌─────────────────────────────────────────────────────────────┐
│ Token Bucket Algorithm │
├─────────────────────────────────────────────────────────────┤
│ │
│ Bucket capacity: 100 tokens │
│ Refill rate: 10 tokens/second │
│ │
│ Request arrives: │
│ ┌─────────────────────────────────────────────┐ │
│ │ Token Bucket │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ (50 tokens) │ │ │
│ │ └──────────────────────────────────────┘ │ │
│ │ │ │
│ │ Request needs: 5 tokens │ │
│ │ ✓ Available, consume tokens │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ ✓ Allows bursts up to bucket capacity │
│ ✓ Smooth rate over time │
│ │
└─────────────────────────────────────────────────────────────┘
// Token Bucket Implementation
class TokenBucketRateLimiter {
private buckets = new Map<string, { tokens: number; lastRefill: number }>();
constructor(
private capacity: number,
private refillRate: number // tokens per ms
) {}
async isAllowed(key: string, cost: number = 1): Promise<boolean> {
const now = Date.now();
let bucket = this.buckets.get(key);
if (!bucket) {
bucket = { tokens: this.capacity, lastRefill: now };
this.buckets.set(key, bucket);
}
// Refill tokens
const timePassed = now - bucket.lastRefill;
const tokensToAdd = timePassed * this.refillRate;
bucket.tokens = Math.min(this.capacity, bucket.tokens + tokensToAdd);
bucket.lastRefill = now;
if (bucket.tokens >= cost) {
bucket.tokens -= cost;
return true;
}
return false;
}
}
Implementation Examples
Express Middleware
// Express rate limiting middleware
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
// Basic rate limiter
export const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
limit: 100, // Limit each IP to 100 requests per window
standardHeaders: 'draft-7',
legacyHeaders: false,
store: new RedisStore({
sendCommand: (...args: string[]) => redis.call(...args),
}),
message: {
success: false,
message: 'Too many requests, please try again later',
},
});
// Stricter limiter for auth endpoints
export const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
limit: 5, // Only 5 attempts
skipSuccessfulRequests: true,
message: {
success: false,
message: 'Too many login attempts',
},
});
// Apply to routes
app.use('/api/', apiLimiter);
app.use('/api/auth/login', authLimiter);
Custom Redis Implementation
// Redis-based sliding window
import Redis from 'ioredis';
class RedisRateLimiter {
constructor(private redis: Redis) {}
async slidingWindow(
key: string,
limit: number,
windowMs: number
): Promise<{ allowed: boolean; remaining: number }> {
const now = Date.now();
const windowStart = now - windowMs;
// Use Redis transaction
const multi = this.redis.multi();
// Remove old entries
multi.zremrangebyscore(key, 0, windowStart);
// Count current requests
multi.zcard(key);
// Add current request
multi.zadd(key, now, `${now}-${Math.random()}`);
// Set expiry
multi.pexpire(key, windowMs);
const results = await multi.exec();
const currentCount = results?.[1]?.[1] as number || 0;
if (currentCount > limit) {
return { allowed: false, remaining: 0 };
}
return { allowed: true, remaining: limit - currentCount };
}
}
Response Headers
// Rate limit headers
app.use((req, res, next) => {
const remaining = res.get('RateLimit-Remaining');
const limit = res.get('RateLimit-Limit');
const reset = res.get('RateLimit-Reset');
// Add standard headers
res.set('RateLimit-Limit', '100');
res.set('RateLimit-Remaining', '99');
res.set('RateLimit-Reset', Math.ceil(Date.now() / 1000 + 60).toString());
// When limit exceeded
if (remaining === '0') {
res.set('Retry-After', '60');
}
next();
});
Tiered Rate Limiting
# Different limits for different plans
plans:
free:
requests_per_minute: 10
requests_per_day: 1000
basic:
requests_per_minute: 60
requests_per_day: 10000
pro:
requests_per_minute: 300
requests_per_day: 100000
enterprise:
requests_per_minute: "custom"
requests_per_day: "unlimited"
// Plan-based rate limiting
function getRateLimit(plan: string) {
const limits = {
free: { windowMs: 60000, limit: 10 },
basic: { windowMs: 60000, limit: 60 },
pro: { windowMs: 60000, limit: 300 },
enterprise: { windowMs: 60000, limit: 10000 },
};
return limits[plan] || limits.free;
}
app.use('/api/', (req, res, next) => {
const plan = req.user?.plan || 'free';
const { windowMs, limit } = getRateLimit(plan);
// Apply appropriate limiter
rateLimit({ windowMs, limit })(req, res, next);
});
Key Takeaways
- Fixed Window - Simple, but allows bursts at boundaries
- Sliding Window - Smoother, more accurate
- Token Bucket - Best for burst handling, smooth rates
- Use Redis - Distributed rate limiting across instances
- Return proper headers - Help clients respect limits
Comments