API Rate Limiting Strategies: Implementation Guide

Introduction

Rate limiting protects your API from abuse, ensures fair usage, and prevents cascading failures. This guide covers the main algorithms and implementations for production-ready rate limiting.

Why Rate Limiting?

┌─────────────────────────────────────────────────────────────┐
│              Rate Limiting Purposes                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. Prevent Abuse                                           │
│     • Stop malicious users                                   │
│     • Block scrapers and bots                               │
│     • Prevent DoS attacks                                   │
│                                                             │
│  2. Ensure Fairness                                         │
│     • One user doesn't monopolize resources                 │
│     • Premium users get priority                            │
│     • Free tier limits respected                           │
│                                                             │
│  3. Cost Control                                             │
│     • Prevent unexpected bills                              │
│     • Match capacity to pricing                             │
│     • Graceful degradation                                  │
│                                                             │
│  4. Stability                                                │
│     • Prevent cascading failures                            │
│     • Maintain SLA for all users                           │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Rate Limiting Algorithms

1. Fixed Window

┌─────────────────────────────────────────────────────────────┐
│                  Fixed Window Algorithm                         │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Window: 1 minute                                          │
│  Limit: 100 requests                                       │
│                                                             │
│  ┌─────────────────────────────────────────────┐            │
│  │ Minute 1: ████████████████████ 80/100     │            │
│  └─────────────────────────────────────────────┘            │
│                         │                                    │
│  ┌─────────────────────────────────────────────┐            │
│  │ Minute 2: ████████████████████ 80/100     │            │
│  └─────────────────────────────────────────────┘            │
│                         │                                    │
│  ⚠️ Burst problem: 150 requests at minute boundary        │
│     80 in minute 1 + 70 in minute 2 = 150 total          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

// Fixed Window Implementation
class FixedWindowRateLimiter {
  private windows = new Map<string, { count: number; windowStart: number }>();
  
  constructor(private limit: number, private windowMs: number) {}
  
  async isAllowed(key: string): Promise<boolean> {
    const now = Date.now();
    const windowStart = Math.floor(now / this.windowMs) * this.windowMs;
    
    let window = this.windows.get(key);
    
    if (!window || window.windowStart !== windowStart) {
      window = { count: 0, windowStart };
      this.windows.set(key, window);
    }
    
    if (window.count >= this.limit) {
      return false;
    }
    
    window.count++;
    return true;
  }
}

2. Sliding Window

┌─────────────────────────────────────────────────────────────┐
│                Sliding Window Algorithm                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Window: 1 minute (sliding)                                 │
│  Limit: 100 requests                                        │
│                                                             │
│  Current time: 45s                                          │
│  ┌─────────────────────────────────────────────┐            │
│  │ 15s ago ────────────────────────── now     │            │
│  │ ████████████████░░░░░░░░░░░░░░░░░░ 60 req  │            │
│  └─────────────────────────────────────────────┘            │
│                         │                                    │
│  ✓ Smoother than fixed window                              │
│  ✓ No burst at boundaries                                  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

// Sliding Window Implementation
class SlidingWindowRateLimiter {
  private requests = new Map<string, number[]>();
  
  constructor(private limit: number, private windowMs: number) {}
  
  async isAllowed(key: string): Promise<boolean> {
    const now = Date.now();
    const windowStart = now - this.windowMs;
    
    // Get existing requests in window
    let times = this.requests.get(key) || [];
    times = times.filter(t => t > windowStart);
    
    if (times.length >= this.limit) {
      return false;
    }
    
    times.push(now);
    this.requests.set(key, times);
    return true;
  }
}

3. Token Bucket

┌─────────────────────────────────────────────────────────────┐
│                   Token Bucket Algorithm                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Bucket capacity: 100 tokens                               │
│  Refill rate: 10 tokens/second                             │
│                                                             │
│  Request arrives:                                          │
│  ┌─────────────────────────────────────────────┐            │
│  │           Token Bucket                        │            │
│  │  ┌──────────────────────────────────────┐  │            │
│  │  │ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  (50 tokens)    │  │            │
│  │  └──────────────────────────────────────┘  │            │
│  │                                            │            │
│  │  Request needs: 5 tokens                  │            │
│  │  ✓ Available, consume tokens              │            │
│  └─────────────────────────────────────────────┘            │
│                                                             │
│  ✓ Allows bursts up to bucket capacity                   │
│  ✓ Smooth rate over time                                  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

// Token Bucket Implementation
class TokenBucketRateLimiter {
  private buckets = new Map<string, { tokens: number; lastRefill: number }>();
  
  constructor(
    private capacity: number,
    private refillRate: number // tokens per ms
  ) {}
  
  async isAllowed(key: string, cost: number = 1): Promise<boolean> {
    const now = Date.now();
    let bucket = this.buckets.get(key);
    
    if (!bucket) {
      bucket = { tokens: this.capacity, lastRefill: now };
      this.buckets.set(key, bucket);
    }
    
    // Refill tokens
    const timePassed = now - bucket.lastRefill;
    const tokensToAdd = timePassed * this.refillRate;
    bucket.tokens = Math.min(this.capacity, bucket.tokens + tokensToAdd);
    bucket.lastRefill = now;
    
    if (bucket.tokens >= cost) {
      bucket.tokens -= cost;
      return true;
    }
    
    return false;
  }
}

Implementation Examples

Express Middleware

// Express rate limiting middleware
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

// Basic rate limiter
export const apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  limit: 100, // Limit each IP to 100 requests per window
  standardHeaders: 'draft-7',
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args: string[]) => redis.call(...args),
  }),
  message: {
    success: false,
    message: 'Too many requests, please try again later',
  },
});

// Stricter limiter for auth endpoints
export const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  limit: 5, // Only 5 attempts
  skipSuccessfulRequests: true,
  message: {
    success: false,
    message: 'Too many login attempts',
  },
});

// Apply to routes
app.use('/api/', apiLimiter);
app.use('/api/auth/login', authLimiter);

Custom Redis Implementation

// Redis-based sliding window
import Redis from 'ioredis';

class RedisRateLimiter {
  constructor(private redis: Redis) {}
  
  async slidingWindow(
    key: string,
    limit: number,
    windowMs: number
  ): Promise<{ allowed: boolean; remaining: number }> {
    const now = Date.now();
    const windowStart = now - windowMs;
    
    // Use Redis transaction
    const multi = this.redis.multi();
    
    // Remove old entries
    multi.zremrangebyscore(key, 0, windowStart);
    
    // Count current requests
    multi.zcard(key);
    
    // Add current request
    multi.zadd(key, now, `${now}-${Math.random()}`);
    
    // Set expiry
    multi.pexpire(key, windowMs);
    
    const results = await multi.exec();
    const currentCount = results?.[1]?.[1] as number || 0;
    
    if (currentCount > limit) {
      return { allowed: false, remaining: 0 };
    }
    
    return { allowed: true, remaining: limit - currentCount };
  }
}

Response Headers

// Rate limit headers
app.use((req, res, next) => {
  const remaining = res.get('RateLimit-Remaining');
  const limit = res.get('RateLimit-Limit');
  const reset = res.get('RateLimit-Reset');
  
  // Add standard headers
  res.set('RateLimit-Limit', '100');
  res.set('RateLimit-Remaining', '99');
  res.set('RateLimit-Reset', Math.ceil(Date.now() / 1000 + 60).toString());
  
  // When limit exceeded
  if (remaining === '0') {
    res.set('Retry-After', '60');
  }
  
  next();
});

Tiered Rate Limiting

# Different limits for different plans
plans:
  free:
    requests_per_minute: 10
    requests_per_day: 1000
    
  basic:
    requests_per_minute: 60
    requests_per_day: 10000
    
  pro:
    requests_per_minute: 300
    requests_per_day: 100000
    
  enterprise:
    requests_per_minute: "custom"
    requests_per_day: "unlimited"

// Plan-based rate limiting
function getRateLimit(plan: string) {
  const limits = {
    free: { windowMs: 60000, limit: 10 },
    basic: { windowMs: 60000, limit: 60 },
    pro: { windowMs: 60000, limit: 300 },
    enterprise: { windowMs: 60000, limit: 10000 },
  };
  return limits[plan] || limits.free;
}

app.use('/api/', (req, res, next) => {
  const plan = req.user?.plan || 'free';
  const { windowMs, limit } = getRateLimit(plan);
  
  // Apply appropriate limiter
  rateLimit({ windowMs, limit })(req, res, next);
});

Key Takeaways

Fixed Window - Simple, but allows bursts at boundaries
Sliding Window - Smoother, more accurate
Token Bucket - Best for burst handling, smooth rates
Use Redis - Distributed rate limiting across instances
Return proper headers - Help clients respect limits

API Rate Limiting Strategies: Implementation Guide

Introduction

Why Rate Limiting?

Rate Limiting Algorithms

1. Fixed Window

2. Sliding Window

3. Token Bucket

Implementation Examples

Express Middleware

Custom Redis Implementation

Response Headers

Tiered Rate Limiting

Key Takeaways

External Resources

Comments

Share this article

👍 Was this article helpful?