Skip to main content

System Design Interview: Building a URL Shortener

Published: March 9, 2026 Updated: May 24, 2026 Larry Qu 8 min read

Introduction

URL shorteners like bit.ly are classic system design interview questions. This guide walks through designing one, covering everything from basic functionality to scaling considerations.

Requirements

Functional Requirements

  • Shorten long URLs
  • Redirect to original URL
  • Custom aliases (optional)
  • Analytics (optional)

Non-Functional Requirements

  • High availability
  • Low latency
  • Scalability
  • Durability

High-Level Design

Components

User → Load Balancer → API Server
                        Cache (Redis)
                        Database

API Design

Endpoint Method Description
/shorten POST Create short URL
/{short_code} GET Redirect to long URL
/short/{short_code} GET Get info (optional)

Database Design

Schema

CREATE TABLE urls (
    id BIGINT PRIMARY KEY,
    short_code VARCHAR(10) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    expires_at TIMESTAMP,
    click_count INT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);

Storage

  • Relational: PostgreSQL, MySQL
  • NoSQL: DynamoDB, Cassandra
  • Primary key: auto-increment or UUID

Short Code Generation

Methods

1. Hash-Based

import hashlib
import base64

def shorten_url(url):
    hash = hashlib.md5(url.encode()).digest()
    code = base64.urlsafe_encode(hash)[:6]
    return code
  • Collision possible
  • Not sequential
  • Security issue (predictable)

2. Random Generation

import random
import string

def generate_code(length=6):
    chars = string.ascii_letters + string.digits
    return ''.join(random.choice(chars) for _ in range(length))
  • Check uniqueness
  • Need retry on collision

3. Counter-Based

Current count: 1,000,000
Base62: "15wd" (fast)
  • Sequential (not guessable)
  • Single point of failure
  • Need distributed counter

Caching

Redis Implementation

def get_long_url(short_code):
    # Check cache first
    long_url = redis.get(f"url:{short_code}")
    if long_url:
        return long_url
    
    # Fetch from database
    long_url = db.query(short_code)
    
    # Cache for future
    redis.setex(f"url:{short_code}", 3600, long_url)
    return long_url

Cache Strategy

  • LRU eviction
  • TTL: 1-24 hours
  • Cache popular URLs

Scaling

Read-Heavy Optimization

  • Cache heavily
  • CDN for static assets
  • Read replicas

Write-Heavy Optimization

  • Batch inserts
  • Async processing
  • Queue-based

Sharding

By short_code:

  • Hash-based sharding
  • Consistent hashing

Deep Dive Questions

How would you handle 10x traffic?

  • Horizontal scaling
  • More cache
  • Read replicas
  • Rate limiting

How to prevent abuse?

  • Rate limiting per IP
  • Require auth for custom URLs
  • Detect spam patterns

Analytics implementation?

  • Async event logging
  • Batch processing
  • Time-series DB

Custom aliases?

  • User authentication
  • Rate limits per user
  • Conflict resolution

Implementation Example

from flask import Flask, redirect, request
import redis
import uuid

app = Flask(__name__)
db = redis.Redis()

@app.route('/shorten', methods=['POST'])
def shorten():
    long_url = request.json['url']
    short_code = str(uuid.uuid4())[:6]
    
    # Store in database
    db.set(f"url:{short_code}", long_url)
    db.set(f"rev:{long_url}", short_code)
    
    return {'short_url': f'https://short.io/{short_code}'}

@app.route('/<short_code>')
def redirect_url(short_code):
    long_url = db.get(f"url:{short_code}")
    if long_url:
        return redirect(long_url)
    return 'Not found', 404

Detailed Database Design

Schema with Indexes

CREATE TABLE urls (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(10) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    user_id BIGINT REFERENCES users(id),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    expires_at TIMESTAMP WITH TIME ZONE,
    click_count BIGINT DEFAULT 0,
    last_accessed_at TIMESTAMP WITH TIME ZONE,
    is_active BOOLEAN DEFAULT TRUE
);

CREATE UNIQUE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
CREATE INDEX idx_created_at ON urls(created_at DESC);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;
CREATE INDEX idx_last_accessed ON urls(last_accessed_at) WHERE is_active = TRUE;

-- Analytics table for tracking
CREATE TABLE url_analytics (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(10) NOT NULL REFERENCES urls(short_code),
    accessed_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    ip_address INET,
    user_agent TEXT,
    referer TEXT,
    country_code VARCHAR(2),
    device_type VARCHAR(20)  -- desktop, mobile, tablet
);

CREATE INDEX idx_analytics_code ON url_analytics(short_code, accessed_at);
CREATE INDEX idx_analytics_date ON url_analytics(accessed_at);

Partitioning Strategy

For billions of rows, use time-based partitioning:

-- Partition by month for analytics
CREATE TABLE url_analytics_2026_01 PARTITION OF url_analytics
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE url_analytics_2026_02 PARTITION OF url_analytics
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

Storage Estimation

Metric Daily Volume Annual Volume
New URLs 1 million 365 million
Redirects 100 million 36.5 billion
Analytics events 100 million 36.5 billion
Storage (URLs) 100 MB 36.5 GB
Storage (analytics) 10 GB 3.65 TB

Short Code Generation in Depth

Base62 Encoding

Use Base62 (0-9, a-z, A-Z) for maximum compression with minimum length:

BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encode_base62(num: int) -> str:
    """Convert a number to Base62 string."""
    if num == 0:
        return BASE62[0]
    
    chars = []
    while num > 0:
        num, remainder = divmod(num, 62)
        chars.append(BASE62[remainder])
    
    return ''.join(reversed(chars))

def decode_base62(code: str) -> int:
    """Convert Base62 string back to number."""
    num = 0
    for char in code:
        num = num * 62 + BASE62.index(char)
    return num

# Generate short code from a distributed ID
short_code = encode_base62(snowflake_id)

Capacity of Different Code Lengths

Length Combinations Max URLs Notes
4 62^4 = 14.7M 14.7 million Too small for production
5 62^5 = 916M 916 million Good for most applications
6 62^6 = 56.8B 56.8 billion Excellent for large scale
7 62^7 = 3.5T 3.5 trillion Overkill for most use cases

Distributed ID Generation (Snowflake)

Avoid collisions and sequential guessing with a Snowflake-like ID:

import time
import threading

class SnowflakeGenerator:
    """Distributed unique ID generator."""
    
    def __init__(self, worker_id: int, datacenter_id: int = 0):
        self.worker_id = worker_id
        self.datacenter_id = datacenter_id
        self.sequence = 0
        self.last_timestamp = -1
        self.lock = threading.Lock()
        
        # Bit layout: 1 sign + 41 timestamp + 5 datacenter + 5 worker + 12 sequence
        self.epoch = 1700000000000  # Custom epoch
    
    def next_id(self) -> int:
        with self.lock:
            timestamp = int(time.time() * 1000)
            
            if timestamp < self.last_timestamp:
                raise Exception("Clock moved backwards")
            
            if timestamp == self.last_timestamp:
                self.sequence = (self.sequence + 1) & 4095
                if self.sequence == 0:
                    timestamp = self.wait_next_ms()
            else:
                self.sequence = 0
            
            self.last_timestamp = timestamp
            
            return ((timestamp - self.epoch) << 22) | \
                   (self.datacenter_id << 17) | \
                   (self.worker_id << 12) | \
                   self.sequence
    
    def wait_next_ms(self) -> int:
        timestamp = int(time.time() * 1000)
        while timestamp <= self.last_timestamp:
            timestamp = int(time.time() * 1000)
        return timestamp

Advanced Caching Strategy

Multi-Layer Cache

import redis
import aiocache
from typing import Optional

class MultiLayerCache:
    """L1: Local memory, L2: Redis, L3: Database."""

    def __init__(self, redis_client: redis.Redis):
        self.local_cache = {}
        self.redis = redis_client
        self.local_ttl = 60       # 1 minute for L1
        self.redis_ttl = 3600     # 1 hour for L2
        self.max_local_size = 10000

    async def get(self, short_code: str) -> Optional[str]:
        # L1: Local memory (fastest)
        if short_code in self.local_cache:
            entry = self.local_cache[short_code]
            if entry['expires_at'] > time.time():
                return entry['url']
            del self.local_cache[short_code]

        # L2: Redis (fast)
        long_url = await self.redis.get(f"url:{short_code}")
        if long_url:
            # Promote to L1
            if len(self.local_cache) < self.max_local_size:
                self.local_cache[short_code] = {
                    'url': long_url,
                    'expires_at': time.time() + self.local_ttl
                }
            return long_url

        return None

    async def set(self, short_code: str, long_url: str):
        await self.redis.setex(f"url:{short_code}", self.redis_ttl, long_url)
        # Pre-populate L1
        self.local_cache[short_code] = {
            'url': long_url,
            'expires_at': time.time() + self.local_ttl
        }

    async def invalidate(self, short_code: str):
        self.local_cache.pop(short_code, None)
        await self.redis.delete(f"url:{short_code}")

Cache-Aside Pattern with Write-Through

async def redirect_url(short_code: str) -> Optional[str]:
    # Try cache
    long_url = await cache.get(short_code)
    if long_url:
        # Async analytics — don't block the redirect
        asyncio.create_task(track_click(short_code))
        return long_url

    # Cache miss — query database
    row = await db.fetchrow(
        "SELECT long_url FROM urls WHERE short_code = $1 AND is_active = true",
        short_code
    )
    if not row:
        return None

    # Populate cache asynchronously
    asyncio.create_task(cache.set(short_code, row['long_url']))
    asyncio.create_task(track_click(short_code))
    return row['long_url']

Rate Limiting and Abuse Prevention

Token Bucket Implementation

import time
from collections import defaultdict

class TokenBucket:
    """Rate limiter using token bucket algorithm."""

    def __init__(self, rate: int, capacity: int):
        self.rate = rate           # Tokens per second
        self.capacity = capacity   # Max burst
        self.tokens = defaultdict(float)
        self.last_refill = defaultdict(float)

    def allow(self, key: str, tokens: int = 1) -> bool:
        now = time.time()
        # Refill tokens
        elapsed = now - self.last_refill[key]
        self.tokens[key] = min(
            self.capacity,
            self.tokens[key] + elapsed * self.rate
        )
        self.last_refill[key] = now

        if self.tokens[key] >= tokens:
            self.tokens[key] -= tokens
            return True
        return False

# Usage
limiter = TokenBucket(rate=10, capacity=20)  # 10 requests/sec, burst 20

@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
    client_ip = request.client.host

    if not limiter.allow(f"ip:{client_ip}"):
        raise HTTPException(status_code=429, detail="Too many requests")

    # Process redirect...

Tiered Rate Limits

Tier Rate Limit Features
Anonymous 10 req/min Basic redirect
Free API 100 req/min Analytics dashboard
Pro API 1000 req/min Custom domains, advanced analytics
Enterprise Custom SLA guarantees, dedicated infrastructure

CAP Theorem and Consistency Considerations

A URL shortener is typically an AP system (Availability + Partition tolerance):

  • Availability: Redirects must always work — a 503 on a redirect breaks user experience
  • Partition tolerance: Must handle network partitions gracefully
  • Consistency trade-off: Eventual consistency for analytics is acceptable. However, short code creation must be strongly consistent to prevent duplicates

Use a strongly consistent database (PostgreSQL, CockroachDB) for URL storage and eventually consistent stores (Redis, Cassandra) for analytics and caching.

Monitoring and Observability

Key Metrics to Track

# Prometheus metrics for URL shortener
METRICS = {
    "redirect_requests_total": "Counter of all redirect requests",
    "redirect_latency_seconds": "Histogram of redirect latency",
    "cache_hit_ratio": "Ratio of cache hits to total lookups",
    "shorten_requests_total": "Counter of URL shortening requests",
    "shorten_latency_seconds": "Histogram of shortening latency",
    "active_urls_count": "Gauge of active shortened URLs",
    "error_rate_by_type": "Counter of errors by type (not_found, rate_limited, etc.)"
}

# Alert thresholds
ALERTS = {
    "high_error_rate": "Error rate > 1% over 5 minutes",
    "high_latency_p99": "p99 latency > 100ms over 5 minutes",
    "low_cache_hit_ratio": "Cache hit ratio < 80% over 10 minutes",
    "high_rate_limiting": "Rate limited requests > 5% over 5 minutes"
}

Distributed Tracing

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

async def handle_redirect(short_code: str):
    with tracer.start_as_current_span("handle_redirect") as span:
        span.set_attribute("short_code", short_code)

        # Cache lookup
        with tracer.start_as_current_span("cache_lookup") as cache_span:
            long_url = await cache.get(short_code)
            cache_span.set_attribute("cache_hit", long_url is not None)

        if not long_url:
            # Database query
            with tracer.start_as_current_span("db_lookup") as db_span:
                long_url = await db.fetch_long_url(short_code)
                db_span.set_attribute("found", long_url is not None)

        if not long_url:
            span.set_attribute("error", "not_found")
            return None

        # Track analytics asynchronously
        with tracer.start_as_current_span("track_analytics") as analytics_span:
            asyncio.create_task(track_visit(short_code))

        span.set_attribute("long_url", long_url)
        return long_url

Interview Tips

  1. Clarify requirements - Ask questions
  2. High-level first - Then deep dive
  3. Think trade-offs - Nothing is perfect
  4. Scale gradually - Start simple
  5. Show you care - Discuss monitoring

Conclusion

URL shortener design tests many skills: API design, database, caching, and scaling. Focus on core functionality first, then address optimizations and edge cases.


Resources

Comments

👍 Was this article helpful?