Introduction
URL shorteners like bit.ly are classic system design interview questions. This guide walks through designing one, covering everything from basic functionality to scaling considerations.
Requirements
Functional Requirements
- Shorten long URLs
- Redirect to original URL
- Custom aliases (optional)
- Analytics (optional)
Non-Functional Requirements
- High availability
- Low latency
- Scalability
- Durability
High-Level Design
Components
User → Load Balancer → API Server
↓
Cache (Redis)
↓
Database
API Design
| Endpoint | Method | Description |
|---|---|---|
| /shorten | POST | Create short URL |
| /{short_code} | GET | Redirect to long URL |
| /short/{short_code} | GET | Get info (optional) |
Database Design
Schema
CREATE TABLE urls (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
click_count INT DEFAULT 0
);
CREATE INDEX idx_short_code ON urls(short_code);
Storage
- Relational: PostgreSQL, MySQL
- NoSQL: DynamoDB, Cassandra
- Primary key: auto-increment or UUID
Short Code Generation
Methods
1. Hash-Based
import hashlib
import base64
def shorten_url(url):
hash = hashlib.md5(url.encode()).digest()
code = base64.urlsafe_encode(hash)[:6]
return code
- Collision possible
- Not sequential
- Security issue (predictable)
2. Random Generation
import random
import string
def generate_code(length=6):
chars = string.ascii_letters + string.digits
return ''.join(random.choice(chars) for _ in range(length))
- Check uniqueness
- Need retry on collision
3. Counter-Based
Current count: 1,000,000
Base62: "15wd" (fast)
- Sequential (not guessable)
- Single point of failure
- Need distributed counter
Caching
Redis Implementation
def get_long_url(short_code):
# Check cache first
long_url = redis.get(f"url:{short_code}")
if long_url:
return long_url
# Fetch from database
long_url = db.query(short_code)
# Cache for future
redis.setex(f"url:{short_code}", 3600, long_url)
return long_url
Cache Strategy
- LRU eviction
- TTL: 1-24 hours
- Cache popular URLs
Scaling
Read-Heavy Optimization
- Cache heavily
- CDN for static assets
- Read replicas
Write-Heavy Optimization
- Batch inserts
- Async processing
- Queue-based
Sharding
By short_code:
- Hash-based sharding
- Consistent hashing
Deep Dive Questions
How would you handle 10x traffic?
- Horizontal scaling
- More cache
- Read replicas
- Rate limiting
How to prevent abuse?
- Rate limiting per IP
- Require auth for custom URLs
- Detect spam patterns
Analytics implementation?
- Async event logging
- Batch processing
- Time-series DB
Custom aliases?
- User authentication
- Rate limits per user
- Conflict resolution
Implementation Example
from flask import Flask, redirect, request
import redis
import uuid
app = Flask(__name__)
db = redis.Redis()
@app.route('/shorten', methods=['POST'])
def shorten():
long_url = request.json['url']
short_code = str(uuid.uuid4())[:6]
# Store in database
db.set(f"url:{short_code}", long_url)
db.set(f"rev:{long_url}", short_code)
return {'short_url': f'https://short.io/{short_code}'}
@app.route('/<short_code>')
def redirect_url(short_code):
long_url = db.get(f"url:{short_code}")
if long_url:
return redirect(long_url)
return 'Not found', 404
Detailed Database Design
Schema with Indexes
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT REFERENCES users(id),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE,
click_count BIGINT DEFAULT 0,
last_accessed_at TIMESTAMP WITH TIME ZONE,
is_active BOOLEAN DEFAULT TRUE
);
CREATE UNIQUE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
CREATE INDEX idx_created_at ON urls(created_at DESC);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;
CREATE INDEX idx_last_accessed ON urls(last_accessed_at) WHERE is_active = TRUE;
-- Analytics table for tracking
CREATE TABLE url_analytics (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) NOT NULL REFERENCES urls(short_code),
accessed_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
ip_address INET,
user_agent TEXT,
referer TEXT,
country_code VARCHAR(2),
device_type VARCHAR(20) -- desktop, mobile, tablet
);
CREATE INDEX idx_analytics_code ON url_analytics(short_code, accessed_at);
CREATE INDEX idx_analytics_date ON url_analytics(accessed_at);
Partitioning Strategy
For billions of rows, use time-based partitioning:
-- Partition by month for analytics
CREATE TABLE url_analytics_2026_01 PARTITION OF url_analytics
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE url_analytics_2026_02 PARTITION OF url_analytics
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
Storage Estimation
| Metric | Daily Volume | Annual Volume |
|---|---|---|
| New URLs | 1 million | 365 million |
| Redirects | 100 million | 36.5 billion |
| Analytics events | 100 million | 36.5 billion |
| Storage (URLs) | 100 MB | 36.5 GB |
| Storage (analytics) | 10 GB | 3.65 TB |
Short Code Generation in Depth
Base62 Encoding
Use Base62 (0-9, a-z, A-Z) for maximum compression with minimum length:
BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode_base62(num: int) -> str:
"""Convert a number to Base62 string."""
if num == 0:
return BASE62[0]
chars = []
while num > 0:
num, remainder = divmod(num, 62)
chars.append(BASE62[remainder])
return ''.join(reversed(chars))
def decode_base62(code: str) -> int:
"""Convert Base62 string back to number."""
num = 0
for char in code:
num = num * 62 + BASE62.index(char)
return num
# Generate short code from a distributed ID
short_code = encode_base62(snowflake_id)
Capacity of Different Code Lengths
| Length | Combinations | Max URLs | Notes |
|---|---|---|---|
| 4 | 62^4 = 14.7M | 14.7 million | Too small for production |
| 5 | 62^5 = 916M | 916 million | Good for most applications |
| 6 | 62^6 = 56.8B | 56.8 billion | Excellent for large scale |
| 7 | 62^7 = 3.5T | 3.5 trillion | Overkill for most use cases |
Distributed ID Generation (Snowflake)
Avoid collisions and sequential guessing with a Snowflake-like ID:
import time
import threading
class SnowflakeGenerator:
"""Distributed unique ID generator."""
def __init__(self, worker_id: int, datacenter_id: int = 0):
self.worker_id = worker_id
self.datacenter_id = datacenter_id
self.sequence = 0
self.last_timestamp = -1
self.lock = threading.Lock()
# Bit layout: 1 sign + 41 timestamp + 5 datacenter + 5 worker + 12 sequence
self.epoch = 1700000000000 # Custom epoch
def next_id(self) -> int:
with self.lock:
timestamp = int(time.time() * 1000)
if timestamp < self.last_timestamp:
raise Exception("Clock moved backwards")
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 4095
if self.sequence == 0:
timestamp = self.wait_next_ms()
else:
self.sequence = 0
self.last_timestamp = timestamp
return ((timestamp - self.epoch) << 22) | \
(self.datacenter_id << 17) | \
(self.worker_id << 12) | \
self.sequence
def wait_next_ms(self) -> int:
timestamp = int(time.time() * 1000)
while timestamp <= self.last_timestamp:
timestamp = int(time.time() * 1000)
return timestamp
Advanced Caching Strategy
Multi-Layer Cache
import redis
import aiocache
from typing import Optional
class MultiLayerCache:
"""L1: Local memory, L2: Redis, L3: Database."""
def __init__(self, redis_client: redis.Redis):
self.local_cache = {}
self.redis = redis_client
self.local_ttl = 60 # 1 minute for L1
self.redis_ttl = 3600 # 1 hour for L2
self.max_local_size = 10000
async def get(self, short_code: str) -> Optional[str]:
# L1: Local memory (fastest)
if short_code in self.local_cache:
entry = self.local_cache[short_code]
if entry['expires_at'] > time.time():
return entry['url']
del self.local_cache[short_code]
# L2: Redis (fast)
long_url = await self.redis.get(f"url:{short_code}")
if long_url:
# Promote to L1
if len(self.local_cache) < self.max_local_size:
self.local_cache[short_code] = {
'url': long_url,
'expires_at': time.time() + self.local_ttl
}
return long_url
return None
async def set(self, short_code: str, long_url: str):
await self.redis.setex(f"url:{short_code}", self.redis_ttl, long_url)
# Pre-populate L1
self.local_cache[short_code] = {
'url': long_url,
'expires_at': time.time() + self.local_ttl
}
async def invalidate(self, short_code: str):
self.local_cache.pop(short_code, None)
await self.redis.delete(f"url:{short_code}")
Cache-Aside Pattern with Write-Through
async def redirect_url(short_code: str) -> Optional[str]:
# Try cache
long_url = await cache.get(short_code)
if long_url:
# Async analytics — don't block the redirect
asyncio.create_task(track_click(short_code))
return long_url
# Cache miss — query database
row = await db.fetchrow(
"SELECT long_url FROM urls WHERE short_code = $1 AND is_active = true",
short_code
)
if not row:
return None
# Populate cache asynchronously
asyncio.create_task(cache.set(short_code, row['long_url']))
asyncio.create_task(track_click(short_code))
return row['long_url']
Rate Limiting and Abuse Prevention
Token Bucket Implementation
import time
from collections import defaultdict
class TokenBucket:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: int, capacity: int):
self.rate = rate # Tokens per second
self.capacity = capacity # Max burst
self.tokens = defaultdict(float)
self.last_refill = defaultdict(float)
def allow(self, key: str, tokens: int = 1) -> bool:
now = time.time()
# Refill tokens
elapsed = now - self.last_refill[key]
self.tokens[key] = min(
self.capacity,
self.tokens[key] + elapsed * self.rate
)
self.last_refill[key] = now
if self.tokens[key] >= tokens:
self.tokens[key] -= tokens
return True
return False
# Usage
limiter = TokenBucket(rate=10, capacity=20) # 10 requests/sec, burst 20
@app.get("/{short_code}")
async def redirect(short_code: str, request: Request):
client_ip = request.client.host
if not limiter.allow(f"ip:{client_ip}"):
raise HTTPException(status_code=429, detail="Too many requests")
# Process redirect...
Tiered Rate Limits
| Tier | Rate Limit | Features |
|---|---|---|
| Anonymous | 10 req/min | Basic redirect |
| Free API | 100 req/min | Analytics dashboard |
| Pro API | 1000 req/min | Custom domains, advanced analytics |
| Enterprise | Custom | SLA guarantees, dedicated infrastructure |
CAP Theorem and Consistency Considerations
A URL shortener is typically an AP system (Availability + Partition tolerance):
- Availability: Redirects must always work — a 503 on a redirect breaks user experience
- Partition tolerance: Must handle network partitions gracefully
- Consistency trade-off: Eventual consistency for analytics is acceptable. However, short code creation must be strongly consistent to prevent duplicates
Use a strongly consistent database (PostgreSQL, CockroachDB) for URL storage and eventually consistent stores (Redis, Cassandra) for analytics and caching.
Monitoring and Observability
Key Metrics to Track
# Prometheus metrics for URL shortener
METRICS = {
"redirect_requests_total": "Counter of all redirect requests",
"redirect_latency_seconds": "Histogram of redirect latency",
"cache_hit_ratio": "Ratio of cache hits to total lookups",
"shorten_requests_total": "Counter of URL shortening requests",
"shorten_latency_seconds": "Histogram of shortening latency",
"active_urls_count": "Gauge of active shortened URLs",
"error_rate_by_type": "Counter of errors by type (not_found, rate_limited, etc.)"
}
# Alert thresholds
ALERTS = {
"high_error_rate": "Error rate > 1% over 5 minutes",
"high_latency_p99": "p99 latency > 100ms over 5 minutes",
"low_cache_hit_ratio": "Cache hit ratio < 80% over 10 minutes",
"high_rate_limiting": "Rate limited requests > 5% over 5 minutes"
}
Distributed Tracing
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
async def handle_redirect(short_code: str):
with tracer.start_as_current_span("handle_redirect") as span:
span.set_attribute("short_code", short_code)
# Cache lookup
with tracer.start_as_current_span("cache_lookup") as cache_span:
long_url = await cache.get(short_code)
cache_span.set_attribute("cache_hit", long_url is not None)
if not long_url:
# Database query
with tracer.start_as_current_span("db_lookup") as db_span:
long_url = await db.fetch_long_url(short_code)
db_span.set_attribute("found", long_url is not None)
if not long_url:
span.set_attribute("error", "not_found")
return None
# Track analytics asynchronously
with tracer.start_as_current_span("track_analytics") as analytics_span:
asyncio.create_task(track_visit(short_code))
span.set_attribute("long_url", long_url)
return long_url
Interview Tips
- Clarify requirements - Ask questions
- High-level first - Then deep dive
- Think trade-offs - Nothing is perfect
- Scale gradually - Start simple
- Show you care - Discuss monitoring
Conclusion
URL shortener design tests many skills: API design, database, caching, and scaling. Focus on core functionality first, then address optimizations and edge cases.
Comments