Introduction
URL shorteners like bit.ly are classic system design interview questions. This guide walks through designing one, covering everything from basic functionality to scaling considerations.
Requirements
Functional Requirements
- Shorten long URLs
- Redirect to original URL
- Custom aliases (optional)
- Analytics (optional)
Non-Functional Requirements
- High availability
- Low latency
- Scalability
- Durability
High-Level Design
Components
User โ Load Balancer โ API Server
โ
Cache (Redis)
โ
Database
API Design
| Endpoint | Method | Description |
|---|---|---|
| /shorten | POST | Create short URL |
| /{short_code} | GET | Redirect to long URL |
| /short/{short_code} | GET | Get info (optional) |
Database Design
Schema
CREATE TABLE urls (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
click_count INT DEFAULT 0
);
CREATE INDEX idx_short_code ON urls(short_code);
Storage
- Relational: PostgreSQL, MySQL
- NoSQL: DynamoDB, Cassandra
- Primary key: auto-increment or UUID
Short Code Generation
Methods
1. Hash-Based
import hashlib
import base64
def shorten_url(url):
hash = hashlib.md5(url.encode()).digest()
code = base64.urlsafe_encode(hash)[:6]
return code
- Collision possible
- Not sequential
- Security issue (predictable)
2. Random Generation
import random
import string
def generate_code(length=6):
chars = string.ascii_letters + string.digits
return ''.join(random.choice(chars) for _ in range(length))
- Check uniqueness
- Need retry on collision
3. Counter-Based
Current count: 1,000,000
Base62: "15wd" (fast)
- Sequential (not guessable)
- Single point of failure
- Need distributed counter
Caching
Redis Implementation
def get_long_url(short_code):
# Check cache first
long_url = redis.get(f"url:{short_code}")
if long_url:
return long_url
# Fetch from database
long_url = db.query(short_code)
# Cache for future
redis.setex(f"url:{short_code}", 3600, long_url)
return long_url
Cache Strategy
- LRU eviction
- TTL: 1-24 hours
- Cache popular URLs
Scaling
Read-Heavy Optimization
- Cache heavily
- CDN for static assets
- Read replicas
Write-Heavy Optimization
- Batch inserts
- Async processing
- Queue-based
Sharding
By short_code:
- Hash-based sharding
- Consistent hashing
Deep Dive Questions
How would you handle 10x traffic?
- Horizontal scaling
- More cache
- Read replicas
- Rate limiting
How to prevent abuse?
- Rate limiting per IP
- Require auth for custom URLs
- Detect spam patterns
Analytics implementation?
- Async event logging
- Batch processing
- Time-series DB
Custom aliases?
- User authentication
- Rate limits per user
- Conflict resolution
Implementation Example
from flask import Flask, redirect, request
import redis
import uuid
app = Flask(__name__)
db = redis.Redis()
@app.route('/shorten', methods=['POST'])
def shorten():
long_url = request.json['url']
short_code = str(uuid.uuid4())[:6]
# Store in database
db.set(f"url:{short_code}", long_url)
db.set(f"rev:{long_url}", short_code)
return {'short_url': f'https://short.io/{short_code}'}
@app.route('/<short_code>')
def redirect_url(short_code):
long_url = db.get(f"url:{short_code}")
if long_url:
return redirect(long_url)
return 'Not found', 404
Interview Tips
- Clarify requirements - Ask questions
- High-level first - Then deep dive
- Think trade-offs - Nothing is perfect
- Scale gradually - Start simple
- Show you care - Discuss monitoring
Conclusion
URL shortener design tests many skills: API design, database, caching, and scaling. Focus on core functionality first, then address optimizations and edge cases.
Comments