System design questions are a crucial part of technical interviews for senior engineering roles. This guide covers fundamental patterns, common approaches, and building blocks for designing large-scale systems.
What is System Design?
System design involves making architectural decisions about how software systems should be built. It encompasses:
- Functional requirements: What the system should do
- Non-functional requirements: Performance, scalability, reliability
- Technical constraints: Budget, timeline, existing infrastructure
Common System Design Concepts
CAP Theorem
CAP Theorem states that a distributed system can only provide 2 of 3 guarantees:
┌─────────────────────────────────────┐
│ │
│ Consistency │
│ ⚡ ⚡ │
│ ↙ ↘ │
│ ↙ ↘ │
│ Availability ⚡ ⚡ │
│ ↘ ↙ │
│ ↘ ↙ │
│ Partition │
│ Tolerance │
│ │
└─────────────────────────────────────┘
- Consistency (C): All nodes see the same data
- Availability (A): Every request gets a response
- Partition Tolerance (P): System works despite network failures
ACID vs BASE
| ACID | BASE |
|---|---|
| Atomicity | Basically Available |
| Consistency | Soft state |
| Isolation | Eventual consistency |
| Durability |
Scalability Basics
Vertical vs Horizontal Scaling
Vertical Scaling (Scale Up) Horizontal Scaling (Scale Out)
┌─────────────────────┐ ┌───────┐ ┌───────┐ ┌───────┐
│ │ │ │ │ │ │ │
│ ┌─────────┐ │ │ App │ │ App │ │ App │
│ │ Server │ │ │ │ │ │ │ │
│ └─────────┘ │ └───────┘ └───────┘ └───────┘
│ ↑ │ ↑ ↑ ↑
│ More CPU/RAM │ Load Balancer
│ │
└─────────────────────┘
Database Scaling
Read Replicas Sharding
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Main │ │ Shard 1 │ │ Shard 2 │
│ Database │───Replication──▶ │ (Users A-M)│ │(Users N-Z)│
└──────────┘ └──────────┘ └──────────┘
↑
Reads/Writes
│
┌─────┴──────┐
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ Read │ │ Read │
│ Replica │ │ Replica │
└──────────┘ └──────────┘
Common Building Blocks
1. Load Balancer
## Load balancing strategies
class LoadBalancer:
def __init__(self, servers):
self.servers = servers
# Round Robin
def round_robin(self):
return self.servers[self.index % len(self.servers)]
# Least Connections
def least_connections(self):
return min(self.servers, key=lambda s: s.connections)
# Weighted Round Robin
def weighted_rr(self):
# Servers with higher weight get more requests
pass
# IP Hash (sticky sessions)
def ip_hash(self, client_ip):
return self.servers[hash(client_ip) % len(self.servers)]
2. Caching
┌─────────────────────────────────────────────────────┐
│ Cache Hierarchy │
├─────────────────────────────────────────────────────┤
│ │
│ Browser ──▶ CDN ──▶ Load Balancer ──▶ App Server │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Cache Edge Cache Redis/Memcached │
│ │ │
│ ▼ │
│ Database │
└─────────────────────────────────────────────────────┘
3. Database Patterns
-- Read Replica
SELECT * FROM users WHERE id = 1; -- Read from replica
-- Sharding by User ID
-- Users 1-1,000,000 -> Shard 1
-- Users 1,000,001-2,000,000 -> Shard 2
-- Vertical Partitioning
-- Users table: id, name, email
-- UserProfiles table: user_id, avatar, bio, settings
Design Approaches
Step 1: Requirements Clarification
Ask questions like:
- What are the key features?
- How many users?
- What are the read/write ratios?
- What are the latency requirements?
- Any geographic considerations?
Step 2: High-Level Design
┌─────────────────────────────────────────────────────────┐
│ User Interface │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ Load Balancer │
└───────┬───────────────┬───────────────────┬─────────────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Service A │ │ Service B │ │ Service C │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Cache │ │ Cache │ │ Cache │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────┴───────────────────┘
│
▼
┌─────────────────┐
│ Database │
│ (Primary + │
│ Replicas) │
└─────────────────┘
Step 3: Component Design
Design each component:
1. API Server
- REST/GraphQL
- Authentication (JWT, OAuth)
2. Data Storage
- SQL vs NoSQL
- Caching strategy
3. Message Queue
- Async processing
- Event-driven
4. Search
- Full-text search (Elasticsearch)
- Suggestions
Step 4: Trade-offs
Common Trade-offs:
1. Consistency vs Performance
- Strong consistency = slower
- Eventual consistency = faster
2. Latency vs Cost
- More caching = lower latency, higher cost
- Less caching = higher latency, lower cost
3. Availability vs Consistency
- Always available = eventual consistency
- Consistent = may be unavailable during partitions
Common System Designs
1. URL Shortener
Requirements:
- Shorten long URLs
- Redirect to original URL
- Track click analytics
Design:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Client │────▶│ API Server │────▶│ Database │
│ │◀────│ (Hash + │◀────│ (URL + │
│ │ │ Redirect) │ │ Short ID) │
└──────────────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────┐
│ Cache │
│ (Redis) │
└──────────────┘
2. Twitter/News Feed
Requirements:
- User follows others
- See chronological feed
- High read throughput
Design:
┌──────────────┐ ┌──────────────┐
│ Client │────▶│ API Server │
└──────────────┘ └──────┬───────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Fan-out │ │ Search │ │ Analytics │
│ Service │ │ Service │ │ Service │
└──────┬──────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐
│ Message │────▶│ Cache │
│ Queue │ │ (Redis) │
└─────────────┘ └─────────────┘
3. Real-time Chat
Requirements:
- 1:1 messaging
- Group chats
- Online status
- Message history
Design:
┌──────────────┐ ┌──────────────┐
│ WebSocket │────▶│ Chat API │
│ Connection │ │ Server │
└──────────────┘ └──────┬───────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Presence │ │ Message │ │ Notification│
│ Service │ │ Service │ │ Service │
└─────────────┘ └──────┬──────┘ └─────────────┘
│
▼
┌─────────────┐
│ Database │
│(Cassandra) │
└─────────────┘
Estimation Techniques
Traffic Estimation
Assumptions:
- 1 million DAU
- 10 requests per user per day
- Peak = 3x average
Calculation:
- QPS = 1M × 10 / 86400 ≈ 115 QPS (average)
- Peak QPS = 115 × 3 = 345 QPS
Storage Estimation
Assumptions:
- 100 million users
- 100 posts per user
- 1KB per post
Calculation:
- Storage = 100M × 100 × 1KB = 10 TB
- Plus 3 years of images = 100+ TB
Bandwidth
Assumptions:
- 100 QPS
- 10KB per response
Calculation:
- Bandwidth = 100 × 10KB = 1MB/s
- Peak = 3MB/s
Interview Tips
Do’s
- Clarify requirements - Ask questions before designing
- Think out loud - Show your reasoning
- Make trade-offs - Explain pros/cons
- Start simple - MVP first, then scale
- Know the numbers - Estimate capacity
Don’ts
- Don’t jump straight to code
- Don’t ignore non-functional requirements
- Don’t over-engineer the solution
- Don’t forget about failure scenarios
- Don’t ignore monitoring/operations
Conclusion
System design interviews evaluate your ability to make reasoned architectural tradeoffs under uncertainty. Structure your approach: clarify requirements, estimate scale, design the data model, outline the core flow, then discuss tradeoffs and bottlenecks. Practice communicating your reasoning clearly—the thought process matters more than the final design.
External Resources
- Designing Data-Intensive Applications
- System Design Interview
- High Scalability Blog
- AWS Well-Architected Framework
Comments