Skip to main content

System Design Interview Guide: Complete Patterns and Approaches

Created: February 26, 2026 Larry Qu 6 min read

System design questions are a crucial part of technical interviews for senior engineering roles. This guide covers fundamental patterns, common approaches, and building blocks for designing large-scale systems.

What is System Design?

System design involves making architectural decisions about how software systems should be built. It encompasses:

  • Functional requirements: What the system should do
  • Non-functional requirements: Performance, scalability, reliability
  • Technical constraints: Budget, timeline, existing infrastructure

Common System Design Concepts

CAP Theorem

CAP Theorem states that a distributed system can only provide 2 of 3 guarantees:

    ┌─────────────────────────────────────┐
    │                                     │
    │         Consistency                 │
    │            ⚡ ⚡                    │
    │          ↙       ↘                 │
    │       ↙             ↘              │
    │   Availability ⚡ ⚡                  │
    │       ↘             ↙              │
    │         ↘         ↙                │
    │           Partition                 │
    │           Tolerance                 │
    │                                     │
    └─────────────────────────────────────┘

- Consistency (C): All nodes see the same data
- Availability (A): Every request gets a response
- Partition Tolerance (P): System works despite network failures

ACID vs BASE

ACID BASE
Atomicity Basically Available
Consistency Soft state
Isolation Eventual consistency
Durability

Scalability Basics

Vertical vs Horizontal Scaling

Vertical Scaling (Scale Up)          Horizontal Scaling (Scale Out)
┌─────────────────────┐             ┌───────┐ ┌───────┐ ┌───────┐
│                     │             │       │ │       │ │       │
│    ┌─────────┐      │             │  App  │ │  App  │ │  App  │
│    │  Server │      │             │       │ │       │ │       │
│    └─────────┘      │             └───────┘ └───────┘ └───────┘
│        ↑           │                 ↑         ↑         ↑
│    More CPU/RAM    │             Load Balancer
│                     │                 
└─────────────────────┘

Database Scaling

Read Replicas                    Sharding
┌──────────┐                    ┌──────────┐ ┌──────────┐
│   Main   │                    │ Shard 1  │ │ Shard 2  │
│ Database │───Replication──▶  │ (Users A-M)│ │(Users N-Z)│
└──────────┘                    └──────────┘ └──────────┘
  Reads/Writes
┌─────┴──────┐
│            │
▼            ▼
┌──────────┐ ┌──────────┐
│  Read    │ │  Read    │
│  Replica │ │  Replica │
└──────────┘ └──────────┘

Common Building Blocks

1. Load Balancer

## Load balancing strategies
class LoadBalancer:
    def __init__(self, servers):
        self.servers = servers
    
    # Round Robin
    def round_robin(self):
        return self.servers[self.index % len(self.servers)]
    
    # Least Connections
    def least_connections(self):
        return min(self.servers, key=lambda s: s.connections)
    
    # Weighted Round Robin
    def weighted_rr(self):
        # Servers with higher weight get more requests
        pass
    
    # IP Hash (sticky sessions)
    def ip_hash(self, client_ip):
        return self.servers[hash(client_ip) % len(self.servers)]

2. Caching

┌─────────────────────────────────────────────────────┐
│                   Cache Hierarchy                     │
├─────────────────────────────────────────────────────┤
│                                                      │
│  Browser ──▶ CDN ──▶ Load Balancer ──▶ App Server │
│     │         │                         │            │
│     ▼         ▼                         ▼            │
│  Cache    Edge Cache              Redis/Memcached  │
│                                               │    │
│                                               ▼    │
│                                         Database   │
└─────────────────────────────────────────────────────┘

3. Database Patterns

-- Read Replica
SELECT * FROM users WHERE id = 1;  -- Read from replica

-- Sharding by User ID
-- Users 1-1,000,000 -> Shard 1
-- Users 1,000,001-2,000,000 -> Shard 2

-- Vertical Partitioning
-- Users table: id, name, email
-- UserProfiles table: user_id, avatar, bio, settings

Design Approaches

Step 1: Requirements Clarification

Ask questions like:
- What are the key features?
- How many users?
- What are the read/write ratios?
- What are the latency requirements?
- Any geographic considerations?

Step 2: High-Level Design

┌─────────────────────────────────────────────────────────┐
│                    User Interface                       │
└─────────────────────┬───────────────────────────────────┘
┌─────────────────────▼───────────────────────────────────┐
│                   Load Balancer                        │
└───────┬───────────────┬───────────────────┬─────────────┘
        │               │                   │
        ▼               ▼                   ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│  Service A  │ │  Service B  │ │  Service C  │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
       │               │                   │
       ▼               ▼                   ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│    Cache    │ │    Cache    │ │    Cache    │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
       │               │                   │
       └───────────────┴───────────────────┘
            ┌─────────────────┐
            │   Database      │
            │ (Primary +      │
            │  Replicas)      │
            └─────────────────┘

Step 3: Component Design

Design each component:

1. API Server
   - REST/GraphQL
   - Authentication (JWT, OAuth)
   
2. Data Storage
   - SQL vs NoSQL
   - Caching strategy
   
3. Message Queue
   - Async processing
   - Event-driven
   
4. Search
   - Full-text search (Elasticsearch)
   - Suggestions

Step 4: Trade-offs

Common Trade-offs:

1. Consistency vs Performance
   - Strong consistency = slower
   - Eventual consistency = faster
   
2. Latency vs Cost
   - More caching = lower latency, higher cost
   - Less caching = higher latency, lower cost
   
3. Availability vs Consistency
   - Always available = eventual consistency
   - Consistent = may be unavailable during partitions

Common System Designs

1. URL Shortener

Requirements:
- Shorten long URLs
- Redirect to original URL
- Track click analytics

Design:
┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   Client     │────▶│  API Server  │────▶│  Database   │
│              │◀────│  (Hash +    │◀────│  (URL +     │
│              │     │   Redirect)  │     │   Short ID) │
└──────────────┘     └──────────────┘     └──────────────┘
                     ┌──────────────┐
                     │   Cache      │
                     │  (Redis)     │
                     └──────────────┘

2. Twitter/News Feed

Requirements:
- User follows others
- See chronological feed
- High read throughput

Design:
┌──────────────┐     ┌──────────────┐
│   Client     │────▶│  API Server  │
└──────────────┘     └──────┬───────┘
         ┌───────────────────┼───────────────────┐
         ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Fan-out   │     │   Search    │     │  Analytics  │
│  Service   │     │   Service   │     │   Service   │
└──────┬──────┘     └─────────────┘     └─────────────┘
┌─────────────┐     ┌─────────────┐
│  Message    │────▶│   Cache     │
│  Queue      │     │  (Redis)    │
└─────────────┘     └─────────────┘

3. Real-time Chat

Requirements:
- 1:1 messaging
- Group chats
- Online status
- Message history

Design:
┌──────────────┐     ┌──────────────┐
│   WebSocket  │────▶│  Chat API    │
│  Connection  │     │   Server     │
└──────────────┘     └──────┬───────┘
        ┌───────────────────┼───────────────────┐
        ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Presence   │     │  Message    │     │  Notification│
│  Service    │     │  Service    │     │   Service   │
└─────────────┘     └──────┬──────┘     └─────────────┘
                   ┌─────────────┐
                   │  Database   │
                   │(Cassandra) │
                   └─────────────┘

Estimation Techniques

Traffic Estimation

Assumptions:
- 1 million DAU
- 10 requests per user per day
- Peak = 3x average

Calculation:
- QPS = 1M × 10 / 86400 ≈ 115 QPS (average)
- Peak QPS = 115 × 3 = 345 QPS

Storage Estimation

Assumptions:
- 100 million users
- 100 posts per user
- 1KB per post

Calculation:
- Storage = 100M × 100 × 1KB = 10 TB
- Plus 3 years of images = 100+ TB

Bandwidth

Assumptions:
- 100 QPS
- 10KB per response

Calculation:
- Bandwidth = 100 × 10KB = 1MB/s
- Peak = 3MB/s

Interview Tips

Do’s

  1. Clarify requirements - Ask questions before designing
  2. Think out loud - Show your reasoning
  3. Make trade-offs - Explain pros/cons
  4. Start simple - MVP first, then scale
  5. Know the numbers - Estimate capacity

Don’ts

  1. Don’t jump straight to code
  2. Don’t ignore non-functional requirements
  3. Don’t over-engineer the solution
  4. Don’t forget about failure scenarios
  5. Don’t ignore monitoring/operations

Conclusion

System design interviews evaluate your ability to make reasoned architectural tradeoffs under uncertainty. Structure your approach: clarify requirements, estimate scale, design the data model, outline the core flow, then discuss tradeoffs and bottlenecks. Practice communicating your reasoning clearly—the thought process matters more than the final design.

External Resources


Comments

Share this article

Scan to read on mobile

👍 Was this article helpful?