Skip to main content
โšก Calmops

Understanding System Design: Scalable Applications

Introduction

System design is the process of defining architecture, components, and data flow for a system. Whether you’re building a startup MVP or enterprise platform, understanding system design principles helps you create scalable, maintainable applications. This guide covers fundamental system design concepts.

Scalability Fundamentals

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up)

  • Add more resources to existing machine
  • Simple to implement
  • Hardware limits
  • Single point of failure

Horizontal Scaling (Scale Out)

  • Add more machines
  • More complex
  • Better fault tolerance
  • Preferred for large systems

Load Balancing

Distributes traffic across servers:

# Simple round-robin
class LoadBalancer:
    def __init__(self, servers):
        self.servers = servers
        self.current = 0
    
    def get_server(self):
        server = self.servers[self.current]
        self.current = (self.current + 1) % len(self.servers)
        return server

Load Balancing Algorithms

  • Round Robin: Sequential distribution
  • Least Connections: Fewest active requests
  • IP Hash: Same client to same server
  • Weighted: Performance-based distribution

Caching

Store frequently accessed data for fast retrieval:

# Cache-aside pattern
def get_user(user_id):
    # Check cache first
    cache_key = f"user:{user_id}"
    cached = redis.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Fetch from database
    user = db.query(User).get(user_id)
    
    # Store in cache
    if user:
        redis.setex(cache_key, 3600, json.dumps(user))
    
    return user

Cache Strategies

Strategy Description Use Case
Cache-aside App manages cache Read-heavy
Write-through Write to cache and DB Read-write
Write-behind Async DB writes High write
TTL Time-based expiration Any

Database Design

SQL vs NoSQL

SQL NoSQL
Structured data Flexible schema
ACID compliance Eventual consistency
Complex queries Simple queries
Vertical scaling Horizontal scaling

When to Use SQL

  • Transactional data
  • Complex relationships
  • Structured data
  • ACID requirements

When to Use NoSQL

  • Unstructured data
  • High write throughput
  • Horizontal scaling needed
  • Schema flexibility

Database Patterns

Read Replicas

-- Create read replica
-- Read from replica, write to primary
SELECT * FROM users;  -- Goes to replica
INSERT INTO users...  -- Goes to primary

Sharding

Split data across databases:

  • Range-based: Users A-M, N-Z
  • Hash-based: user_id % n
  • Directory-based: Lookup service

Microservices

Monolith vs Microservices

Monolith Microservices
Single deployable Independent services
Simple initially Complex but scalable
Shared database Each service owns data
Hard to scale Scale individually

Communication Patterns

Synchronous (REST/gRPC)

# REST call
def get_user_with_orders(user_id):
    user = user_service.get(user_id)
    orders = order_service.get_by_user(user_id)
    return {"user": user, "orders": orders}

Asynchronous (Message Queue)

# Publish event
def create_order(order_data):
    order = save_order(order_data)
    message_queue.publish("order_created", {
        "order_id": order.id,
        "user_id": order.user_id
    })

Service Discovery

# Kubernetes service
apiVersion: v1
kind: Service
metadata:
  name: user-service
spec:
  selector:
    app: user-service
  ports:
  - port: 80
    targetPort: 8080

Reliability

Fault Tolerance

  • Circuit breakers
  • Retries with backoff
  • Timeouts
  • Fallbacks
# Circuit breaker
class CircuitBreaker:
    def __init__(self, threshold=5):
        self.failures = 0
        self.threshold = threshold
        self.state = "closed"
    
    def call(self, func):
        if self.state == "open":
            raise Exception("Circuit open")
        
        try:
            result = func()
            self.failures = 0
            return result
        except:
            self.failures += 1
            if self.failures >= self.threshold:
                self.state = "open"
            raise

Redundancy

  • Multiple availability zones
  • Data replication
  • Failover mechanisms

Monitoring

  • Metrics collection
  • Logging aggregation
  • Distributed tracing
  • Alerting

Common System Designs

URL Shortener

User โ†’ API โ†’ Database
      โ†“
   Cache
      โ†“
   Redirect

Components:

  • Hash function for short codes
  • Database for mapping
  • Cache for popular URLs
  • 301/302 redirects

Twitter Feed

User โ†’ Load Balancer โ†’ API Servers
         โ†“
      Cache (user feeds)
         โ†“
    Database + Follower graph

Real-time Chat

User โ†’ WebSocket โ†’ API Server
           โ†“
       Redis (presence)
           โ†“
      Message Queue
           โ†“
    Database + Push notifications

Design Process

1. Requirements Clarification

  • What features are needed?
  • What’s the scale?
  • What’s the timeline?

2. High-Level Design

  • Components and services
  • Data flow
  • APIs

3. Deep Dive

  • Database schema
  • Caching strategy
  • Scaling approach

4. Bottlenecks

  • Where are potential failures?
  • How to handle them?

5. Wrap Up

  • Summary of design
  • Trade-offs discussed

Key Concepts Summary

Concept Purpose
Load balancing Distribute traffic
Caching Speed up reads
Database sharding Horizontal scale
Message queues Async communication
Circuit breakers Fault tolerance
CDN Static content delivery

Conclusion

System design skills improve with practice. Start with fundamentals, study real-world systems, and apply concepts in your projects. Understanding these patterns helps you make better architecture decisions.


Resources

Comments