Skip to main content
โšก Calmops

Algorithm Design Principles: A Practical Guide

Introduction

Algorithm design is the art of creating solutions that are not just correct, but efficient. The same problem can be solved in ways that differ by orders of magnitude in performance. This guide covers the core principles and strategies that separate good algorithms from great ones.

Core Principle 1: Minimize Unnecessary Work

The primary goal is to reduce the total number of operations. Before writing any code, ask: “What do I already know that can narrow the search space?”

# Bad: linear search on every call โ€” O(n) per lookup
def find_user(users, user_id):
    for user in users:
        if user['id'] == user_id:
            return user
    return None

# Good: build a lookup dict once โ€” O(1) per lookup
def build_index(users):
    return {user['id']: user for user in users}

user_index = build_index(users)
user = user_index.get(user_id)  # O(1)

Key questions before coding:

  • Can I sort the data first to enable binary search?
  • Can I precompute something to avoid repeated work?
  • Can I use a different data structure for faster access?

Core Principle 2: Choose the Right Data Structure

The data structure choice often determines the algorithm’s complexity:

Operation Array Linked List Hash Map BST Heap
Access by index O(1) O(n) โ€” โ€” โ€”
Search O(n) O(n) O(1) avg O(log n) O(n)
Insert (end) O(1) amortized O(1) O(1) avg O(log n) O(log n)
Insert (middle) O(n) O(1) โ€” O(log n) โ€”
Delete O(n) O(1) O(1) avg O(log n) O(log n)
Min/Max O(n) O(n) O(n) O(log n) O(1)
# Problem: find all duplicates in a list

# Bad: O(nยฒ) โ€” nested loops
def find_duplicates_slow(lst):
    duplicates = []
    for i in range(len(lst)):
        for j in range(i + 1, len(lst)):
            if lst[i] == lst[j] and lst[i] not in duplicates:
                duplicates.append(lst[i])
    return duplicates

# Good: O(n) โ€” use a set for O(1) membership testing
def find_duplicates_fast(lst):
    seen = set()
    duplicates = set()
    for item in lst:
        if item in seen:
            duplicates.add(item)
        seen.add(item)
    return list(duplicates)

Core Principle 3: Memoization and Caching

Avoid recomputing the same result. Store computed values and retrieve them on subsequent calls:

# Without memoization: O(2^n) โ€” exponential
def fib_slow(n):
    if n <= 1:
        return n
    return fib_slow(n - 1) + fib_slow(n - 2)

# With memoization: O(n) โ€” each subproblem solved once
from functools import lru_cache

@lru_cache(maxsize=None)
def fib_memo(n):
    if n <= 1:
        return n
    return fib_memo(n - 1) + fib_memo(n - 2)

# Manual memoization
def fib_manual(n, memo={}):
    if n in memo:
        return memo[n]
    if n <= 1:
        return n
    memo[n] = fib_manual(n - 1, memo) + fib_manual(n - 2, memo)
    return memo[n]

print(fib_slow(35))   # takes seconds
print(fib_memo(35))   # instant
print(fib_memo(1000)) # still instant

Recursion vs Iteration

Recursion is elegant but has overhead. Iteration is usually faster and avoids stack overflow.

# Recursive factorial โ€” limited by recursion depth
def factorial_recursive(n):
    if n <= 1:
        return 1
    return n * factorial_recursive(n - 1)

# Iterative factorial โ€” no depth limit
def factorial_iterative(n):
    result = 1
    for i in range(2, n + 1):
        result *= i
    return result

# When recursion is natural: tree traversal
class TreeNode:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Recursive DFS โ€” clean and natural
def dfs_recursive(node):
    if node is None:
        return []
    return [node.val] + dfs_recursive(node.left) + dfs_recursive(node.right)

# Iterative DFS โ€” handles deep trees without stack overflow
def dfs_iterative(root):
    if not root:
        return []
    result, stack = [], [root]
    while stack:
        node = stack.pop()
        result.append(node.val)
        if node.right: stack.append(node.right)
        if node.left:  stack.append(node.left)
    return result

Rule of thumb: Use recursion when the problem is naturally recursive (trees, graphs, divide-and-conquer). Convert to iteration when depth could be large or performance is critical.

Strategy 1: Divide and Conquer

Split the problem into independent subproblems, solve each, combine results.

Time complexity: Often O(n log n) โ€” the log comes from the splitting.

# Merge Sort โ€” classic divide and conquer
def merge_sort(arr):
    if len(arr) <= 1:
        return arr

    mid = len(arr) // 2
    left  = merge_sort(arr[:mid])   # divide
    right = merge_sort(arr[mid:])   # divide

    return merge(left, right)       # conquer + combine

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    return result + left[i:] + right[j:]

# Binary Search โ€” divide and conquer on sorted array
def binary_search(arr, target):
    lo, hi = 0, len(arr) - 1
    while lo <= hi:
        mid = (lo + hi) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1
    return -1

Other examples: QuickSort, Fast Fourier Transform, Strassen’s matrix multiplication.

Strategy 2: Greedy Algorithms

Make the locally optimal choice at each step. Fast and simple, but doesn’t always find the global optimum.

# Coin change (greedy works for standard denominations)
def make_change_greedy(amount, coins=[25, 10, 5, 1]):
    result = []
    for coin in sorted(coins, reverse=True):
        while amount >= coin:
            result.append(coin)
            amount -= coin
    return result

print(make_change_greedy(48))
# => [25, 10, 10, 1, 1, 1]

# Activity selection: maximize number of non-overlapping activities
def activity_selection(activities):
    # Sort by end time (greedy choice: pick earliest-ending activity)
    sorted_acts = sorted(activities, key=lambda x: x[1])
    selected = [sorted_acts[0]]

    for start, end in sorted_acts[1:]:
        if start >= selected[-1][1]:  # no overlap
            selected.append((start, end))

    return selected

activities = [(1,4), (3,5), (0,6), (5,7), (3,9), (5,9), (6,10), (8,11)]
print(activity_selection(activities))
# => [(1, 4), (5, 7), (8, 11)]

When greedy works: Activity selection, Huffman coding, Dijkstra’s shortest path, minimum spanning tree (Kruskal’s, Prim’s).

When greedy fails: Coin change with non-standard denominations, 0/1 knapsack problem.

Strategy 3: Dynamic Programming

Break into overlapping subproblems, solve each once, store results. Builds solution bottom-up.

# Fibonacci โ€” bottom-up DP
def fib_dp(n):
    if n <= 1:
        return n
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    return dp[n]

# Space-optimized (only need last two values)
def fib_optimized(n):
    if n <= 1:
        return n
    a, b = 0, 1
    for _ in range(2, n + 1):
        a, b = b, a + b
    return b

# Longest Common Subsequence
def lcs(s1, s2):
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])

    return dp[m][n]

print(lcs("ABCBDAB", "BDCAB"))  # => 4 (BCAB or BDAB)

# 0/1 Knapsack
def knapsack(weights, values, capacity):
    n = len(weights)
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]

    for i in range(1, n + 1):
        for w in range(capacity + 1):
            dp[i][w] = dp[i-1][w]  # don't take item i
            if weights[i-1] <= w:
                dp[i][w] = max(dp[i][w],
                               dp[i-1][w - weights[i-1]] + values[i-1])

    return dp[n][capacity]

Strategy 4: Two Pointers

Use two indices moving through the data to solve problems in O(n) that would otherwise be O(nยฒ):

# Two Sum (sorted array)
def two_sum_sorted(arr, target):
    left, right = 0, len(arr) - 1
    while left < right:
        s = arr[left] + arr[right]
        if s == target:
            return (left, right)
        elif s < target:
            left += 1
        else:
            right -= 1
    return None

# Remove duplicates from sorted array in-place
def remove_duplicates(arr):
    if not arr:
        return 0
    write = 1
    for read in range(1, len(arr)):
        if arr[read] != arr[read - 1]:
            arr[write] = arr[read]
            write += 1
    return write

Strategy 5: Sliding Window

Maintain a window over a sequence to find optimal subarrays in O(n):

# Maximum sum subarray of size k
def max_sum_subarray(arr, k):
    window_sum = sum(arr[:k])
    max_sum = window_sum

    for i in range(k, len(arr)):
        window_sum += arr[i] - arr[i - k]
        max_sum = max(max_sum, window_sum)

    return max_sum

# Longest substring without repeating characters
def length_of_longest_substring(s):
    char_index = {}
    max_len = start = 0

    for end, char in enumerate(s):
        if char in char_index and char_index[char] >= start:
            start = char_index[char] + 1
        char_index[char] = end
        max_len = max(max_len, end - start + 1)

    return max_len

print(length_of_longest_substring("abcabcbb"))  # => 3 ("abc")

Complexity Analysis

Always analyze your algorithm before coding:

O(1)       โ€” constant: hash lookup, array access
O(log n)   โ€” logarithmic: binary search, balanced BST
O(n)       โ€” linear: single pass, linear search
O(n log n) โ€” linearithmic: merge sort, heap sort
O(nยฒ)      โ€” quadratic: nested loops, bubble sort
O(2^n)     โ€” exponential: naive recursion, subset enumeration
O(n!)      โ€” factorial: permutation generation

Practical rule: For n = 10^6, O(n log n) is fine; O(nยฒ) is too slow.

Resources

Comments