Skip to main content
โšก Calmops

Threading vs Multiprocessing: Master Concurrent Programming in Python

Threading vs Multiprocessing: Master Concurrent Programming in Python

Modern applications need to do multiple things at once. Download files while responding to user input. Process data while serving web requests. Handle multiple client connections simultaneously. This is concurrent programming, and it’s essential for building responsive, scalable applications.

Python offers two main approaches: threading and multiprocessing. Both allow your program to do multiple things concurrently, but they work differently and suit different problems. Choosing the wrong approach can leave performance on the table or introduce subtle bugs. This guide explains both, their trade-offs, and when to use each.

Why Concurrent Programming Matters

Without concurrency, your program does one thing at a time. If a task takes 10 seconds (like waiting for a network response), your entire program waits. With concurrency, other tasks can run while one waits.

Sequential (No Concurrency):
Task A: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ (10s)
Task B:             โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ (10s)
Task C:                         โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ (10s)
Total: 30 seconds

Concurrent:
Task A: โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
Task B:     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
Task C:         โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
Total: ~12 seconds (mostly overlapped)

This is why concurrent programming is crucial for modern applications.

Threading Explained

What Are Threads?

Threads are lightweight execution units within a single process. Multiple threads share the same memory space and can access the same variables.

Single Process with Multiple Threads:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         Single Process              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Thread 1  โ”‚  Thread 2  โ”‚  Thread 3 โ”‚
โ”‚            โ”‚            โ”‚           โ”‚
โ”‚  Shared Memory Space                โ”‚
โ”‚  (Global variables, heap)           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Basic Threading Example

import threading
import time

def worker(name, duration):
    """Function to run in a thread"""
    print(f"{name} starting")
    time.sleep(duration)
    print(f"{name} finished")

# Create threads
thread1 = threading.Thread(target=worker, args=("Thread 1", 2))
thread2 = threading.Thread(target=worker, args=("Thread 2", 3))

# Start threads
thread1.start()
thread2.start()

# Wait for threads to complete
thread1.join()
thread2.join()

print("All threads completed")

Output:

Thread 1 starting
Thread 2 starting
Thread 1 finished
Thread 2 finished
All threads completed

Shared Memory and Race Conditions

Threads share memory, which is convenient but dangerous. Multiple threads accessing the same variable can cause race conditions.

import threading

counter = 0

def increment():
    """Increment counter 100,000 times"""
    global counter
    for _ in range(100000):
        counter += 1

# Create threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)

# Run threads
thread1.start()
thread2.start()
thread1.join()
thread2.join()

print(f"Counter: {counter}")
# Expected: 200,000
# Actual: ~150,000 (varies each run due to race condition)

The problem: both threads read, modify, and write counter simultaneously, causing lost updates.

Thread Synchronization

Use locks to prevent race conditions:

import threading

counter = 0
lock = threading.Lock()

def increment():
    """Increment counter safely"""
    global counter
    for _ in range(100000):
        with lock:  # Acquire lock
            counter += 1
            # Lock automatically released

# Create and run threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)

thread1.start()
thread2.start()
thread1.join()
thread2.join()

print(f"Counter: {counter}")  # Correctly prints 200,000

Multiprocessing Explained

What Is Multiprocessing?

Multiprocessing creates separate processes, each with its own Python interpreter and memory space. Processes don’t share memory directly.

Multiple Processes:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Process 1      โ”‚  โ”‚   Process 2      โ”‚  โ”‚   Process 3      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Memory Space 1  โ”‚  โ”‚  Memory Space 2  โ”‚  โ”‚  Memory Space 3  โ”‚
โ”‚  (Isolated)      โ”‚  โ”‚  (Isolated)      โ”‚  โ”‚  (Isolated)      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Basic Multiprocessing Example

import multiprocessing
import time

def worker(name, duration):
    """Function to run in a process"""
    print(f"{name} starting")
    time.sleep(duration)
    print(f"{name} finished")

if __name__ == '__main__':
    # Create processes
    process1 = multiprocessing.Process(target=worker, args=("Process 1", 2))
    process2 = multiprocessing.Process(target=worker, args=("Process 2", 3))
    
    # Start processes
    process1.start()
    process2.start()
    
    # Wait for processes to complete
    process1.join()
    process2.join()
    
    print("All processes completed")

Note: The if __name__ == '__main__': guard is required on Windows and recommended on all platforms.

Process Pools

For managing multiple processes efficiently, use process pools:

import multiprocessing

def square(x):
    """Calculate square of x"""
    return x ** 2

if __name__ == '__main__':
    # Create pool with 4 worker processes
    with multiprocessing.Pool(processes=4) as pool:
        # Map function across data
        results = pool.map(square, range(10))
    
    print(results)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Inter-Process Communication

Processes need explicit communication mechanisms:

import multiprocessing

def producer(queue):
    """Put items in queue"""
    for i in range(5):
        queue.put(f"Item {i}")
    queue.put(None)  # Signal end

def consumer(queue):
    """Get items from queue"""
    while True:
        item = queue.get()
        if item is None:
            break
        print(f"Received: {item}")

if __name__ == '__main__':
    queue = multiprocessing.Queue()
    
    # Create producer and consumer processes
    p1 = multiprocessing.Process(target=producer, args=(queue,))
    p2 = multiprocessing.Process(target=consumer, args=(queue,))
    
    p1.start()
    p2.start()
    
    p1.join()
    p2.join()

Key Differences

Memory Usage

import threading
import multiprocessing
import os

# Threading: Lightweight
thread = threading.Thread(target=lambda: None)
# ~8 KB per thread

# Multiprocessing: Heavy
process = multiprocessing.Process(target=lambda: None)
# ~30-50 MB per process (full Python interpreter)

CPU Utilization

Threading: Limited by Python’s Global Interpreter Lock (GIL). Only one thread executes Python bytecode at a time, even on multi-core systems.

Multiprocessing: True parallelism. Each process has its own GIL, so multiple processes can execute Python code simultaneously on multiple cores.

Communication Overhead

Threading: Fast (shared memory, just use locks)

Multiprocessing: Slow (requires serialization/deserialization of data)

Comparison Table

Aspect Threading Multiprocessing
Memory per unit ~8 KB ~30-50 MB
Creation time Fast (ms) Slow (100s ms)
Communication Fast (shared memory) Slow (serialization)
CPU parallelism Limited (GIL) True parallelism
Synchronization Complex (locks, race conditions) Simple (isolated memory)
Debugging Difficult (shared state) Easier (isolated state)

Use Cases: When to Use Each

Threading: I/O-Bound Tasks

Use threading when your program waits for I/O (network, disk, database).

import threading
import requests
import time

def fetch_url(url):
    """Fetch URL and print response time"""
    start = time.time()
    response = requests.get(url)
    elapsed = time.time() - start
    print(f"{url}: {elapsed:.2f}s")

# Sequential: ~6 seconds (3 URLs ร— 2 seconds each)
start = time.time()
for url in ['http://example.com', 'http://example.com', 'http://example.com']:
    fetch_url(url)
print(f"Sequential: {time.time() - start:.2f}s")

# Threading: ~2 seconds (all requests in parallel)
start = time.time()
threads = []
for url in ['http://example.com', 'http://example.com', 'http://example.com']:
    t = threading.Thread(target=fetch_url, args=(url,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()
print(f"Threading: {time.time() - start:.2f}s")

Multiprocessing: CPU-Bound Tasks

Use multiprocessing when your program performs heavy computation.

import multiprocessing
import time

def cpu_intensive(n):
    """Perform CPU-intensive calculation"""
    result = 0
    for i in range(n):
        result += i ** 2
    return result

if __name__ == '__main__':
    # Sequential: ~4 seconds
    start = time.time()
    for _ in range(4):
        cpu_intensive(50000000)
    print(f"Sequential: {time.time() - start:.2f}s")
    
    # Multiprocessing: ~1 second (on 4-core system)
    start = time.time()
    with multiprocessing.Pool(processes=4) as pool:
        pool.map(cpu_intensive, [50000000] * 4)
    print(f"Multiprocessing: {time.time() - start:.2f}s")

Advantages and Disadvantages

Threading

Advantages:

  • Lightweight (low memory overhead)
  • Fast creation and context switching
  • Efficient for I/O-bound tasks
  • Easy data sharing (shared memory)
  • Good for responsive UIs

Disadvantages:

  • Limited by GIL (no true parallelism for CPU-bound tasks)
  • Complex synchronization (race conditions, deadlocks)
  • Difficult to debug (shared state)
  • One thread crash can crash entire process

Multiprocessing

Advantages:

  • True parallelism (bypasses GIL)
  • Excellent for CPU-bound tasks
  • Isolated memory (fewer synchronization issues)
  • One process crash doesn’t affect others
  • Easier to reason about (no shared state)

Disadvantages:

  • Heavy memory overhead
  • Slow creation and context switching
  • Expensive inter-process communication
  • Data serialization overhead
  • More complex to implement

Practical Considerations

The Global Interpreter Lock (GIL)

Python’s GIL prevents multiple threads from executing Python bytecode simultaneously. This means threading doesn’t provide true parallelism for CPU-bound tasks.

import threading
import time

def cpu_work():
    """CPU-intensive work"""
    total = 0
    for i in range(100000000):
        total += i

# Single thread: ~5 seconds
start = time.time()
cpu_work()
print(f"Single thread: {time.time() - start:.2f}s")

# Two threads: ~5 seconds (not faster due to GIL)
start = time.time()
t1 = threading.Thread(target=cpu_work)
t2 = threading.Thread(target=cpu_work)
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Two threads: {time.time() - start:.2f}s")

Deadlocks

Deadlocks occur when threads wait for each other indefinitely.

import threading

lock1 = threading.Lock()
lock2 = threading.Lock()

def thread1_func():
    with lock1:
        time.sleep(0.1)
        with lock2:  # Waits for lock2
            pass

def thread2_func():
    with lock2:
        time.sleep(0.1)
        with lock1:  # Waits for lock1 (deadlock!)
            pass

# This will deadlock
t1 = threading.Thread(target=thread1_func)
t2 = threading.Thread(target=thread2_func)
t1.start()
t2.start()

Prevention: Always acquire locks in the same order.

Best Practices

# โœ… GOOD: Use context managers for locks
with lock:
    # Critical section
    pass

# โœ… GOOD: Use thread-safe data structures
from queue import Queue
queue = Queue()  # Thread-safe

# โœ… GOOD: Use thread pools for many tasks
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=4) as executor:
    executor.map(function, data)

# โŒ BAD: Manual lock management
lock.acquire()
try:
    # Critical section
finally:
    lock.release()

# โŒ BAD: Sharing mutable objects without synchronization
shared_list = []  # Dangerous!

Choosing Between Threading and Multiprocessing

Is your task I/O-bound?
โ”œโ”€ Yes โ†’ Use Threading
โ”‚        (Network, disk, database operations)
โ”‚
โ””โ”€ No (CPU-bound)
   โ””โ”€ Use Multiprocessing
      (Heavy computation, data processing)

Decision Tree

# 1. Identify task type
if task_involves_io:  # Network, disk, database
    use_threading()
elif task_is_cpu_intensive:  # Computation, data processing
    use_multiprocessing()
else:
    use_asyncio()  # For many I/O operations

# 2. Consider constraints
if memory_limited:
    prefer_threading()
elif need_true_parallelism:
    prefer_multiprocessing()

# 3. Measure performance
profile_both_approaches()
choose_faster_one()

Conclusion

Threading and multiprocessing are both valuable tools for concurrent programming, but they solve different problems:

  • Use threading for I/O-bound tasks (network requests, file operations, database queries). It’s lightweight, fast, and efficient for waiting.

  • Use multiprocessing for CPU-bound tasks (heavy computation, data processing). It provides true parallelism and bypasses Python’s GIL.

Key takeaways:

  1. Understand the GIL: Threading doesn’t provide true parallelism for CPU-bound tasks in Python
  2. Match the tool to the problem: I/O-bound โ†’ threading, CPU-bound โ†’ multiprocessing
  3. Synchronize carefully: Use locks, queues, and thread-safe data structures
  4. Measure performance: Profile both approaches and choose the faster one
  5. Start simple: Use thread/process pools before building complex synchronization logic
  6. Consider alternatives: For many I/O operations, asyncio might be better than threading

Concurrent programming is powerful but complex. Start with the simplest approach that solves your problem, measure its performance, and optimize based on data.

Happy concurrent coding!

Comments