Threading vs Multiprocessing: Master Concurrent Programming in Python
Modern applications need to do multiple things at once. Download files while responding to user input. Process data while serving web requests. Handle multiple client connections simultaneously. This is concurrent programming, and it’s essential for building responsive, scalable applications.
Python offers two main approaches: threading and multiprocessing. Both allow your program to do multiple things concurrently, but they work differently and suit different problems. Choosing the wrong approach can leave performance on the table or introduce subtle bugs. This guide explains both, their trade-offs, and when to use each.
Why Concurrent Programming Matters
Without concurrency, your program does one thing at a time. If a task takes 10 seconds (like waiting for a network response), your entire program waits. With concurrency, other tasks can run while one waits.
Sequential (No Concurrency):
Task A: โโโโโโโโโโโโ (10s)
Task B: โโโโโโโโโโโโ (10s)
Task C: โโโโโโโโโโโโ (10s)
Total: 30 seconds
Concurrent:
Task A: โโโโโโโโโโโโ
Task B: โโโโโโโโโโโโ
Task C: โโโโโโโโโโโโ
Total: ~12 seconds (mostly overlapped)
This is why concurrent programming is crucial for modern applications.
Threading Explained
What Are Threads?
Threads are lightweight execution units within a single process. Multiple threads share the same memory space and can access the same variables.
Single Process with Multiple Threads:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Single Process โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Thread 1 โ Thread 2 โ Thread 3 โ
โ โ โ โ
โ Shared Memory Space โ
โ (Global variables, heap) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Basic Threading Example
import threading
import time
def worker(name, duration):
"""Function to run in a thread"""
print(f"{name} starting")
time.sleep(duration)
print(f"{name} finished")
# Create threads
thread1 = threading.Thread(target=worker, args=("Thread 1", 2))
thread2 = threading.Thread(target=worker, args=("Thread 2", 3))
# Start threads
thread1.start()
thread2.start()
# Wait for threads to complete
thread1.join()
thread2.join()
print("All threads completed")
Output:
Thread 1 starting
Thread 2 starting
Thread 1 finished
Thread 2 finished
All threads completed
Shared Memory and Race Conditions
Threads share memory, which is convenient but dangerous. Multiple threads accessing the same variable can cause race conditions.
import threading
counter = 0
def increment():
"""Increment counter 100,000 times"""
global counter
for _ in range(100000):
counter += 1
# Create threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
# Run threads
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print(f"Counter: {counter}")
# Expected: 200,000
# Actual: ~150,000 (varies each run due to race condition)
The problem: both threads read, modify, and write counter simultaneously, causing lost updates.
Thread Synchronization
Use locks to prevent race conditions:
import threading
counter = 0
lock = threading.Lock()
def increment():
"""Increment counter safely"""
global counter
for _ in range(100000):
with lock: # Acquire lock
counter += 1
# Lock automatically released
# Create and run threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=increment)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print(f"Counter: {counter}") # Correctly prints 200,000
Multiprocessing Explained
What Is Multiprocessing?
Multiprocessing creates separate processes, each with its own Python interpreter and memory space. Processes don’t share memory directly.
Multiple Processes:
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ Process 1 โ โ Process 2 โ โ Process 3 โ
โโโโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโโโค
โ Memory Space 1 โ โ Memory Space 2 โ โ Memory Space 3 โ
โ (Isolated) โ โ (Isolated) โ โ (Isolated) โ
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
Basic Multiprocessing Example
import multiprocessing
import time
def worker(name, duration):
"""Function to run in a process"""
print(f"{name} starting")
time.sleep(duration)
print(f"{name} finished")
if __name__ == '__main__':
# Create processes
process1 = multiprocessing.Process(target=worker, args=("Process 1", 2))
process2 = multiprocessing.Process(target=worker, args=("Process 2", 3))
# Start processes
process1.start()
process2.start()
# Wait for processes to complete
process1.join()
process2.join()
print("All processes completed")
Note: The if __name__ == '__main__': guard is required on Windows and recommended on all platforms.
Process Pools
For managing multiple processes efficiently, use process pools:
import multiprocessing
def square(x):
"""Calculate square of x"""
return x ** 2
if __name__ == '__main__':
# Create pool with 4 worker processes
with multiprocessing.Pool(processes=4) as pool:
# Map function across data
results = pool.map(square, range(10))
print(results) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Inter-Process Communication
Processes need explicit communication mechanisms:
import multiprocessing
def producer(queue):
"""Put items in queue"""
for i in range(5):
queue.put(f"Item {i}")
queue.put(None) # Signal end
def consumer(queue):
"""Get items from queue"""
while True:
item = queue.get()
if item is None:
break
print(f"Received: {item}")
if __name__ == '__main__':
queue = multiprocessing.Queue()
# Create producer and consumer processes
p1 = multiprocessing.Process(target=producer, args=(queue,))
p2 = multiprocessing.Process(target=consumer, args=(queue,))
p1.start()
p2.start()
p1.join()
p2.join()
Key Differences
Memory Usage
import threading
import multiprocessing
import os
# Threading: Lightweight
thread = threading.Thread(target=lambda: None)
# ~8 KB per thread
# Multiprocessing: Heavy
process = multiprocessing.Process(target=lambda: None)
# ~30-50 MB per process (full Python interpreter)
CPU Utilization
Threading: Limited by Python’s Global Interpreter Lock (GIL). Only one thread executes Python bytecode at a time, even on multi-core systems.
Multiprocessing: True parallelism. Each process has its own GIL, so multiple processes can execute Python code simultaneously on multiple cores.
Communication Overhead
Threading: Fast (shared memory, just use locks)
Multiprocessing: Slow (requires serialization/deserialization of data)
Comparison Table
| Aspect | Threading | Multiprocessing |
|---|---|---|
| Memory per unit | ~8 KB | ~30-50 MB |
| Creation time | Fast (ms) | Slow (100s ms) |
| Communication | Fast (shared memory) | Slow (serialization) |
| CPU parallelism | Limited (GIL) | True parallelism |
| Synchronization | Complex (locks, race conditions) | Simple (isolated memory) |
| Debugging | Difficult (shared state) | Easier (isolated state) |
Use Cases: When to Use Each
Threading: I/O-Bound Tasks
Use threading when your program waits for I/O (network, disk, database).
import threading
import requests
import time
def fetch_url(url):
"""Fetch URL and print response time"""
start = time.time()
response = requests.get(url)
elapsed = time.time() - start
print(f"{url}: {elapsed:.2f}s")
# Sequential: ~6 seconds (3 URLs ร 2 seconds each)
start = time.time()
for url in ['http://example.com', 'http://example.com', 'http://example.com']:
fetch_url(url)
print(f"Sequential: {time.time() - start:.2f}s")
# Threading: ~2 seconds (all requests in parallel)
start = time.time()
threads = []
for url in ['http://example.com', 'http://example.com', 'http://example.com']:
t = threading.Thread(target=fetch_url, args=(url,))
threads.append(t)
t.start()
for t in threads:
t.join()
print(f"Threading: {time.time() - start:.2f}s")
Multiprocessing: CPU-Bound Tasks
Use multiprocessing when your program performs heavy computation.
import multiprocessing
import time
def cpu_intensive(n):
"""Perform CPU-intensive calculation"""
result = 0
for i in range(n):
result += i ** 2
return result
if __name__ == '__main__':
# Sequential: ~4 seconds
start = time.time()
for _ in range(4):
cpu_intensive(50000000)
print(f"Sequential: {time.time() - start:.2f}s")
# Multiprocessing: ~1 second (on 4-core system)
start = time.time()
with multiprocessing.Pool(processes=4) as pool:
pool.map(cpu_intensive, [50000000] * 4)
print(f"Multiprocessing: {time.time() - start:.2f}s")
Advantages and Disadvantages
Threading
Advantages:
- Lightweight (low memory overhead)
- Fast creation and context switching
- Efficient for I/O-bound tasks
- Easy data sharing (shared memory)
- Good for responsive UIs
Disadvantages:
- Limited by GIL (no true parallelism for CPU-bound tasks)
- Complex synchronization (race conditions, deadlocks)
- Difficult to debug (shared state)
- One thread crash can crash entire process
Multiprocessing
Advantages:
- True parallelism (bypasses GIL)
- Excellent for CPU-bound tasks
- Isolated memory (fewer synchronization issues)
- One process crash doesn’t affect others
- Easier to reason about (no shared state)
Disadvantages:
- Heavy memory overhead
- Slow creation and context switching
- Expensive inter-process communication
- Data serialization overhead
- More complex to implement
Practical Considerations
The Global Interpreter Lock (GIL)
Python’s GIL prevents multiple threads from executing Python bytecode simultaneously. This means threading doesn’t provide true parallelism for CPU-bound tasks.
import threading
import time
def cpu_work():
"""CPU-intensive work"""
total = 0
for i in range(100000000):
total += i
# Single thread: ~5 seconds
start = time.time()
cpu_work()
print(f"Single thread: {time.time() - start:.2f}s")
# Two threads: ~5 seconds (not faster due to GIL)
start = time.time()
t1 = threading.Thread(target=cpu_work)
t2 = threading.Thread(target=cpu_work)
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Two threads: {time.time() - start:.2f}s")
Deadlocks
Deadlocks occur when threads wait for each other indefinitely.
import threading
lock1 = threading.Lock()
lock2 = threading.Lock()
def thread1_func():
with lock1:
time.sleep(0.1)
with lock2: # Waits for lock2
pass
def thread2_func():
with lock2:
time.sleep(0.1)
with lock1: # Waits for lock1 (deadlock!)
pass
# This will deadlock
t1 = threading.Thread(target=thread1_func)
t2 = threading.Thread(target=thread2_func)
t1.start()
t2.start()
Prevention: Always acquire locks in the same order.
Best Practices
# โ
GOOD: Use context managers for locks
with lock:
# Critical section
pass
# โ
GOOD: Use thread-safe data structures
from queue import Queue
queue = Queue() # Thread-safe
# โ
GOOD: Use thread pools for many tasks
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=4) as executor:
executor.map(function, data)
# โ BAD: Manual lock management
lock.acquire()
try:
# Critical section
finally:
lock.release()
# โ BAD: Sharing mutable objects without synchronization
shared_list = [] # Dangerous!
Choosing Between Threading and Multiprocessing
Is your task I/O-bound?
โโ Yes โ Use Threading
โ (Network, disk, database operations)
โ
โโ No (CPU-bound)
โโ Use Multiprocessing
(Heavy computation, data processing)
Decision Tree
# 1. Identify task type
if task_involves_io: # Network, disk, database
use_threading()
elif task_is_cpu_intensive: # Computation, data processing
use_multiprocessing()
else:
use_asyncio() # For many I/O operations
# 2. Consider constraints
if memory_limited:
prefer_threading()
elif need_true_parallelism:
prefer_multiprocessing()
# 3. Measure performance
profile_both_approaches()
choose_faster_one()
Conclusion
Threading and multiprocessing are both valuable tools for concurrent programming, but they solve different problems:
-
Use threading for I/O-bound tasks (network requests, file operations, database queries). It’s lightweight, fast, and efficient for waiting.
-
Use multiprocessing for CPU-bound tasks (heavy computation, data processing). It provides true parallelism and bypasses Python’s GIL.
Key takeaways:
- Understand the GIL: Threading doesn’t provide true parallelism for CPU-bound tasks in Python
- Match the tool to the problem: I/O-bound โ threading, CPU-bound โ multiprocessing
- Synchronize carefully: Use locks, queues, and thread-safe data structures
- Measure performance: Profile both approaches and choose the faster one
- Start simple: Use thread/process pools before building complex synchronization logic
- Consider alternatives: For many I/O operations,
asynciomight be better than threading
Concurrent programming is powerful but complex. Start with the simplest approach that solves your problem, measure its performance, and optimize based on data.
Happy concurrent coding!
Comments