Python Debugging: Mastering pdb and Print Statements
Introduction
You’ve written what you think is perfect code. You run it, and something goes wrong. The error message is cryptic. You have no idea where the problem is. This is where debugging comes in.
Debugging is the process of finding and fixing errors in your code. It’s one of the most important skills in programming, yet many developers rely on guesswork and random changes rather than systematic debugging techniques.
Python offers two primary debugging approaches: print statements and the pdb debugger. Print statements are simple and immediateโyou add a line of code and see what’s happening. The pdb debugger is more powerfulโit lets you pause execution, inspect variables, and step through code line by line.
Neither approach is universally better. The best debugger is the one you use effectively. In this guide, we’ll explore both techniques in depth, show you when to use each, and teach you how to combine them for maximum debugging power. By the end, you’ll be able to tackle even the trickiest bugs with confidence.
Part 1: Understanding Debugging
Why Debugging Matters
Debugging isn’t just about fixing bugsโit’s about understanding your code. When you debug, you:
- Understand program flow: See exactly what your code is doing
- Identify assumptions: Discover where your mental model of the code differs from reality
- Learn patterns: Recognize common mistakes and how to avoid them
- Build confidence: Know your code works because you’ve verified it
The Debugging Mindset
Effective debugging requires a systematic approach:
- Reproduce the bug: Make sure you can consistently trigger the error
- Form a hypothesis: What do you think is causing the problem?
- Test the hypothesis: Use debugging to verify or refute it
- Isolate the problem: Narrow down where the bug occurs
- Fix and verify: Apply a fix and confirm it works
Part 2: Print Statement Debugging
The Basics
Print statement debugging is the simplest approach: add print() statements to see what’s happening.
def calculate_average(numbers):
total = sum(numbers)
print(f"Total: {total}") # Debug print
count = len(numbers)
print(f"Count: {count}") # Debug print
average = total / count
print(f"Average: {average}") # Debug print
return average
result = calculate_average([10, 20, 30])
print(f"Result: {result}")
# Output:
# Total: 60
# Count: 3
# Average: 20.0
# Result: 20.0
Strategic Print Placement
Place prints at key points to understand program flow:
def process_user_data(users):
print(f"Processing {len(users)} users") # Entry point
valid_users = []
for user in users:
print(f" Checking user: {user['name']}") # Loop iteration
if user['age'] < 18:
print(f" Skipping {user['name']} (too young)") # Conditional
continue
valid_users.append(user)
print(f" Added {user['name']}") # Success case
print(f"Valid users: {len(valid_users)}") # Summary
return valid_users
users = [
{"name": "Alice", "age": 25},
{"name": "Bob", "age": 16},
{"name": "Charlie", "age": 30}
]
result = process_user_data(users)
Using f-strings for Clear Output
# Good: Clear, informative output
x = 42
y = 10
print(f"x={x}, y={y}, x+y={x+y}")
# Less clear: Hard to parse
print(x, y, x+y)
# Verbose: Too much output
print("The value of x is", x, "and the value of y is", y)
Debugging Collections
def process_data(data):
print(f"Input data: {data}")
print(f"Data type: {type(data)}")
print(f"Data length: {len(data)}")
# For dictionaries
if isinstance(data, dict):
print(f"Keys: {list(data.keys())}")
for key, value in data.items():
print(f" {key}: {value} (type: {type(value).__name__})")
# For lists
elif isinstance(data, list):
print(f"First item: {data[0] if data else 'empty'}")
print(f"Last item: {data[-1] if data else 'empty'}")
for i, item in enumerate(data):
print(f" [{i}]: {item}")
data = {"name": "Alice", "scores": [85, 90, 88]}
process_data(data)
Pretty Printing Complex Data
import json
from pprint import pprint
data = {
"users": [
{"id": 1, "name": "Alice", "email": "[email protected]"},
{"id": 2, "name": "Bob", "email": "[email protected]"}
],
"metadata": {"total": 2, "page": 1}
}
# Using pprint for better formatting
print("Using pprint:")
pprint(data)
# Using json for even better formatting
print("\nUsing json:")
print(json.dumps(data, indent=2))
Conditional Debugging
DEBUG = True # Set to False to disable debug output
def calculate(x, y):
if DEBUG:
print(f"calculate({x}, {y})")
result = x + y
if DEBUG:
print(f" result = {result}")
return result
# Or use a debug function
def debug_print(*args, **kwargs):
if DEBUG:
print(*args, **kwargs)
def process(data):
debug_print(f"Processing: {data}")
# ... rest of code
Logging Instead of Print
For production code, use logging instead of print:
import logging
# Configure logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
def calculate_average(numbers):
logger.debug(f"Calculating average of {numbers}")
total = sum(numbers)
logger.debug(f"Total: {total}")
count = len(numbers)
logger.debug(f"Count: {count}")
average = total / count
logger.debug(f"Average: {average}")
return average
result = calculate_average([10, 20, 30])
Limitations of Print Debugging
# Problem 1: Cluttered output
def complex_function(data):
print(f"Start: {data}")
for item in data:
print(f"Processing: {item}")
result = item * 2
print(f"Result: {result}")
print("Done")
# Output is hard to follow
# Problem 2: Can't inspect state at specific points
def buggy_function(x):
y = x * 2
# Can't easily check what y is without adding print
z = y + 10
return z
# Problem 3: Requires code changes
# You must modify your code to add prints, then remove them later
# Problem 4: Can't easily change debugging level
# You have to manually add/remove print statements
Part 3: PDB Debugger
Getting Started with PDB
The Python Debugger (pdb) lets you pause execution and inspect your code interactively.
def calculate_average(numbers):
total = sum(numbers)
count = len(numbers)
average = total / count
return average
# Add breakpoint
import pdb; pdb.set_trace()
result = calculate_average([10, 20, 30])
print(result)
When you run this, execution pauses at the breakpoint, and you get an interactive prompt:
> <stdin>(1)<module>()
-> result = calculate_average([10, 20, 30])
(Pdb)
Python 3.7+ breakpoint() Function
def calculate_average(numbers):
total = sum(numbers)
count = len(numbers)
breakpoint() # Cleaner than pdb.set_trace()
average = total / count
return average
result = calculate_average([10, 20, 30])
Essential PDB Commands
l (list) - Show Code
(Pdb) l
1 def calculate_average(numbers):
2 total = sum(numbers)
3 count = len(numbers)
4 -> average = total / count
5 return average
n (next) - Execute Next Line
(Pdb) n
> <stdin>(5)<module>()
-> return average
s (step) - Step Into Function
(Pdb) s
# Steps into function calls, unlike 'n' which steps over them
c (continue) - Resume Execution
(Pdb) c
# Resumes execution until next breakpoint or program end
p (print) - Print Variable
(Pdb) p numbers
[10, 20, 30]
(Pdb) p total
60
(Pdb) p average
20.0
pp (pretty print) - Pretty Print
(Pdb) pp {"name": "Alice", "scores": [85, 90, 88]}
{'name': 'Alice', 'scores': [85, 90, 88]}
b (break) - Set Breakpoint
(Pdb) b 10 # Set breakpoint at line 10
(Pdb) b function_name # Set breakpoint at function
(Pdb) b # List all breakpoints
(Pdb) cl 1 # Clear breakpoint 1
w (where) - Show Stack Trace
(Pdb) w
File "script.py", line 10, in <module>
result = calculate_average([10, 20, 30])
File "script.py", line 4, in calculate_average
breakpoint()
> File "script.py", line 5, in calculate_average
average = total / count
u (up) / d (down) - Navigate Stack
(Pdb) u # Go up one level in stack
(Pdb) d # Go down one level in stack
h (help) - Get Help
(Pdb) h
# Shows list of commands
(Pdb) h n
# Shows help for 'n' command
Practical PDB Workflow
def find_user(users, user_id):
"""Find user by ID"""
for user in users:
if user['id'] == user_id:
return user
return None
def process_users(users):
"""Process user data"""
results = []
for user in users:
found = find_user(users, user['id'])
if found:
results.append(found)
return results
users = [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"},
{"id": 3, "name": "Charlie"}
]
breakpoint() # Start debugging here
result = process_users(users)
print(result)
Debugging session:
(Pdb) l # See code
(Pdb) n # Step to next line
(Pdb) p users # Check users variable
(Pdb) s # Step into process_users
(Pdb) n # Step through loop
(Pdb) p user # Check current user
(Pdb) s # Step into find_user
(Pdb) p user_id # Check parameter
(Pdb) c # Continue to next breakpoint
Conditional Breakpoints
def process_items(items):
for i, item in enumerate(items):
breakpoint() # Breaks every iteration
print(f"Processing {item}")
# Better: Break only on specific condition
def process_items(items):
for i, item in enumerate(items):
if item > 50: # Only break for large items
breakpoint()
print(f"Processing {item}")
Post-Mortem Debugging
Debug after a crash:
import pdb
import traceback
def buggy_function():
x = 10
y = 0
return x / y # Will crash
try:
buggy_function()
except Exception:
traceback.print_exc()
pdb.post_mortem() # Start debugger at crash point
Advanced: Debugging in Loops
def find_bug(data):
for i, item in enumerate(data):
if item < 0:
breakpoint() # Break when condition is true
print(f"Found negative at index {i}: {item}")
find_bug([1, 2, -3, 4, -5])
Advantages of PDB
# Advantage 1: No code changes needed
# Just add breakpoint() where you want to pause
# Advantage 2: Inspect any variable at any time
(Pdb) p some_variable
(Pdb) p len(some_list)
(Pdb) p some_dict.keys()
# Advantage 3: Execute arbitrary Python code
(Pdb) p [x*2 for x in range(5)]
[0, 2, 4, 6, 8]
# Advantage 4: Step through code line by line
(Pdb) n # Next line
(Pdb) s # Step into function
(Pdb) c # Continue
# Advantage 5: Set breakpoints dynamically
(Pdb) b 42 # Break at line 42
(Pdb) b function_name # Break at function
Part 4: Comparing Print Statements and PDB
When to Use Print Statements
Use print statements when:
-
Quick, simple debugging
# Just want to see if a value is what you expect print(f"x = {x}") -
Logging program flow
# Track what your program is doing print("Starting process...") print("Processing item 1...") print("Done!") -
Debugging in production
# Can't use interactive debugger in production # Use logging instead of print logger.debug(f"Processing user {user_id}") -
Simple scripts
# For small scripts, print is often sufficient -
Distributed systems
# When debugging across multiple processes/machines # Print to logs that can be aggregated
When to Use PDB
Use pdb when:
-
Complex bugs
# Need to understand program state at specific points # Need to step through code line by line -
Unexpected behavior
# Program runs but produces wrong results # Need to inspect variables at each step -
Debugging functions
# Need to step into function calls # Need to see what parameters are passed -
Conditional bugs
# Bug only happens under certain conditions # Need to inspect state when condition is true -
Performance issues
# Need to see which lines are slow # Need to understand execution flow
Comparison Table
| Aspect | Print Statements | PDB |
|---|---|---|
| Setup | Add print() calls | Add breakpoint() |
| Interaction | Non-interactive | Interactive |
| Code Changes | Requires modification | Minimal changes |
| Learning Curve | Very easy | Moderate |
| Flexibility | Limited | Very flexible |
| Performance | Minimal overhead | Pauses execution |
| Production Use | Yes (with logging) | No |
| Complex Inspection | Difficult | Easy |
| Stepping Through Code | Not possible | Yes |
| Setting Breakpoints | Manual | Dynamic |
Part 5: Practical Debugging Scenarios
Scenario 1: Finding a Logic Error
Problem: Function returns wrong result
def calculate_discount(price, discount_percent):
"""Calculate discounted price"""
discount_amount = price * discount_percent
final_price = price - discount_amount
return final_price
# Test
result = calculate_discount(100, 0.1)
print(result) # Expected: 90, Got: 9.0 (WRONG!)
Using Print Debugging:
def calculate_discount(price, discount_percent):
print(f"Input: price={price}, discount_percent={discount_percent}")
discount_amount = price * discount_percent
print(f"Discount amount: {discount_amount}")
final_price = price - discount_amount
print(f"Final price: {final_price}")
return final_price
result = calculate_discount(100, 0.1)
# Output shows discount_amount = 10.0, final_price = 90.0
# Wait, that's correct! Let me check the test...
# Oh! discount_percent should be 0.1, not 10
Using PDB:
def calculate_discount(price, discount_percent):
breakpoint()
discount_amount = price * discount_percent
final_price = price - discount_amount
return final_price
# In pdb:
# (Pdb) p price
# 100
# (Pdb) p discount_percent
# 0.1
# (Pdb) n
# (Pdb) p discount_amount
# 10.0
# (Pdb) n
# (Pdb) p final_price
# 90.0
Scenario 2: Debugging a Loop
Problem: Loop produces unexpected results
def sum_positive_numbers(numbers):
"""Sum only positive numbers"""
total = 0
for num in numbers:
if num > 0:
total = total + num
return total
result = sum_positive_numbers([1, -2, 3, -4, 5])
print(result) # Expected: 9, Got: 9 (correct, but let's verify)
Using Print Debugging:
def sum_positive_numbers(numbers):
total = 0
for num in numbers:
print(f"Checking {num}")
if num > 0:
print(f" Adding {num}")
total = total + num
else:
print(f" Skipping {num}")
print(f"Final total: {total}")
return total
result = sum_positive_numbers([1, -2, 3, -4, 5])
Using PDB:
def sum_positive_numbers(numbers):
total = 0
for num in numbers:
breakpoint() # Break each iteration
if num > 0:
total = total + num
return total
# In pdb:
# (Pdb) p num
# 1
# (Pdb) n
# (Pdb) p total
# 1
# (Pdb) c # Continue to next iteration
Scenario 3: Debugging Function Calls
Problem: Function returns None unexpectedly
def find_user(users, user_id):
"""Find user by ID"""
for user in users:
if user['id'] == user_id:
return user
# Missing return None here
def get_user_name(users, user_id):
"""Get user name"""
user = find_user(users, user_id)
return user['name'] # Crashes if user is None
users = [{"id": 1, "name": "Alice"}]
name = get_user_name(users, 999) # User doesn't exist
print(name) # TypeError: 'NoneType' object is not subscriptable
Using PDB:
def find_user(users, user_id):
for user in users:
if user['id'] == user_id:
return user
def get_user_name(users, user_id):
user = find_user(users, user_id)
breakpoint() # Pause here
return user['name']
# In pdb:
# (Pdb) p user
# None
# (Pdb) p user_id
# 999
# (Pdb) p users
# [{'id': 1, 'name': 'Alice'}]
# Now I see the problem!
Scenario 4: Debugging Data Structures
Problem: Dictionary has unexpected structure
def process_config(config):
"""Process configuration"""
database_host = config['database']['host']
database_port = config['database']['port']
return f"{database_host}:{database_port}"
config = {
'database': {
'host': 'localhost'
# Missing 'port'!
}
}
result = process_config(config) # KeyError: 'port'
Using Print Debugging:
def process_config(config):
print(f"Config: {config}")
print(f"Database config: {config.get('database')}")
print(f"Keys in database: {config['database'].keys()}")
database_host = config['database']['host']
database_port = config['database']['port']
return f"{database_host}:{database_port}"
# Output shows 'port' is missing
Using PDB:
def process_config(config):
breakpoint()
database_host = config['database']['host']
database_port = config['database']['port']
return f"{database_host}:{database_port}"
# In pdb:
# (Pdb) pp config
# {'database': {'host': 'localhost'}}
# (Pdb) p config['database'].keys()
# dict_keys(['host'])
# Missing 'port'!
Part 6: Best Practices and Tips
Combining Print and PDB
# Use print for high-level flow
print("Starting data processing...")
# Use pdb for detailed inspection
def process_data(data):
breakpoint() # Pause here for detailed inspection
# ... process data
print("Data processing complete")
Debugging Strategies
1. Narrow Down the Problem
# Start with a broad breakpoint
def complex_function(a, b, c):
breakpoint() # Pause at start
result1 = step1(a, b)
result2 = step2(result1, c)
result3 = step3(result2)
return result3
# Once you know where the problem is, move the breakpoint
def complex_function(a, b, c):
result1 = step1(a, b)
breakpoint() # Pause here instead
result2 = step2(result1, c)
result3 = step3(result2)
return result3
2. Use Assertions
def calculate_average(numbers):
assert numbers, "Numbers list cannot be empty"
assert all(isinstance(n, (int, float)) for n in numbers), "All items must be numbers"
total = sum(numbers)
average = total / len(numbers)
return average
# Assertions help catch bugs early
calculate_average([]) # AssertionError: Numbers list cannot be empty
3. Create Minimal Reproducible Examples
# Instead of debugging the entire application
# Create a small script that reproduces the bug
# buggy_code.py
def buggy_function(x):
return x * 2 + 1
result = buggy_function(5)
print(result) # Check if this is correct
# Now debug just this small example
4. Use Type Hints and Type Checking
def calculate_average(numbers: list[float]) -> float:
"""Calculate average of numbers"""
if not numbers:
raise ValueError("Numbers list cannot be empty")
return sum(numbers) / len(numbers)
# Type hints help catch bugs before runtime
# Use mypy for static type checking
Common Pitfalls
Pitfall 1: Forgetting to Remove Debug Code
# Bad: Debug code left in production
def process_data(data):
breakpoint() # Oops! This will pause in production
return data
# Good: Use conditional debugging
DEBUG = False
def process_data(data):
if DEBUG:
breakpoint()
return data
Pitfall 2: Print Statements Interfering with Output
# Bad: Debug prints mixed with actual output
def get_data():
print("Getting data...") # Debug print
data = fetch_from_api()
print(f"Data: {data}") # Debug print
return data
# Good: Use logging instead
import logging
logger = logging.getLogger(__name__)
def get_data():
logger.debug("Getting data...")
data = fetch_from_api()
logger.debug(f"Data: {data}")
return data
Pitfall 3: Not Understanding Variable Scope
# Confusing: Variable scope issues
x = 10
def modify_x():
x = 20 # Creates local variable, doesn't modify global
print(x) # Prints 20
modify_x()
print(x) # Prints 10 (global x unchanged)
# Use pdb to understand scope
def modify_x():
breakpoint()
x = 20
# (Pdb) p x # Shows local x
Pitfall 4: Debugging the Wrong Thing
# Problem: You think the bug is in function A
# But it's actually in function B
# Solution: Use pdb to trace execution
def function_a():
result = function_b()
breakpoint() # Check result here
return result
def function_b():
# Bug is actually here
return wrong_value
Part 7: Advanced Debugging Techniques
Debugging Async Code
import asyncio
import pdb
async def fetch_data():
breakpoint() # Works with async too
data = await get_from_api()
return data
# Run with: python -m asyncio script.py
Debugging Multithreaded Code
import threading
import pdb
def worker():
breakpoint() # Pause in thread
# ... do work
thread = threading.Thread(target=worker)
thread.start()
Using IDE Debuggers
Most IDEs (PyCharm, VS Code) have built-in debuggers that are more user-friendly than pdb:
# In VS Code or PyCharm, you can:
# 1. Click to set breakpoints
# 2. Hover over variables to see values
# 3. Use GUI to step through code
# 4. Watch expressions
# 5. Evaluate code in console
# This is often easier than using pdb directly
Conclusion
Debugging is a skill that improves with practice. Both print statements and pdb are valuable tools, and the best debugger is the one you use effectively.
Key takeaways:
- Print statements are simple, immediate, and good for quick debugging and logging
- PDB is powerful, interactive, and essential for complex bugs
- Use print statements for simple cases, logging, and production code
- Use pdb for complex bugs, stepping through code, and understanding program flow
- Combine both techniques for maximum effectiveness
- Use assertions to catch bugs early
- Create minimal reproducible examples to isolate problems
- Use type hints and static analysis to prevent bugs
- Learn your tools: master pdb commands and logging configuration
The more you practice debugging, the faster you’ll find and fix bugs. Start with print statements for simple cases, graduate to pdb for complex problems, and eventually you’ll develop an intuition for where bugs hide. Happy debugging!
Comments