Dynamic Code Generation in Python: Creating and Executing Code at Runtime
Dynamic code generation is one of Python’s most powerfulโand most dangerousโcapabilities. It allows programs to create and execute code at runtime, enabling sophisticated patterns like domain-specific languages, configuration-driven behavior, and advanced metaprogramming.
Yet with great power comes great responsibility. Dynamic code generation can introduce security vulnerabilities, performance issues, and code that’s difficult to understand and maintain. This guide explores dynamic code generation comprehensively, showing you how to use it effectively while avoiding common pitfalls.
What is Dynamic Code Generation?
Dynamic code generation is the process of creating executable code at runtime, rather than writing it statically before execution. Instead of hardcoding behavior, you generate it based on runtime conditions, configuration, or user input.
Static vs Dynamic Code
# Static code: Written before execution
def greet(name):
return f"Hello, {name}!"
result = greet("Alice")
print(result) # Hello, Alice!
# Dynamic code: Generated at runtime
code = 'def greet(name): return f"Hello, {name}!"'
exec(code)
result = greet("Alice")
print(result) # Hello, Alice!
Both produce the same result, but the dynamic version creates the function at runtime.
Core Techniques for Dynamic Code Generation
1. eval(): Evaluating Expressions
eval() evaluates a Python expression and returns the result:
# Simple expressions
result = eval("2 + 2")
print(result) # 4
# String expressions
expression = "len('hello')"
result = eval(expression)
print(result) # 5
# With variables
x = 10
y = 20
result = eval("x + y")
print(result) # 30
# With custom namespace
namespace = {'x': 5, 'y': 3}
result = eval("x * y", namespace)
print(result) # 15
Important: eval() only works with expressions, not statements.
# This works (expression)
eval("2 + 2")
# This fails (statement)
try:
eval("x = 5")
except SyntaxError as e:
print(f"Error: {e}") # invalid syntax
2. exec(): Executing Statements
exec() executes Python code (statements and expressions):
# Execute a simple statement
code = "x = 10"
exec(code)
print(x) # 10
# Execute multiple statements
code = """
def add(a, b):
return a + b
result = add(5, 3)
"""
exec(code)
print(result) # 8
# Execute with namespace
namespace = {}
code = """
def multiply(a, b):
return a * b
result = multiply(4, 5)
"""
exec(code, namespace)
print(namespace['result']) # 20
3. compile(): Compiling Code
compile() compiles code into bytecode, which can be executed multiple times:
# Compile code
code_string = "x = 10; y = 20; z = x + y"
compiled_code = compile(code_string, '<string>', 'exec')
# Execute compiled code multiple times
namespace1 = {}
exec(compiled_code, namespace1)
print(namespace1['z']) # 30
namespace2 = {}
exec(compiled_code, namespace2)
print(namespace2['z']) # 30
# Compile an expression
expression = "2 ** 10"
compiled_expr = compile(expression, '<string>', 'eval')
result = eval(compiled_expr)
print(result) # 1024
4. The ast Module: Safe Code Analysis
The ast module parses Python code into an Abstract Syntax Tree (AST), allowing safe analysis without execution:
import ast
# Parse code into AST
code = """
def greet(name):
return f"Hello, {name}!"
"""
tree = ast.parse(code)
# Analyze the AST
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
print(f"Function: {node.name}")
print(f"Arguments: {[arg.arg for arg in node.args.args]}")
# Output:
# Function: greet
# Arguments: ['name']
Practical Examples
Example 1: Expression Evaluator
import re
class ExpressionEvaluator:
"""Safely evaluate mathematical expressions"""
def __init__(self):
# Allowed functions and variables
self.safe_dict = {
'abs': abs,
'round': round,
'min': min,
'max': max,
'sum': sum,
'__builtins__': {}, # Disable built-in functions
}
def evaluate(self, expression: str) -> float:
"""Evaluate a mathematical expression safely"""
# Validate expression contains only safe characters
if not re.match(r'^[0-9+\-*/(). ]+$', expression):
raise ValueError("Invalid characters in expression")
try:
result = eval(expression, self.safe_dict)
return result
except Exception as e:
raise ValueError(f"Invalid expression: {e}")
# Usage
evaluator = ExpressionEvaluator()
print(evaluator.evaluate("2 + 2")) # 4
print(evaluator.evaluate("(10 + 5) * 2")) # 30
print(evaluator.evaluate("max(1, 5, 3)")) # 5
# This raises an error (unsafe)
try:
evaluator.evaluate("__import__('os').system('ls')")
except ValueError as e:
print(f"Error: {e}")
Example 2: Configuration-Driven Behavior
import json
class ConfigurableProcessor:
"""Process data based on configuration"""
def __init__(self, config: dict):
self.config = config
self.namespace = {
'abs': abs,
'len': len,
'str': str,
'int': int,
'float': float,
'__builtins__': {},
}
def process(self, data: dict) -> dict:
"""Process data according to configuration"""
result = {}
for field_name, field_config in self.config.items():
if 'transform' in field_config:
# Apply transformation
transform = field_config['transform']
namespace = {**self.namespace, 'value': data.get(field_name)}
result[field_name] = eval(transform, namespace)
else:
result[field_name] = data.get(field_name)
return result
# Configuration
config = {
'name': {}, # No transformation
'age': {'transform': 'int(value)'},
'email': {'transform': 'str(value).lower()'},
'score': {'transform': 'float(value) * 1.1'}, # Add 10% bonus
}
processor = ConfigurableProcessor(config)
# Process data
data = {
'name': 'Alice',
'age': '30',
'email': '[email protected]',
'score': '85',
}
result = processor.process(data)
print(result)
# {'name': 'Alice', 'age': 30, 'email': '[email protected]', 'score': 93.5}
Example 3: Domain-Specific Language (DSL)
class QueryBuilder:
"""Build SQL queries dynamically"""
def __init__(self):
self.table = None
self.conditions = []
self.columns = ['*']
def from_table(self, table: str) -> 'QueryBuilder':
"""Set the table"""
self.table = table
return self
def select(self, *columns) -> 'QueryBuilder':
"""Select specific columns"""
self.columns = list(columns)
return self
def where(self, condition: str) -> 'QueryBuilder':
"""Add a WHERE condition"""
self.conditions.append(condition)
return self
def build(self) -> str:
"""Build the SQL query"""
columns = ', '.join(self.columns)
query = f"SELECT {columns} FROM {self.table}"
if self.conditions:
where_clause = ' AND '.join(self.conditions)
query += f" WHERE {where_clause}"
return query
# Usage
query = (QueryBuilder()
.from_table('users')
.select('id', 'name', 'email')
.where('age > 18')
.where('status = "active"')
.build())
print(query)
# SELECT id, name, email FROM users WHERE age > 18 AND status = "active"
Example 4: Dynamic Function Generation
def create_validator(rules: dict):
"""Create a validator function based on rules"""
def validator(data: dict) -> tuple[bool, list]:
"""Validate data against rules"""
errors = []
for field, rule in rules.items():
value = data.get(field)
# Check required
if rule.get('required') and not value:
errors.append(f"{field} is required")
continue
# Check type
if value and 'type' in rule:
expected_type = rule['type']
if not isinstance(value, expected_type):
errors.append(
f"{field} must be {expected_type.__name__}, "
f"got {type(value).__name__}"
)
# Check custom validation
if value and 'validate' in rule:
validation_code = rule['validate']
namespace = {'value': value}
try:
is_valid = eval(validation_code, {'__builtins__': {}}, namespace)
if not is_valid:
errors.append(f"{field} validation failed")
except Exception as e:
errors.append(f"{field} validation error: {e}")
return len(errors) == 0, errors
return validator
# Define validation rules
rules = {
'email': {
'required': True,
'type': str,
'validate': "'@' in value and '.' in value",
},
'age': {
'required': True,
'type': int,
'validate': '18 <= value <= 120',
},
'username': {
'required': True,
'type': str,
'validate': 'len(value) >= 3',
},
}
# Create validator
validator = create_validator(rules)
# Validate data
data1 = {'email': '[email protected]', 'age': 30, 'username': 'alice'}
is_valid, errors = validator(data1)
print(f"Valid: {is_valid}, Errors: {errors}") # Valid: True, Errors: []
data2 = {'email': 'invalid', 'age': 15, 'username': 'ab'}
is_valid, errors = validator(data2)
print(f"Valid: {is_valid}, Errors: {errors}")
# Valid: False, Errors: ['email validation failed', 'age validation failed', 'username validation failed']
Example 5: Template-Based Code Generation
class ClassGenerator:
"""Generate classes dynamically"""
@staticmethod
def create_dataclass(name: str, fields: dict) -> type:
"""Create a dataclass dynamically"""
# Generate __init__ method
init_params = ', '.join(f"{field_name}" for field_name in fields.keys())
init_body = '\n '.join(
f"self.{field_name} = {field_name}"
for field_name in fields.keys()
)
init_code = f"""
def __init__(self, {init_params}):
{init_body}
"""
# Generate __repr__ method
repr_items = ', '.join(
f"{field_name}={{self.{field_name}!r}}"
for field_name in fields.keys()
)
repr_code = f"""
def __repr__(self):
return f"{name}({repr_items})"
"""
# Create namespace and execute code
namespace = {}
exec(init_code, namespace)
exec(repr_code, namespace)
# Create class
cls = type(name, (object,), {
'__init__': namespace['__init__'],
'__repr__': namespace['__repr__'],
})
return cls
# Usage
Person = ClassGenerator.create_dataclass('Person', {
'name': str,
'age': int,
'email': str,
})
person = Person('Alice', 30, '[email protected]')
print(person) # Person(name='Alice', age=30, email='[email protected]')
print(person.name) # Alice
Security Considerations
Dynamic code generation introduces significant security risks. Always be cautious.
Risk 1: Code Injection
# DANGEROUS: User input directly in eval
user_input = input("Enter an expression: ")
result = eval(user_input) # User could enter: __import__('os').system('rm -rf /')
# SAFE: Validate and restrict input
import re
def safe_eval(expression: str) -> float:
"""Safely evaluate mathematical expressions"""
# Only allow numbers, operators, and parentheses
if not re.match(r'^[0-9+\-*/(). ]+$', expression):
raise ValueError("Invalid expression")
# Use restricted namespace
safe_dict = {
'__builtins__': {},
'abs': abs,
'round': round,
}
return eval(expression, safe_dict)
Risk 2: Arbitrary Code Execution
# DANGEROUS: Executing untrusted code
untrusted_code = """
import os
os.system('malicious command')
"""
exec(untrusted_code) # Executes malicious code!
# SAFE: Use restricted namespace
safe_namespace = {
'__builtins__': {}, # Disable built-in functions
}
exec(untrusted_code, safe_namespace) # Raises NameError
Risk 3: Resource Exhaustion
# DANGEROUS: Infinite loop in dynamic code
code = "while True: pass"
exec(code) # Hangs forever!
# SAFE: Use timeout
import signal
def timeout_handler(signum, frame):
raise TimeoutError("Code execution timeout")
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(5) # 5 second timeout
try:
exec("while True: pass")
except TimeoutError:
print("Code execution timed out")
finally:
signal.alarm(0) # Cancel alarm
Performance Considerations
Dynamic code generation has performance implications.
Compilation Overhead
import timeit
# Direct execution
def direct():
return 2 + 2
# Using eval
def using_eval():
return eval("2 + 2")
# Using exec
def using_exec():
namespace = {}
exec("result = 2 + 2", namespace)
return namespace['result']
# Using compile
compiled = compile("2 + 2", '<string>', 'eval')
def using_compile():
return eval(compiled)
# Benchmark
print("Direct:", timeit.timeit(direct, number=1000000))
print("eval:", timeit.timeit(using_eval, number=1000000))
print("exec:", timeit.timeit(using_exec, number=1000000))
print("compile:", timeit.timeit(using_compile, number=1000000))
# Output (approximate):
# Direct: 0.05s
# eval: 0.15s (3x slower)
# exec: 0.25s (5x slower)
# compile: 0.10s (2x slower)
Optimization: Cache Compiled Code
class CachedEvaluator:
"""Cache compiled expressions for performance"""
def __init__(self):
self.cache = {}
def evaluate(self, expression: str, **variables) -> float:
"""Evaluate expression with caching"""
# Check cache
if expression not in self.cache:
self.cache[expression] = compile(expression, '<string>', 'eval')
compiled = self.cache[expression]
return eval(compiled, {'__builtins__': {}}, variables)
# Usage
evaluator = CachedEvaluator()
# First call: compiles
result1 = evaluator.evaluate("x + y", x=5, y=3)
print(result1) # 8
# Second call: uses cached compiled code
result2 = evaluator.evaluate("x + y", x=10, y=20)
print(result2) # 30
When to Use Dynamic Code Generation
Good Use Cases
- Configuration-Driven Behavior: Apply transformations based on configuration
- Domain-Specific Languages: Create custom query or expression languages
- Template Engines: Generate code from templates
- Testing Frameworks: Generate test cases dynamically
- Code Generators: Generate boilerplate code
Bad Use Cases
- Simple Conditionals: Use if/else instead of eval
- Function Dispatch: Use dictionaries or function registries instead of eval
- Data Transformation: Use built-in functions or libraries instead of eval
- User Input Processing: Use parsers and validators instead of eval
# Bad: Using eval for simple dispatch
def bad_dispatch(action, data):
return eval(f"{action}(data)")
# Good: Using a dictionary
def good_dispatch(action, data):
handlers = {
'process': process_data,
'validate': validate_data,
'transform': transform_data,
}
return handlers[action](/programming/data)
Best Practices
1. Avoid eval() When Possible
# Bad: Using eval
result = eval(user_input)
# Good: Use safer alternatives
import ast
import operator
def safe_eval(expression: str) -> float:
"""Safely evaluate mathematical expressions"""
try:
tree = ast.parse(expression, mode='eval')
return eval_ast(tree.body)
except:
raise ValueError("Invalid expression")
def eval_ast(node):
"""Safely evaluate AST nodes"""
if isinstance(node, ast.Constant):
return node.value
elif isinstance(node, ast.BinOp):
left = eval_ast(node.left)
right = eval_ast(node.right)
op = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
}[type(node.op)]
return op(left, right)
else:
raise ValueError("Unsupported operation")
2. Use Restricted Namespaces
# Always restrict the namespace
safe_namespace = {
'__builtins__': {}, # Disable built-ins
'abs': abs,
'len': len,
'max': max,
'min': min,
}
result = eval("max(1, 5, 3)", safe_namespace)
print(result) # 5
3. Validate Input
import re
def validate_expression(expression: str) -> bool:
"""Validate expression before evaluation"""
# Only allow safe characters
if not re.match(r'^[a-zA-Z0-9+\-*/(). ]+$', expression):
return False
# Check for dangerous patterns
dangerous_patterns = ['import', '__', 'eval', 'exec', 'open']
for pattern in dangerous_patterns:
if pattern in expression:
return False
return True
expression = "2 + 2"
if validate_expression(expression):
result = eval(expression)
print(result)
4. Document Dynamic Code
def apply_transformation(data: dict, transform_code: str) -> dict:
"""
Apply a transformation to data.
WARNING: This function uses dynamic code execution. Only use with
trusted transformation code. Never pass user input directly.
Args:
data: Dictionary to transform
transform_code: Python code that transforms the data
Available variables: 'data', 'value'
Returns:
Transformed data
Example:
>>> data = {'price': 100}
>>> transform_code = "{'price': value * 1.1}"
>>> apply_transformation(data, transform_code)
{'price': 110.0}
"""
namespace = {
'__builtins__': {},
'data': data,
}
return eval(transform_code, namespace)
5. Use Type Hints
from typing import Any, Callable, Dict
def create_function(code: str) -> Callable:
"""
Create a function from code.
Args:
code: Python function definition as string
Returns:
Callable function
"""
namespace: Dict[str, Any] = {}
exec(code, namespace)
# Return the first function defined
for value in namespace.values():
if callable(value) and not value.__name__.startswith('_'):
return value
raise ValueError("No function found in code")
Conclusion
Dynamic code generation is a powerful tool that enables sophisticated programming patterns. However, it comes with significant risks and performance costs.
Key takeaways:
- eval() evaluates expressions and returns results
- exec() executes statements and doesn’t return values
- compile() compiles code into bytecode for reuse
- ast module safely analyzes code without execution
- Security is criticalโalways validate and restrict input
- Performance overhead is significantโcache when possible
- Avoid dynamic code when simpler alternatives exist
- Document thoroughly when using dynamic code
- Test extensively to catch errors early
Use dynamic code generation judiciously. When you do use it, prioritize security, performance, and maintainability. The power of dynamic code generation is best reserved for frameworks, configuration systems, and specialized tools where the benefits clearly outweigh the costs.
Comments