Skip to main content
โšก Calmops

Logging Best Practices: Complete Guide to Structured Logging for Production

Introduction

Good logs are invaluable when debugging production issues. Bad logs are noise that obscures the signal. In modern distributed systems, logs are often the primary way to understand what’s happening when things go wrong.

This comprehensive guide covers logging best practices that make debugging easier and systems more observable. You’ll learn about log levels, structured logging, correlation IDs, log management systems, and practical implementation patterns.

Whether you’re building a small web application or a large distributed system, effective logging is essential for understanding behavior, debugging issues, and maintaining reliable systems.

Understanding Log Levels

The Right Level for Every Message

Log levels help categorize the importance and severity of log messages. Using them consistently makes it easy to filter and find relevant logs.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Log Level Hierarchy                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                       โ”‚
โ”‚   CRITICAL (50) - System cannot function                              โ”‚
โ”‚        |                                                             โ”‚
โ”‚   ERROR (40) - Operation failed                                     โ”‚
โ”‚        |                                                             โ”‚
โ”‚   WARNING (30) - Unexpected but handled                              โ”‚
โ”‚        |                                                             โ”‚
โ”‚   INFO (20) - Normal operations                                      โ”‚
โ”‚        |                                                             โ”‚
โ”‚   DEBUG (10) - Detailed debugging information                         โ”‚
โ”‚                                                                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

When to Use Each Level

Level Use For Examples
DEBUG Detailed diagnostic info Variable values, loop iterations, function calls
INFO Normal operations User actions, API requests, business events
WARNING Recoverable issues Retries, deprecations, rate limits
ERROR Failures Exceptions, timeouts, validation errors
CRITICAL System failures Database down, cannot start

Implementation Examples

import logging
import sys


def setup_logger(name: str, level: int = logging.INFO) -> logging.Logger:
    """Set up a structured logger."""
    
    logger = logging.getLogger(name)
    logger.setLevel(level)
    
    # Console handler
    handler = logging.StreamHandler(sys.stdout)
    handler.setLevel(level)
    
    # Formatter
    formatter = logging.Formatter(
        '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
    )
    handler.setFormatter(formatter)
    
    logger.addHandler(handler)
    
    return logger


# Example usage
logger = setup_logger(__name__)


# DEBUG - Detailed debugging info
def process_items(items: list):
    logger.debug(f"Starting to process {len(items)} items")
    
    for i, item in enumerate(items):
        logger.debug(f"Processing item {i+1}/{len(items)}: {item.id}")
        
        # Process item
        result = process_item(item)
        
        logger.debug(f"Item {item.id} processed: {result}")
    
    logger.debug(f"Completed processing {len(items)} items")


# INFO - Normal operations
def user_login(user_id: str, ip_address: str):
    logger.info(f"User {user_id} logged in from {ip_address}")


def place_order(order_id: str, user_id: str):
    logger.info(f"Order {order_id} placed by user {user_id}")


# WARNING - Recoverable issues
def api_request_with_retry(url: str, max_retries: int = 3):
    for attempt in range(max_retries):
        try:
            response = make_request(url)
            return response
        except ConnectionError as e:
            if attempt < max_retries - 1:
                logger.warning(
                    f"Request to {url} failed (attempt {attempt+1}/{max_retries}): {e}"
                )
            else:
                raise


# ERROR - Failures
def process_order(order_id: str):
    try:
        order = get_order(order_id)
        validate_order(order)
        charge_customer(order)
        fulfill_order(order)
    except ValidationError as e:
        logger.error(f"Order {order_id} validation failed: {e}")
    except PaymentError as e:
        logger.error(f"Payment failed for order {order_id}: {e}")
    except Exception as e:
        logger.error(f"Unexpected error processing order {order_id}: {e}")
        raise


# CRITICAL - System failures
def connect_to_database():
    try:
        db.connect()
    except DatabaseUnavailableError:
        logger.critical("Database unavailable - cannot serve requests")
        sys.exit(1)

What NOT to Log

# โŒ NEVER log these:

# Sensitive data
logger.info(f"User password: {user.password}")  # SECURITY RISK!
logger.info(f"Credit card: {credit_card_number}")

# PII that isn't necessary
logger.info(f"User SSN: {ssn}")
logger.info(f"Medical records: {patient_data}")

# Secrets and keys
logger.debug(f"API key: {api_key}")
logger.debug(f"Token: {access_token}")

# Excessive debug info
for item in huge_list:
    logger.debug(f"Item: {item}")  # Will flood logs!

# โœ… DO log these:

# Action being performed
logger.info(f"Processing order {order_id}")

# Context (not sensitive parts)
logger.info(f"User {user_id} action failed")

# Error details (without sensitive data)
logger.error(f"Payment failed: {error_code}")

Structured Logging

Why Structured Logs Matter

Structured logs use a consistent format (typically JSON) that makes them easy to parse, search, and analyze:

import json
import logging
from datetime import datetime


class StructuredFormatter(logging.Formatter):
    """Format logs as JSON."""
    
    def format(self, record: logging.LogRecord) -> str:
        log_data = {
            'timestamp': datetime.utcnow().isoformat(),
            'level': record.levelname,
            'logger': record.name,
            'message': record.getMessage(),
            'module': record.module,
            'function': record.funcName,
            'line': record.lineno
        }
        
        # Add exception info if present
        if record.exc_info:
            log_data['exception'] = self.formatException(record.exc_info)
        
        # Add extra fields
        if hasattr(record, 'user_id'):
            log_data['user_id'] = record.user_id
        
        if hasattr(record, 'request_id'):
            log_data['request_id'] = record.request_id
        
        return json.dumps(log_data)


# Usage
logger = logging.getLogger('structured')
handler = logging.StreamHandler()
handler.setFormatter(StructuredFormatter())
logger.addHandler(handler)
logger.setLevel(logging.INFO)


# Now logs will be JSON:
# {"timestamp": "2026-03-13T10:00:00Z", "level": "INFO", "logger": "structured", ...}

Structured Log Examples

import json
from dataclasses import dataclass, asdict
from typing import Optional, Dict, Any


@dataclass
class LogContext:
    """Context for structured logging."""
    user_id: Optional[str] = None
    request_id: Optional[str] = None
    session_id: Optional[str] = None
    correlation_id: Optional[str] = None
    extra: Dict[str, Any] = None


class StructuredLogger:
    """Logger with structured logging support."""
    
    def __init__(self, name: str):
        self.logger = logging.getLogger(name)
    
    def log(self, level: int, message: str, context: LogContext = None, **kwargs):
        """Log with structured context."""
        
        extra = {}
        
        if context:
            if context.user_id:
                extra['user_id'] = context.user_id
            if context.request_id:
                extra['request_id'] = context.request_id
            if context.correlation_id:
                extra['correlation_id'] = context.correlation_id
            if context.extra:
                extra.update(context.extra)
        
        # Add any additional fields
        extra.update(kwargs)
        
        self.logger.log(level, message, extra=extra)
    
    def info(self, message: str, context: LogContext = None, **kwargs):
        self.log(logging.INFO, message, context, **kwargs)
    
    def error(self, message: str, context: LogContext = None, **kwargs):
        self.log(logging.ERROR, message, context, **kwargs)


# Usage
logger = StructuredLogger('myapp')

# Structured log with context
logger.info(
    "Order placed",
    context=LogContext(
        user_id="user_123",
        request_id="req_abc",
        correlation_id="corr_xyz"
    ),
    order_id="order_456",
    amount=99.99,
    currency="USD"
)

# Output:
# {"message": "Order placed", "user_id": "user_123", "request_id": "req_abc", 
#  "order_id": "order_456", "amount": 99.99, ...}

Correlation IDs

Tracking Requests Across Systems

Correlation IDs help trace requests across services in distributed systems:

import uuid
from contextvars import ContextVar
from functools import wraps


# Store correlation ID in context
correlation_id_var: ContextVar[str] = ContextVar('correlation_id', default='')


def generate_correlation_id() -> str:
    """Generate a new correlation ID."""
    return str(uuid.uuid4())


def get_correlation_id() -> str:
    """Get current correlation ID or generate new one."""
    corr_id = correlation_id_var.get()
    if not corr_id:
        corr_id = generate_correlation_id()
        correlation_id_var.set(corr_id)
    return corr_id


def set_correlation_id(correlation_id: str):
    """Set correlation ID for current context."""
    correlation_id_var.set(correlation_id)


# Middleware for web frameworks
def correlation_middleware(app):
    """Middleware to track correlation IDs."""
    
    @wraps(app)
    async def wrapper(request, *args, **kwargs):
        # Get from header or generate new
        correlation_id = request.headers.get('X-Correlation-ID')
        if not correlation_id:
            correlation_id = generate_correlation_id()
        
        # Set in context
        set_correlation_id(correlation_id)
        
        # Add to response header
        response = await app(request, *args, **kwargs)
        response.headers['X-Correlation-ID'] = correlation_id
        
        return response
    
    return wrapper


# Logging with correlation ID
def log_with_correlation(logger: logging.Logger, level: int, message: str):
    """Log with correlation ID."""
    corr_id = get_correlation_id()
    logger.log(level, f"[{corr_id}] {message}")

Log Management Pipeline

ELK Stack Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Log Management Pipeline                                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                       โ”‚
โ”‚   Application                                                   โ”‚
โ”‚       โ”‚                                                           โ”‚
โ”‚       โ–ผ                                                           โ”‚
โ”‚   Filebeat / Fluentd                                            โ”‚
โ”‚       โ”‚                                                           โ”‚
โ”‚       โ–ผ                                                           โ”‚
โ”‚   Logstash / Kafka                                              โ”‚
โ”‚       โ”‚                                                           โ”‚
โ”‚       โ–ผ                                                           โ”‚
โ”‚   Elasticsearch                                                 โ”‚
โ”‚       โ”‚                                                           โ”‚
โ”‚       โ–ผ                                                           โ”‚
โ”‚   Kibana Dashboard                                              โ”‚
โ”‚                                                                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Python Logging to ELK

import logging
from logging.handlers import RotatingFileHandler
import json


class ELKHandler(logging.Handler):
    """Handler that sends logs to Logstash/ELK."""
    
    def __init__(self, host: str, port: int):
        super().__init__()
        self.host = host
        self.port = port
        self.socket = None
    
    def emit(self, record: logging.LogRecord):
        """Send log to Logstash."""
        try:
            log_data = {
                'timestamp': self.format(record),
                'level': record.levelname,
                'logger': record.name,
                'message': record.getMessage(),
                'module': record.module,
                'function': record.funcName,
                'line': record.lineno
            }
            
            if record.exc_info:
                log_data['exception'] = self.formatException(record.exc_info)
            
            # Add extra fields
            if hasattr(record, 'user_id'):
                log_data['user_id'] = record.user_id
            
            if hasattr(record, 'correlation_id'):
                log_data['correlation_id'] = record.correlation_id
            
            # Send to Logstash (via TCP/UDP or HTTP)
            self._send(json.dumps(log_data))
            
        except Exception:
            self.handleError(record)
    
    def _send(self, message: str):
        """Send message to Logstash."""
        import socket
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        try:
            sock.connect((self.host, self.port))
            sock.sendall(message.encode() + b'\n')
        finally:
            sock.close()


# Setup logging with ELK
def setup_elk_logging(host: str = 'localhost', port: int = 5044):
    """Set up logging to ELK stack."""
    
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    
    # Add ELK handler
    elk_handler = ELKHandler(host, port)
    logger.addHandler(elk_handler)
    
    # Also add console for local development
    console = logging.StreamHandler()
    console.setFormatter(
        logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
    )
    logger.addHandler(console)
    
    return logger

Best Practices Summary

Practice Implementation
Use appropriate levels DEBUG for debugging, INFO for operations
Structure your logs JSON format for easy parsing
Include correlation IDs Track requests across services
Don’t log sensitive data Passwords, tokens, PII
Log context User ID, request ID, relevant data
Use appropriate granularity Not too verbose, not too sparse
Ship to central system ELK, CloudWatch, Datadog

Conclusion

Good logging practices are essential for debugging production issues and maintaining observable systems. By using appropriate log levels, structured logging, correlation IDs, and proper log management, you’ll be able to diagnose and fix issues quickly.

Key takeaways:

  1. Use log levels correctly - DEBUG, INFO, WARNING, ERROR, CRITICAL
  2. Structure your logs - JSON format enables easy parsing
  3. Include correlation IDs - Track requests across services
  4. Never log sensitive data - Security first
  5. Ship to central system - ELK, CloudWatch, etc.
  6. Include context - User ID, request ID, relevant data
  7. Be consistent - Same format across all services

By implementing these logging practices, you’ll have the visibility needed to maintain reliable production systems.

Resources

Comments