Logging and Distributed Tracing in Rust Microservices

TL;DR: This guide covers implementing production-grade logging and distributed tracing in Rust microservices using the tracing crate, OpenTelemetry, and Jaeger. You’ll learn structured logging, span-based tracing, and how to debug issues across service boundaries.

Introduction

In a microservices architecture, a single user request often traverses multiple services. When something goes wrong, finding the root cause becomes challenging without proper observability. This article explores how to implement logging and distributed tracing in Rust microservices to achieve full visibility into your system’s behavior.

We’ll cover:

Structured logging with the tracing crate
Distributed tracing with OpenTelemetry
Jaeger integration for visualizing traces
Best practices for correlation IDs and span management

Structured Logging with the Tracing Crate

The tracing crate is Rust’s primary solution for structured logging. Unlike traditional logging, tracing supports span-based logging that captures context about what operation is being performed.

Installation

Add these dependencies to your Cargo.toml:

[dependencies]
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
tracing-appender = "0.2"

Basic Structured Logging

use tracing::{info, warn, error, instrument};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};

fn main() {
    // Initialize tracing subscriber
    tracing_subscriber::registry()
        .with(tracing_subscriber::fmt::layer())
        .init();

    info!("Application starting up");
    
    process_request("user-123").unwrap_or_else(|e| {
        error!("Failed to process request: {}", e);
    });
    
    info!("Application shutting down");
}

fn process_request(user_id: &str) -> Result<String, &'static str> {
    info!(user_id = user_id, "Processing user request");
    
    // Simulate some processing
    validate_user(user_id)?;
    
    info!(user_id = user_id, "Request processed successfully");
    Ok("Success".to_string())
}

fn validate_user(user_id: &str) -> Result<(), &'static str> {
    if user_id.is_empty() {
        warn!("Empty user ID provided");
        return Err("Invalid user ID");
    }
    Ok(())
}

Logging with Context (Spans)

Spans provide context about what’s happening during an operation:

use tracing::{info, warn, error, Instrument, Span};
use std::sync::Arc;
use tokio::sync::Mutex;

#[derive(Clone)]
pub struct OrderService {
    client: Arc<Mutex<HttpClient>>,
}

impl OrderService {
    pub async fn process_order(&self, order_id: &str) -> Result<Order, Error> {
        // Create a span for this operation
        let span = tracing::info_span!(
            "process_order", 
            order_id = order_id,
            service = "order-service"
        );
        
        async move {
            info!("Starting order processing");
            
            // Validate order
            let order = self.validate_order(order_id).await?;
            
            // Process payment
            let payment = self.process_payment(&order).await?;
            
            // Update inventory
            self.update_inventory(&order).await?;
            
            // Ship order
            self.ship_order(&order).await?;
            
            info!("Order processed successfully");
            Ok(order)
        }
        .instrument(span)
        .await
    }
    
    async fn validate_order(&self, order_id: &str) -> Result<Order, Error> {
        info!(step = "validation", "Validating order");
        // Validation logic
        Ok(Order { id: order_id.to_string(), status: "valid".into() })
    }
    
    async fn process_payment(&self, order: &Order) -> Result<Payment, Error> {
        info!(step = "payment", "Processing payment");
        // Payment logic
        Ok(Payment { order_id: order.id.clone(), amount: 99.99 })
    }
    
    async fn update_inventory(&self, order: &Order) -> Result<(), Error> {
        info!(step = "inventory", "Updating inventory");
        // Inventory logic
        Ok(())
    }
    
    async fn ship_order(&self, order: &Order) -> Result<(), Error> {
        info!(step = "shipping", "Shipping order");
        // Shipping logic
        Ok(())
    }
}

File Output and Log Rotation

For production systems, you need file-based logging with rotation:

use tracing_appender::rolling::{RollingFileAppender, Rotation};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter};

fn setup_file_logging() {
    let file_appender = RollingFileAppender::new(
        Rotation::DAILY,
        "/var/log/myapp",
        "app.log",
    );
    
    let (non_blocking, _guard) = tracing_appender::non_blocking(file_appender);
    
    // Keep the guard alive for the duration of the program
    std::mem::forget(_guard);
    
    tracing_subscriber::registry()
        .with(EnvFilter::new("info"))
        .with(
            tracing_subscriber::fmt::layer()
                .with_writer(non_blocking)
                .with_ansi(false)
                .with_target(true)
                .with_thread_ids(true)
                .with_file(true)
                .with_line_number(true)
        )
        .init();
}

JSON Logging for Log Aggregation

For integration with log aggregation systems like ELK stack:

use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter};

fn setup_json_logging() {
    let filter = EnvFilter::try_from_default_env()
        .unwrap_or_else(|_| EnvFilter::new("info"));
    
    tracing_subscriber::registry()
        .with(filter)
        .with(
            tracing_subscriber::fmt::layer()
                .json()
                .with_target(true)
                .with_thread_ids(true)
                .with_file(true)
                .with_line_number(true)
        )
        .init();
}

This produces JSON output like:

{
  "timestamp": "2026-02-17T10:30:00.123Z",
  "level": "INFO",
  "target": "myapp::order",
  "thread_id": "1",
  "file": "main.rs",
  "line": 42,
  "message": "Processing order",
  "order_id": "ord-123"
}

Distributed Tracing with OpenTelemetry

Distributed tracing tracks requests across service boundaries. We’ll use OpenTelemetry, a vendor-neutral standard.

Installation

[dependencies]
opentelemetry = "0.21"
opentelemetry-sdk = "0.21"
opentelemetry-otlp = "0.14"
opentelemetry-http = "0.10"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
tracing-opentelemetry = "0.21"

Setting Up OpenTelemetry

use opentelemetry::{global, sdk::propagation::TraceContextPropagator};
use opentelemetry_sdk::trace::{self, Sampler};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};

fn init_tracing() -> impl Drop {
    // Set up global propagator for W3C Trace Context
    global::set_text_map_propagator(TraceContextPropagator::new());
    
    // Configure tracing pipeline
    let tracer = opentelemetry_otlp::new_pipeline()
        .with_service_name("order-service")
        .with_trace_config(
            trace::config()
                .with_sampler(Sampler::AlwaysOn)
                .with_id_generator(opentelemetry_sdk::trace::IdGenerator::Default)
        )
        .install_batch(opentelemetry_sdk::runtime::Tokio)
        .expect("Failed to initialize OTLP tracer");
    
    tracing_subscriber::registry()
        .with(tracing_subscriber::fmt::layer())
        .with(tracing_opentelemetry::layer().with_tracer(tracer))
        .init();
    
    // Return a guard to keep the tracer alive
    opentelemetry::global::shutdown_tracer_provider()
}

Using Spans in Microservices

use opentelemetry::global;
use tracing::{info, Instrument};
use reqwest::Client;
use serde_json::json;

pub struct PaymentService {
    client: Client,
    endpoint: String,
}

impl PaymentService {
    pub async fn charge(&self, order_id: &str, amount: f64) -> Result<PaymentResult, Error> {
        let mut injector = opentelemetry::global::get_text_map_propagator(
            &opentelemetry::sdk::propagation::TraceContextPropagator::new()
        );
        
        let span = global::tracer("payment-service")
            .start_with_context("payment.charge", opentelemetry::Context::current());
        
        let _guard = span.enter();
        
        info!(order_id = order_id, amount = amount, "Processing payment");
        
        // Extract trace context from incoming request headers
        let parent_context = global::get_text_map_propagator(
            &opentelemetry::sdk::propagation::TraceContextPropagator::new()
        ).extract(&self.request_headers);
        
        let span = global::tracer("payment-service")
            .start_with_context("http.post", parent_context);
        
        let _guard = span.enter();
        
        let response = self.client
            .post(&self.endpoint)
            .header("Content-Type", "application/json")
            .json(&json!({
                "order_id": order_id,
                "amount": amount
            }))
            .send()
            .await?;
        
        info!(status = response.status(), "Payment processed");
        Ok(PaymentResult { success: true })
    }
}

Jaeger Integration

Jaeger provides a UI for visualizing distributed traces.

Docker Compose Setup

# docker-compose.yml
version: '3.8'
services:
  jaeger:
    image: jaegertracing/all-in-one:1.50
    ports:
      - "6831:6831/udp"
      - "16686:16686"
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    volumes:
      - jaeger-data:/var/lib/jaeger

Rust Jaeger Exporter

[dependencies]
opentelemetry-jaeger = "0.21"

use opentelemetry::{global, sdk::propagation::TraceContextPropagator};
use opentelemetry_sdk::trace::{self, Sampler};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};

fn init_jaeger() -> Result<(), Box<dyn std::error::Error>> {
    global::set_text_map_propagator(TraceContextPropagator::new());
    
    let tracer = opentelemetry_jaeger::new_pipeline()
        .with_service_name("order-service")
        .with_trace_config(
            trace::config()
                .with_sampler(Sampler::AlwaysOn)
        )
        .install_batch(opentelemetry_sdk::runtime::Tokio)?;
    
    tracing_subscriber::registry()
        .with(tracing_subscriber::fmt::layer())
        .with(tracing_opentelemetry::layer().with_tracer(tracer))
        .init();
    
    Ok(())
}

Correlation IDs and Request Context

Passing correlation IDs across service boundaries is essential for tracing:

Creating and Propagating Correlation IDs

use std::sync::Arc;
use tokio::sync::Middleware;

/// Middleware that adds correlation ID to each request
pub async fn run_middleware<'a>(
    mut request: Request,
    next: Next<'a>,
) -> Response {
    // Get correlation ID from header or generate new one
    let correlation_id = request
        .headers()
        .get("X-Correlation-ID")
        .and_then(|v| v.to_str().ok())
        .map(|s| s.to_string())
        .unwrap_or_else(|| uuid::Uuid::new_v4().to_string());
    
    // Add to request extensions for handlers to access
    request.extensions_mut().insert(CorrelationId(correlation_id.clone()));
    
    // Add to response headers
    let mut response = next.run(request).await;
    response.headers_mut().insert(
        "X-Correlation-ID",
        correlation_id.parse().unwrap()
    );
    
    response
}

#[derive(Clone, Debug)]
pub struct CorrelationId(pub String);

/// Extract correlation ID in handler
pub fn get_correlation_id<B>(req: &Request<B>) -> Option<String> {
    req.extensions().get::<CorrelationId>().map(|c| c.0.clone())
}

// Usage in handler
async fn get_order(
    State(state): State<Arc<AppState>>,
    Request(req): Request,
) -> Result<Response, Error> {
    let correlation_id = get_correlation_id(&req)
        .unwrap_or_else(|| "unknown".to_string());
    
    let span = tracing::info_span!(
        "get_order",
        correlation_id = correlation_id
    );
    
    let _enter = span.enter();
    
    info!("Fetching order from database");
    // Handler logic
}

Error Handling and Logging

Proper error handling with contextual logging:

use thiserror::Error;
use tracing::{error, warn};

#[derive(Error, Debug)]
pub enum OrderError {
    #[error("Order not found: {order_id}")]
    NotFound { order_id: String },
    
    #[error("Payment failed: {reason}")]
    PaymentFailed { reason: String },
    
    #[error("Inventory unavailable: {item}")]
    InventoryUnavailable { item: String },
    
    #[error("Database error: {source}")]
    Database {
        #[from]
        source: sqlx::Error,
    },
}

impl OrderError {
    pub fn log(&self) {
        match self {
            OrderError::NotFound { order_id } => {
                warn!(order_id = order_id, "Order not found");
            }
            OrderError::PaymentFailed { reason } => {
                error!(reason = reason, "Payment processing failed");
            }
            OrderError::InventoryUnavailable { item } => {
                warn!(item = item, "Item out of stock");
            }
            OrderError::Database { source } => {
                error!(error = %source, "Database operation failed");
            }
        }
    }
}

// Usage
async fn process_order(order_id: &str) -> Result<Order, OrderError> {
    let order = find_order(order_id)
        .await?
        .ok_or_else(|| OrderError::NotFound { order_id: order_id.to_string() })?;
    
    // Error is automatically logged when converted
    Ok(order)
}

Best Practices

1. Use Appropriate Log Levels

Level	When to Use
ERROR	Failures that need immediate attention
WARN	Unexpected but recoverable situations
INFO	Key business events and state changes
DEBUG	Detailed debugging information
TRACE	Very fine-grained execution flow

2. Include Sufficient Context

// ❌ Bad - lacks context
error!("Payment failed");

// ✅ Good - includes relevant context
error!(
    order_id = %order_id,
    user_id = %user_id,
    amount = %amount,
    error = %error,
    "Payment processing failed"
);

3. Don’t Log Sensitive Data

// ❌ Bad - logs sensitive data
info!(password = %password, "User login attempted");

// ✅ Good - logs relevant but not sensitive data
info!(
    user_id = %user_id,
    ip_address = %ip_address,
    "User login attempted"
);

4. Use Span Hierarchy

#[instrument(skip_all, fields(order_id))]
async fn process_order(order: Order) -> Result<(), Error> {
    validate_order(&order).await?;  // Creates child span
    charge_payment(&order).await?;   // Creates child span
    ship_order(&order).await?;       // Creates child span
    Ok(())
}

Benchmark: Logging Performance

Here’s a quick benchmark comparing logging implementations:

use std::time::Instant;

fn benchmark_logging() {
    // No logging
    let start = Instant::now();
    for _ in 0..100_000 {
        // Do nothing
    }
    println!("No logging: {:?}", start.elapsed());
    
    // tracing::info!
    let start = Instant::now();
    for _ in 0..100_000 {
        tracing::info!("Test message");
    }
    println!("tracing info: {:?}", start.elapsed());
}

Typical results:

No logging: ~1ms
tracing::info!: ~5-10ms
tracing with file output: ~50-100ms

Conclusion

Implementing proper logging and distributed tracing is essential for operating Rust microservices in production. The tracing ecosystem provides excellent tools for:

Structured logging with contextual information
Span-based logging for understanding execution flow
OpenTelemetry integration for vendor-neutral tracing
Jaeger for visualizing distributed traces

Start with basic structured logging, then add distributed tracing as your system grows. The key is consistency—ensure all services use the same correlation ID format and log levels.

Logging and Distributed Tracing in Rust Microservices

Logging and Distributed Tracing in Rust Microservices

Introduction

Structured Logging with the Tracing Crate

Installation

Basic Structured Logging

Logging with Context (Spans)

File Output and Log Rotation

JSON Logging for Log Aggregation

Distributed Tracing with OpenTelemetry

Installation

Setting Up OpenTelemetry

Using Spans in Microservices

Jaeger Integration

Docker Compose Setup

Rust Jaeger Exporter

Correlation IDs and Request Context

Creating and Propagating Correlation IDs

Error Handling and Logging

Best Practices

1. Use Appropriate Log Levels

2. Include Sufficient Context

3. Don’t Log Sensitive Data

4. Use Span Hierarchy

Benchmark: Logging Performance

Conclusion

External Resources

Comments