Event-Driven Architecture: Patterns, Kafka, and CQRS

Introduction

Event-driven architecture (EDA) organizes systems around events — things that happened — rather than direct service calls. Instead of Service A calling Service B’s API, A publishes an event and B (and any other interested service) reacts to it. This decoupling is the core benefit: producers and consumers evolve independently.

EDA is no longer niche. By 2025, 37% of organizations had already adopted event-driven architectures, with another 26% planning adoption. The EDA software market is projected to reach $16.5 billion by 2027. Industry data shows organizations using EDA see a median 62% reduction in time-to-market for new features built on existing event streams and 47% shorter integration timelines compared to point-to-point API approaches.

When EDA makes sense:

Multiple services need to react to the same business event
You need audit trails of everything that happened
Services have different scaling requirements
You want to add new consumers without changing producers

When it doesn’t:

Simple CRUD applications — you’ll pay complexity costs for no benefit
When synchronous responses are required
Small teams where operational overhead isn’t justified
Systems with fewer than 3-4 services that need to coordinate

Core Concepts

Event:    Something that happened ("OrderPlaced", "PaymentProcessed")
Producer: The service that emits events
Consumer: The service that reacts to events
Broker:   The infrastructure that routes events (Kafka, RabbitMQ, SQS)
Topic:    A named stream of events (like a table in a database)

Events vs Commands:

Command: "ProcessPayment" — a request to do something (can be rejected)
Event:   "PaymentProcessed" — a fact that already happened (immutable)

Kafka: The Standard Event Broker

Apache Kafka remains the most widely used event streaming platform. Its durable, partitioned log is the default baseline that every alternative measures against. Kafka 4.0, released in 2025, marks a major milestone — the ZooKeeper dependency is finally removed. KRaft (Kafka Raft) is now the only metadata mode.

Kafka 4.0 with KRaft (Docker Compose)

Kafka 4.0 eliminates ZooKeeper entirely, simplifying operations and reducing infrastructure footprint:

## docker-compose.yml (Kafka 4.0 with KRaft)
services:
  kafka:
    image: apache/kafka:4.0.0
    ports:
      - "9092:9092"
    environment:
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_NODE_ID: 1
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9093
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: true

  kafka-ui:
    image: provectuslabs/kafka-ui:latest
    ports:
      - "8080:8080"
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092

Start with a single command and access the UI at http://localhost:8080:

docker compose up -d

Producing Events (Node.js)

npm install kafkajs

// producer.js
import { Kafka } from 'kafkajs';

const kafka = new Kafka({
    clientId: 'order-service',
    brokers: ['localhost:9092'],
});

const producer = kafka.producer();

async function publishOrderPlaced(order) {
    await producer.connect();

    await producer.send({
        topic: 'orders',
        messages: [
            {
                key: order.id,
                value: JSON.stringify({
                    eventType: 'OrderPlaced',
                    eventId: crypto.randomUUID(),
                    timestamp: new Date().toISOString(),
                    version: 1,
                    data: {
                        orderId: order.id,
                        userId: order.userId,
                        items: order.items,
                        total: order.total,
                    },
                }),
                headers: {
                    'content-type': 'application/json',
                    'source': 'order-service',
                },
            },
        ],
    });
}

// In your order creation handler
app.post('/api/orders', async (req, res) => {
    const order = await db.createOrder(req.body);
    await publishOrderPlaced(order);
    res.status(201).json(order);
});

Consuming Events

// consumer.js
import { Kafka } from 'kafkajs';

const kafka = new Kafka({ brokers: ['localhost:9092'] });

const consumer = kafka.consumer({
    groupId: 'notification-service',
});

async function startConsumer() {
    await consumer.connect();
    await consumer.subscribe({ topic: 'orders', fromBeginning: false });

    await consumer.run({
        eachMessage: async ({ topic, partition, message }) => {
            const event = JSON.parse(message.value.toString());

            console.log(`Processing ${event.eventType} for order ${event.data.orderId}`);

            try {
                switch (event.eventType) {
                    case 'OrderPlaced':
                        await sendOrderConfirmationEmail(event.data);
                        break;
                    case 'OrderShipped':
                        await sendShippingNotification(event.data);
                        break;
                    default:
                        console.log('Unknown event type:', event.eventType);
                }
            } catch (err) {
                console.error('Failed to process event:', err);
            }
        },
    });
}

Consumer Groups

Multiple consumer groups read the same topic independently — each group maintains its own offset:

Topic: orders
├── Consumer Group: notification-service  → sends emails
├── Consumer Group: inventory-service     → updates stock
└── Consumer Group: analytics-service    → updates dashboards

Choosing an Event Broker in 2026

Kafka is the default, but it is no longer the only option. The 2024-2026 period has produced serious contenders, each optimizing for different trade-offs. IBM’s $11 billion acquisition of Confluent (announced December 2025) confirms streaming is now infrastructure — not a feature.

Broker	Type	Key Strength	Best For
Apache Kafka 4.0	Log-based streaming	Mature ecosystem, highest throughput, KRaft (no ZooKeeper)	High-volume pipelines, event sourcing, source-of-truth
Redpanda	Kafka-compatible (C++)	No JVM/ZooKeeper, lower latency, simpler ops	Latency-sensitive workloads, teams wanting Kafka without overhead
WarpStream	S3-backed Kafka	80-85% lower cost than self-hosted Kafka	Logging, analytics, cost-sensitive workloads
Apache Pulsar	Native multi-tenant	Separate serving/storage layers, geo-replication	Multi-team platforms, cloud-native deployments
AutoMQ	Auto-scaling Kafka fork	Elastic scaling on cloud storage	Variable-throughput workloads on AWS/GCP

Redpanda rewrites the Kafka protocol in C++ — no JVM heap tuning, no ZooKeeper. Single binary, lower latency, 220+ connectors via Redpanda Connect. If you’re starting fresh and want Kafka compatibility without Kafka operations, Redpanda is the most mature alternative.

WarpStream (acquired by Confluent in 2024) takes a radical approach: stateless agents backed by object storage (S3). No local disks, no inter-AZ replication, no broker state. Cost is 4x lower (~6 instances vs ~24 for equivalent Kafka), but p99 write latency is ~400-600ms on S3 Standard. Suitable for logging and analytics, not real-time fraud detection.

Apache Pulsar separates serving and storage layers so each scales independently. Native multi-tenancy and built-in geo-replication make it attractive for organizations running a single streaming platform across multiple teams.

When Kafka is still the right choice: you need the reference ecosystem — the widest set of clients, connectors, and processing tools (Kafka Streams, ksqlDB, Flink). For most organizations, Kafka remains the safest default.

Event Schema Design

Good event schemas are self-describing and versioned:

// Good event structure
{
    "eventType": "OrderPlaced",
    "eventId": "550e8400-e29b-41d4-a716-446655440000",
    "timestamp": "2026-03-30T10:00:00Z",
    "version": 1,
    "source": "order-service",
    "correlationId": "req-abc123",  // trace requests across services
    "data": {
        "orderId": "ord-123",
        "userId": "user-456",
        "items": [{"productId": "prod-789", "quantity": 2, "price": 29.99}],
        "total": 59.98
    }
}

Schema evolution rules:

Adding new fields: safe (consumers ignore unknown fields)
Removing fields: breaking change — use versioning
Renaming fields: breaking change — add new field, deprecate old

Schemas should be registered in a Schema Registry (Confluent Schema Registry, Redpanda Schema Registry, or Apicurio). Without schema governance, event producers and consumers silently diverge, causing runtime failures that are hard to debug.

Event Sourcing

Instead of storing current state, store the sequence of events that led to it:

// Traditional: store current state
// users table: { id, name, email, balance: 150 }

// Event sourcing: store events
// events table:
// { type: "AccountOpened",  data: { userId: 1, initialBalance: 100 } }
// { type: "MoneyDeposited", data: { userId: 1, amount: 100 } }
// { type: "MoneyWithdrawn", data: { userId: 1, amount: 50 } }
// Current balance = 100 + 100 - 50 = 150

// Rebuild state by replaying events
class BankAccount {
    constructor() {
        this.balance = 0;
        this.events = [];
    }

    apply(event) {
        switch (event.type) {
            case 'AccountOpened':
                this.balance = event.data.initialBalance;
                break;
            case 'MoneyDeposited':
                this.balance += event.data.amount;
                break;
            case 'MoneyWithdrawn':
                if (event.data.amount > this.balance) {
                    throw new Error('Insufficient funds');
                }
                this.balance -= event.data.amount;
                break;
        }
        this.events.push(event);
    }

    static fromEvents(events) {
        const account = new BankAccount();
        events.forEach(e => account.apply(e));
        return account;
    }
}

// Load account from event store
const events = await eventStore.getEvents('account-123');
const account = BankAccount.fromEvents(events);
console.log(account.balance);  // current balance

Benefits of event sourcing:

Complete audit trail — you know exactly what happened and when
Time travel — reconstruct state at any point in time
Event replay — rebuild read models, fix bugs by replaying

Costs:

More complex than CRUD
Eventual consistency between write and read models
Event schema evolution is harder
Snapshot strategies are required at scale (replaying 10M events is slow)

Production Event Stores

When implementing event sourcing in production, use a purpose-built event store rather than a general-purpose database:

Solution	Type	Notes
Apache Kafka	Durable log	Most common choice; event sourcing via topics with compaction
Kurrent (formerly EventStoreDB)	Purpose-built event store	gRPC client API, projections, rebranded from EventStoreDB in 2025
Marten	.NET library on PostgreSQL	Good for .NET shops that want event sourcing without a separate store

CQRS: Command Query Responsibility Segregation

Separate the write model (commands) from the read model (queries):

Write side:                    Read side:
POST /orders → OrderService → EventStore → Projector → ReadDB
                                                           ↓
GET /orders/{id} ──────────────────────────────────→ ReadDB

// Write side: command handler
async function handlePlaceOrder(command) {
    if (command.items.length === 0) {
        throw new Error('Order must have at least one item');
    }

    const event = {
        type: 'OrderPlaced',
        orderId: generateId(),
        userId: command.userId,
        items: command.items,
        total: calculateTotal(command.items),
        timestamp: new Date().toISOString(),
    };

    await eventStore.append('orders', event.orderId, event);
    await kafka.publish('orders', event);

    return event.orderId;
}

// Read side: projection (builds read model from events)
async function projectOrderSummary(event) {
    if (event.type === 'OrderPlaced') {
        await readDb.upsert('order_summaries', {
            id: event.orderId,
            userId: event.userId,
            status: 'pending',
            total: event.total,
            itemCount: event.items.length,
            createdAt: event.timestamp,
        });
    }

    if (event.type === 'OrderShipped') {
        await readDb.update('order_summaries',
            { id: event.orderId },
            { status: 'shipped', shippedAt: event.timestamp }
        );
    }
}

// Read side: query handler (fast, optimized for reads)
async function getOrderSummary(orderId) {
    return readDb.findOne('order_summaries', { id: orderId });
}

CQRS and Event Sourcing are independent patterns, but in production they are almost always used together. Event Sourcing provides the write model; projections create optimized read models that handle the query side.

Saga Pattern: Distributed Transactions

When a business process spans multiple services, use sagas to coordinate:

// Choreography-based saga (services react to events)
// 1. OrderService publishes OrderPlaced
// 2. PaymentService reacts, publishes PaymentProcessed or PaymentFailed
// 3. InventoryService reacts to PaymentProcessed, publishes InventoryReserved
// 4. ShippingService reacts to InventoryReserved, publishes OrderShipped

// Compensation: if any step fails, publish compensating events
// PaymentFailed → OrderService cancels order
// InventoryFailed → PaymentService refunds payment

Dead Letter Queue

Handle events that fail processing:

const dlqProducer = kafka.producer();

async function processWithDLQ(event) {
    try {
        await processEvent(event);
    } catch (err) {
        console.error('Failed to process event, sending to DLQ:', err);

        await dlqProducer.send({
            topic: 'orders.dlq',
            messages: [{
                key: event.orderId,
                value: JSON.stringify({
                    originalEvent: event,
                    error: err.message,
                    failedAt: new Date().toISOString(),
                    retryCount: (event.retryCount || 0) + 1,
                }),
            }],
        });
    }
}

EDA and AI/ML Integration

Event-driven architecture is becoming the backbone for real-time machine learning. In 2025-2026, several developments connect EDA directly to AI workloads:

Streaming feature stores — Databricks introduced streaming feature stores in 2024, letting ML systems consume events directly to update models in near real time. Instead of batch-training on stale data, models receive feature updates as events arrive.

Flink ML — Flink 2.0 added native ML inference APIs in streaming jobs. A model can score each event as it passes through a pipeline, enabling online recommendation engines, fraud detection, and personalization at stream processing speed.

Streamhouse pattern — The convergence of streaming (Kafka) and lakehouse (Iceberg, Delta Lake) creates a single architecture where events serve both real-time and batch workloads. Events land in Kafka for immediate consumption and are materialized into Iceberg tables for historical analysis.

// Streaming ML inference pattern
// 1. Raw events arrive on Kafka topic "user-actions"
// 2. Flink job deserializes, enriches with feature store
// 3. ML model scores each enriched event
// 4. Predictions published to "predictions" topic
// 5. Downstream services react to predictions in real time

Event Portals and Governance

As EDA adoption grows, organizations need tooling to design, discover, and govern events. An event portal provides a catalog of all events, their schemas, producers, and consumers — analogous to an API gateway for events.

Capabilities:

Event discovery: Browse available events by domain, type, or producer
Schema governance: Version management, compatibility checks, deprecation
Dependency mapping: See which services produce and consume each event
Impact analysis: Predict the blast radius of schema changes

Solace PubSub+, Confluent Cloud, and Redpanda all offer event portal capabilities. For open-source options, Apicurio Registry and the AsyncAPI specification provide schema governance and event documentation.

Decision Framework: Should You Go Event-Driven?

The mistake teams make most often is adopting EDA everywhere because it is “modern.” If your system is 15 CRUD endpoints and a dashboard, a PostgreSQL database and REST APIs are the right choice. EDA solves real problems at scale, but the complexity costs are real.

Scenario	Recommendation
1-3 services, simple CRUD	REST/GraphQL — don’t overcomplicate
Multiple services need same data	Event streaming (Kafka or alternative)
Audit/compliance requirements	Event sourcing + CQRS
Real-time ML/AI inference	Streaming feature store + Flink
Multi-team platform	Pulsar (native multi-tenancy)
Cost-sensitive, analytics workload	WarpStream (S3-backed)

Conclusion

Event-driven architecture decouples producers from consumers, enabling scalable, resilient systems. In 2026, the ecosystem has matured significantly — Kafka 4.0 eliminates ZooKeeper, serious alternatives exist for specific workloads, and streaming is becoming the backbone for AI inference. Start with a single event stream and clear schema governance before expanding. Choose your event broker based on throughput needs, latency requirements, and operational capacity. And remember: EDA is a tool, not a goal. Use it where it solves real problems.

Event-Driven Architecture: Patterns, Kafka, and CQRS

Introduction

Core Concepts

Kafka: The Standard Event Broker

Kafka 4.0 with KRaft (Docker Compose)

Producing Events (Node.js)

Consuming Events

Consumer Groups

Choosing an Event Broker in 2026

Event Schema Design

Event Sourcing

Production Event Stores

CQRS: Command Query Responsibility Segregation

Saga Pattern: Distributed Transactions

Dead Letter Queue

EDA and AI/ML Integration

Event Portals and Governance

Decision Framework: Should You Go Event-Driven?

Conclusion

Resources

Comments

Share this article

👍 Was this article helpful?