Distributed transactions ensure data consistency across multiple databases or services. This guide covers protocols and approaches for managing them.
The Problem
# Without distributed transactions
scenario:
user: "Transfer $100 between accounts"
services:
- "Account Service (Database A)"
- "Account Service (Database B)"
steps:
- "Deduct from Account A"
- "Add to Account B"
risks:
- "Deduct succeeds, add fails โ Lost money!"
- "Partial completion possible"
Two-Phase Commit (2PC)
Protocol
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Two-Phase Commit โ
โ โ
โ Phase 1: Prepare (Voting) โ
โ โ
โ Coordinator โโโบ All Participants โ
โ "Can you commit?" โ
โ โ
โ Participant1 โโโบ "Yes, ready!" โ
โ Participant2 โโโบ "Yes, ready!" โ
โ Participant3 โโโบ "Yes, ready!" โ
โ โ
โ Phase 2: Commit โ
โ โ
โ Coordinator โโโบ All Participants โ
โ "COMMIT!" โ
โ โ
โ All โโโบ "Committed!" โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Implementation
class TwoPhaseCommit:
def __init__(self, participants):
self.participants = participants # List of databases
self.state = "INIT"
def execute(self, operations):
# Phase 1: Prepare
self.state = "PREPARING"
votes = []
for participant in self.participants:
try:
vote = participant.prepare()
votes.append(vote)
except Exception as e:
votes.append("NO")
# Check all voted yes
if all(v == "YES" for v in votes):
# Phase 2: Commit
self.state = "COMMITTING"
for participant in self.participants:
try:
participant.commit()
except Exception as e:
# Need recovery
self.handle_failure(participant)
self.state = "COMMITTED"
return True
else:
# Rollback
self.state = "ROLLING_BACK"
for participant in self.participants:
try:
participant.rollback()
except:
pass
self.state = "ROLLED_BACK"
return False
def handle_failure(self, participant):
"""Recovery handler for failures"""
# Log for manual intervention
# Or retry commit
pass
Three-Phase Commit (3PC)
Protocol
# 3PC adds a "pre-commit" phase
phases:
- name: "CanCommit?"
description: "Coordinator asks if can prepare"
- name: "PreCommit"
description: "Coordinator says prepare to commit"
- name: "DoCommit"
description: "Actually commit"
advantages:
- "No blocking on coordinator failure"
- "Can recover if participants don't hear final phase"
Modern Approaches
Saga Pattern
# Already covered in earlier article
# See: saga-pattern-distributed-transactions.md
Outbox Pattern
# Outbox pattern for reliable events
class OutboxPattern:
"""Write to outbox table, then publish"""
def __init__(self, db, event_publisher):
self.db = db
self.publisher = event_publisher
def transfer(self, from_id, to_id, amount):
# Single transaction: update + outbox
self.db.execute("""
BEGIN;
-- Update accounts
UPDATE accounts SET balance = balance - %s WHERE id = %s;
UPDATE accounts SET balance = balance + %s WHERE id = %s;
-- Write to outbox
INSERT INTO outbox (event_type, payload)
VALUES ('transfer', '{"from": %s, "to": %s, "amount": %s}');
COMMIT;
""", (amount, from_id, amount, to_id, from_id, to_id, amount))
# Separate process publishes outbox
self.publish_outbox()
def publish_outbox(self):
while True:
events = self.db.execute("""
SELECT * FROM outbox
WHERE published = false
ORDER BY created_at
LIMIT 10
""")
for event in events:
try:
self.publisher.publish(event)
self.db.execute(
"UPDATE outbox SET published = true WHERE id = %s",
(event.id,)
)
except:
break
Try-Confirm/Cancel (TCC)
TCC is a popular pattern that provides stronger guarantees than Saga:
class TCCTransaction:
"""Try-Confirm/Cancel pattern implementation"""
def __init__(self, participants):
self.participants = participants
self.try_results = {}
def execute(self, operations):
# Phase 1: Try
self._try_phase(operations)
# Phase 2: Confirm or Cancel
if self._all_confirmed():
self._confirm_phase()
else:
self._cancel_phase()
def _try_phase(self, operations):
"""Reserve resources, don't commit yet"""
for participant, operation in operations.items():
try:
result = participant.try_reserve(operation)
self.try_results[participant] = {"status": "reserved", "data": result}
except Exception as e:
self.try_results[participant] = {"status": "failed", "error": e}
def _all_confirmed(self):
return all(
r.get("status") == "reserved"
for r in self.try_results.values()
)
def _confirm_phase(self):
"""Commit the reservation"""
for participant, result in self.try_results.items():
if result["status"] == "reserved":
participant.confirm(result["data"])
def _cancel_phase(self):
"""Release the reservation"""
for participant, result in self.try_results.items():
if result["status"] == "reserved":
participant.cancel(result["data"])
TCC Example: Payment Service
class PaymentTCC:
"""TCC implementation for payment processing"""
def __init__(self, account_db, order_db):
self.accounts = account_db
self.orders = order_db
def try_reserve(self, order_id, amount):
# Check balance and reserve
account = self.accounts.get(order_id.user_id)
if account.balance < amount:
raise InsufficientFundsError()
# Freeze the amount
self.accounts.freeze(order_id.user_id, amount)
# Update order status
self.orders.update(order_id, {"status": "RESERVING"})
return {"user_id": order_id.user_id, "amount": amount}
def confirm(self, reservation_data):
# Capture the reserved amount
self.accounts.capture(reservation_data["user_id"], reservation_data["amount"])
# Update order to confirmed
self.orders.update(order_id, {"status": "CONFIRMED"})
def cancel(self, reservation_data):
# Release the reserved amount
self.accounts.release(reservation_data["user_id"], reservation_data["amount"])
# Update order to cancelled
self.orders.update(order_id, {"status": "CANCELLED"})
Seata Framework
Seata is an open-source distributed transaction solution:
# Seata configuration
seata:
application-id: payment-service
tx-service-group: my_tx_group
config:
type: nacos
nacos:
server-addr: localhost:8848
namespace: seata
registry:
type: nacos
nacos:
server-addr: localhost:8848
application: seata-server
// Seata AT mode (automatic transaction)
@GlobalTransactional(timeoutMills = 30000, name = "transfer-money")
public void transfer(String fromId, String toId, BigDecimal amount) {
// These operations are automatically managed by Seata
accountMapper.decrease(fromId, amount);
accountMapper.increase(toId, amount);
}
// Seata TCC mode
@LocalTCC
public interface TccAction {
@TwoPhaseBusinessAction(
name = "deduct",
commitMethod = "confirm",
rollbackMethod = "cancel"
)
boolean tryDeduct(@BusinessActionContextParameter(paramName = "amount") BigDecimal amount);
boolean confirm(BusinessActionContext context);
boolean cancel(BusinessActionContext context);
}
Event Sourcing for Transactions
class EventSourcedTransaction:
"""Using event sourcing for distributed transactions"""
def __init__(self, event_store):
self.event_store = event_store
def transfer(self, from_id, to_id, amount):
# Create events
events = [
MoneyDebited(from_id, amount),
MoneyCredited(to_id, amount)
]
# Validate and append
self.event_store.append_events(
stream_id=f"account-{from_id}",
events=events
)
def replay(self, account_id):
"""Reconstruct account state from events"""
events = self.event_store.get_events(f"account-{account_id}")
balance = 0
for event in events:
if isinstance(event, MoneyDebited):
balance -= event.amount
elif isinstance(event, MoneyCredited):
balance += event.amount
return balance
Handling Failures
# Idempotency for distributed transactions
class IdempotentTransaction:
def __init__(self, transaction_id, db):
self.transaction_id = transaction_id
self.db = db
def execute(self, action):
# Check if already processed
existing = self.db.execute(
"SELECT status FROM transactions WHERE id = %s",
(self.transaction_id,)
)
if existing:
return existing.status
# Process and record
result = action()
self.db.execute(
"INSERT INTO transactions (id, status, result) VALUES (%s, %s, %s)",
(self.transaction_id, "COMPLETED", result)
)
return result
Retry with Exponential Backoff
import time
from functools import wraps
def retry_with_backoff(max_retries=3, base_delay=1):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt)
time.sleep(delay)
return None
return wrapper
return decorator
@retry_with_backoff(max_retries=5, base_delay=2)
def commit_with_retry(transaction):
return transaction.commit()
Choosing the Right Pattern
# Decision matrix
pattern_selection:
2PC:
use_when:
- "Strong consistency required"
- "Small number of participants"
- "Low latency acceptable"
avoid_when:
- "High latency requirements"
- "Many participants"
- "Network partitions likely"
3PC:
use_when:
- "Need better failure handling than 2PC"
- "Moderate number of participants"
avoid_when:
- "Network latency is high"
- "Simple solution suffices"
Saga:
use_when:
- "Microservices architecture"
- "Eventual consistency acceptable"
- "Long-running transactions"
avoid_when:
- "Strict ACID required"
- "Small, simple system"
TCC:
use_when:
- "Need strong guarantees"
- "Can implement try/confirm/cancel"
- "Higher complexity acceptable"
avoid_when:
- "Simple use case"
- "Can't modify participant code"
Outbox:
use_when:
- "Need reliable event publishing"
- "Can tolerate eventual consistency"
- "Event-driven architecture"
avoid_when:
- "Synchronous response needed"
Best Practices
# Distributed transaction best practices
design:
- "Avoid if possible"
- "Use eventual consistency"
- "Design for failure"
- "Make operations idempotent"
- "Implement proper monitoring"
- "Use distributed tracing"
when_needed:
- "Financial transactions"
- "Inventory management"
- "Order processing"
- "Multi-service updates"
alternatives:
- "Saga pattern"
- "Event sourcing"
- "Outbox pattern"
- "TCC"
monitoring:
- "Track transaction duration"
- "Alert on failures"
- "Monitor compensation events"
- "Log all rollback attempts"
testing:
- "Test failure scenarios"
- "Test network partitions"
- "Test concurrent transactions"
- "Test idempotency"
Conclusion
Distributed transactions are complex:
- 2PC: Classic but has coordinator failure issues
- 3PC: Better failure handling, more complex
- TCC: Stronger guarantees than Saga, requires participant support
- Saga: Better for microservices, eventual consistency
- Outbox: Reliable event publishing
- Seata: Enterprise-grade solution with multiple modes
Choose the simplest approach that meets your consistency requirements.
For 2026, trends include:
- Seata becoming the standard for Java/Go microservices
- Event-driven architectures making Saga and Outbox more popular
- Cloud-native transaction managers simplifying deployment
- Combining patterns (TCC + Saga) for complex scenarios
Comments