Skip to main content
โšก Calmops

Distributed Transactions: Two-Phase Commit, Three-Phase Commit, and Beyond

Distributed transactions ensure data consistency across multiple databases or services. This guide covers protocols and approaches for managing them.

The Problem

# Without distributed transactions

scenario:
  user: "Transfer $100 between accounts"
  services:
    - "Account Service (Database A)"
    - "Account Service (Database B)"
  
  steps:
    - "Deduct from Account A"
    - "Add to Account B"
  
  risks:
    - "Deduct succeeds, add fails โ†’ Lost money!"
    - "Partial completion possible"

Two-Phase Commit (2PC)

Protocol

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Two-Phase Commit                             โ”‚
โ”‚                                                             โ”‚
โ”‚   Phase 1: Prepare (Voting)                                 โ”‚
โ”‚                                                             โ”‚
โ”‚   Coordinator โ”€โ”€โ–บ All Participants                          โ”‚
โ”‚   "Can you commit?"                                        โ”‚
โ”‚                                                             โ”‚
โ”‚   Participant1 โ”€โ”€โ–บ "Yes, ready!"                          โ”‚
โ”‚   Participant2 โ”€โ”€โ–บ "Yes, ready!"                          โ”‚
โ”‚   Participant3 โ”€โ”€โ–บ "Yes, ready!"                          โ”‚
โ”‚                                                             โ”‚
โ”‚   Phase 2: Commit                                          โ”‚
โ”‚                                                             โ”‚
โ”‚   Coordinator โ”€โ”€โ–บ All Participants                         โ”‚
โ”‚   "COMMIT!"                                                โ”‚
โ”‚                                                             โ”‚
โ”‚   All โ”€โ”€โ–บ "Committed!"                                     โ”‚
โ”‚                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Implementation

class TwoPhaseCommit:
    def __init__(self, participants):
        self.participants = participants  # List of databases
        self.state = "INIT"
    
    def execute(self, operations):
        # Phase 1: Prepare
        self.state = "PREPARING"
        
        votes = []
        for participant in self.participants:
            try:
                vote = participant.prepare()
                votes.append(vote)
            except Exception as e:
                votes.append("NO")
        
        # Check all voted yes
        if all(v == "YES" for v in votes):
            # Phase 2: Commit
            self.state = "COMMITTING"
            
            for participant in self.participants:
                try:
                    participant.commit()
                except Exception as e:
                    # Need recovery
                    self.handle_failure(participant)
            
            self.state = "COMMITTED"
            return True
        else:
            # Rollback
            self.state = "ROLLING_BACK"
            
            for participant in self.participants:
                try:
                    participant.rollback()
                except:
                    pass
            
            self.state = "ROLLED_BACK"
            return False
    
    def handle_failure(self, participant):
        """Recovery handler for failures"""
        # Log for manual intervention
        # Or retry commit
        pass

Three-Phase Commit (3PC)

Protocol

# 3PC adds a "pre-commit" phase

phases:
  - name: "CanCommit?"
    description: "Coordinator asks if can prepare"
    
  - name: "PreCommit"
    description: "Coordinator says prepare to commit"
    
  - name: "DoCommit"
    description: "Actually commit"

advantages:
  - "No blocking on coordinator failure"
  - "Can recover if participants don't hear final phase"

Modern Approaches

Saga Pattern

# Already covered in earlier article
# See: saga-pattern-distributed-transactions.md

Outbox Pattern

# Outbox pattern for reliable events

class OutboxPattern:
    """Write to outbox table, then publish"""
    
    def __init__(self, db, event_publisher):
        self.db = db
        self.publisher = event_publisher
    
    def transfer(self, from_id, to_id, amount):
        # Single transaction: update + outbox
        self.db.execute("""
            BEGIN;
            
            -- Update accounts
            UPDATE accounts SET balance = balance - %s WHERE id = %s;
            UPDATE accounts SET balance = balance + %s WHERE id = %s;
            
            -- Write to outbox
            INSERT INTO outbox (event_type, payload)
            VALUES ('transfer', '{"from": %s, "to": %s, "amount": %s}');
            
            COMMIT;
        """, (amount, from_id, amount, to_id, from_id, to_id, amount))
        
        # Separate process publishes outbox
        self.publish_outbox()
    
    def publish_outbox(self):
        while True:
            events = self.db.execute("""
                SELECT * FROM outbox 
                WHERE published = false 
                ORDER BY created_at 
                LIMIT 10
            """)
            
            for event in events:
                try:
                    self.publisher.publish(event)
                    self.db.execute(
                        "UPDATE outbox SET published = true WHERE id = %s",
                        (event.id,)
                    )
                except:
                    break

Try-Confirm/Cancel (TCC)

TCC is a popular pattern that provides stronger guarantees than Saga:

class TCCTransaction:
    """Try-Confirm/Cancel pattern implementation"""
    
    def __init__(self, participants):
        self.participants = participants
        self.try_results = {}
    
    def execute(self, operations):
        # Phase 1: Try
        self._try_phase(operations)
        
        # Phase 2: Confirm or Cancel
        if self._all_confirmed():
            self._confirm_phase()
        else:
            self._cancel_phase()
    
    def _try_phase(self, operations):
        """Reserve resources, don't commit yet"""
        for participant, operation in operations.items():
            try:
                result = participant.try_reserve(operation)
                self.try_results[participant] = {"status": "reserved", "data": result}
            except Exception as e:
                self.try_results[participant] = {"status": "failed", "error": e}
    
    def _all_confirmed(self):
        return all(
            r.get("status") == "reserved" 
            for r in self.try_results.values()
        )
    
    def _confirm_phase(self):
        """Commit the reservation"""
        for participant, result in self.try_results.items():
            if result["status"] == "reserved":
                participant.confirm(result["data"])
    
    def _cancel_phase(self):
        """Release the reservation"""
        for participant, result in self.try_results.items():
            if result["status"] == "reserved":
                participant.cancel(result["data"])

TCC Example: Payment Service

class PaymentTCC:
    """TCC implementation for payment processing"""
    
    def __init__(self, account_db, order_db):
        self.accounts = account_db
        self.orders = order_db
    
    def try_reserve(self, order_id, amount):
        # Check balance and reserve
        account = self.accounts.get(order_id.user_id)
        if account.balance < amount:
            raise InsufficientFundsError()
        
        # Freeze the amount
        self.accounts.freeze(order_id.user_id, amount)
        
        # Update order status
        self.orders.update(order_id, {"status": "RESERVING"})
        
        return {"user_id": order_id.user_id, "amount": amount}
    
    def confirm(self, reservation_data):
        # Capture the reserved amount
        self.accounts.capture(reservation_data["user_id"], reservation_data["amount"])
        # Update order to confirmed
        self.orders.update(order_id, {"status": "CONFIRMED"})
    
    def cancel(self, reservation_data):
        # Release the reserved amount
        self.accounts.release(reservation_data["user_id"], reservation_data["amount"])
        # Update order to cancelled
        self.orders.update(order_id, {"status": "CANCELLED"})

Seata Framework

Seata is an open-source distributed transaction solution:

# Seata configuration
seata:
  application-id: payment-service
  tx-service-group: my_tx_group
  config:
    type: nacos
    nacos:
      server-addr: localhost:8848
      namespace: seata
  registry:
    type: nacos
    nacos:
      server-addr: localhost:8848
      application: seata-server
// Seata AT mode (automatic transaction)
@GlobalTransactional(timeoutMills = 30000, name = "transfer-money")
public void transfer(String fromId, String toId, BigDecimal amount) {
    // These operations are automatically managed by Seata
    accountMapper.decrease(fromId, amount);
    accountMapper.increase(toId, amount);
}
// Seata TCC mode
@LocalTCC
public interface TccAction {
    @TwoPhaseBusinessAction(
        name = "deduct",
        commitMethod = "confirm",
        rollbackMethod = "cancel"
    )
    boolean tryDeduct(@BusinessActionContextParameter(paramName = "amount") BigDecimal amount);
    
    boolean confirm(BusinessActionContext context);
    
    boolean cancel(BusinessActionContext context);
}

Event Sourcing for Transactions

class EventSourcedTransaction:
    """Using event sourcing for distributed transactions"""
    
    def __init__(self, event_store):
        self.event_store = event_store
    
    def transfer(self, from_id, to_id, amount):
        # Create events
        events = [
            MoneyDebited(from_id, amount),
            MoneyCredited(to_id, amount)
        ]
        
        # Validate and append
        self.event_store.append_events(
            stream_id=f"account-{from_id}",
            events=events
        )
    
    def replay(self, account_id):
        """Reconstruct account state from events"""
        events = self.event_store.get_events(f"account-{account_id}")
        
        balance = 0
        for event in events:
            if isinstance(event, MoneyDebited):
                balance -= event.amount
            elif isinstance(event, MoneyCredited):
                balance += event.amount
        
        return balance

Handling Failures

# Idempotency for distributed transactions

class IdempotentTransaction:
    def __init__(self, transaction_id, db):
        self.transaction_id = transaction_id
        self.db = db
    
    def execute(self, action):
        # Check if already processed
        existing = self.db.execute(
            "SELECT status FROM transactions WHERE id = %s",
            (self.transaction_id,)
        )
        
        if existing:
            return existing.status
        
        # Process and record
        result = action()
        
        self.db.execute(
            "INSERT INTO transactions (id, status, result) VALUES (%s, %s, %s)",
            (self.transaction_id, "COMPLETED", result)
        )
        
        return result

Retry with Exponential Backoff

import time
from functools import wraps

def retry_with_backoff(max_retries=3, base_delay=1):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_retries - 1:
                        raise
                    delay = base_delay * (2 ** attempt)
                    time.sleep(delay)
            return None
        return wrapper
    return decorator

@retry_with_backoff(max_retries=5, base_delay=2)
def commit_with_retry(transaction):
    return transaction.commit()

Choosing the Right Pattern

# Decision matrix

pattern_selection:
  2PC:
    use_when:
      - "Strong consistency required"
      - "Small number of participants"
      - "Low latency acceptable"
    avoid_when:
      - "High latency requirements"
      - "Many participants"
      - "Network partitions likely"

  3PC:
    use_when:
      - "Need better failure handling than 2PC"
      - "Moderate number of participants"
    avoid_when:
      - "Network latency is high"
      - "Simple solution suffices"

  Saga:
    use_when:
      - "Microservices architecture"
      - "Eventual consistency acceptable"
      - "Long-running transactions"
    avoid_when:
      - "Strict ACID required"
      - "Small, simple system"

  TCC:
    use_when:
      - "Need strong guarantees"
      - "Can implement try/confirm/cancel"
      - "Higher complexity acceptable"
    avoid_when:
      - "Simple use case"
      - "Can't modify participant code"

  Outbox:
    use_when:
      - "Need reliable event publishing"
      - "Can tolerate eventual consistency"
      - "Event-driven architecture"
    avoid_when:
      - "Synchronous response needed"

Best Practices

# Distributed transaction best practices

design:
  - "Avoid if possible"
  - "Use eventual consistency"
  - "Design for failure"
  - "Make operations idempotent"
  - "Implement proper monitoring"
  - "Use distributed tracing"

when_needed:
  - "Financial transactions"
  - "Inventory management"
  - "Order processing"
  - "Multi-service updates"

alternatives:
  - "Saga pattern"
  - "Event sourcing"
  - "Outbox pattern"
  - "TCC"

monitoring:
  - "Track transaction duration"
  - "Alert on failures"
  - "Monitor compensation events"
  - "Log all rollback attempts"

testing:
  - "Test failure scenarios"
  - "Test network partitions"
  - "Test concurrent transactions"
  - "Test idempotency"

Conclusion

Distributed transactions are complex:

  • 2PC: Classic but has coordinator failure issues
  • 3PC: Better failure handling, more complex
  • TCC: Stronger guarantees than Saga, requires participant support
  • Saga: Better for microservices, eventual consistency
  • Outbox: Reliable event publishing
  • Seata: Enterprise-grade solution with multiple modes

Choose the simplest approach that meets your consistency requirements.

For 2026, trends include:

  • Seata becoming the standard for Java/Go microservices
  • Event-driven architectures making Saga and Outbox more popular
  • Cloud-native transaction managers simplifying deployment
  • Combining patterns (TCC + Saga) for complex scenarios

Comments