When choosing a database, one of the most important decisions is understanding consistency guarantees. ACID and BASE represent two fundamentally different approaches to handling data consistency, each with trade-offs.
In this guide, we’ll explore ACID, BASE, the CAP theorem, and help you choose the right model for your application.
Understanding ACID
The ACID Properties
┌─────────────────────────────────────────────────────────────┐
│ ACID Properties │
│ │
│ A - Atomicity │
│ "All or nothing" │
│ Transaction either completes fully or not at all │
│ │
│ C - Consistency │
│ "Valid state to valid state" │
│ Transaction moves database from one valid state to │
│ another │
│ │
│ I - Isolation │
│ "Concurrent transactions appear serial" │
│ Effects of concurrent transactions are hidden │
│ │
│ D - Durability │
│ "Once committed, data survives failures" │
│ Committed data is permanently stored │
│ │
└─────────────────────────────────────────────────────────────┘
Atomicity Example
# ACID Atomicity - Money transfer
# WITHOUT atomicity (BAD)
def transfer_bad(from_account, to_account, amount):
# Step 1: Withdraw
withdraw(from_account, amount)
# If this fails, money disappears!
# Step 2: Deposit
deposit(to_account, amount)
# WITH atomicity (GOOD)
def transfer_atomic(from_account, to_account, amount):
with transaction: # All or nothing
withdraw(from_account, amount)
deposit(to_account, amount)
# Either both succeed or both fail
Isolation Levels
# Different isolation levels
isolation_levels = {
"READ_UNCOMMITTED": {
"description": "Can read uncommitted data",
"problems": "Dirty reads, non-repeatable reads, phantoms",
"use_when": "Never recommended"
},
"READ_COMMITTED": {
"description": "Only read committed data",
"problems": "Non-repeatable reads, phantoms",
"use_when": "Most databases default"
},
"REPEATABLE_READ": {
"description": "Same query returns same result",
"problems": "Phantoms",
"use_when": "Financial transactions"
},
"SERIALIZABLE": {
"description": "Transactions appear serial",
"problems": "None (but slow)",
"use_when": "Critical data"
}
}
# SQLAlchemy isolation levels
from sqlalchemy import create_engine
engine = create_engine(
"postgresql://user:pass@localhost/db",
isolation_level="SERIALIZABLE" # Choose level
)
Understanding BASE
The BASE Properties
┌─────────────────────────────────────────────────────────────┐
│ BASE Properties │
│ │
│ B - Basically Available │
│ System guarantees availability │
│ But may not be consistent │
│ │
│ A - Soft State │
│ State may change over time │
│ Even without input (due to replication lag) │
│ │
│ E - Eventually Consistent │
│ System will become consistent given no input │
│ But may take time │
│ │
│ "Available under partition" │
│ "Soft state" │
│ "Eventually consistent" │
└─────────────────────────────────────────────────────────────┘
Eventual Consistency Example
# BASE - Eventual consistency
class EventuallyConsistentDB:
"""
Write returns immediately,
reads may return stale data temporarily
"""
def __init__(self):
self.primary = {} # Writer
self.replicas = [] # Readers
self.pending_writes = [] # Async replication
def write(self, key, value):
# Write to primary immediately
self.primary[key] = value
# Async replicate to replicas
self.pending_writes.append({
"key": key,
"value": value
})
return "written" # Returns immediately!
def read(self, key):
# Can return stale data from replicas
if self.replicas:
# Read from nearest replica
return self.replicas[0].get(key, self.primary.get(key))
return self.primary.get(key)
def replicate(self):
"""Background replication"""
while self.pending_writes:
write = self.pending_writes.pop()
for replica in self.replicas:
replica[write["key"]] = write["value"]
The CAP Theorem
Understanding CAP
┌─────────────────────────────────────────────────────────────┐
│ CAP Theorem │
│ │
│ You can have at most TWO of three: │
│ │
│ ┌──────────┐ │
│ │ C │ │
│ │ Consistency│ │
│ │ (C) │ │
│ └────┬─────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────┐ ┌────────┐ │
│ │ A │ │ P │ │
│ │Availability│ │Partition│ │
│ └────────┘ │Tolerance│ │
│ │ └────────┘ │
│ │ │ │
│ └────────────┘ │
│ │
│ CP: Consistency + Partition Tolerance │
│ AP: Availability + Partition Tolerance │
│ CA: Consistency + Availability (impossible with P) │
└─────────────────────────────────────────────────────────────┘
CP vs AP Systems
# CP Systems (Consistency over Availability)
# Example: ZooKeeper, etcd, HBase
cp_systems = {
"description": "System becomes unavailable during partition",
"behavior": "Refuse requests until consistency restored",
"use_when": "Financial systems, inventory, locking"
}
# AP Systems (Availability over Consistency)
# Example: Cassandra, DynamoDB, CouchDB
ap_systems = {
"description": "System remains available during partition",
"behavior": "May return stale data",
"use_when": "Shopping carts, likes, counters"
}
# CA Systems (impossible in distributed systems)
# Example: Single-node databases
# Only possible without network partitions
PACELC Model
# PACELC extends CAP
pacelc = """
Even when there is NO partition:
E - Else (latency vs consistency)
L - Latency
C - Consistency
So it's really: "CAP + Else"
"""
ACID vs BASE Comparison
| Aspect | ACID | BASE |
|---|---|---|
| Consistency | Strong | Eventual |
| Availability | May sacrifice | Guaranteed |
| Transactions | Full ACID | Limited |
| Latency | Higher | Lower |
| Scale | Limited | Highly scalable |
| Complexity | Lower | Higher |
| Examples | PostgreSQL, MySQL | Cassandra, DynamoDB |
| Use When | Financial, inventory | Social media, analytics |
Choosing the Right Model
Decision Framework
def choose_consistency_model(requirements):
"""Choose between ACID and BASE"""
# Strong consistency needed?
if requirements.get("strong_consistency"):
return "ACID"
# Financial/transactional?
if requirements.get("is_financial"):
return "ACID"
# High availability needed?
if requirements.get("availability") == "critical":
return "BASE"
# Can tolerate eventual consistency?
if requirements.get("tolerates_stale_data"):
return "BASE"
# Scale requirements?
if requirements.get("scale") == "massive":
return "BASE"
# Default to ACID if unsure
return "ACID"
Common Use Cases
# ACID use cases:
- Banking and payments
- Inventory management
- Order processing
- User authentication
- Anything requiring transactions
# BASE use cases:
- Social media feeds
- Analytics and metrics
- Caching layers
- Shopping carts
- Session stores
- IoT data ingestion
Database Examples
ACID Databases
# PostgreSQL - Strong ACID
import psycopg2
conn = psycopg2.connect("postgresql://user:pass@localhost/db")
conn.set_session(isolation_level='SERIALIZABLE')
try:
with conn.cursor() as cur:
cur.execute("BEGIN")
cur.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
cur.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
cur.execute("COMMIT")
except:
cur.execute("ROLLBACK")
# MySQL with InnoDB
conn = pymysql.connect(...)
with conn.cursor() as cur:
cur.execute("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE")
cur.execute("START TRANSACTION")
# operations
cur.execute("COMMIT")
BASE Databases
# Cassandra - AP (Eventual consistency)
from cassandra.cluster import Cluster
cluster = Cluster(['127.0.0.1'])
session = cluster.connect('mykeyspace')
# Write (can specify consistency)
session.execute(
"INSERT INTO users (id, name) VALUES (%s, %s)",
[1, "John"]
) # Returns immediately!
# Read (may get stale data)
result = session.execute("SELECT * FROM users WHERE id = 1")
# DynamoDB - Eventually consistent by default
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Users')
# Write
table.put_item(Item={'id': '1', 'name': 'John'})
# Eventually consistent read
response = table.get_item(Key={'id': '1'})
# Strongly consistent read
response = table.get_item(
Key={'id': '1'},
ConsistentRead=True
)
Hybrid Approaches
# NewSQL - ACID + Scalability
# Examples: CockroachDB, Spanner, YugabyteDB
# CockroachDB - Distributed ACID
import cockroachdb
conn = cockroachdb.connect("postgresql://user@host:26257/db")
# Full ACID transactions across nodes!
with conn.transaction():
cur.execute("UPDATE accounts SET balance = balance - 100")
cur.execute("UPDATE accounts SET balance = balance + 100")
# Atomic across nodes!
# Event Store - Event sourcing with ACID
from eventstore import EventStore
store = EventStore()
# Append events atomically
stream = store.get_stream("orders")
stream.append([
OrderCreatedEvent(order_id=1),
PaymentReceivedEvent(amount=100)
])
# All or nothing!
Consistency Patterns
Read Your Own Writes
# Ensure you read what you just wrote
class ReadYourOwnWrites:
"""Pattern for BASE systems"""
def __init__(self, db):
self.db = db
def write_and_read(self, key, value):
# Write
self.db.write(key, value)
# Wait for replication
self.db.wait_for_replication(key, value)
# Now read will return our write
return self.db.read(key)
Eventual to Strong Consistency
# Gradually strengthen consistency
consistency_levels = {
"eventual": {
"description": "May return stale data",
"latency": "Lowest"
},
"causal": {
"description": "Respects causality",
"latency": "Low"
},
"prefix": {
"description": "Reads see recent writes",
"latency": "Medium"
},
"strong": {
"description": "Always sees latest",
"latency": "Highest"
}
}
# Choose based on use case
def get_consistency_level(operation):
if operation == "user_profile":
return "strong" # User expects to see their changes
elif operation == "like_count":
return "eventual" # Can be slightly stale
elif operation == "comment_thread":
return "causal" # Order matters
Conclusion
Choosing between ACID and BASE depends on your requirements:
- ACID: When strong consistency is critical (financial, inventory)
- BASE: When availability and scale matter more (social, analytics)
- CAP: You can’t have both during network partitions
Modern databases often provide tunable consistency or hybrid models (NewSQL) to give you the best of both worlds.
Comments