When choosing a database, one of the most important decisions is understanding consistency guarantees. ACID and BASE represent two fundamentally different approaches to handling data consistency, each with trade-offs.
In this guide, we’ll explore ACID, BASE, the CAP theorem, and help you choose the right model for your application.
Understanding ACID
The ACID Properties
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ACID Properties โ
โ โ
โ A - Atomicity โ
โ "All or nothing" โ
โ Transaction either completes fully or not at all โ
โ โ
โ C - Consistency โ
โ "Valid state to valid state" โ
โ Transaction moves database from one valid state to โ
โ another โ
โ โ
โ I - Isolation โ
โ "Concurrent transactions appear serial" โ
โ Effects of concurrent transactions are hidden โ
โ โ
โ D - Durability โ
โ "Once committed, data survives failures" โ
โ Committed data is permanently stored โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Atomicity Example
# ACID Atomicity - Money transfer
# WITHOUT atomicity (BAD)
def transfer_bad(from_account, to_account, amount):
# Step 1: Withdraw
withdraw(from_account, amount)
# If this fails, money disappears!
# Step 2: Deposit
deposit(to_account, amount)
# WITH atomicity (GOOD)
def transfer_atomic(from_account, to_account, amount):
with transaction: # All or nothing
withdraw(from_account, amount)
deposit(to_account, amount)
# Either both succeed or both fail
Isolation Levels
# Different isolation levels
isolation_levels = {
"READ_UNCOMMITTED": {
"description": "Can read uncommitted data",
"problems": "Dirty reads, non-repeatable reads, phantoms",
"use_when": "Never recommended"
},
"READ_COMMITTED": {
"description": "Only read committed data",
"problems": "Non-repeatable reads, phantoms",
"use_when": "Most databases default"
},
"REPEATABLE_READ": {
"description": "Same query returns same result",
"problems": "Phantoms",
"use_when": "Financial transactions"
},
"SERIALIZABLE": {
"description": "Transactions appear serial",
"problems": "None (but slow)",
"use_when": "Critical data"
}
}
# SQLAlchemy isolation levels
from sqlalchemy import create_engine
engine = create_engine(
"postgresql://user:pass@localhost/db",
isolation_level="SERIALIZABLE" # Choose level
)
Understanding BASE
The BASE Properties
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BASE Properties โ
โ โ
โ B - Basically Available โ
โ System guarantees availability โ
โ But may not be consistent โ
โ โ
โ A - Soft State โ
โ State may change over time โ
โ Even without input (due to replication lag) โ
โ โ
โ E - Eventually Consistent โ
โ System will become consistent given no input โ
โ But may take time โ
โ โ
โ "Available under partition" โ
โ "Soft state" โ
โ "Eventually consistent" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Eventual Consistency Example
# BASE - Eventual consistency
class EventuallyConsistentDB:
"""
Write returns immediately,
reads may return stale data temporarily
"""
def __init__(self):
self.primary = {} # Writer
self.replicas = [] # Readers
self.pending_writes = [] # Async replication
def write(self, key, value):
# Write to primary immediately
self.primary[key] = value
# Async replicate to replicas
self.pending_writes.append({
"key": key,
"value": value
})
return "written" # Returns immediately!
def read(self, key):
# Can return stale data from replicas
if self.replicas:
# Read from nearest replica
return self.replicas[0].get(key, self.primary.get(key))
return self.primary.get(key)
def replicate(self):
"""Background replication"""
while self.pending_writes:
write = self.pending_writes.pop()
for replica in self.replicas:
replica[write["key"]] = write["value"]
The CAP Theorem
Understanding CAP
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CAP Theorem โ
โ โ
โ You can have at most TWO of three: โ
โ โ
โ โโโโโโโโโโโโ โ
โ โ C โ โ
โ โ Consistencyโ โ
โ โ (C) โ โ
โ โโโโโโฌโโโโโโ โ
โ โ โ
โ โโโโโโโโดโโโโโโโ โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโ โโโโโโโโโโ โ
โ โ A โ โ P โ โ
โ โAvailabilityโ โPartitionโ โ
โ โโโโโโโโโโ โToleranceโ โ
โ โ โโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโโโโโโโโ โ
โ โ
โ CP: Consistency + Partition Tolerance โ
โ AP: Availability + Partition Tolerance โ
โ CA: Consistency + Availability (impossible with P) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
CP vs AP Systems
# CP Systems (Consistency over Availability)
# Example: ZooKeeper, etcd, HBase
cp_systems = {
"description": "System becomes unavailable during partition",
"behavior": "Refuse requests until consistency restored",
"use_when": "Financial systems, inventory, locking"
}
# AP Systems (Availability over Consistency)
# Example: Cassandra, DynamoDB, CouchDB
ap_systems = {
"description": "System remains available during partition",
"behavior": "May return stale data",
"use_when": "Shopping carts, likes, counters"
}
# CA Systems (impossible in distributed systems)
# Example: Single-node databases
# Only possible without network partitions
PACELC Model
# PACELC extends CAP
pacelc = """
Even when there is NO partition:
E - Else (latency vs consistency)
L - Latency
C - Consistency
So it's really: "CAP + Else"
"""
ACID vs BASE Comparison
| Aspect | ACID | BASE |
|---|---|---|
| Consistency | Strong | Eventual |
| Availability | May sacrifice | Guaranteed |
| Transactions | Full ACID | Limited |
| Latency | Higher | Lower |
| Scale | Limited | Highly scalable |
| Complexity | Lower | Higher |
| Examples | PostgreSQL, MySQL | Cassandra, DynamoDB |
| Use When | Financial, inventory | Social media, analytics |
Choosing the Right Model
Decision Framework
def choose_consistency_model(requirements):
"""Choose between ACID and BASE"""
# Strong consistency needed?
if requirements.get("strong_consistency"):
return "ACID"
# Financial/transactional?
if requirements.get("is_financial"):
return "ACID"
# High availability needed?
if requirements.get("availability") == "critical":
return "BASE"
# Can tolerate eventual consistency?
if requirements.get("tolerates_stale_data"):
return "BASE"
# Scale requirements?
if requirements.get("scale") == "massive":
return "BASE"
# Default to ACID if unsure
return "ACID"
Common Use Cases
# ACID use cases:
- Banking and payments
- Inventory management
- Order processing
- User authentication
- Anything requiring transactions
# BASE use cases:
- Social media feeds
- Analytics and metrics
- Caching layers
- Shopping carts
- Session stores
- IoT data ingestion
Database Examples
ACID Databases
# PostgreSQL - Strong ACID
import psycopg2
conn = psycopg2.connect("postgresql://user:pass@localhost/db")
conn.set_session(isolation_level='SERIALIZABLE')
try:
with conn.cursor() as cur:
cur.execute("BEGIN")
cur.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
cur.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
cur.execute("COMMIT")
except:
cur.execute("ROLLBACK")
# MySQL with InnoDB
conn = pymysql.connect(...)
with conn.cursor() as cur:
cur.execute("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE")
cur.execute("START TRANSACTION")
# operations
cur.execute("COMMIT")
BASE Databases
# Cassandra - AP (Eventual consistency)
from cassandra.cluster import Cluster
cluster = Cluster(['127.0.0.1'])
session = cluster.connect('mykeyspace')
# Write (can specify consistency)
session.execute(
"INSERT INTO users (id, name) VALUES (%s, %s)",
[1, "John"]
) # Returns immediately!
# Read (may get stale data)
result = session.execute("SELECT * FROM users WHERE id = 1")
# DynamoDB - Eventually consistent by default
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Users')
# Write
table.put_item(Item={'id': '1', 'name': 'John'})
# Eventually consistent read
response = table.get_item(Key={'id': '1'})
# Strongly consistent read
response = table.get_item(
Key={'id': '1'},
ConsistentRead=True
)
Hybrid Approaches
# NewSQL - ACID + Scalability
# Examples: CockroachDB, Spanner, YugabyteDB
# CockroachDB - Distributed ACID
import cockroachdb
conn = cockroachdb.connect("postgresql://user@host:26257/db")
# Full ACID transactions across nodes!
with conn.transaction():
cur.execute("UPDATE accounts SET balance = balance - 100")
cur.execute("UPDATE accounts SET balance = balance + 100")
# Atomic across nodes!
# Event Store - Event sourcing with ACID
from eventstore import EventStore
store = EventStore()
# Append events atomically
stream = store.get_stream("orders")
stream.append([
OrderCreatedEvent(order_id=1),
PaymentReceivedEvent(amount=100)
])
# All or nothing!
Consistency Patterns
Read Your Own Writes
# Ensure you read what you just wrote
class ReadYourOwnWrites:
"""Pattern for BASE systems"""
def __init__(self, db):
self.db = db
def write_and_read(self, key, value):
# Write
self.db.write(key, value)
# Wait for replication
self.db.wait_for_replication(key, value)
# Now read will return our write
return self.db.read(key)
Eventual to Strong Consistency
# Gradually strengthen consistency
consistency_levels = {
"eventual": {
"description": "May return stale data",
"latency": "Lowest"
},
"causal": {
"description": "Respects causality",
"latency": "Low"
},
"prefix": {
"description": "Reads see recent writes",
"latency": "Medium"
},
"strong": {
"description": "Always sees latest",
"latency": "Highest"
}
}
# Choose based on use case
def get_consistency_level(operation):
if operation == "user_profile":
return "strong" # User expects to see their changes
elif operation == "like_count":
return "eventual" # Can be slightly stale
elif operation == "comment_thread":
return "causal" # Order matters
Conclusion
Choosing between ACID and BASE depends on your requirements:
- ACID: When strong consistency is critical (financial, inventory)
- BASE: When availability and scale matter more (social, analytics)
- CAP: You can’t have both during network partitions
Modern databases often provide tunable consistency or hybrid models (NewSQL) to give you the best of both worlds.
Comments