Introduction
Understanding Redis internals helps you make better architectural decisions, optimize performance, and debug issues effectively. This article explores the core data structures, algorithms, and design patterns that make Redis one of the fastest databases available.
Core Data Structures
1. Simple Dynamic String (SDS)
Redis doesn’t use C strings directly. Instead, it implements Simple Dynamic Strings (SDS) for better performance and safety.
SDS Structure
// Redis SDS structure
struct sdshdr8 {
uint8_t len; // Used bytes
uint8_t alloc; // Allocated bytes (excluding header + null)
unsigned char flags; // Type indicator
char buf[]; // Actual string data
};
Advantages Over C Strings
// C string issues:
// - O(n) for length calculation
// - Not binary safe
// - Buffer overflow risks
// SDS benefits:
// - O(1) length retrieval
// - Binary safe (stores raw bytes)
// - Prevents buffer overflow
// - Memory pre-allocation
Memory Pre-allocation
// When growing strings, Redis pre-allocates extra space
// Less than 1MB: double the size
// More than 1MB: add 1MB
// Example:
// "hello" (5 bytes) -> append " world" (6 bytes)
// New allocation: 5 + 6 = 11, but actually allocates 22 bytes
// This reduces reallocation overhead
2. Redis Object Encoding
Redis uses an object abstraction layer that maps high-level types to internal encodings.
// Object structure
typedef struct redisObject {
unsigned type:4; // REDIS_STRING, REDIS_LIST, etc.
unsigned encoding:4; // Actual storage format
unsigned lru:24; // LRU for eviction
int refcount; // Reference counting
void *ptr; // Pointer to actual data
} robj;
Encoding Types
| Type | Encoding | Internal Structure |
|---|---|---|
| String | int | Integer value |
| String | embstr | Embedded string (small strings) |
| String | raw | SDS allocation |
| List | ziplist | Compressed list (small lists) |
| List | quicklist | Linked list of ziplists |
| Hash | ziplist | Compressed hash (small hashes) |
| Hash | hashtable | Hash table |
| Set | intset | Integer set (numeric sets) |
| Set | hashtable | Hash table |
| ZSet | ziplist | Sorted list (small sets) |
| ZSet | skiplist | Skip list + hash table |
# Check encoding in Redis
redis-cli HSET myhash field value
redis-cli DEBUG OBJECT myhash
# Encoding output example:
# encoding:ziplist, numele:1, numfields:1
3. Hash Table (Dict)
Redis uses hash tables for most data structures. The implementation includes collision handling via chaining and incremental rehashing.
Dict Structure
// Main dict structure
typedef struct dict {
dictType *type; // Type-specific functions
void *privdata; // Private data
dictht ht[2]; // Two hash tables (rehashing)
long rehashidx; // -1 = not rehashing
unsigned long iterators; // Active iterators
} dict;
// Hash table entry
typedef struct dictEntry {
void *key;
union {
void *val;
uint64_t u64;
int64_t s64;
double d;
} v;
struct dictEntry *next; // Chaining for collision
} dictEntry;
Hash Function
// Redis uses SipHash for string hashing
// Secure against hash collision attacks
uint64_t dictGenHashFunction(const void *key, int len) {
return siphash(key, len, dict_hash_function_seed);
}
Incremental Rehashing
# Redis performs rehashing incrementally
# During rehash:
# - Read operations: check both tables
# - Write operations: go to new table
# - Each operation: migrate 1 bucket
# This prevents latency spikes during growth
# Checking rehash status:
redis-cli INFO stats | grep rehash
4. SkipList
Sorted sets use SkipLists for ordered data with O(log N) operations.
SkipList Structure
// SkipList node
typedef struct zskiplistNode {
sds ele; // Element (string)
double score; // Score for sorting
struct zskiplistNode *backward; // Back pointer
struct zskiplistLevel { // Forward pointers
struct zskiplistNode *forward;
unsigned int span; // Elements between nodes
} level[];
} zskiplistNode;
// SkipList structure
typedef struct zskiplist {
struct zskiplistNode *header, *tail;
unsigned long length;
int level;
} zskiplist;
Why SkipList Over B-Tree?
# SkipList advantages:
# - Simpler implementation than B-tree
# - Easier to implement lock-free versions
# - Good cache locality
# - O(log N) operations
# Time complexity:
# - Search: O(log N)
# - Insert: O(log N)
# - Delete: O(log N)
# Redis also maintains a hash table alongside SkipList
# for O(1) score lookup by member
5. QuickList
Lists are implemented as QuickLists - a hybrid of ziplists and linked lists.
QuickList Structure
// QuickList
typedef struct quicklist {
quicklistNode *head;
quicklistNode *tail;
unsigned long count; // Total elements
unsigned long len; // Number of nodes
int fill : QL_FILL_BITS; // Ziplist size limit
int compress : QL_COMPRESS_BITS;
int bookmark_count : QL_BM_BITS;
quicklistBookmark bookmarks[];
} quicklist;
// Each node is a ziplist
typedef struct quicklistNode {
quicklistNode *prev;
quicklistNode *next;
unsigned char *zl; // Compressed ziplist
unsigned int sz; // Ziplist size
unsigned int count : 16;
unsigned int encoding : 2;
unsigned int container : 2;
unsigned int recompress : 1;
} quicklistNode;
Compression
# QuickList compression configuration
# compress depth: number of nodes to NOT compress from each end
# Example: compress 1
# [head] [uncompressed] [uncompressed] [compressed] [tail]
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^
# Not compressed (head/tail - 1 each)
# Configuration:
# list-compress-depth 1 # Default
# list-compress-depth 0 # No compression
# list-compress-depth 3 # More compression
Event Loop and I/O
Reactor Pattern
Redis uses a single-threaded event loop based on the reactor pattern.
// Simplified event loop structure
typedef struct aeEventLoop {
int maxfd;
int setsize;
aeFileEvent *events; // Registered events
aeFiredEvent *fired; // Triggered events
aeTimeEvent *timeEventHead;
void *apidata; // Platform-specific data (epoll, kqueue)
aeBeforeSleepProc *beforesleep;
} aeEventLoop;
Event Types
// File events
AE_READABLE // Socket ready for reading
AE_WRITABLE // Socket ready for writing
// Time events
AE_NOMORE // Delete time event
Event Loop Flow
# Simplified event loop pseudocode:
def event_loop():
while running:
# 1. Process time events (expire keys, etc.)
process_time_events()
# 2. Poll for I/O events
# (epoll_wait / kqueue / select)
fds = poll(max_wait)
# 3. Process I/O events
for fd in fds:
if readable:
read_query(client)
if writable:
write_response(client)
# 4. Call beforeSleep hook
before_sleep()
Multi-Threaded I/O (Redis 6+)
// Redis 6+ uses I/O threads for network processing
// Main thread: executes commands
// I/O threads: read/write to network
// Configuration
io-threads 4 // 1 main + 4 I/O threads
io-threads-do-reads yes
# Performance impact:
# Single-threaded Redis: ~100K ops/sec
# Multi-threaded I/O: ~200-400K ops/sec
#
# Commands still execute on main thread
# I/O threads handle network buffering only
Persistence Mechanisms
RDB (Redis Database)
RDB creates point-in-time snapshots.
Save Points
# redis.conf
save 900 1 # 1 change in 15 minutes
save 300 10 # 10 changes in 5 minutes
save 60 10000 # 10000 changes in 1 minute
Fork and Copy
# RDB process:
# 1. Parent forks child (COW copy)
# 2. Child writes snapshot to temp file
# 3. Child replaces old RDB file
# 4. Parent continues serving requests
# Memory: ~2x during fork (copy-on-write)
# Use "child_info" to track
redis-cli INFO stats | grep fork
RDB Structure
+--------+--------+--------+--------+
| HEADER | DB 0 | DB 1 | ... | DB n |
+--------+--------+--------+--------+
| REDIS | SELECT | KEY | KEY | END |
| 0099 | 00 | type | data | FF |
+--------+--------+--------+--------+
AOF (Append Only File)
AOF logs every write operation for recovery.
fsync Policies
# Performance vs Safety
appendfsync always # Every write (slowest, safest)
appendfsync everysec # Every second (default)
appendfsync no # Let OS decide (fastest, risky)
AOF Rewrite
# Background rewrite:
# 1. Fork child process
# 2. Child writes minimal AOF (only final state)
# 3. Parent continues logging to old AOF
# 4. Atomic switch to new AOF
# Triggered by:
# - auto-aof-rewrite-percentage 100
# - auto-aof-rewrite-min-size 64mb
# Manual:
redis-cli BGREWRITEAOF
AOF Recovery
# On restart:
# 1. Load RDB if exists (faster)
# 2. Load AOF and replay (complete)
# 3. Redis prefers AOF if both exist
# (aof-use-rdb-preamble yes)
# Checking integrity:
redis-cli --rdb-check /var/redis/dump.rdb
redis-cli --aof-check /var/redis/appendonly.aof
Memory Management
Memory Allocation
// Redis uses jemalloc for memory management
// Advantages over glibc malloc:
// - Less fragmentation
// - Better memory utilization
// - Faster allocations
// Can switch to tcmalloc or system malloc
// But jemalloc is recommended
Memory Optimization
# Tips for memory optimization:
# 1. Use appropriate data structures
# Bad: SET user:1:skills "python,javascript,redis"
# Good: SADD user:1:skills python javascript redis
# 2. Use integer IDs instead of strings
# Bad: SET user:alice:score 100
# Good: SET user:100:score 100 (with user mapping)
# 3. Use hash for related fields
# Bad: SET user:1:name "Alice"
# SET user:1:email "[email protected]"
# Good: HSET user:1 name "Alice" email "[email protected]"
# 4. Enable compression for lists
# CONFIG SET list-compress-depth 1
Memory Eviction
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
# Eviction policies:
# - noeviction: Return error (default)
# - allkeys-lru: LRU on all keys
# - allkeys-random: Random eviction
# - volatile-lru: LRU on expired keys
# - volatile-ttl: Evict shortest TTL
# - volatile-random: Random expired keys
Replication
Master-Slave Replication
// Replication process:
// 1. Slave connects to master
// 2. Slave sends PSYNC with offset
// 3. Master sends full sync (RDB) if needed
// 4. Master streams new commands
// 5. Slave replays commands
Partial Resync
# PSYNC (Partial Sync):
# - If slave has valid replication ID + offset
# - Master can send missing commands only
# - Much faster than full sync
# Replication ID: identifies replication epoch
# Offset: position in replication stream
# Checking replication status:
redis-cli INFO replication
Redis Sentinel
// Sentinel responsibilities:
// - Monitor master/slave health
// - Automatic failover
// - Provider configuration to clients
// Leader election:
// - Sentinels vote to elect leader
// - Majority needed for failover
// - New master chosen from healthiest slave
Cluster
Hash Slots
# 16384 hash slots (0-16383)
# Key slot = CRC16(key) % 16384
# Example:
# Key "user:100" -> slot 1234
# Mapped to node handling 1234
# MOVED response:
# Redirects client to correct node
# Client caches mapping for future requests
Failover in Cluster
// Failover process:
// 1. Slave detects master failure
// 2. Slave requests votes from other nodes
// 3. If majority, slave becomes master
// 4. Other slaves redirect to new master
// 5. Old master restored as slave when available
Client-Side Caching
Implementation
# Redis 6+ supports client-side caching
# 1. Client subscribes to invalidation channel
PSUBSCRIBE __redis__:invalidate
# 2. Server tracks cached keys
CLIENT TRACKING on PREFIX:user: PREFIX:session:
# 3. When key changes, server notifies client
# 4. Client invalidates local cache
Tracking Modes
// NOLOOP: Don't receive own invalidations
// BROADCAST: Track across all clients
// CACHING: Act as intermediate cache
Lua Scripting
Script Execution
// Redis uses Lua 5.1
// Scripts are executed atomically
// No other commands run during script
// Script cache:
// - Scripts are cached by SHA
// - EVALSHA reuses cached script
// - SCRIPT FLUSH clears cache
# Example script
script = """
local key = KEYS[1]
local current = redis.call('GET', key)
if current then
return tonumber(current) + 1
else
return 1
end
"""
# Execute
r.eval(script, 1, 'counter')
# Or use SHA for caching
sha = r.script_load(script)
r.evalsha(sha, 1, 'counter')
Transactions
Implementation
# Redis transactions use MULTI/EXEC
# Basic transaction
pipe = r.pipeline()
pipe.set('a', 1)
pipe.incr('a')
pipe.execute()
# With WATCH for optimistic locking
while True:
try:
pipe = r.pipeline()
pipe.watch('account:balance')
balance = int(r.get('account:balance'))
if balance >= amount:
pipe.multi()
pipe.decrby('account:balance', amount)
pipe.execute()
break
else:
pipe.unwatch()
break
except redis.WatchError:
continue # Retry
Design Patterns Summary
| Pattern | Implementation | Use Case |
|---|---|---|
| Cache-Aside | Application logic | Read caching |
| Write-Through | Application logic | Write caching |
| Publish/Subscribe | Redis PUB/SUB | Real-time messaging |
| Distributed Lock | SET NX + Lua | Coordination |
| Rate Limiter | Sliding Window | API limits |
| Leaderboard | Sorted Set | Rankings |
| Time Series | Sorted Set + Rollups | Metrics |
| Rate Counter | HyperLogLog | Counting |
Conclusion
Redis’s performance stems from carefully designed data structures optimized for specific use cases. Understanding SDS for strings, SkipLists for sorted data, QuickLists for lists, and hash tables for key-value access helps you design better Redis implementations.
Combined with its event-driven architecture, efficient persistence mechanisms, and smart memory management, Redis provides a robust foundation for high-performance applications.
In the next article, we’ll explore how Redis powers AI applications through vector search and semantic caching.
Comments