Skip to main content
โšก Calmops

Redis Internals: Data Structures, Algorithms, and Design Patterns

Introduction

Understanding Redis internals helps you make better architectural decisions, optimize performance, and debug issues effectively. This article explores the core data structures, algorithms, and design patterns that make Redis one of the fastest databases available.


Core Data Structures

1. Simple Dynamic String (SDS)

Redis doesn’t use C strings directly. Instead, it implements Simple Dynamic Strings (SDS) for better performance and safety.

SDS Structure

// Redis SDS structure
struct sdshdr8 {
    uint8_t len;        // Used bytes
    uint8_t alloc;      // Allocated bytes (excluding header + null)
    unsigned char flags; // Type indicator
    char buf[];         // Actual string data
};

Advantages Over C Strings

// C string issues:
// - O(n) for length calculation
// - Not binary safe
// - Buffer overflow risks

// SDS benefits:
// - O(1) length retrieval
// - Binary safe (stores raw bytes)
// - Prevents buffer overflow
// - Memory pre-allocation

Memory Pre-allocation

// When growing strings, Redis pre-allocates extra space
// Less than 1MB: double the size
// More than 1MB: add 1MB

// Example:
// "hello" (5 bytes) -> append " world" (6 bytes)
// New allocation: 5 + 6 = 11, but actually allocates 22 bytes
// This reduces reallocation overhead

2. Redis Object Encoding

Redis uses an object abstraction layer that maps high-level types to internal encodings.

// Object structure
typedef struct redisObject {
    unsigned type:4;        // REDIS_STRING, REDIS_LIST, etc.
    unsigned encoding:4;    // Actual storage format
    unsigned lru:24;        // LRU for eviction
    int refcount;          // Reference counting
    void *ptr;             // Pointer to actual data
} robj;

Encoding Types

Type Encoding Internal Structure
String int Integer value
String embstr Embedded string (small strings)
String raw SDS allocation
List ziplist Compressed list (small lists)
List quicklist Linked list of ziplists
Hash ziplist Compressed hash (small hashes)
Hash hashtable Hash table
Set intset Integer set (numeric sets)
Set hashtable Hash table
ZSet ziplist Sorted list (small sets)
ZSet skiplist Skip list + hash table
# Check encoding in Redis
redis-cli HSET myhash field value
redis-cli DEBUG OBJECT myhash

# Encoding output example:
# encoding:ziplist, numele:1, numfields:1

3. Hash Table (Dict)

Redis uses hash tables for most data structures. The implementation includes collision handling via chaining and incremental rehashing.

Dict Structure

// Main dict structure
typedef struct dict {
    dictType *type;      // Type-specific functions
    void *privdata;      // Private data
    dictht ht[2];        // Two hash tables (rehashing)
    long rehashidx;      // -1 = not rehashing
    unsigned long iterators; // Active iterators
} dict;

// Hash table entry
typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;  // Chaining for collision
} dictEntry;

Hash Function

// Redis uses SipHash for string hashing
// Secure against hash collision attacks

uint64_t dictGenHashFunction(const void *key, int len) {
    return siphash(key, len, dict_hash_function_seed);
}

Incremental Rehashing

# Redis performs rehashing incrementally
# During rehash:
# - Read operations: check both tables
# - Write operations: go to new table
# - Each operation: migrate 1 bucket

# This prevents latency spikes during growth

# Checking rehash status:
redis-cli INFO stats | grep rehash

4. SkipList

Sorted sets use SkipLists for ordered data with O(log N) operations.

SkipList Structure

// SkipList node
typedef struct zskiplistNode {
    sds ele;                 // Element (string)
    double score;            // Score for sorting
    struct zskiplistNode *backward;  // Back pointer
    struct zskiplistLevel {         // Forward pointers
        struct zskiplistNode *forward;
        unsigned int span;    // Elements between nodes
    } level[];
} zskiplistNode;

// SkipList structure
typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;

Why SkipList Over B-Tree?

# SkipList advantages:
# - Simpler implementation than B-tree
# - Easier to implement lock-free versions
# - Good cache locality
# - O(log N) operations

# Time complexity:
# - Search: O(log N)
# - Insert: O(log N)
# - Delete: O(log N)

# Redis also maintains a hash table alongside SkipList
# for O(1) score lookup by member

5. QuickList

Lists are implemented as QuickLists - a hybrid of ziplists and linked lists.

QuickList Structure

// QuickList
typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;        // Total elements
    unsigned long len;          // Number of nodes
    int fill : QL_FILL_BITS;    // Ziplist size limit
    int compress : QL_COMPRESS_BITS;
    int bookmark_count : QL_BM_BITS;
    quicklistBookmark bookmarks[];
} quicklist;

// Each node is a ziplist
typedef struct quicklistNode {
    quicklistNode *prev;
    quicklistNode *next;
    unsigned char *zl;          // Compressed ziplist
    unsigned int sz;            // Ziplist size
    unsigned int count : 16;
    unsigned int encoding : 2;
    unsigned int container : 2;
    unsigned int recompress : 1;
} quicklistNode;

Compression

# QuickList compression configuration
# compress depth: number of nodes to NOT compress from each end

# Example: compress 1
# [head] [uncompressed] [uncompressed] [compressed] [tail]
#                 ^^^^^^^^^^^^^^ ^^^^^^^^^^^^
#                 Not compressed (head/tail - 1 each)

# Configuration:
# list-compress-depth 1  # Default
# list-compress-depth 0  # No compression
# list-compress-depth 3  # More compression

Event Loop and I/O

Reactor Pattern

Redis uses a single-threaded event loop based on the reactor pattern.

// Simplified event loop structure
typedef struct aeEventLoop {
    int maxfd;
    int setsize;
    aeFileEvent *events;     // Registered events
    aeFiredEvent *fired;     // Triggered events
    aeTimeEvent *timeEventHead;
    void *apidata;           // Platform-specific data (epoll, kqueue)
    aeBeforeSleepProc *beforesleep;
} aeEventLoop;

Event Types

// File events
AE_READABLE  // Socket ready for reading
AE_WRITABLE  // Socket ready for writing

// Time events
AE_NOMORE    // Delete time event

Event Loop Flow

# Simplified event loop pseudocode:

def event_loop():
    while running:
        # 1. Process time events (expire keys, etc.)
        process_time_events()
        
        # 2. Poll for I/O events
        #    (epoll_wait / kqueue / select)
        fds = poll(max_wait)
        
        # 3. Process I/O events
        for fd in fds:
            if readable:
                read_query(client)
            if writable:
                write_response(client)
        
        # 4. Call beforeSleep hook
        before_sleep()

Multi-Threaded I/O (Redis 6+)

// Redis 6+ uses I/O threads for network processing
// Main thread: executes commands
// I/O threads: read/write to network

// Configuration
io-threads 4        // 1 main + 4 I/O threads
io-threads-do-reads yes
# Performance impact:
# Single-threaded Redis: ~100K ops/sec
# Multi-threaded I/O:   ~200-400K ops/sec
# 
# Commands still execute on main thread
# I/O threads handle network buffering only

Persistence Mechanisms

RDB (Redis Database)

RDB creates point-in-time snapshots.

Save Points

# redis.conf
save 900 1      # 1 change in 15 minutes
save 300 10     # 10 changes in 5 minutes
save 60 10000   # 10000 changes in 1 minute

Fork and Copy

# RDB process:
# 1. Parent forks child (COW copy)
# 2. Child writes snapshot to temp file
# 3. Child replaces old RDB file
# 4. Parent continues serving requests

# Memory: ~2x during fork (copy-on-write)
# Use "child_info" to track
redis-cli INFO stats | grep fork

RDB Structure

+--------+--------+--------+--------+
| HEADER | DB 0   | DB 1   | ...    | DB n  |
+--------+--------+--------+--------+
| REDIS  | SELECT |  KEY   |  KEY   | END   |
| 0099   |   00   |  type  | data   |  FF   |
+--------+--------+--------+--------+

AOF (Append Only File)

AOF logs every write operation for recovery.

fsync Policies

# Performance vs Safety
appendfsync always    # Every write (slowest, safest)
appendfsync everysec  # Every second (default)
appendfsync no        # Let OS decide (fastest, risky)

AOF Rewrite

# Background rewrite:
# 1. Fork child process
# 2. Child writes minimal AOF (only final state)
# 3. Parent continues logging to old AOF
# 4. Atomic switch to new AOF

# Triggered by:
# - auto-aof-rewrite-percentage 100
# - auto-aof-rewrite-min-size 64mb

# Manual:
redis-cli BGREWRITEAOF

AOF Recovery

# On restart:
# 1. Load RDB if exists (faster)
# 2. Load AOF and replay (complete)
# 3. Redis prefers AOF if both exist
#    (aof-use-rdb-preamble yes)

# Checking integrity:
redis-cli --rdb-check /var/redis/dump.rdb
redis-cli --aof-check /var/redis/appendonly.aof

Memory Management

Memory Allocation

// Redis uses jemalloc for memory management
// Advantages over glibc malloc:
// - Less fragmentation
// - Better memory utilization
// - Faster allocations

// Can switch to tcmalloc or system malloc
// But jemalloc is recommended

Memory Optimization

# Tips for memory optimization:

# 1. Use appropriate data structures
# Bad: SET user:1:skills "python,javascript,redis"
# Good: SADD user:1:skills python javascript redis

# 2. Use integer IDs instead of strings
# Bad: SET user:alice:score 100
# Good: SET user:100:score 100 (with user mapping)

# 3. Use hash for related fields
# Bad: SET user:1:name "Alice"
#      SET user:1:email "[email protected]"
# Good: HSET user:1 name "Alice" email "[email protected]"

# 4. Enable compression for lists
# CONFIG SET list-compress-depth 1

Memory Eviction

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru

# Eviction policies:
# - noeviction: Return error (default)
# - allkeys-lru: LRU on all keys
# - allkeys-random: Random eviction
# - volatile-lru: LRU on expired keys
# - volatile-ttl: Evict shortest TTL
# - volatile-random: Random expired keys

Replication

Master-Slave Replication

// Replication process:
// 1. Slave connects to master
// 2. Slave sends PSYNC with offset
// 3. Master sends full sync (RDB) if needed
// 4. Master streams new commands
// 5. Slave replays commands

Partial Resync

# PSYNC (Partial Sync):
# - If slave has valid replication ID + offset
# - Master can send missing commands only
# - Much faster than full sync

# Replication ID: identifies replication epoch
# Offset: position in replication stream

# Checking replication status:
redis-cli INFO replication

Redis Sentinel

// Sentinel responsibilities:
// - Monitor master/slave health
// - Automatic failover
// - Provider configuration to clients

// Leader election:
// - Sentinels vote to elect leader
// - Majority needed for failover
// - New master chosen from healthiest slave

Cluster

Hash Slots

# 16384 hash slots (0-16383)
# Key slot = CRC16(key) % 16384

# Example:
# Key "user:100" -> slot 1234
# Mapped to node handling 1234

# MOVED response:
# Redirects client to correct node
# Client caches mapping for future requests

Failover in Cluster

// Failover process:
// 1. Slave detects master failure
// 2. Slave requests votes from other nodes
// 3. If majority, slave becomes master
// 4. Other slaves redirect to new master
// 5. Old master restored as slave when available

Client-Side Caching

Implementation

# Redis 6+ supports client-side caching

# 1. Client subscribes to invalidation channel
PSUBSCRIBE __redis__:invalidate

# 2. Server tracks cached keys
CLIENT TRACKING on PREFIX:user: PREFIX:session:

# 3. When key changes, server notifies client
# 4. Client invalidates local cache

Tracking Modes

// NOLOOP: Don't receive own invalidations
// BROADCAST: Track across all clients
// CACHING: Act as intermediate cache

Lua Scripting

Script Execution

// Redis uses Lua 5.1
// Scripts are executed atomically
// No other commands run during script

// Script cache:
// - Scripts are cached by SHA
// - EVALSHA reuses cached script
// - SCRIPT FLUSH clears cache
# Example script
script = """
local key = KEYS[1]
local current = redis.call('GET', key)
if current then
    return tonumber(current) + 1
else
    return 1
end
"""

# Execute
r.eval(script, 1, 'counter')

# Or use SHA for caching
sha = r.script_load(script)
r.evalsha(sha, 1, 'counter')

Transactions

Implementation

# Redis transactions use MULTI/EXEC

# Basic transaction
pipe = r.pipeline()
pipe.set('a', 1)
pipe.incr('a')
pipe.execute()

# With WATCH for optimistic locking
while True:
    try:
        pipe = r.pipeline()
        pipe.watch('account:balance')
        balance = int(r.get('account:balance'))
        
        if balance >= amount:
            pipe.multi()
            pipe.decrby('account:balance', amount)
            pipe.execute()
            break
        else:
            pipe.unwatch()
            break
    except redis.WatchError:
        continue  # Retry

Design Patterns Summary

Pattern Implementation Use Case
Cache-Aside Application logic Read caching
Write-Through Application logic Write caching
Publish/Subscribe Redis PUB/SUB Real-time messaging
Distributed Lock SET NX + Lua Coordination
Rate Limiter Sliding Window API limits
Leaderboard Sorted Set Rankings
Time Series Sorted Set + Rollups Metrics
Rate Counter HyperLogLog Counting

Conclusion

Redis’s performance stems from carefully designed data structures optimized for specific use cases. Understanding SDS for strings, SkipLists for sorted data, QuickLists for lists, and hash tables for key-value access helps you design better Redis implementations.

Combined with its event-driven architecture, efficient persistence mechanisms, and smart memory management, Redis provides a robust foundation for high-performance applications.

In the next article, we’ll explore how Redis powers AI applications through vector search and semantic caching.

Comments