Introduction
Understanding MariaDB’s internal architecture helps developers and DBAs make informed decisions about storage engine selection, query optimization, and performance tuning. MariaDB’s pluggable storage engine architecture is one of its most distinctive features, offering flexibility to choose the optimal engine for different workloads.
This article explores MariaDB’s internal mechanisms, including storage engines, query processing pipeline, caching mechanisms, and the architectural decisions that make MariaDB unique.
Architecture Overview
Client-Server Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MariaDB Server โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Connection Pool โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Thread Pool โ Authentication โ Protocol Layer โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ SQL Layer โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Parser โ โ Optimizerโ โ Executor โ โ Caches โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Storage Engine Layer (Pluggable) โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โ
โ โInnoDB โ โ Aria โ โMyRocks โ โMemory โ โCSV โ โ
โ โโโโโโโโโ โโโโโโโโโ โโโโโโโโโโ โโโโโโโโโ โโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ OS/File System Layer โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Query Processing Flow
SQL Query
โ
โผ
โโโโโโโโโโโโโโโ
โ Parser โ 1. Parse SQL into AST
โโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโ
โ Analyzer โ 2. Validate, resolve names
โโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโ
โ Optimizer โ 3. Generate execution plan
โโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโ
โ Executor โ 4. Execute plan on storage engines
โโโโโโโโโโโโโโโ
โ
โผ
Result Set
Storage Engines
Pluggable Architecture
MariaDB’s storage engine architecture allows different engines for different tables:
-- Check available engines
SHOW ENGINES;
-- Create table with specific engine
CREATE TABLE transactional (...) ENGINE=InnoDB;
CREATE TABLE archive (...) ENGINE=Aria;
CREATE TABLE analytics (...) ENGINE=ColumnStore;
InnoDB (Default)
InnoDB is MariaDB’s default storage engine, providing ACID transactions and row-level locking.
InnoDB Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ InnoDB Buffer Pool โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Data โ โ Index โ โ Insert โ โ
โ โ Pages โ โ Pages โ โ Buffer โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Log Files (Redo Logs) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ System Tablespace โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Doublewrite Buffer โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Change Buffer โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
InnoDB Data Structures
-- InnoDB uses B+ trees for indexes
-- Clustered index: primary key
-- Secondary indexes: reference primary key
-- Page structure (16KB default)
-- - Page header (38 bytes)
-- - Infimum/Supremum records
-- - User records
-- - Free space
-- - Page trailer (8 bytes)
-- Row format (DYNAMIC/COMPACT)
-- - Variable-length columns stored in overflow pages
-- - NULL bitmap for null values
-- - Row header with transaction IDs
InnoDB Transaction Log
-- Redo log (transaction log)
-- Write-ahead logging (WAL)
-- Durability: innodb_flush_log_at_trx_commit
-- 1: Flush log to disk on commit (most durable)
-- 2: Flush to OS cache on commit
-- 0: Flush every second
-- Undo log: for MVCC and rollback
-- Stored in system tablespace (MariaDB 10.6+)
-- Or undo tablespaces (MariaDB 10.6+)
Aria Storage Engine
Aria is MariaDB’s improved replacement for MyISAM, with crash-safe tables.
Aria Features
-- Create Aria table
CREATE TABLE logs (...) ENGINE=Aria;
-- Aria features:
-- - Crash-safe tables (unlike MyISAM)
-- - Transaction support (since MariaDB 10.4)
-- - Page-based caching
-- - Full-text search support
-- - Better performance than MyISAM
Aria vs MyISAM
| Feature | Aria | MyISAM |
|---|---|---|
| Crash-safe | Yes | No |
| Transactions | Yes (10.4+) | No |
| Concurrency | Better | Limited |
| Index caching | Yes | Yes |
| Row count accuracy | Yes | No |
MyRocks Storage Engine
MyRocks uses RocksDB (embedded key-value store) for excellent write performance and compression.
MyRocks Architecture
-- Create MyRocks table
CREATE TABLE events (...) ENGINE=MyRocks;
-- MyRocks characteristics:
-- - Log-structured Merge (LSM) tree
-- - Write-optimized
-- - Compression (2-3x better than InnoDB)
-- - Less space than InnoDB
-- - Good for write-heavy workloads
ColumnStore (Analytical)
ColumnStore provides columnar storage for analytical workloads (OLAP).
ColumnStore Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MariaDB Server (User Module) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Execution Manager โ โ
โ โ - Query parsing & optimization โ โ
โ โ - Job distribution to Performance Modules โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Performance Modules (PM) - Storage & Processing โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โ โ PM1 โ โ PM2 โ โ PM3 โ โ
โ โ Local disk โ โ Local disk โ โ Local disk โ โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Data Storage (Columnar) โ
โ - Extents (8-64MB) โ
โ - Block Compression โ
โ - No secondary indexes โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
ColumnStore Features
-- Create ColumnStore table
CREATE TABLE sales (
id INT,
date DATE,
product_id INT,
customer_id INT,
amount DECIMAL(10,2)
) ENGINE=ColumnStore;
-- Features:
-- - Columnar storage (analytical queries)
-- - Distributed MPP architecture
-- - Data compression
-- - No indexes needed
-- - Parallel query processing
Query Processing
Parser and Rewriter
-- MariaDB parser handles:
-- 1. Lexical analysis (tokenization)
-- 2. Syntax parsing
-- 3. Semantic analysis (permissions, types)
-- 4. Query rewrite
-- Check rewritten query
EXPLAIN FORMAT=JSON
SELECT * FROM users WHERE name = 'John';
Query Optimizer
MariaDB uses a cost-based optimizer:
-- View query execution plan
EXPLAIN SELECT * FROM employees WHERE department_id = 1;
-- Detailed JSON plan
EXPLAIN FORMAT=JSON
SELECT e.name, d.name
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE e.salary > 50000;
-- Check optimizer trace
SET optimizer_trace='enabled=on';
SELECT * FROM employees WHERE department_id = 1;
SELECT * FROM information_schema.optimizer_trace;
SET optimizer_trace='enabled=off';
Index Merge
-- MariaDB can merge multiple index scans
EXPLAIN
SELECT * FROM employees
WHERE department_id = 1 OR salary > 100000;
-- Index intersection
EXPLAIN
SELECT * FROM employees
WHERE department_id = 1 AND salary > 50000;
Caching Mechanisms
Query Cache (Deprecated in 10.5+)
-- Query cache was removed in MariaDB 10.5
-- Use the following instead:
-- 1. Handler socket (for high performance)
-- 2. Memcached plugin (InnoDB)
-- 3. Query rewrite + prepared statements
-- 4. Application-level caching
InnoDB Buffer Pool
-- Buffer pool is the main cache
-- Contains: data pages, index pages, insert buffer, lock info
-- Check buffer pool status
SHOW ENGINE INNODB STATUS;
-- Buffer pool size (set in my.cnf)
-- innodb_buffer_pool_size = 4G
-- Monitor buffer pool
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool%';
-- Pages in buffer pool
SELECT * FROM information_schema.INNODB_BUFFER_PAGE
WHERE TABLE_NAME = 'users' LIMIT 10;
Thread Cache
-- Thread cache reduces thread creation overhead
-- thread_cache_size = 32 (in my.cnf)
-- Check thread cache usage
SHOW GLOBAL STATUS LIKE 'Threads%';
-- Threads_cached: idle threads in cache
-- Threads_connected: active connections
-- Threads_created: total created
-- Threads_running: active running
Memory Management
Global Buffers
# my.cnf configuration
[mysqld]
# Buffer pool (most important)
innodb_buffer_pool_size=4G
# Log buffer
innodb_log_buffer_size=64M
# Sort buffer
sort_buffer_size=4M
# Join buffer
join_buffer_size=4M
# Read buffer
read_buffer_size=2M
# Read rnd buffer
read_rnd_buffer_size=4M
Per-Connection Buffers
-- These are allocated per query/connection
-- Use carefully to avoid memory issues
-- Default values
-- sort_buffer_size: 256KB
-- join_buffer_size: 256KB
-- read_buffer_size: 128KB
-- read_rnd_buffer_size: 256KB
-- Check current values
SHOW VARIABLES LIKE '%buffer_size%';
Locking and Concurrency
Row-Level Locking
-- InnoDB uses row-level locking
-- Lock types:
-- - Shared (S): allows reading
-- - Exclusive (X): allows write
-- Lock modes:
-- - Record lock: single row
-- - Gap lock: range between records
-- - Next-key lock: record + gap
-- Show locks
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;
Metadata Locking
-- Metadata locks prevent DDL/DML conflicts
-- Check metadata locks
SELECT * FROM information_schema.metadata_lock_info;
-- Kill blocking session
-- SHOW PROCESSLIST;
-- KILL [SESSION_ID];
Deadlock Handling
-- InnoDB deadlock detection
-- Check for deadlocks
SHOW ENGINE INNODB STATUS;
-- Last detected deadlock
-- In STATUS output, look for "LATEST DETECTED DEADLOCK"
-- Set deadlock timeout
SET deadlock_timeout=10; -- milliseconds
Replication Internals
Binary Log
-- Binary log formats
-- 1. STATEMENT: SQL statements
-- 2. ROW: Changed rows (safer)
-- 3. MIXED: Auto-select
-- Binary log contents
-- - DDL statements
-- - DML statements that modify data
-- - Transaction events
-- Check binary log
SHOW MASTER STATUS;
SHOW BINLOG EVENTS IN 'mysql-bin.000001';
Replication Process
Master Slave
| |
| 1. DML Transaction |
| 2. Write to binlog |
| 3. Send to slave (I/O) |
|------------------------------>|
| | 4. Write to relay log
| | 5. Apply (SQL thread)
| |
GTID (Global Transaction ID)
-- Enable GTID
-- In my.cnf:
-- log_slave_updates=ON
-- gtid_mode=ON
-- enforce_gtid_consistency=ON
-- Use GTID replication
CHANGE MASTER TO
MASTER_USE_GTID=slave_pos;
-- Or MASTER_AUTO_POSITION
CHANGE MASTER TO
MASTER_AUTO_POSITION=1;
MVCC (Multi-Version Concurrency Control)
How MVCC Works
Transaction 1 (Read) Transaction 2 (Write)
| |
| SELECT (sees old version) |
|<----------------------------------|
| | UPDATE (creates new version)
| |-------------------------->|
| |
| COMMIT |
| |
| SELECT (now sees new version) |
|<----------------------------------|
InnoDB MVCC
-- Each row has:
-- - DB_TRX_ID: transaction ID
-- - DB_ROLL_PTR: pointer to undo log
-- Read consistency:
-- - Transaction reads committed changes
-- - Sees snapshot as of statement start
-- Isolation levels:
-- READ UNCOMMITTED: sees uncommitted
-- READ COMMITTED: sees committed
-- REPEATABLE READ (default): same snapshot
-- SERIALIZABLE: locking
Best Practices for Performance
Storage Engine Selection
-- OLTP (Transactional): InnoDB
CREATE TABLE orders (...) ENGINE=InnoDB;
-- OLAP (Analytical): ColumnStore
CREATE TABLE sales_facts (...) ENGINE=ColumnStore;
-- Write-heavy: MyRocks
CREATE TABLE events (...) ENGINE=MyRocks;
-- Archive/Logs: Aria
CREATE TABLE logs (...) ENGINE=Aria;
Memory Tuning
-- Buffer pool should be 70% of RAM
-- For 8GB RAM: innodb_buffer_pool_size = 5G
-- Log file size: 25% of buffer pool
-- For 5G buffer: innodb_log_file_size = 1.25G
Query Optimization
-- Always use EXPLAIN
EXPLAIN SELECT * FROM employees WHERE department_id = 1;
-- Avoid SELECT *
SELECT name, salary FROM employees;
-- Use covering indexes
CREATE INDEX idx_covering ON employees(department_id, salary, name);
-- Use EXPLAIN ANALYZE (MariaDB 10.6+)
EXPLAIN ANALYZE
SELECT * FROM employees WHERE department_id = 1;
Common Pitfalls
Pitfall 1: Wrong Storage Engine
-- Don't use MyISAM for transactional data
CREATE TABLE transactions (...) ENGINE=MyISAM; -- BAD
-- Use InnoDB
CREATE TABLE transactions (...) ENGINE=InnoDB; -- GOOD
Pitfall 2: Too Small Buffer Pool
-- Set appropriate buffer pool size
-- my.cnf:
-- innodb_buffer_pool_size = 4G
Pitfall 3: Not Using Prepared Statements
-- Bad: concatenating values
SELECT * FROM users WHERE id = ' + str(id) + '
-- Good: prepared statements
PREPARE stmt FROM 'SELECT * FROM users WHERE id = ?';
SET @id = 1;
EXECUTE stmt USING @id;
Resources
Conclusion
MariaDB’s pluggable storage engine architecture provides flexibility to optimize for different workloads. Understanding these internals helps in making informed decisions about storage engine selection, query optimization, and performance tuning.
In the next article, we’ll explore MariaDB trends in 2025-2026, including vector search, new features, and emerging use cases.
Comments