Introduction
Understanding MySQL’s internal architecture helps you make better design decisions, optimize queries, and troubleshoot performance issues. This article explores the core components that make MySQL work: from the InnoDB storage engine to query execution and transaction management.
MySQL Architecture Overview
Layered Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MySQL Server (SQL Layer) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Connection Manager โ โ
โ โ Query Cache (removed 8.0) โ โ
โ โ Query Optimizer โ โ
โ โ Parser โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Storage Engine Layer โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ InnoDB โ โ MyISAM โ ... โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OS / File System โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Components
-- Show storage engines
SHOW ENGINES;
-- Check InnoDB status
SHOW ENGINE INNODB STATUS;
-- View system variables
SHOW VARIABLES LIKE 'innodb%';
InnoDB Storage Engine
Architecture
InnoDB Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Buffer Pool โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Pages (16KB each) โ โ
โ โ - Data pages โ โ
โ โ - Index pages โ โ
โ โ - Insert buffer โ โ
โ โ - Adaptive hash index โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Log Files โ
โ - Redo log โ
โ - Undo log โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Tablespaces โ
โ - System tablespace โ
โ - File-per-table tablespaces โ
โ - Undo tablespaces โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Page Structure
-- InnoDB page size is 16KB
SHOW VARIABLES LIKE 'innodb_page_size';
-- Page structure
-- โโโโโโโโโโโโโโโโโโโโโโ
-- โ Page Header โ 38 bytes
-- โโโโโโโโโโโโโโโโโโโโโโค
-- โ Infimum + Supremumโ 26 bytes
-- โโโโโโโโโโโโโโโโโโโโโโค
-- โ User Records โ Variable
-- โโโโโโโโโโโโโโโโโโโโโโค
-- โ Free Space โ Variable
-- โโโโโโโโโโโโโโโโโโโโโโค
-- โ Page Directory โ Variable
-- โโโโโโโโโโโโโโโโโโโโโโ
Buffer Pool
Purpose and Configuration
-- Buffer pool is memory cache for table data
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
-- Set buffer pool size (should be 70-80% of available RAM)
SET GLOBAL innodb_buffer_pool_size = 2147483648; -- 2GB
-- Buffer pool instances (for scalability)
SHOW VARIABLES LIKE 'innodb_buffer_pool_instances';
-- View buffer pool status
SELECT
pool_id,
pool_size,
free_buffers,
database_pages
FROM information_schema.INNODB_BUFFER_POOL_STATS;
-- Monitor page activity
SELECT * FROM information_schema.INNODB_BUFFER_PAGE;
Buffer Pool and Caching
-- What pages are in buffer pool
SELECT
table_name,
index_name,
COUNT(*) as pages,
ROUND(SUM(data_size) / 1024 / 1024, 2) as MB
FROM information_schema.INNODB_BUFFER_PAGE
GROUP BY table_name, index_name
ORDER BY pages DESC
LIMIT 20;
-- Buffer pool hit ratio
SHOW STATUS LIKE 'Innodb_buffer_pool_reads';
SHOW STATUS LIKE 'Innodb_buffer_pool_read_requests';
-- Calculate hit ratio
-- (1 - reads / read_requests) * 100 = hit ratio
-- Should be > 95%
Redo Log
Write-Ahead Logging
-- Redo log files
SHOW VARIABLES LIKE 'innodb_log_file_size';
SHOW VARIABLES LIKE 'innodb_log_files_in_group';
-- How redo logs work
-- 1. Transaction modifies data in buffer pool
-- 2. Changes written to redo log (persistently)
-- 3. Later flushed to data files
-- Check log sequence number (LSN)
SHOW ENGINE INNODB STATUS\G
-- Look for: log sequence number
-- Log flush
SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
-- 1: Write to disk on commit (default, most safe)
-- 2: Write to OS cache on commit
-- 0/ASYNC: Write to OS cache every second
Undo Log
Transaction Rollback
-- Undo logs store old versions of data
-- Used for:
-- 1. Transaction rollback
-- 2. MVCC (consistent reads)
-- Undo tablespaces
SHOW VARIABLES LIKE 'innodb_undo_tablespaces';
-- Check undo usage
SHOW ENGINE INNODB STATUS\G
-- Look for: undo tablespaces
MVCC (Multi-Version Concurrency Control)
How InnoDB Implements MVCC
-- InnoDB MVCC uses:
-- 1. Transaction IDs (DB_TRX_ID)
-- 2. Roll Pointers (DB_ROLL_PTR)
-- 3. Undo logs
-- Every row has hidden columns:
-- - DB_TRX_ID: Transaction that last modified
-- - DB_ROLL_PTR: Pointer to undo log
-- - DB_ROW_ID: Row ID (if no primary key)
-- Example: Concurrent reads
-- Transaction 1: INSERT
START TRANSACTION;
INSERT INTO accounts (id, balance) VALUES (1, 1000);
COMMIT;
-- Transaction 2: READ (sees original value)
START TRANSACTION;
SELECT * FROM accounts; -- Sees 1000
COMMIT;
-- Transaction 3: UPDATE
START TRANSACTION;
UPDATE accounts SET balance = 2000 WHERE id = 1;
COMMIT;
-- Transaction 4: READ (sees updated value)
SELECT * FROM accounts; -- Sees 2000
Transaction Isolation Levels
-- REPEATABLE READ (default in InnoDB)
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Consistent read: sees snapshot from first read
-- May see phantom rows in certain cases
-- READ COMMITTED
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- Each read sees latest committed values
-- May see non-repeatable reads
-- READ UNCOMMITTED
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- May see uncommitted changes (dirty reads)
-- SERIALIZABLE
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Locks all selected rows
-- Equivalent to REPEATABLE READ + SELECT ... FOR SHARE
Index Structures
B-Tree Index
-- Clustered vs Secondary Indexes
-- Clustered index:
-- - Primary key is clustered index
-- - Data stored in index order
-- - One per table
-- Secondary index:
-- - Contains primary key value
-- - Can have multiple per table
-- B-Tree structure
-- Root (page)
-- โโโ Branch pages
-- โ โโโ ...
-- โโโ Leaf pages (actual data)
-- Index creation
CREATE INDEX idx_users_email ON users(email);
-- Composite index (leftmost prefix)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Prefix index
CREATE INDEX idx_users_name ON users(name(10));
Index Statistics
-- Analyze table for index statistics
ANALYZE TABLE users;
-- View index information
SHOW INDEX FROM users;
-- Check index cardinality
SELECT
INDEX_NAME,
COLUMN_NAME,
CARDINALITY,
SEQ_IN_INDEX
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'mydb'
AND TABLE_NAME = 'users';
Query Processing
Query Execution Pipeline
SQL Query
โ
1. Parser (syntax check)
โ
2. Resolver (name resolution)
โ
3. Optimizer (create plan)
โ
4. Executor (run plan)
โ
Results
EXPLAIN Analysis
-- Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- EXPLAIN FORMAT=JSON (detailed)
EXPLAIN FORMAT=JSON
SELECT * FROM orders o
JOIN users u ON o.user_id = u.id
WHERE u.country = 'USA';
-- EXPLAIN ANALYZE (MySQL 8.0.18+)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 1;
-- Common output types:
-- ALL: Full table scan (bad)
-- index: Full index scan
-- range: Index range scan
-- ref: Non-unique index lookup
-- eq_ref: Unique index lookup (good)
-- const: Constant lookup (excellent)
-- system: Single row (excellent)
Join Execution
-- Nested Loop Join
EXPLAIN SELECT * FROM orders o
JOIN users u ON o.user_id = u.id;
-- Hash Join (MySQL 8.0.18+)
EXPLAIN SELECT * FROM orders o
JOIN users u ON o.user_id = u.id;
-- Block Nested Loop (for no index)
EXPLAIN SELECT * FROM orders o
JOIN users u ON o.user_id = u.id
WHERE u.country = 'USA';
Transaction Management
ACID Properties
-- Atomicity: All or nothing
START TRANSACTION;
INSERT INTO accounts (id, balance) VALUES (1, 1000);
INSERT INTO accounts (id, balance) VALUES (2, 500);
-- Both succeed or both fail
COMMIT;
-- Consistency: Valid state
-- Enforced by constraints
ALTER TABLE accounts ADD CHECK (balance >= 0);
-- Isolation: Concurrent transactions
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Durability: Survives crashes
SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
-- = 1 means fully durable
Locking
-- Row-level locks
SELECT * FROM orders WHERE id = 1 FOR UPDATE; -- Exclusive lock
SELECT * FROM orders WHERE id = 1 FOR SHARE; -- Shared lock
-- Gap locks
SELECT * FROM orders WHERE id BETWEEN 1 AND 10 FOR UPDATE;
-- Deadlock detection
SHOW ENGINE INNODB STATUS\G
-- Look for: LATEST DETECTED DEADLOCK
-- View locks
SELECT * FROM information_schema.INNODB_LOCKS;
-- View lock waits
SELECT * FROM information_schema.INNODB_LOCK_WAITS;
Memory Architecture
Key Memory Areas
-- Buffer pool (data and indexes)
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
-- Log buffer
SHOW VARIABLES LIKE 'innodb_log_buffer_size';
-- Sort buffer
SHOW VARIABLES LIKE 'sort_buffer_size';
-- Join buffer
SHOW VARIABLES LIKE 'join_buffer_size';
-- Read buffer
SHOW VARIABLES LIKE 'read_buffer_size';
-- Thread cache
SHOW VARIABLES LIKE 'thread_cache_size';
-- Query cache (removed in 8.0)
-- Use InnoDB buffer pool instead
Data Dictionary
MySQL 8.0 Data Dictionary
-- MySQL 8.0 uses InnoDB for data dictionary
-- No more .frm files
-- View data dictionary tables
SELECT * FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'mysql'
AND TABLE_NAME LIKE 'innodb%';
-- Check dictionary
SELECT * FROM mysql.innodb_table_stats;
SELECT * FROM mysql.innodb_index_stats;
Conclusion
Understanding MySQL’s internalsโInnoDB, MVCC, the buffer pool, and query processingโhelps you design better schemas, write faster queries, and diagnose performance issues. These fundamentals apply whether you’re tuning a simple application or managing a complex distributed system.
In the next article, we’ll explore MySQL’s recent developments: new features in version 8.0 and 8.4 LTS, JSON enhancements, and the evolving ecosystem.
Comments