Introduction
Understanding OpenSearch’s internal architecture helps you optimize queries and troubleshoot issues. This article explores how OpenSearch achieves distributed search and analytics.
Apache Lucene Foundation
OpenSearch is built on Apache Lucene:
┌─────────────────────────────────────────┐
│ OpenSearch │
│ ┌───────────────────────────────────┐ │
│ │ REST API Layer │ │
│ └───────────────────────────────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Cluster Management │ │
│ └───────────────────────────────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Lucene Library │ │
│ │ - IndexWriter │ │
│ │ - IndexReader │ │
│ │ - Searcher │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────┘
Segment-Based Storage
Index Structure
Index (shard)
├── _0.segment
│ ├── _0.si (Segment Info)
│ ├── _0.fdm (Field Metadata)
│ ├── _0.fdt (Field Data)
│ ├── _0.fdx (Field Index)
│ ├── _0.tmd (Term Metadata)
│ ├── _0.tvd (Term Vector Data)
│ ├── _0.tvx (Term Vector Index)
│ └── _0.nvd (Norm Data)
├── _1.segment
└── _2.segment
Document Addition
# When you index a document:
# 1. Added to in-memory buffer
# 2. Written to translog
# 3. Periodically flushed to segment
POST /products/_doc
{
"name": "Mouse",
"price": 29.99
}
Segment Merging
# Force merge
POST /products/_forcemerge
{
"max_num_segments": 1
}
Sharding Strategy
Primary and Replica Shards
# Index with shards
PUT /products
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 2
}
}
# Shard distribution:
# Primary: 0, 1, 2
# Replicas: 0, 1, 2 (2 copies each)
Shard Routing
# Document routing
POST /products/_doc
{
"name": "Mouse"
}
# Uses: hash(_id) % num_primary_shards
Replication
Primary-Replica Flow
Write Request
│
▼
┌─────────────────┐
│ Primary Shard │──► In-memory buffer
└─────────────────┘
│
▼ (parallel)
┌───────────────┐
│ Replica 1 │──► Translog
└───────────────┘
│
┌───────────────┐
│ Replica 2 │──► Translog
└───────────────┘
Read Request
# Coordinating node routes to:
# 1. One primary + all replicas
# 2. Waits for responses
# 3. Returns to client
Near Real-Time Search
Refresh Interval
# Default refresh: 1 second
PUT /products
{
"settings": {
"refresh_interval": "1s"
}
}
# Make visible for search
POST /products/_refresh
# Disable auto-refresh
PUT /products
{
"settings": {
"refresh_interval": "-1"
}
}
Translog
# Translog provides durability
PUT /products
{
"settings": {
"translog": {
"sync_interval": "5s",
"size": "5mb"
}
}
}
Query Execution
Query Phase
# 1. Coordinator receives request
# 2. Broadcasts to all shards
# 3. Each shard returns top-N results
# 4. Coordinator merges and returns
Search Flow
Search Request
│
▼
┌─────────────────────┐
│ Query Parsing │──► Parse DSL
└─────────────────────┘
│
▼
┌─────────────────────┐
│ Query Execution │──► Execute on shards
└─────────────────────┘
│
▼
┌─────────────────────┐
│ Reduce Phase │──► Merge results
└─────────────────────┘
│
▼
Response
Caching
Query Cache
# Enabled by default for filter queries
# Caches results per segment
# Clear cache
POST /products/_cache/clear
Field Data Cache
# Used for aggregations, sorts
# Loading costs memory
# Monitor
GET /products/_stats
Conclusion
Understanding OpenSearch internals—Lucene segments, sharding, and replication—helps you design better indexes and troubleshoot performance issues.
Comments