Introduction
Understanding OpenSearch’s internal architecture helps you optimize queries and troubleshoot issues. This article explores how OpenSearch achieves distributed search and analytics.
Apache Lucene Foundation
OpenSearch is built on Apache Lucene:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenSearch โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ REST API Layer โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Cluster Management โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Lucene Library โ โ
โ โ - IndexWriter โ โ
โ โ - IndexReader โ โ
โ โ - Searcher โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Segment-Based Storage
Index Structure
Index (shard)
โโโ _0.segment
โ โโโ _0.si (Segment Info)
โ โโโ _0.fdm (Field Metadata)
โ โโโ _0.fdt (Field Data)
โ โโโ _0.fdx (Field Index)
โ โโโ _0.tmd (Term Metadata)
โ โโโ _0.tvd (Term Vector Data)
โ โโโ _0.tvx (Term Vector Index)
โ โโโ _0.nvd (Norm Data)
โโโ _1.segment
โโโ _2.segment
Document Addition
# When you index a document:
# 1. Added to in-memory buffer
# 2. Written to translog
# 3. Periodically flushed to segment
POST /products/_doc
{
"name": "Mouse",
"price": 29.99
}
Segment Merging
# Force merge
POST /products/_forcemerge
{
"max_num_segments": 1
}
Sharding Strategy
Primary and Replica Shards
# Index with shards
PUT /products
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 2
}
}
# Shard distribution:
# Primary: 0, 1, 2
# Replicas: 0, 1, 2 (2 copies each)
Shard Routing
# Document routing
POST /products/_doc
{
"name": "Mouse"
}
# Uses: hash(_id) % num_primary_shards
Replication
Primary-Replica Flow
Write Request
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Primary Shard โโโโบ In-memory buffer
โโโโโโโโโโโโโโโโโโโ
โ
โผ (parallel)
โโโโโโโโโโโโโโโโโ
โ Replica 1 โโโโบ Translog
โโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโ
โ Replica 2 โโโโบ Translog
โโโโโโโโโโโโโโโโโ
Read Request
# Coordinating node routes to:
# 1. One primary + all replicas
# 2. Waits for responses
# 3. Returns to client
Near Real-Time Search
Refresh Interval
# Default refresh: 1 second
PUT /products
{
"settings": {
"refresh_interval": "1s"
}
}
# Make visible for search
POST /products/_refresh
# Disable auto-refresh
PUT /products
{
"settings": {
"refresh_interval": "-1"
}
}
Translog
# Translog provides durability
PUT /products
{
"settings": {
"translog": {
"sync_interval": "5s",
"size": "5mb"
}
}
}
Query Execution
Query Phase
# 1. Coordinator receives request
# 2. Broadcasts to all shards
# 3. Each shard returns top-N results
# 4. Coordinator merges and returns
Search Flow
Search Request
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Query Parsing โโโโบ Parse DSL
โโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Query Execution โโโโบ Execute on shards
โโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Reduce Phase โโโโบ Merge results
โโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Response
Caching
Query Cache
# Enabled by default for filter queries
# Caches results per segment
# Clear cache
POST /products/_cache/clear
Field Data Cache
# Used for aggregations, sorts
# Loading costs memory
# Monitor
GET /products/_stats
Conclusion
Understanding OpenSearch internalsโLucene segments, sharding, and replicationโhelps you design better indexes and troubleshoot performance issues.
Comments