Introduction
OpenSearch has emerged as a powerful open-source platform for AI applications, combining the capabilities of Elasticsearch with enhanced machine learning features. The k-NN (k-Nearest Neighbors) plugin enables efficient vector similarity search, making OpenSearch a compelling choice for semantic search, retrieval-augmented generation (RAG), and other AI-powered applications. In 2026, OpenSearch continues to gain traction as organizations seek open-source alternatives to proprietary vector databases while leveraging their existing search infrastructure.
This comprehensive guide covers everything you need to know about using OpenSearch for AI applications, from basic vector search setup to advanced RAG implementations and production best practices.
Understanding OpenSearch k-NN Vector Search
What is k-NN in OpenSearch?
k-NN is OpenSearch’s native vector similarity search capability that enables finding the most similar vectors to a query vector from a large dataset. Unlike traditional keyword search that matches exact terms, k-NN search finds semantically similar content by comparing numerical representations of meaning (embeddings).
Key features of OpenSearch k-NN:
- Approximate Nearest Neighbor (ANN) search: Fast similarity search using HNSW graphs
- Exact k-NN search: Precise results for smaller datasets
- Multiple distance metrics: Cosine similarity, Euclidean distance, inner product
- Filter support: Apply metadata filters to vector search results
- Scalable indexing: HNSW algorithm for efficient large-scale search
Supported Vector Field Types
OpenSearch supports several k-NN field configurations:
{
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss",
"parameters": {
"ef_construction": 100,
"m": 16
}
}
}
}
}
}
The key parameters:
- dimension: Number of dimensions in your embedding vectors (1536 for OpenAI ada-002, 768 for many transformer models)
- space_type: Distance metric (cosinesimil, l2, innerproduct)
- engine: Vector search engine (faiss, nmslib)
- ef_construction: Size of the dynamic candidate list during index build (100-1000)
- m: Number of connections per node in HNSW graph (16 is a good default)
Vector Distance Metrics
OpenSearch supports multiple distance metrics for vector similarity:
Cosine Similarity - Measures the angle between vectors:
{
"space_type": "cosinesimil",
"description": "Best for semantic similarity, insensitive to vector magnitude"
}
Euclidean Distance - Measures straight-line distance:
{
"space_type": "l2",
"description": "Best when vector magnitude matters, sensitive to scale"
}
Inner Product - Measures dot product:
{
"space_type": "innerproduct",
"description": "Equivalent to cosine for normalized vectors, faster computation"
}
Setting Up Vector Search in OpenSearch
Step 1: Create Vector Index
Create an index with k-NN vector fields:
PUT /documents
{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 100,
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"content": {
"type": "text"
},
"content_embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss",
"parameters": {
"ef_construction": 100,
"m": 16
}
}
},
"title_embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss",
"parameters": {
"ef_construction": 100,
"m": 16
}
}
},
"created_at": {
"type": "date"
},
"category": {
"type": "keyword"
}
}
}
}
Step 2: Configure k-NN Parameters
Tune k-NN performance in opensearch.yml:
# k-NN configuration
knn.memory.circuit_breaker.enabled: true
knn.memory.circuit_breaker.limit: 50%
knn.algo_param.ef_search: 100
knn.index.algorithm: HNSW
knn.index.space_type: cosinesimil
Step 3: Index Documents with Vectors
Index documents with both text and vector fields:
POST /documents/_doc/1
{
"id": "doc-1",
"title": "Introduction to Machine Learning",
"content": "Machine learning is a subset of artificial intelligence that enables systems to learn from data...",
"content_embedding": [0.1, 0.2, 0.3, ..., 0.9],
"title_embedding": [0.2, 0.3, 0.4, ..., 0.8],
"created_at": "2026-03-05T10:00:00Z",
"category": "technology"
}
Building RAG Pipelines with OpenSearch
RAG Architecture Overview
Retrieval-Augmented Generation (RAG) combines vector search with LLMs to provide context-aware responses. The architecture consists of:
- Document Ingestion: Parse, chunk, and embed documents
- Vector Indexing: Store embeddings in OpenSearch
- Query Processing: Embed user queries and search for similar content
- Context Augmentation: Combine retrieved documents with LLM prompts
- Response Generation: Generate answers using LLM with context
Document Ingestion Pipeline
from opensearchpy import OpenSearch
import numpy as np
from typing import List, Dict, Optional
import json
class OpenSearchRAGIngestion:
def __init__(self, opensearch_url: str, embedding_model, index_name: str = "documents"):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.index_name = index_name
self.chunk_size = 1000
self.chunk_overlap = 200
def chunk_text(self, text: str) -> List[str]:
"""Split text into chunks with overlap"""
chunks = []
words = text.split()
for i in range(0, len(words), self.chunk_size - self.chunk_overlap):
chunk = ' '.join(words[i:i + self.chunk_size])
if chunk:
chunks.append(chunk)
return chunks
def embed_and_index(self, document: Dict) -> List[str]:
"""Embed and index a document"""
doc_ids = []
# Create chunks
chunks = self.chunk_text(document['content'])
for i, chunk in enumerate(chunks):
# Generate embedding
embedding = self.model.encode(chunk).tolist()
# Create document with chunk metadata
chunk_doc = {
'id': f"{document['id']}_chunk_{i}",
'parent_id': document['id'],
'chunk_index': i,
'content': chunk,
'content_embedding': embedding,
'title': document.get('title', ''),
'created_at': document.get('created_at', ''),
'source': document.get('source', ''),
'category': document.get('category', '')
}
# Index the chunk
response = self.client.index(
index=self.index_name,
body=chunk_doc,
id=chunk_doc['id']
)
doc_ids.append(chunk_doc['id'])
return doc_ids
def batch_ingest(self, documents: List[Dict]) -> int:
"""Batch ingest multiple documents"""
total_chunks = 0
for doc in documents:
chunks = self.chunk_text(doc['content'])
for i, chunk in enumerate(chunks):
embedding = self.model.encode(chunk).tolist()
chunk_doc = {
'id': f"{doc['id']}_chunk_{i}",
'parent_id': doc['id'],
'chunk_index': i,
'content': chunk,
'content_embedding': embedding,
'title': doc.get('title', ''),
'created_at': doc.get('created_at', ''),
'source': doc.get('source', ''),
'category': doc.get('category', '')
}
self.client.index(
index=self.index_name,
body=chunk_doc,
id=chunk_doc['id']
)
total_chunks += 1
return total_chunks
Query Processing and Retrieval
class OpenSearchRAGQuery:
def __init__(self, opensearch_url: str, embedding_model,
index_name: str = "documents", k: int = 5):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.index_name = index_name
self.k = k
def retrieve(self, query: str, filters: Dict = None) -> List[Dict]:
"""Retrieve relevant documents for a query"""
# Generate query embedding
query_embedding = self.model.encode(query).tolist()
# Build k-NN search query
search_body = {
"size": self.k,
"query": {
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": self.k
}
}
},
"_source": ["id", "parent_id", "content", "title", "score"]
}
# Add filters if provided
if filters:
filter_query = {"bool": {"must": []}}
for field, value in filters.items():
if isinstance(value, list):
filter_query["bool"]["must"].append({
"terms": {field: value}
})
else:
filter_query["bool"]["must"].append({
"term": {field: value}
})
search_body["query"] = filter_query
search_body["post_filter"] = {
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": self.k
}
}
}
# Execute search
results = self.client.search(index=self.index_name, body=search_body)
return [
{
'id': hit['_id'],
'parent_id': hit['_source'].get('parent_id'),
'content': hit['_source'].get('content', ''),
'title': hit['_source'].get('title', ''),
'score': hit['_score'],
'relevance': float(hit['_score'])
}
for hit in results['hits']['hits']
]
def retrieve_with_metadata(self, query: str,
metadata_fields: List[str] = None) -> List[Dict]:
"""Retrieve documents with additional metadata"""
query_embedding = self.model.encode(query).tolist()
fields = ['id', 'parent_id', 'content', 'title', 'score'] + (metadata_fields or [])
search_body = {
"size": self.k,
"_source": fields,
"query": {
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": self.k
}
}
}
}
results = self.client.search(index=self.index_name, body=search_body)
return [
{field: hit['_source'].get(field) for field in fields}
for hit in results['hits']['hits']
]
Hybrid Search Implementation
Combine vector search with keyword search for better results:
class HybridSearchRAG:
def __init__(self, opensearch_url: str, embedding_model,
index_name: str = "documents", k: int = 5):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.index_name = index_name
self.k = k
def hybrid_search(self, query: str,
keyword_boost: float = 2.0,
vector_boost: float = 1.0) -> List[Dict]:
"""Combine keyword and vector search results"""
query_embedding = self.model.encode(query).tolist()
# Build hybrid search query
search_body = {
"size": self.k,
"query": {
"bool": {
"should": [
{
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": self.k,
"boost": vector_boost
}
}
},
{
"match": {
"content": {
"query": query,
"boost": keyword_boost
}
}
}
]
}
},
"_source": ["id", "parent_id", "content", "title", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
return [
{
'id': hit['_id'],
'content': hit['_source'].get('content', ''),
'title': hit['_source'].get('title', ''),
'score': hit['_score'],
'retrieval_method': 'hybrid'
}
for hit in results['hits']['hits']
]
Advanced OpenSearch AI Features
Multi-Vector Search
Support multiple embedding types for richer search:
{
"mappings": {
"properties": {
"title_embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss"
}
},
"content_embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss"
}
},
"metadata_embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss"
}
}
}
}
}
Query multiple vectors:
def multi_vector_search(self, query: str,
weights: Dict[str, float] = None) -> List[Dict]:
"""Search across multiple vector fields"""
query_embedding = self.model.encode(query).tolist()
weights = weights or {
'title_embedding': 2.0,
'content_embedding': 1.0,
'metadata_embedding': 0.5
}
search_body = {
"size": self.k,
"query": {
"bool": {
"should": [
{
"knn": {
f"{field}_embedding": {
"vector": query_embedding,
"k": self.k,
"boost": weight
}
}
}
for field, weight in weights.items()
]
}
},
"_source": ["id", "content", "title", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
return results
Filtered Vector Search
Apply metadata filters to vector search:
def filtered_vector_search(self, query: str,
filters: Dict,
top_k: int = 10) -> List[Dict]:
"""Vector search with metadata filters"""
query_embedding = self.model.encode(query).tolist()
# Build filter query
filter_query = {"bool": {"must": []}}
for field, value in filters.items():
if isinstance(value, list):
filter_query["bool"]["must"].append({
"terms": {field: value}
})
else:
filter_query["bool"]["must"].append({
"term": {field: value}
})
search_body = {
"size": top_k,
"query": {
"bool": {
"must": filter_query["bool"]["must"],
"should": [
{
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": top_k
}
}
}
]
}
},
"_source": ["id", "content", "title", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
return results
Range-based Vector Search
Combine vector search with range queries:
def range_vector_search(self, query: str,
date_from: str = None,
date_to: str = None,
score_min: float = 0.5) -> List[Dict]:
"""Vector search with date and score ranges"""
query_embedding = self.model.encode(query).tolist()
filter_query = {"bool": {"must": []}}
if date_from:
filter_query["bool"]["must"].append({
"range": {
"created_at": {
"gte": date_from
}
}
})
if date_to:
filter_query["bool"]["must"].append({
"range": {
"created_at": {
"lte": date_to
}
}
})
search_body = {
"size": 10,
"query": {
"bool": {
"must": filter_query["bool"]["must"],
"should": [
{
"knn": {
"content_embedding": {
"vector": query_embedding,
"k": 10
}
}
}
]
}
},
"_source": ["id", "content", "title", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
# Filter by minimum score
return [r for r in results['hits']['hits']
if r['_score'] >= score_min]
Production Best Practices
Index Optimization
Optimize HNSW parameters for your use case:
{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 500,
"knn.algo_param.ef_construction": 1000,
"knn.algo_param.m": 24
}
},
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": 1536,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "faiss",
"parameters": {
"ef_construction": 1000,
"m": 24
}
}
}
}
}
}
Parameters to tune:
- ef_construction: Higher values (500-1000) improve index quality but increase indexing time
- m: Higher values (24-48) improve accuracy but increase index size
- ef_search: Higher values (300-1000) improve recall but increase query latency
Memory and Performance Tuning
Configure OpenSearch for vector search performance:
# opensearch.yml - Memory settings
indices.memory.index_buffer_size: 20%
knn.memory.circuit_breaker.enabled: true
knn.memory.circuit_breaker.limit: 50%
knn.algo_param.ef_search: 100
knn.index.algorithm: HNSW
knn.index.space_type: cosinesimil
# JVM options for vector search
-Xms8g -Xmx8g
-Dopensearch.jvm.options=-XX:+UseG1GC
Monitoring and Metrics
Track vector search performance:
# Check k-NN stats
GET /_cat/knn?v
# Check index stats
GET /documents/_stats/knn
# Monitor query performance
GET /_nodes/stats/knn
Key metrics to monitor:
- query_time: Average query latency
- requests: Query throughput
- cache_hit_ratio: Vector cache effectiveness
- memory_usage: k-NN memory usage
Security Considerations
Secure k-NN endpoints:
# opensearch.yml - Security settings
security.enabled: true
security.authcz.admin_dn:
- "CN=admin,OU=Example,O=Example,L=Example,ST=Example,C=US"
# Role-based access control
roles.yml:
knn_user:
cluster_permissions:
- "cluster:admin/opensearch/knn/*"
index_permissions:
- "indices:admin/mapping/put"
- "indices:data/write/knn_search"
- "indices:data/read/knn_search"
Use Cases and Examples
E-commerce Product Search
Build semantic product search with OpenSearch:
class EcommerceProductSearch:
def __init__(self, opensearch_url, embedding_model, index_name="products"):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.index_name = index_name
def search_products(self, query: str,
category: str = None,
min_price: float = 0,
max_price: float = 1000) -> List[Dict]:
"""Search products with semantic understanding"""
query_embedding = self.model.encode(query).tolist()
filter_query = {
"bool": {
"must": [
{"range": {"price": {"gte": min_price, "lte": max_price}}}
]
}
}
if category:
filter_query["bool"]["must"].append(
{"term": {"category": category}}
)
search_body = {
"size": 20,
"query": {
"bool": {
"must": filter_query["bool"]["must"],
"should": [
{
"knn": {
"product_embedding": {
"vector": query_embedding,
"k": 20
}
}
}
]
}
},
"_source": ["id", "name", "price", "category", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
return results
Content Recommendation System
Build content recommendations:
class ContentRecommendation:
def __init__(self, opensearch_url, embedding_model, index_name="documents"):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.index_name = index_name
def get_similar_content(self, content_id: str,
exclude_id: str = None,
limit: int = 5) -> List[Dict]:
"""Find similar content to a given document"""
# Get the original document's embedding
response = self.client.get(index=self.index_name, id=content_id)
if not response.get('_source'):
return []
embedding = response['_source'].get('content_embedding', [])
if not embedding:
return []
filter_query = {"bool": {"must": []}}
if exclude_id:
filter_query["bool"]["must"].append({
"bool": {
"must_not": {"term": {"_id": exclude_id}}
}
})
search_body = {
"size": limit,
"query": {
"bool": {
"must": filter_query["bool"]["must"],
"should": [
{
"knn": {
"content_embedding": {
"vector": embedding,
"k": limit
}
}
}
]
}
},
"_source": ["id", "title", "score"]
}
results = self.client.search(index=self.index_name, body=search_body)
return results
Document Deduplication
Find and remove duplicate documents:
class DocumentDeduplicator:
def __init__(self, opensearch_url, embedding_model, threshold=0.95):
self.client = OpenSearch([opensearch_url], verify_certs=False)
self.model = embedding_model
self.threshold = threshold
def find_duplicates(self, document_id: str) -> List[str]:
"""Find duplicate documents based on vector similarity"""
# Get the document's embedding
response = self.client.get(index="documents", id=document_id)
embedding = response['_source'].get('content_embedding', [])
if not embedding:
return []
search_body = {
"size": 100,
"query": {
"knn": {
"content_embedding": {
"vector": embedding,
"k": 100
}
}
},
"_source": ["id", "title", "score"]
}
results = self.client.search(index="documents", body=search_body)
# Filter by similarity threshold
duplicates = []
for hit in results['hits']['hits']:
if hit['_id'] != document_id and hit['_score'] >= self.threshold:
duplicates.append(hit['_id'])
return duplicates
Comparison with Alternatives
OpenSearch vs Dedicated Vector Databases
| Feature | OpenSearch | Pinecone | Weaviate | PostgreSQL pgvector |
|---|---|---|---|---|
| Open Source | Yes | No | Yes | Yes |
| Hybrid Search | Native | Limited | Native | Requires custom |
| Existing Infrastructure | Leverage current | New | New | New |
| Scalability | Horizontal | Managed | Horizontal | Vertical + read replicas |
| Cost | Self-managed | Pay-per-use | Pay-per-use | Infrastructure cost |
| Learning Curve | Moderate | Low | Moderate | Low |
When to Choose OpenSearch for AI
Choose OpenSearch when:
- You already have OpenSearch/Elasticsearch infrastructure
- You need hybrid search (keyword + vector)
- You want open-source control
- You need enterprise features (security, monitoring)
- You have mixed workloads (search + AI)
Consider alternatives when:
- You need pure vector search at massive scale
- You prefer managed services
- You need specialized vector algorithms
- You want simpler setup
Resources
Official Documentation
Community Resources
Tools and Libraries
Conclusion
OpenSearch has matured into a capable platform for AI applications, offering native k-NN vector search, hybrid search capabilities, and seamless integration with LLMs. The key advantages for AI workloads include:
- Open Source Flexibility - Full control over your search infrastructure
- Hybrid Search - Combine keyword and vector search out of the box
- Existing Infrastructure - Leverage your current OpenSearch/Elasticsearch deployment
- Enterprise Features - Security, monitoring, and scalability
- Cost Efficiency - No per-query pricing or vendor lock-in
By following the patterns and best practices outlined in this guide, you can build production-ready AI applications with OpenSearch, from semantic search engines to RAG pipelines and recommendation systems. The integration with modern embedding models and LLMs makes OpenSearch a compelling choice for organizations seeking open-source AI solutions.
As vector search continues to evolve, OpenSearch’s active development community and enterprise adoption ensure it will remain a viable option for AI applications in 2026 and beyond.
Comments