Skip to main content
โšก Calmops

OpenSearch for AI: Vector Search, RAG Pipelines, and Semantic Search 2026

Introduction

OpenSearch has emerged as a powerful open-source platform for AI applications, combining the capabilities of Elasticsearch with enhanced machine learning features. The k-NN (k-Nearest Neighbors) plugin enables efficient vector similarity search, making OpenSearch a compelling choice for semantic search, retrieval-augmented generation (RAG), and other AI-powered applications. In 2026, OpenSearch continues to gain traction as organizations seek open-source alternatives to proprietary vector databases while leveraging their existing search infrastructure.

This comprehensive guide covers everything you need to know about using OpenSearch for AI applications, from basic vector search setup to advanced RAG implementations and production best practices.


What is k-NN in OpenSearch?

k-NN is OpenSearch’s native vector similarity search capability that enables finding the most similar vectors to a query vector from a large dataset. Unlike traditional keyword search that matches exact terms, k-NN search finds semantically similar content by comparing numerical representations of meaning (embeddings).

Key features of OpenSearch k-NN:

  • Approximate Nearest Neighbor (ANN) search: Fast similarity search using HNSW graphs
  • Exact k-NN search: Precise results for smaller datasets
  • Multiple distance metrics: Cosine similarity, Euclidean distance, inner product
  • Filter support: Apply metadata filters to vector search results
  • Scalable indexing: HNSW algorithm for efficient large-scale search

Supported Vector Field Types

OpenSearch supports several k-NN field configurations:

{
  "mappings": {
    "properties": {
      "embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 100,
            "m": 16
          }
        }
      }
    }
  }
}

The key parameters:

  • dimension: Number of dimensions in your embedding vectors (1536 for OpenAI ada-002, 768 for many transformer models)
  • space_type: Distance metric (cosinesimil, l2, innerproduct)
  • engine: Vector search engine (faiss, nmslib)
  • ef_construction: Size of the dynamic candidate list during index build (100-1000)
  • m: Number of connections per node in HNSW graph (16 is a good default)

Vector Distance Metrics

OpenSearch supports multiple distance metrics for vector similarity:

Cosine Similarity - Measures the angle between vectors:

{
  "space_type": "cosinesimil",
  "description": "Best for semantic similarity, insensitive to vector magnitude"
}

Euclidean Distance - Measures straight-line distance:

{
  "space_type": "l2",
  "description": "Best when vector magnitude matters, sensitive to scale"
}

Inner Product - Measures dot product:

{
  "space_type": "innerproduct",
  "description": "Equivalent to cosine for normalized vectors, faster computation"
}

Setting Up Vector Search in OpenSearch

Step 1: Create Vector Index

Create an index with k-NN vector fields:

PUT /documents
{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 100,
      "number_of_shards": 1,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "title": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "content": {
        "type": "text"
      },
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 100,
            "m": 16
          }
        }
      },
      "title_embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 100,
            "m": 16
          }
        }
      },
      "created_at": {
        "type": "date"
      },
      "category": {
        "type": "keyword"
      }
    }
  }
}

Step 2: Configure k-NN Parameters

Tune k-NN performance in opensearch.yml:

# k-NN configuration
knn.memory.circuit_breaker.enabled: true
knn.memory.circuit_breaker.limit: 50%
knn.algo_param.ef_search: 100
knn.index.algorithm: HNSW
knn.index.space_type: cosinesimil

Step 3: Index Documents with Vectors

Index documents with both text and vector fields:

POST /documents/_doc/1
{
  "id": "doc-1",
  "title": "Introduction to Machine Learning",
  "content": "Machine learning is a subset of artificial intelligence that enables systems to learn from data...",
  "content_embedding": [0.1, 0.2, 0.3, ..., 0.9],
  "title_embedding": [0.2, 0.3, 0.4, ..., 0.8],
  "created_at": "2026-03-05T10:00:00Z",
  "category": "technology"
}

Building RAG Pipelines with OpenSearch

RAG Architecture Overview

Retrieval-Augmented Generation (RAG) combines vector search with LLMs to provide context-aware responses. The architecture consists of:

  1. Document Ingestion: Parse, chunk, and embed documents
  2. Vector Indexing: Store embeddings in OpenSearch
  3. Query Processing: Embed user queries and search for similar content
  4. Context Augmentation: Combine retrieved documents with LLM prompts
  5. Response Generation: Generate answers using LLM with context

Document Ingestion Pipeline

from opensearchpy import OpenSearch
import numpy as np
from typing import List, Dict, Optional
import json

class OpenSearchRAGIngestion:
    def __init__(self, opensearch_url: str, embedding_model, index_name: str = "documents"):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.index_name = index_name
        self.chunk_size = 1000
        self.chunk_overlap = 200
    
    def chunk_text(self, text: str) -> List[str]:
        """Split text into chunks with overlap"""
        chunks = []
        words = text.split()
        
        for i in range(0, len(words), self.chunk_size - self.chunk_overlap):
            chunk = ' '.join(words[i:i + self.chunk_size])
            if chunk:
                chunks.append(chunk)
        
        return chunks
    
    def embed_and_index(self, document: Dict) -> List[str]:
        """Embed and index a document"""
        doc_ids = []
        
        # Create chunks
        chunks = self.chunk_text(document['content'])
        
        for i, chunk in enumerate(chunks):
            # Generate embedding
            embedding = self.model.encode(chunk).tolist()
            
            # Create document with chunk metadata
            chunk_doc = {
                'id': f"{document['id']}_chunk_{i}",
                'parent_id': document['id'],
                'chunk_index': i,
                'content': chunk,
                'content_embedding': embedding,
                'title': document.get('title', ''),
                'created_at': document.get('created_at', ''),
                'source': document.get('source', ''),
                'category': document.get('category', '')
            }
            
            # Index the chunk
            response = self.client.index(
                index=self.index_name,
                body=chunk_doc,
                id=chunk_doc['id']
            )
            doc_ids.append(chunk_doc['id'])
        
        return doc_ids
    
    def batch_ingest(self, documents: List[Dict]) -> int:
        """Batch ingest multiple documents"""
        total_chunks = 0
        
        for doc in documents:
            chunks = self.chunk_text(doc['content'])
            for i, chunk in enumerate(chunks):
                embedding = self.model.encode(chunk).tolist()
                
                chunk_doc = {
                    'id': f"{doc['id']}_chunk_{i}",
                    'parent_id': doc['id'],
                    'chunk_index': i,
                    'content': chunk,
                    'content_embedding': embedding,
                    'title': doc.get('title', ''),
                    'created_at': doc.get('created_at', ''),
                    'source': doc.get('source', ''),
                    'category': doc.get('category', '')
                }
                
                self.client.index(
                    index=self.index_name,
                    body=chunk_doc,
                    id=chunk_doc['id']
                )
                total_chunks += 1
        
        return total_chunks

Query Processing and Retrieval

class OpenSearchRAGQuery:
    def __init__(self, opensearch_url: str, embedding_model, 
                 index_name: str = "documents", k: int = 5):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.index_name = index_name
        self.k = k
    
    def retrieve(self, query: str, filters: Dict = None) -> List[Dict]:
        """Retrieve relevant documents for a query"""
        # Generate query embedding
        query_embedding = self.model.encode(query).tolist()
        
        # Build k-NN search query
        search_body = {
            "size": self.k,
            "query": {
                "knn": {
                    "content_embedding": {
                        "vector": query_embedding,
                        "k": self.k
                    }
                }
            },
            "_source": ["id", "parent_id", "content", "title", "score"]
        }
        
        # Add filters if provided
        if filters:
            filter_query = {"bool": {"must": []}}
            for field, value in filters.items():
                if isinstance(value, list):
                    filter_query["bool"]["must"].append({
                        "terms": {field: value}
                    })
                else:
                    filter_query["bool"]["must"].append({
                        "term": {field: value}
                    })
            search_body["query"] = filter_query
            search_body["post_filter"] = {
                "knn": {
                    "content_embedding": {
                        "vector": query_embedding,
                        "k": self.k
                    }
                }
            }
        
        # Execute search
        results = self.client.search(index=self.index_name, body=search_body)
        
        return [
            {
                'id': hit['_id'],
                'parent_id': hit['_source'].get('parent_id'),
                'content': hit['_source'].get('content', ''),
                'title': hit['_source'].get('title', ''),
                'score': hit['_score'],
                'relevance': float(hit['_score'])
            }
            for hit in results['hits']['hits']
        ]
    
    def retrieve_with_metadata(self, query: str, 
                               metadata_fields: List[str] = None) -> List[Dict]:
        """Retrieve documents with additional metadata"""
        query_embedding = self.model.encode(query).tolist()
        
        fields = ['id', 'parent_id', 'content', 'title', 'score'] + (metadata_fields or [])
        
        search_body = {
            "size": self.k,
            "_source": fields,
            "query": {
                "knn": {
                    "content_embedding": {
                        "vector": query_embedding,
                        "k": self.k
                    }
                }
            }
        }
        
        results = self.client.search(index=self.index_name, body=search_body)
        
        return [
            {field: hit['_source'].get(field) for field in fields}
            for hit in results['hits']['hits']
        ]

Hybrid Search Implementation

Combine vector search with keyword search for better results:

class HybridSearchRAG:
    def __init__(self, opensearch_url: str, embedding_model, 
                 index_name: str = "documents", k: int = 5):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.index_name = index_name
        self.k = k
    
    def hybrid_search(self, query: str, 
                     keyword_boost: float = 2.0,
                     vector_boost: float = 1.0) -> List[Dict]:
        """Combine keyword and vector search results"""
        query_embedding = self.model.encode(query).tolist()
        
        # Build hybrid search query
        search_body = {
            "size": self.k,
            "query": {
                "bool": {
                    "should": [
                        {
                            "knn": {
                                "content_embedding": {
                                    "vector": query_embedding,
                                    "k": self.k,
                                    "boost": vector_boost
                                }
                            }
                        },
                        {
                            "match": {
                                "content": {
                                    "query": query,
                                    "boost": keyword_boost
                                }
                            }
                        }
                    ]
                }
            },
            "_source": ["id", "parent_id", "content", "title", "score"]
        }
        
        results = self.client.search(index=self.index_name, body=search_body)
        
        return [
            {
                'id': hit['_id'],
                'content': hit['_source'].get('content', ''),
                'title': hit['_source'].get('title', ''),
                'score': hit['_score'],
                'retrieval_method': 'hybrid'
            }
            for hit in results['hits']['hits']
        ]

Advanced OpenSearch AI Features

Support multiple embedding types for richer search:

{
  "mappings": {
    "properties": {
      "title_embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss"
        }
      },
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss"
        }
      },
      "metadata_embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss"
        }
      }
    }
  }
}

Query multiple vectors:

def multi_vector_search(self, query: str, 
                       weights: Dict[str, float] = None) -> List[Dict]:
    """Search across multiple vector fields"""
    query_embedding = self.model.encode(query).tolist()
    
    weights = weights or {
        'title_embedding': 2.0,
        'content_embedding': 1.0,
        'metadata_embedding': 0.5
    }
    
    search_body = {
        "size": self.k,
        "query": {
            "bool": {
                "should": [
                    {
                        "knn": {
                            f"{field}_embedding": {
                                "vector": query_embedding,
                                "k": self.k,
                                "boost": weight
                            }
                        }
                    }
                    for field, weight in weights.items()
                ]
            }
        },
        "_source": ["id", "content", "title", "score"]
    }
    
    results = self.client.search(index=self.index_name, body=search_body)
    return results

Apply metadata filters to vector search:

def filtered_vector_search(self, query: str, 
                          filters: Dict,
                          top_k: int = 10) -> List[Dict]:
    """Vector search with metadata filters"""
    query_embedding = self.model.encode(query).tolist()
    
    # Build filter query
    filter_query = {"bool": {"must": []}}
    for field, value in filters.items():
        if isinstance(value, list):
            filter_query["bool"]["must"].append({
                "terms": {field: value}
            })
        else:
            filter_query["bool"]["must"].append({
                "term": {field: value}
            })
    
    search_body = {
        "size": top_k,
        "query": {
            "bool": {
                "must": filter_query["bool"]["must"],
                "should": [
                    {
                        "knn": {
                            "content_embedding": {
                                "vector": query_embedding,
                                "k": top_k
                            }
                        }
                    }
                ]
            }
        },
        "_source": ["id", "content", "title", "score"]
    }
    
    results = self.client.search(index=self.index_name, body=search_body)
    return results

Combine vector search with range queries:

def range_vector_search(self, query: str,
                       date_from: str = None,
                       date_to: str = None,
                       score_min: float = 0.5) -> List[Dict]:
    """Vector search with date and score ranges"""
    query_embedding = self.model.encode(query).tolist()
    
    filter_query = {"bool": {"must": []}}
    
    if date_from:
        filter_query["bool"]["must"].append({
            "range": {
                "created_at": {
                    "gte": date_from
                }
            }
        })
    
    if date_to:
        filter_query["bool"]["must"].append({
            "range": {
                "created_at": {
                    "lte": date_to
                }
            }
        })
    
    search_body = {
        "size": 10,
        "query": {
            "bool": {
                "must": filter_query["bool"]["must"],
                "should": [
                    {
                        "knn": {
                            "content_embedding": {
                                "vector": query_embedding,
                                "k": 10
                            }
                        }
                    }
                ]
            }
        },
        "_source": ["id", "content", "title", "score"]
    }
    
    results = self.client.search(index=self.index_name, body=search_body)
    
    # Filter by minimum score
    return [r for r in results['hits']['hits'] 
            if r['_score'] >= score_min]

Production Best Practices

Index Optimization

Optimize HNSW parameters for your use case:

{
  "settings": {
    "index": {
      "knn": true,
      "knn.algo_param.ef_search": 500,
      "knn.algo_param.ef_construction": 1000,
      "knn.algo_param.m": 24
    }
  },
  "mappings": {
    "properties": {
      "embedding": {
        "type": "knn_vector",
        "dimension": 1536,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss",
          "parameters": {
            "ef_construction": 1000,
            "m": 24
          }
        }
      }
    }
  }
}

Parameters to tune:

  • ef_construction: Higher values (500-1000) improve index quality but increase indexing time
  • m: Higher values (24-48) improve accuracy but increase index size
  • ef_search: Higher values (300-1000) improve recall but increase query latency

Memory and Performance Tuning

Configure OpenSearch for vector search performance:

# opensearch.yml - Memory settings
indices.memory.index_buffer_size: 20%
knn.memory.circuit_breaker.enabled: true
knn.memory.circuit_breaker.limit: 50%
knn.algo_param.ef_search: 100
knn.index.algorithm: HNSW
knn.index.space_type: cosinesimil

# JVM options for vector search
-Xms8g -Xmx8g
-Dopensearch.jvm.options=-XX:+UseG1GC

Monitoring and Metrics

Track vector search performance:

# Check k-NN stats
GET /_cat/knn?v

# Check index stats
GET /documents/_stats/knn

# Monitor query performance
GET /_nodes/stats/knn

Key metrics to monitor:

  • query_time: Average query latency
  • requests: Query throughput
  • cache_hit_ratio: Vector cache effectiveness
  • memory_usage: k-NN memory usage

Security Considerations

Secure k-NN endpoints:

# opensearch.yml - Security settings
security.enabled: true
security.authcz.admin_dn:
  - "CN=admin,OU=Example,O=Example,L=Example,ST=Example,C=US"

# Role-based access control
roles.yml:
  knn_user:
    cluster_permissions:
      - "cluster:admin/opensearch/knn/*"
    index_permissions:
      - "indices:admin/mapping/put"
      - "indices:data/write/knn_search"
      - "indices:data/read/knn_search"

Use Cases and Examples

Build semantic product search with OpenSearch:

class EcommerceProductSearch:
    def __init__(self, opensearch_url, embedding_model, index_name="products"):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.index_name = index_name
    
    def search_products(self, query: str, 
                       category: str = None,
                       min_price: float = 0,
                       max_price: float = 1000) -> List[Dict]:
        """Search products with semantic understanding"""
        query_embedding = self.model.encode(query).tolist()
        
        filter_query = {
            "bool": {
                "must": [
                    {"range": {"price": {"gte": min_price, "lte": max_price}}}
                ]
            }
        }
        
        if category:
            filter_query["bool"]["must"].append(
                {"term": {"category": category}}
            )
        
        search_body = {
            "size": 20,
            "query": {
                "bool": {
                    "must": filter_query["bool"]["must"],
                    "should": [
                        {
                            "knn": {
                                "product_embedding": {
                                    "vector": query_embedding,
                                    "k": 20
                                }
                            }
                        }
                    ]
                }
            },
            "_source": ["id", "name", "price", "category", "score"]
        }
        
        results = self.client.search(index=self.index_name, body=search_body)
        return results

Content Recommendation System

Build content recommendations:

class ContentRecommendation:
    def __init__(self, opensearch_url, embedding_model, index_name="documents"):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.index_name = index_name
    
    def get_similar_content(self, content_id: str, 
                           exclude_id: str = None,
                           limit: int = 5) -> List[Dict]:
        """Find similar content to a given document"""
        # Get the original document's embedding
        response = self.client.get(index=self.index_name, id=content_id)
        
        if not response.get('_source'):
            return []
        
        embedding = response['_source'].get('content_embedding', [])
        
        if not embedding:
            return []
        
        filter_query = {"bool": {"must": []}}
        if exclude_id:
            filter_query["bool"]["must"].append({
                "bool": {
                    "must_not": {"term": {"_id": exclude_id}}
                }
            })
        
        search_body = {
            "size": limit,
            "query": {
                "bool": {
                    "must": filter_query["bool"]["must"],
                    "should": [
                        {
                            "knn": {
                                "content_embedding": {
                                    "vector": embedding,
                                    "k": limit
                                }
                            }
                        }
                    ]
                }
            },
            "_source": ["id", "title", "score"]
        }
        
        results = self.client.search(index=self.index_name, body=search_body)
        return results

Document Deduplication

Find and remove duplicate documents:

class DocumentDeduplicator:
    def __init__(self, opensearch_url, embedding_model, threshold=0.95):
        self.client = OpenSearch([opensearch_url], verify_certs=False)
        self.model = embedding_model
        self.threshold = threshold
    
    def find_duplicates(self, document_id: str) -> List[str]:
        """Find duplicate documents based on vector similarity"""
        # Get the document's embedding
        response = self.client.get(index="documents", id=document_id)
        embedding = response['_source'].get('content_embedding', [])
        
        if not embedding:
            return []
        
        search_body = {
            "size": 100,
            "query": {
                "knn": {
                    "content_embedding": {
                        "vector": embedding,
                        "k": 100
                    }
                }
            },
            "_source": ["id", "title", "score"]
        }
        
        results = self.client.search(index="documents", body=search_body)
        
        # Filter by similarity threshold
        duplicates = []
        for hit in results['hits']['hits']:
            if hit['_id'] != document_id and hit['_score'] >= self.threshold:
                duplicates.append(hit['_id'])
        
        return duplicates

Comparison with Alternatives

OpenSearch vs Dedicated Vector Databases

Feature OpenSearch Pinecone Weaviate PostgreSQL pgvector
Open Source Yes No Yes Yes
Hybrid Search Native Limited Native Requires custom
Existing Infrastructure Leverage current New New New
Scalability Horizontal Managed Horizontal Vertical + read replicas
Cost Self-managed Pay-per-use Pay-per-use Infrastructure cost
Learning Curve Moderate Low Moderate Low

When to Choose OpenSearch for AI

Choose OpenSearch when:

  • You already have OpenSearch/Elasticsearch infrastructure
  • You need hybrid search (keyword + vector)
  • You want open-source control
  • You need enterprise features (security, monitoring)
  • You have mixed workloads (search + AI)

Consider alternatives when:

  • You need pure vector search at massive scale
  • You prefer managed services
  • You need specialized vector algorithms
  • You want simpler setup

Resources

Official Documentation

Community Resources

Tools and Libraries


Conclusion

OpenSearch has matured into a capable platform for AI applications, offering native k-NN vector search, hybrid search capabilities, and seamless integration with LLMs. The key advantages for AI workloads include:

  1. Open Source Flexibility - Full control over your search infrastructure
  2. Hybrid Search - Combine keyword and vector search out of the box
  3. Existing Infrastructure - Leverage your current OpenSearch/Elasticsearch deployment
  4. Enterprise Features - Security, monitoring, and scalability
  5. Cost Efficiency - No per-query pricing or vendor lock-in

By following the patterns and best practices outlined in this guide, you can build production-ready AI applications with OpenSearch, from semantic search engines to RAG pipelines and recommendation systems. The integration with modern embedding models and LLMs makes OpenSearch a compelling choice for organizations seeking open-source AI solutions.

As vector search continues to evolve, OpenSearch’s active development community and enterprise adoption ensure it will remain a viable option for AI applications in 2026 and beyond.

Comments