Meilisearch supports vector search, enabling semantic and similarity-based queries using embeddings. This feature, available in Meilisearch v1.3+, allows searching by meaning rather than exact keywords. This post covers setup, data preparation, indexing, and querying with vectors.
What is Vector Search?
Vector search uses machine learning embeddings to represent text as vectors in a high-dimensional space. Similar items are closer together, enabling searches like “find documents similar to this one.”
Prerequisites
- Meilisearch v1.3 or later.
- A way to generate embeddings (e.g., OpenAI API, Sentence Transformers).
- Go, Python, or another language for integration.
Step 1: Setting Up Meilisearch
Download and run Meilisearch:
curl -L https://install.meilisearch.com | sh
./meilisearch --master-key="your_master_key"
Enable experimental features if needed (vector search is stable in recent versions).
Step 2: Preparing Data with Embeddings
Generate embeddings for your documents. For example, using Python with Sentence Transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
documents = [
{"id": 1, "title": "Machine Learning Basics", "content": "Introduction to ML..."},
{"id": 2, "title": "Deep Learning", "content": "Advanced neural networks..."},
]
for doc in documents:
doc["_vectors"] = {"default": model.encode(doc["content"]).tolist()}
The _vectors field holds embeddings, with “default” as the vector name.
Step 3: Indexing Documents
Add documents to Meilisearch with vectors:
import meilisearch
client = meilisearch.Client('http://localhost:7700', 'your_master_key')
index = client.index('articles')
index.add_documents(documents)
Configure the index for vector search:
index.update_settings({
"embedders": {
"default": {
"source": "userProvided",
"dimensions": 384 # Match your embedding size
}
}
})
Step 4: Performing Vector Searches
Search using a query vector:
query = "neural networks"
query_embedding = model.encode(query).tolist()
results = index.search("", {
"vector": query_embedding,
"hybrid": {"semanticRatio": 0.5} # Blend keyword and vector search
})
for hit in results['hits']:
print(hit['title'])
For pure vector search, omit the query string.
Step 5: Advanced Features
Hybrid Search
Combine keyword and vector search:
results = index.search("AI", {
"vector": query_embedding,
"hybrid": {"semanticRatio": 0.7}
})
Filtering and Faceting
Apply filters with vector search:
results = index.search("", {
"vector": query_embedding,
"filter": "category = 'tech'"
})
Custom Embedders
Use Meilisearch’s built-in embedders (e.g., OpenAI):
index.update_settings({
"embedders": {
"openai": {
"source": "openAi",
"apiKey": "your_openai_key",
"model": "text-embedding-ada-002",
"documentTemplate": "{% for field in fields %}{{ field.value }}{% endfor %}"
}
}
})
Best Practices
- Choose appropriate embedding models for your domain.
- Normalize vectors if needed.
- Monitor performance; vector search can be resource-intensive.
- Use semantic ratios to balance keyword and vector results.
Conclusion
Vector search in Meilisearch enables powerful semantic queries. Start with simple embeddings, experiment with hybrid search, and integrate into your applications for better relevance.
For more, see the Meilisearch vector search docs.
Comments