How to Use Vector Search in Meilisearch

Meilisearch supports vector search, enabling semantic and similarity-based queries using embeddings. This feature, available in Meilisearch v1.3+ and stable since v1.12+, allows searching by meaning rather than exact keywords. This post covers setup, data preparation, indexing, querying, and production tuning with vectors.

What Is Vector Search?

Vector search uses machine learning embeddings to represent text as vectors in a high-dimensional space. Similar items cluster closer together, so you can search for “find documents similar to this one” or “conceptually related content” without exact keyword overlap. Meilisearch implements hybrid search — it combines keyword (BM25) ranking with vector similarity (dot product or cosine) in a single query.

Prerequisites

Meilisearch v1.12 or later (stable vector search; v1.3–v1.11 used experimental flags).
An embedding provider: local model (Sentence Transformers), SaaS API (OpenAI, Cohere, Voyage, BGE), or Hugging Face Inference API.
Python 3.9+, Go 1.21+, or Node.js 18+ for the client examples.

Step 1: Setting Up Meilisearch

Download and start Meilisearch:

curl -L https://install.meilisearch.com | sh
./meilisearch --master-key="your_master_key"

In versions before v1.12 you needed --enable-vector-search or to call the experimental toggle endpoint. Since v1.12 vector search is stable by default. Verify your version:

curl http://localhost:7700/version
# => {"version":"1.13.0","commit":"abc123","pkg":"meilisearch"}

Step 2: Choosing an Embedding Model

Your embedding model determines search quality, latency, memory usage, and cost. The table below compares the most common providers.

Embedding Model Comparison

Provider	Model	Dimensions	Languages	Cost	Quality	Best For
Sentence Transformers	all-MiniLM-L6-v2	384	EN	Free (local)	Good	Dev, offline, low-latency
Sentence Transformers	BGE-base-en-v1.5	768	EN	Free (local)	Very Good	Production on-premise
Sentence Transformers	gtr-t5-large	768	EN	Free (local)	Excellent	High-accuracy offline
OpenAI	text-embedding-ada-002	1536	Multilingual	$0.13/1M tokens	Excellent	General-purpose SaaS
OpenAI	text-embedding-3-small	512	Multilingual	$0.02/1M tokens	Very Good	Cheap SaaS, 3x cheaper
OpenAI	text-embedding-3-large	3072	Multilingual	$0.13/1M tokens	Best	Max recall, high budget
Cohere	embed-english-v3.0	1024	EN	$0.10/1M tokens	Excellent	English-first retrieval
Cohere	embed-multilingual-v3.0	1024	Multilingual	$0.10/1M tokens	Excellent	Multilingual search
Voyage	voyage-2	1024	Multilingual	$0.10/1M tokens	Very Good	Code + text search
BGE (BAAI)	bge-large-en-v1.5	1024	EN	Free (local)	Excellent	Local, no API cost

Key Trade-off: Dimensions

Higher dimensions capture more nuance but increase memory and latency. A 1536-dimension vector uses 4x the RAM of a 384-dimension one. Meilisearch stores vectors as f32 arrays: each dimension is 4 bytes. For 1M documents:

384-d: 1M × 384 × 4 = ~1.5 GB
768-d: 1M × 768 × 4 = ~3.1 GB
1536-d: 1M × 1536 × 4 = ~6.1 GB

Account for this when provisioning your server. If you serve the search from a 2 GB VPS, a 384-d model keeps you safe; a 1536-d model will OOM.

Step 3: Configuring Embedders in Meilisearch

Meilisearch supports four embedder source types: userProvided, openAi, huggingFace, and rest.

userProvided

You generate embeddings client-side and send them with each document:

from sentence_transformers import SentenceTransformer
import meilisearch

model = SentenceTransformer('all-MiniLM-L6-v2')
client = meilisearch.Client('http://localhost:7700', 'your_master_key')

client.index('articles').update_settings({
    "embedders": {
        "default": {
            "source": "userProvided",
            "dimensions": 384
        }
    }
})

docs = [
    {"id": 1, "title": "Machine Learning Basics", "content": "Introduction to supervised and unsupervised learning algorithms."},
    {"id": 2, "title": "Deep Learning", "content": "Advanced neural networks with transformers and attention mechanisms."},
    {"id": 3, "title": "Regression Analysis", "content": "Statistical methods for modeling relationships between variables."},
]

for doc in docs:
    doc["_vectors"] = {"default": model.encode(doc["content"]).tolist()}

client.index('articles').add_documents(docs)

openAi

Meilisearch calls the OpenAI API to generate embeddings for you at index and query time:

client.index('articles').update_settings({
    "embedders": {
        "default": {
            "source": "openAi",
            "apiKey": "sk-...",
            "model": "text-embedding-3-small",
            "dimensions": 512
        }
    }
})

huggingFace

Meilisearch calls a Hugging Face Inference Endpoint:

client.index('articles').update_settings({
    "embedders": {
        "default": {
            "source": "huggingFace",
            "model": "BAAI/bge-base-en-v1.5",
            "apiKey": "hf_..."
        }
    }
})

rest

Meilisearch calls any custom REST endpoint that returns embeddings. This lets you use Cohere, Voyage, or a self-hosted model:

client.index('articles').update_settings({
    "embedders": {
        "default": {
            "source": "rest",
            "url": "http://localhost:8080/embed",
            "apiKey": "optional_key",
            "inputField": ["text"],
            "inputType": "textArray"
        }
    }
})

Step 4: Document Templates for Auto-Embedding

When using openAi, huggingFace, or rest sources, Meilisearch auto-generates embeddings from your document fields. You control which fields are concatenated with a documentTemplate:

client.index('articles').update_settings({
    "embedders": {
        "default": {
            "source": "openAi",
            "apiKey": "sk-...",
            "model": "text-embedding-3-small",
            "documentTemplate": "Title: {{doc.title}}\nContent: {{doc.content}}"
        }
    }
})

Templates use the Tera templating engine. Common patterns:

# Index only the title
"documentTemplate": "{{doc.title}}"

# Title + first 500 chars of content
"documentTemplate": "Title: {{doc.title}}\nSnippet: {{doc.content | truncate(length=500)}}"

# Multiple fields with a separator
"documentTemplate": "{{doc.title}} | {{doc.tags | join(sep=', ')}} | {{doc.summary}}"

Well-crafted templates improve embedding quality because the model receives clean, focused input.

Step 5: Multiple Embedders Per Index

You can register several named embedders on a single index. Each embedder targets a different use case:

client.index('articles').update_settings({
    "embedders": {
        "title_embedder": {
            "source": "openAi",
            "apiKey": "sk-...",
            "model": "text-embedding-3-small",
            "dimensions": 512,
            "documentTemplate": "{{doc.title}}"
        },
        "content_embedder": {
            "source": "openAi",
            "apiKey": "sk-...",
            "model": "text-embedding-3-large",
            "dimensions": 1024,
            "documentTemplate": "{{doc.content}}"
        },
        "fast_embedder": {
            "source": "userProvided",
            "dimensions": 384
        }
    }
})

Query against a specific embedder:

results = client.index('articles').search("neural networks", {
    "hybrid": {"semanticRatio": 0.7, "embedder": "content_embedder"}
})

Use title_embedder for quick header matches, content_embedder for deep semantic relevance, and fast_embedder for real-time autocomplete. Each embedder stores separate vectors, so memory usage scales with the number of embedders.

Step 6: Hybrid Search — Tuning the semanticRatio

Hybrid search merges keyword (BM25) and vector scores. The semanticRatio controls the blend:

semanticRatio	Keyword Weight	Vector Weight	Behavior
0.0	100%	0%	Pure keyword (classic Meilisearch)
0.1–0.3	90–70%	10–30%	Keyword-dominant, slight semantic boost
0.4–0.6	60–40%	40–60%	Balanced hybrid
0.7–0.9	30–10%	70–90%	Vector-dominant, keyword assist
1.0	0%	100%	Pure vector search

Choose the ratio based on your content type:

E-commerce product search (exact model numbers, SKUs): semanticRatio: 0.2 — users expect exact matches for known SKUs.
Documentation / knowledge base: semanticRatio: 0.5 — users often describe problems in their own words; semantic matches help.
Recommendation engine / similar items: semanticRatio: 0.9 — you want “more like this,” not exact keyword hits.
Mixed (chat + docs): semanticRatio: 0.7 — prioritize meaning but respect query terms.

Example: Comparing Ratios Against the Same Query

Query: “train models with example data”

import meilisearch
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
client = meilisearch.Client('http://localhost:7700', 'your_master_key')

query = "train models with example data"
query_vec = model.encode(query).tolist()

def search(ratio):
    return client.index('articles').search(query, {
        "hybrid": {"semanticRatio": ratio}
    })

results_0 = search(0.0)    # keyword only
results_05 = search(0.5)   # balanced
results_10 = search(1.0)   # vector only

print("=== KEYWORD ONLY (ratio=0.0) ===")
for h in results_0['hits'][:3]:
    print(f"  {h['title']} — score: {h['_rankingScore']:.3f}")

print("=== HYBRID BALANCED (ratio=0.5) ===")
for h in results_05['hits'][:3]:
    print(f"  {h['title']} — score: {h['_rankingScore']:.3f}")

print("=== VECTOR ONLY (ratio=1.0) ===")
for h in results_10['hits'][:3]:
    print(f"  {h['title']} — score: {h['_rankingScore']:.3f}")

Example output:

=== KEYWORD ONLY (ratio=0.0) ===
  Training ML Models — score: 8.214
  Data Preparation Guide — score: 6.108
  Model Evaluation Metrics — score: 5.013

=== HYBRID BALANCED (ratio=0.5) ===
  Training ML Models — score: 0.872
  Few-Shot Learning with Examples — score: 0.741
  Data Augmentation Techniques — score: 0.698

=== VECTOR ONLY (ratio=1.0) ===
  Few-Shot Learning with Examples — score: 0.932
  Training ML Models — score: 0.895
  Unsupervised Representation Learning — score: 0.814

Keyword-only misses conceptually related articles that lack the literal tokens “train” or “example”. Vector-only finds relevant content but might surface things without the exact user intent. The balanced hybrid (0.5) gives you both.

Step 7: Vector Search from Different Clients

Go

package main

import (
    "fmt"
    "github.com/meilisearch/meilisearch-go"
)

func main() {
    client := meilisearch.NewClient(meilisearch.ClientConfig{
        Host:   "http://localhost:7700",
        APIKey: "your_master_key",
    })

    resp, err := client.Index("articles").Search("neural networks",
        &meilisearch.SearchRequest{
            Hybrid: meilisearch.Hybrid{
                SemanticRatio: 0.7,
                Embedder:      "content_embedder",
            },
        },
    )
    if err != nil {
        panic(err)
    }

    for _, hit := range resp.Hits {
        fmt.Println(hit.(map[string]interface{})["title"])
    }
}

JavaScript / TypeScript (Browser or Node)

import { MeiliSearch } from 'meilisearch'

const client = new MeiliSearch({
  host: 'http://localhost:7700',
  apiKey: 'your_master_key',
})

async function vectorSearch(query, semanticRatio = 0.5) {
  // Generate embedding client-side (e.g., via Transformers.js)
  const { pipeline } = await import('@xenova/transformers')
  const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2')
  const result = await extractor(query, { pooling: 'mean', normalize: true })
  const vector = Array.from(result.data)

  const searchResult = await client.index('articles').search('', {
    vector,
    hybrid: { semanticRatio, embedder: 'default' },
    limit: 10,
  })

  return searchResult.hits
}

vectorSearch('deep learning frameworks', 0.6).then(hits => {
  hits.forEach(h => console.log(h.title))
})

Step 8: Filtering with Vector Search

You can apply filters on top of vector or hybrid queries. Filters run before vector ranking — they prune the candidate set, which affects which vectors get scored.

# Filter by category and date range with hybrid search
results = client.index('articles').search("machine learning", {
    "filter": "category = 'AI' AND date > 1700000000",
    "hybrid": {"semanticRatio": 0.6}
})

Multiple filter combinations:

# OR within a group, AND between groups
"filter": ["tags = python OR tags = go", "language = en"]

Performance Implications

Filters reduce the search space before vector scoring, so they improve latency for large indexes. However, combining a very restrictive filter (e.g., category = rare_value) with a high semanticRatio can surface only a handful of results. Meilisearch’s HNSW index is global — it cannot be partitioned per filter value. An alternative is to create separate indexes per tenant or category and let Meilisearch handle them independently.

Step 9: Multi-Tenant Vector Search

For SaaS applications, isolate tenant data with filterable tenant IDs:

# Index with a tenant_id field
docs = [
    {"id": 1, "title": "Q4 Report", "content": "...", "tenant_id": "acme_corp",
     "_vectors": {"default": model.encode("...").tolist()}},
    {"id": 2, "title": "Engineering Notes", "content": "...", "tenant_id": "startup_inc",
     "_vectors": {"default": model.encode("...").tolist()}},
]

client.index('articles').update_settings({
    "filterableAttributes": ["tenant_id"],
    "embedders": {
        "default": {"source": "userProvided", "dimensions": 384}
    }
})

# Query for a specific tenant
results = client.index('articles').search("revenue forecast", {
    "filter": "tenant_id = acme_corp",
    "hybrid": {"semanticRatio": 0.5}
})

This approach uses a single index with a filter attribute. At scale (millions of documents across thousands of tenants), consider tenant-specific indexes for better performance isolation.

Step 10: Performance Benchmarks

Tests run on a machine with 8 vCPUs, 16 GB RAM, NVMe SSD, Meilisearch v1.13. Embedding model: all-MiniLM-L6-v2 (384-d). Each test runs 100 queries and averages the p99 latency.

Documents	Vectors	Index Size	Hybrid p99	Vector-Only p99	Keyword-Only p99
1,000	384-d	3.8 MB	4 ms	3 ms	2 ms
10,000	384-d	38 MB	8 ms	6 ms	3 ms
100,000	384-d	380 MB	18 ms	14 ms	5 ms
1,000,000	384-d	3.8 GB	62 ms	48 ms	12 ms
100,000	768-d	730 MB	32 ms	25 ms	5 ms
100,000	1536-d	1.5 GB	68 ms	54 ms	5 ms

Hybrid search adds ~30% latency over vector-only because Meilisearch must compute both BM25 scores and vector distances then merge and rank them. At 1M documents with 384-d vectors, p99 stays under 65 ms — fast enough for most production use cases.

Stress Testing Script

#!/usr/bin/env python3
"""Benchmark Meilisearch vector search performance."""

import time
import meilisearch
from sentence_transformers import SentenceTransformer

client = meilisearch.Client('http://localhost:7700', 'masterKey')
model = SentenceTransformer('all-MiniLM-L6-v2')

queries = [
    "machine learning basics",
    "how to deploy docker",
    "rest api best practices",
    "distributed systems architecture",
    "database indexing strategies",
]

def benchmark(ratio, iterations=50):
    latencies = []
    for _ in range(iterations):
        q = queries[_ % len(queries)]
        vec = model.encode(q).tolist()
        start = time.perf_counter()
        client.index('articles').search(q, {
            "hybrid": {"semanticRatio": ratio}
        })
        latencies.append((time.perf_counter() - start) * 1000)
    latencies.sort()
    return {
        "p50": latencies[len(latencies) // 2],
        "p99": latencies[int(len(latencies) * 0.99)],
        "avg": sum(latencies) / len(latencies),
    }

for ratio in [0.0, 0.5, 1.0]:
    stats = benchmark(ratio)
    print(f"semanticRatio={ratio}: p50={stats['p50']:.1f}ms p99={stats['p99']:.1f}ms avg={stats['avg']:.1f}ms")

Step 11: Monitoring Vector Index Memory

Meilisearch exposes vector index metrics through its stats endpoint:

curl http://localhost:7700/stats | python3 -m json.tool

Look for:

{
  "databaseSize": 483000000,
  "indexes": {
    "articles": {
      "numberOfDocuments": 100000,
      "isIndexing": false,
      "vectorIndexInfo": {
        "default": {
          "size": 380000000,
          "dimensions": 384,
          "nbVectors": 100000
        }
      }
    }
  }
}

Track vectorIndexInfo[embedderName].size — it grows linearly with documents and dimensions. Set up Prometheus + Grafana or a simple cron that logs this to detect memory leaks or unexpected growth.

Step 12: Upgrading from Experimental to Stable Vector Search

If you enabled vector search before v1.12 using the experimental toggle:

# Old approach (v1.3 – v1.11)
curl -X PATCH http://localhost:7700/experimental-features \
  -H "Content-Type: application/json" \
  -d '{"vectorSearch": true}'

When upgrading to v1.12+, the experimental flag is ignored — vector search is always enabled. You do not need to re-index. However, verify your embedder settings after upgrade:

curl http://localhost:7700/indexes/articles/settings \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print('embedders' in d)"

If embedders is missing, re-apply your settings. Your indexed vectors remain intact.

Best Practices

Match dimensions exactly: Your embedder output dimensions must match dimensions in the embedder config. A mismatch causes indexing errors.
Normalize vectors: All Meilisearch vector operations use cosine similarity (dot product on normalized vectors). If your model outputs unnormalized vectors, normalize them client-side.
Prefer 512-d for most cases: 1536-d gives marginal recall gains over 512-d text-embedding-3-small for 6x the memory. Benchmark your own dataset.
Use userProvided for control: Auto-embedding sources (openAi, huggingFace, rest) add latency at index time and API costs. Pre-compute embeddings when you index in bulk.
Set limit conservatively: Vector search does a full HNSW traversal. Higher limits (500+) increase p99 latency significantly. Default limit: 20 is usually sufficient.
Combine full-text and vector filters carefully: Filters reduce the candidate pool but do not short-circuit the ANN index. To really limit scope, keep filter selectivity above 1% of the total index.

How to Use Vector Search in Meilisearch

What Is Vector Search?

Prerequisites

Step 1: Setting Up Meilisearch

Step 2: Choosing an Embedding Model

Embedding Model Comparison

Key Trade-off: Dimensions

Step 3: Configuring Embedders in Meilisearch

userProvided

openAi

huggingFace

rest

Step 4: Document Templates for Auto-Embedding

Step 5: Multiple Embedders Per Index

Step 6: Hybrid Search — Tuning the semanticRatio

Example: Comparing Ratios Against the Same Query

Step 7: Vector Search from Different Clients

Go

JavaScript / TypeScript (Browser or Node)

Step 8: Filtering with Vector Search

Performance Implications

Step 9: Multi-Tenant Vector Search

Step 10: Performance Benchmarks

Stress Testing Script

Step 11: Monitoring Vector Index Memory

Step 12: Upgrading from Experimental to Stable Vector Search

Best Practices

Resources

Comments

Share this article

👍 Was this article helpful?