Meilisearch is built for full-text search, but vector embeddings unlock visual similarity queries that let you find images by content rather than tags. You feed images through a model like CLIP to produce vectors, index those vectors in Meilisearch, then search by comparing vector distances. This guide walks through model selection, batch processing, production deployment, caching, benchmarking, and ongoing management.
What is Image Search in Meilisearch?
Meilisearch supports vector search through user-provided embeddings. You generate a vector from an image using a machine learning model, attach that vector to a document, and Meilisearch indexes it for nearest-neighbor queries. Search works by encoding a query image (or text) into the same vector space and returning the closest matches by cosine or dot-product similarity.
Prerequisites
- Meilisearch v1.3+ (v1.12+ recommended for production vector features).
- Python 3.10+ with
torch,transformers,meilisearch, andPillow. - An image dataset (local files or URLs).
- Docker and Docker Compose for production deployment.
Choosing an Embedding Model
The model you pick determines accuracy, speed, vector dimensions, and hardware requirements. No single model is best for every scenario.
Model Comparison
| Model | Dimensions | Accuracy (Recall@10) | Inference Speed (ms/img on GPU) | Model Size | Strengths | Weaknesses |
|---|---|---|---|---|---|---|
| CLIP ViT-B/32 | 512 | ~85% | 8-12 | 600 MB | Strong zero-shot, text+image joint space | Moderate accuracy for fine-grained domains |
| CLIP ViT-L/14 | 768 | ~91% | 20-30 | 1.7 GB | Higher accuracy, good detail capture | Slower, larger memory footprint |
| SigLIP ViT-SO400M | 1152 | ~93% | 25-35 | 2.1 GB | Sigmoid loss improves fine-grained matching | Larger vectors increase index size |
| BLIP-2 | 768 | ~89% | 30-45 | 3.9 GB | Multimodal understanding, caption-aware | Heavy model, higher latency |
| DINOv2 ViT-L | 1024 | ~88% | 20-28 | 1.6 GB | Self-supervised, excellent feature diversity | No built-in text joint embedding |
| CLIP ViT-L/14@336px | 768 | ~93% | 35-50 | 1.8 GB | Highest CLIP accuracy | 336px input, slower pre-processing |
For most applications, start with CLIP ViT-B/32. It balances speed and accuracy. Upgrade to ViT-L/14 when you need better retrieval quality. Use SigLIP if your dataset has fine-grained visual distinctions. DINOv2 is a strong choice if you only do image-to-image search (no text queries).
Generating Embeddings with CLIP
import torch
from transformers import CLIPProcessor, CLIPModel
from PIL import Image
import requests
device = "cuda" if torch.cuda.is_available() else "cpu"
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
def get_image_embedding(image_path_or_url: str) -> list[float]:
if image_path_or_url.startswith("http"):
image = Image.open(requests.get(image_path_or_url, stream=True).raw)
else:
image = Image.open(image_path_or_url)
inputs = processor(images=image, return_tensors="pt").to(device)
with torch.no_grad():
embedding = model.get_image_features(**inputs)
return embedding.squeeze().cpu().tolist()
Generating Embeddings with SigLIP
from transformers import AutoProcessor, AutoModel
siglip_model = AutoModel.from_pretrained(
"google/siglip-so400m-patch14-384"
).to(device)
siglip_processor = AutoProcessor.from_pretrained(
"google/siglip-so400m-patch14-384"
)
def get_siglip_embedding(image_path_or_url: str) -> list[float]:
if image_path_or_url.startswith("http"):
image = Image.open(requests.get(image_path_or_url, stream=True).raw)
else:
image = Image.open(image_path_or_url)
inputs = siglip_processor(
images=image, return_tensors="pt"
).to(device)
with torch.no_grad():
embedding = siglip_model.get_image_features(**inputs)
return embedding.squeeze().cpu().tolist()
Generating Embeddings with DINOv2
from transformers import AutoImageProcessor, Dinov2Model
dinov2_model = Dinov2Model.from_pretrained(
"facebook/dinov2-large"
).to(device)
dinov2_processor = AutoImageProcessor.from_pretrained(
"facebook/dinov2-large"
)
def get_dinov2_embedding(image_path_or_url: str) -> list[float]:
if image_path_or_url.startswith("http"):
image = Image.open(requests.get(image_path_or_url, stream=True).raw)
else:
image = Image.open(image_path_or_url)
inputs = dinov2_processor(
images=image, return_tensors="pt"
).to(device)
with torch.no_grad():
outputs = dinov2_model(**inputs)
return outputs.last_hidden_state.mean(dim=1).squeeze().cpu().tolist()
Architecture: Image Search Pipeline
The pipeline has four stages:
[Image Upload] → [Embedding Service] → [Meilisearch Index] → [Query Gateway]
│ │ │ │
│ Resize & │ Encode with │ Store vectors │ Encode query
│ normalize │ chosen model │ + metadata │ image/text
│ │ │ │
v v v v
Static storage GPU/CPU worker UserProvided Vector search
(S3, local FS) (batched inference) embedder config via API
- Ingestion: Images arrive via upload or batch import. Resize and normalize them.
- Embedding Service: A Python/Triton service runs the model, batches images for GPU efficiency, and returns vectors.
- Meilisearch Index: Each document gets an
_vectorsfield with the embedding. Configure the embedder asuserProvidedwith matching dimensions. - Query Gateway: Encodes query images at request time, hits Meilisearch’s vector search endpoint, and returns ranked results.
Batch Processing Large Image Collections
Processing 10,000+ images one at a time is painfully slow. Batch inference on GPU and parallel CPU workers cut that down dramatically.
Parallel Batch Embedding with Asyncio
import asyncio
import aiohttp
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
import torch
executor = ThreadPoolExecutor(max_workers=4)
async def process_batch(
image_paths: list[str],
model: torch.nn.Module,
processor,
batch_size: int = 32
) -> list[dict]:
embeddings = []
for i in range(0, len(image_paths), batch_size):
batch_paths = image_paths[i:i + batch_size]
batch_embeddings = await asyncio.get_event_loop().run_in_executor(
executor,
_infer_batch,
batch_paths,
model,
processor
)
for path, emb in zip(batch_paths, batch_embeddings):
embeddings.append({
"id": Path(path).stem,
"image_url": path,
"_vectors": {"default": emb}
})
return embeddings
def _infer_batch(batch_paths, model, processor):
images = [Image.open(p).convert("RGB") for p in batch_paths]
inputs = processor(images=images, return_tensors="pt").to(device)
with torch.no_grad():
vecs = model.get_image_features(**inputs)
return vecs.cpu().tolist()
async def index_all_images(image_dir: str, batch_size: int = 32):
paths = list(Path(image_dir).glob("*.jpg"))[:5000]
client = meilisearch.Client("http://localhost:7700", "master_key")
for i in range(0, len(paths), 100):
batch_paths = paths[i:i + 100]
docs = await process_batch(batch_paths, model, processor, batch_size)
client.index("images").add_documents(docs)
print(f"Indexed {i + len(batch_paths)}/{len(paths)} images")
asyncio.run(index_all_images("./dataset"))
Rate Limiting and Memory Management
Model inference consumes GPU memory. Large batches cause OOM errors. Control batch size and add rate limiting.
import time
from functools import wraps
def rate_limit(max_per_second: int):
interval = 1.0 / max_per_second
last_call = [0.0]
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
elapsed = time.time() - last_call[0]
if elapsed < interval:
time.sleep(interval - elapsed)
result = func(*args, **kwargs)
last_call[0] = time.time()
return result
return wrapper
return decorator
@rate_limit(max_per_second=10)
def safe_infer(image: Image.Image, model, processor):
inputs = processor(images=image, return_tensors="pt").to(device)
with torch.no_grad():
return model.get_image_features(**inputs).squeeze().cpu().tolist()
def clear_gpu_cache():
if torch.cuda.is_available():
torch.cuda.empty_cache()
Monitor GPU memory with nvidia-smi between batches and call clear_gpu_cache() every 1000 images to avoid fragmentation.
Precomputed Embedding Storage and Management
Storing embeddings in Meilisearch is fine for search, but you also need durable storage for reindexing and rollback.
Embedding Cache with Parquet
import pandas as pd
import pyarrow.parquet as pq
EMBEDDING_FILE = "image_embeddings.parquet"
def save_embeddings(documents: list[dict]):
records = []
for doc in documents:
records.append({
"id": doc["id"],
"image_url": doc["image_url"],
"embedding": doc["_vectors"]["default"]
})
df = pd.DataFrame(records)
df.to_parquet(EMBEDDING_FILE, index=False)
def load_embeddings() -> list[dict]:
if not Path(EMBEDDING_FILE).exists():
return []
df = pd.read_parquet(EMBEDDING_FILE)
docs = []
for _, row in df.iterrows():
docs.append({
"id": row["id"],
"image_url": row["image_url"],
"_vectors": {"default": row["embedding"]}
})
return docs
Embedding Caching with Redis
Cache recently computed embeddings to avoid re-encoding the same image.
import redis.asyncio as aioredis
import json
import hashlib
class EmbeddingCache:
def __init__(self, redis_url: str = "redis://localhost:6379"):
self.redis = aioredis.from_url(redis_url)
self.ttl = 86400 # 24 hours
def _key(self, image_url: str) -> str:
return f"emb:{hashlib.md5(image_url.encode()).hexdigest()}"
async def get(self, image_url: str) -> list[float] | None:
data = await self.redis.get(self._key(image_url))
return json.loads(data) if data else None
async def set(self, image_url: str, embedding: list[float]):
await self.redis.setex(
self._key(image_url), self.ttl, json.dumps(embedding)
)
cache = EmbeddingCache()
async def cached_embed(image_url: str) -> list[float]:
cached = await cache.get(image_url)
if cached:
return cached
embedding = get_image_embedding(image_url)
await cache.set(image_url, embedding)
return embedding
Hybrid Image Search: Text + Visual Features
Pure image-to-image search ignores metadata. Hybrid search filters by categories, tags, or dates while ranking by visual similarity.
Indexing with Metadata and Vectors
documents = [
{
"id": "img_001",
"title": "Sunset over Golden Gate Bridge",
"image_url": "https://example.com/sunset.jpg",
"category": "landscape",
"tags": ["sunset", "bridge", "san-francisco"],
"date_taken": "2025-06-15",
"_vectors": {"default": embedding_001}
},
]
Configure Filterable Attributes
index.update_settings({
"embedders": {
"default": {
"source": "userProvided",
"dimensions": 512
}
},
"filterableAttributes": ["category", "tags", "date_taken"],
"sortableAttributes": ["date_taken"]
})
Query with Filters
def search_images(
query_embedding: list[float],
category: str | None = None,
tags: list[str] | None = None,
limit: int = 20
) -> list[dict]:
filter_parts = []
if category:
filter_parts.append(f"category = '{category}'")
if tags:
tag_filters = [f"tags = '{t}'" for t in tags]
filter_parts.append(f"({' AND '.join(tag_filters)})")
filter_expr = " AND ".join(filter_parts) if filter_parts else None
return client.index("images").search("", {
"vector": query_embedding,
"filter": filter_expr,
"limit": limit
})["hits"]
Text-to-Image Hybrid Search
Combine a text query with vector similarity using Meilisearch’s hybrid search.
text = "urban landscape"
text_inputs = processor(text=text, return_tensors="pt").to(device)
with torch.no_grad():
text_emb = model.get_text_features(**text_inputs).squeeze().cpu().tolist()
results = client.index("images").search(text, {
"vector": text_emb,
"hybrid": {"embedder": "default", "semanticRatio": 0.7},
"limit": 10
})
Production Deployment with Docker Compose
Run Meilisearch and the embedding service as containers. Scale the embedding service independently.
# docker-compose.yml
version: "3.9"
services:
meilisearch:
image: getmeilisearch/meilisearch:v1.12
ports:
- "7700:7700"
environment:
MEILI_MASTER_KEY: ${MEILI_MASTER_KEY}
MEILI_ENV: production
volumes:
- meili_data:/meili_data
deploy:
resources:
limits:
memory: 4G
embedding-service:
build:
context: .
dockerfile: Dockerfile.embedding
ports:
- "8000:8000"
environment:
MODEL_NAME: "openai/clip-vit-base-patch32"
DEVICE: "cuda"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
volumes:
- embedding_cache:/cache
depends_on:
- redis
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
volumes:
meili_data:
embedding_cache:
redis_data:
Embedding Service API
# embedding_service.py
from fastapi import FastAPI, UploadFile
import numpy as np
app = FastAPI()
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
@app.post("/embed")
async def embed_image(file: UploadFile):
image = Image.open(file.file).convert("RGB")
inputs = processor(images=image, return_tensors="pt").to(device)
with torch.no_grad():
vec = model.get_image_features(**inputs)
return {"embedding": vec.squeeze().cpu().tolist()}
@app.post("/embed-text")
async def embed_text(text: str):
inputs = processor(text=text, return_tensors="pt").to(device)
with torch.no_grad():
vec = model.get_text_features(**inputs)
return {"embedding": vec.squeeze().cpu().tolist()}
@app.post("/search-by-image")
async def search_by_image(file: UploadFile, limit: int = 10):
emb = await embed_image(file)
client = meilisearch.Client("http://meilisearch:7700", "master_key")
results = client.index("images").search("", {
"vector": emb["embedding"], "limit": limit
})
return results["hits"]
Performance Benchmarking
Measure indexing throughput and search latency to understand capacity and plan hardware.
Benchmarking Script
import time
import statistics
def benchmark_indexing(documents: list[dict], batch_sizes: list[int]):
for batch_size in batch_sizes:
client = meilisearch.Client("http://localhost:7700", "master_key")
client.index("images_bench").delete()
client.create_index("images_bench")
client.index("images_bench").update_settings({
"embedders": {
"default": {"source": "userProvided", "dimensions": 512}
}
})
start = time.perf_counter()
total = 0
for i in range(0, len(documents), batch_size):
batch = documents[i:i + batch_size]
client.index("images_bench").add_documents(batch)
total += len(batch)
elapsed = time.perf_counter() - start
qps = total / elapsed
print(f"Batch size {batch_size}: {total} docs in {elapsed:.1f}s = {qps:.0f} QPS")
def benchmark_search(embedding: list[float], n_queries: int = 100):
client = meilisearch.Client("http://localhost:7700", "master_key")
latencies = []
for _ in range(n_queries):
start = time.perf_counter()
client.index("images_bench").search("", {
"vector": embedding, "limit": 10
})
latencies.append((time.perf_counter() - start) * 1000)
p50 = statistics.median(latencies)
p99 = sorted(latencies)[int(len(latencies) * 0.99)]
print(f"Search latency (ms) — P50: {p50:.1f}, P99: {p99:.1f}")
# Typical results on 10K images, CLIP ViT-B/32, 512-dim vectors:
# Batch size 32: 1420 QPS
# Batch size 128: 2100 QPS
# Search P50: 4.2ms, P99: 12.8ms
Monitoring Vector Search Performance
Track these metrics over time to catch degradation.
import psutil
def monitor_index_stats(index_name: str = "images"):
client = meilisearch.Client("http://localhost:7700", "master_key")
stats = client.index(index_name).get_stats()
print(f"Number of documents: {stats.number_of_documents}")
print(f"Index size: {stats.database_size / 1024 / 1024:.1f} MB")
print(f"Last update: {stats.updated_at}")
print(f"Memory usage: {psutil.virtual_memory().percent}%")
# Compare vector field cardinality
# Sudden drops indicate deletion or corruption
Handling Image Updates and Deletions
When images change, regenerate the embedding and update the document. Schedule periodic re-indexing for stale data.
Update Pipeline
def update_image(doc_id: str, new_image_url: str):
new_embedding = get_image_embedding(new_image_url)
client.index("images").update_documents([
{"id": doc_id, "image_url": new_image_url,
"_vectors": {"default": new_embedding}}
])
def delete_image(doc_id: str):
client.index("images").delete_document(doc_id)
CI/CD Re-indexing Pipeline
For datasets that change frequently, automate re-indexing with a scheduled pipeline.
# .github/workflows/reindex-images.yml
name: Reindex Images
on:
schedule:
- cron: "0 6 * * 0" # weekly
workflow_dispatch:
jobs:
reindex:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install dependencies
run: |
pip install torch transformers meilisearch Pillow requests
- name: Generate embeddings
run: python scripts/generate_embeddings.py --input ./dataset --output embeddings.parquet
- name: Upload to Meilisearch
env:
MEILI_URL: ${{ secrets.MEILI_URL }}
MEILI_KEY: ${{ secrets.MEILI_KEY }}
run: python scripts/upload_embeddings.py --file embeddings.parquet
Troubleshooting
Dimension Mismatch
Meilisearch rejects documents when the vector dimension does not match the embedder configuration. Always verify dimensions match.
embedder_config = {
"source": "userProvided",
"dimensions": 512
}
# Must match your model output
assert len(embedding) == 512, f"Expected 512, got {len(embedding)}"
If you switch models, create a new index instead of reusing the old one with mismatched dimensions.
GPU Out of Memory
Reduce batch size. For CLIP ViT-L/14 on a 16 GB GPU, keep batch size under 64.
# Progressive batch-size reduction
for batch_size in [128, 64, 32, 16, 8]:
try:
process_batch(paths, model, processor, batch_size)
break
except torch.cuda.OutOfMemoryError:
clear_gpu_cache()
continue
Slow Search Queries
Search latency increases with collection size and vector dimensions.
- Reduce dimensions (e.g., switch from SigLIP 1152 to CLIP 512).
- Increase Meilisearch’s
search_cutoff_msif you prefer latency over recall. - Partition images into multiple indexes by category and query the relevant one.
- Add more RAM to the Meilisearch host — vector search is memory-bound.
Stale Embeddings After Re-index
Clear the embedding cache when you update the model, or version your cache keys.
class VersionedCache(EmbeddingCache):
def __init__(self, model_version: str, *args, **kwargs):
super().__init__(*args, **kwargs)
self.model_version = model_version
def _key(self, image_url: str) -> str:
base = super()._key(image_url)
return f"{self.model_version}:{base}"
Best Practices
- Precompute embeddings once and store them in durable storage (Parquet, S3, Redis). Never recompute on every index.
- Resize images before encoding — 224x224 for CLIP, 384x384 for SigLIP. Larger images waste GPU cycles.
- Monitor index memory — each 512-dim vector takes ~2 KB in Meilisearch. 1 million images consume ~2 GB just for vectors.
- Benchmark before deploying — run the benchmarking script on a representative subset to set realistic SLOs.
- Use a staging index — run re-indexing against a staging index, then swap aliases when ready.
- Version your models — encode the model name in the embedder key so you can switch models without downtime.
Conclusion
Image search in Meilisearch is practical today. Pick a model that fits your accuracy and latency budget, precompute embeddings with batched inference, cache aggressively, and monitor search performance in production. The combination of Meilisearch’s fast vector engine with a dedicated embedding service handles datasets from hundreds to millions of images.
For more, see the Meilisearch vector search docs and the Meilisearch GitHub repository.
Resources
- Meilisearch Vector Search Documentation
- Meilisearch GitHub
- OpenAI CLIP Model Card
- SigLIP on Hugging Face
- DINOv2 on Hugging Face
- BLIP-2 on Hugging Face
- Redis PyPI Page
- PyTorch Documentation
- Hugging Face Transformers
Comments