Skip to main content

How to Use Image Search in Meilisearch

Published: November 25, 2025 Updated: May 8, 2026 Larry Qu 11 min read

Meilisearch is built for full-text search, but vector embeddings unlock visual similarity queries that let you find images by content rather than tags. You feed images through a model like CLIP to produce vectors, index those vectors in Meilisearch, then search by comparing vector distances. This guide walks through model selection, batch processing, production deployment, caching, benchmarking, and ongoing management.

What is Image Search in Meilisearch?

Meilisearch supports vector search through user-provided embeddings. You generate a vector from an image using a machine learning model, attach that vector to a document, and Meilisearch indexes it for nearest-neighbor queries. Search works by encoding a query image (or text) into the same vector space and returning the closest matches by cosine or dot-product similarity.

Prerequisites

  • Meilisearch v1.3+ (v1.12+ recommended for production vector features).
  • Python 3.10+ with torch, transformers, meilisearch, and Pillow.
  • An image dataset (local files or URLs).
  • Docker and Docker Compose for production deployment.

Choosing an Embedding Model

The model you pick determines accuracy, speed, vector dimensions, and hardware requirements. No single model is best for every scenario.

Model Comparison

Model Dimensions Accuracy (Recall@10) Inference Speed (ms/img on GPU) Model Size Strengths Weaknesses
CLIP ViT-B/32 512 ~85% 8-12 600 MB Strong zero-shot, text+image joint space Moderate accuracy for fine-grained domains
CLIP ViT-L/14 768 ~91% 20-30 1.7 GB Higher accuracy, good detail capture Slower, larger memory footprint
SigLIP ViT-SO400M 1152 ~93% 25-35 2.1 GB Sigmoid loss improves fine-grained matching Larger vectors increase index size
BLIP-2 768 ~89% 30-45 3.9 GB Multimodal understanding, caption-aware Heavy model, higher latency
DINOv2 ViT-L 1024 ~88% 20-28 1.6 GB Self-supervised, excellent feature diversity No built-in text joint embedding
CLIP ViT-L/14@336px 768 ~93% 35-50 1.8 GB Highest CLIP accuracy 336px input, slower pre-processing

For most applications, start with CLIP ViT-B/32. It balances speed and accuracy. Upgrade to ViT-L/14 when you need better retrieval quality. Use SigLIP if your dataset has fine-grained visual distinctions. DINOv2 is a strong choice if you only do image-to-image search (no text queries).

Generating Embeddings with CLIP

import torch
from transformers import CLIPProcessor, CLIPModel
from PIL import Image
import requests

device = "cuda" if torch.cuda.is_available() else "cpu"
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

def get_image_embedding(image_path_or_url: str) -> list[float]:
    if image_path_or_url.startswith("http"):
        image = Image.open(requests.get(image_path_or_url, stream=True).raw)
    else:
        image = Image.open(image_path_or_url)
    inputs = processor(images=image, return_tensors="pt").to(device)
    with torch.no_grad():
        embedding = model.get_image_features(**inputs)
    return embedding.squeeze().cpu().tolist()

Generating Embeddings with SigLIP

from transformers import AutoProcessor, AutoModel

siglip_model = AutoModel.from_pretrained(
    "google/siglip-so400m-patch14-384"
).to(device)
siglip_processor = AutoProcessor.from_pretrained(
    "google/siglip-so400m-patch14-384"
)

def get_siglip_embedding(image_path_or_url: str) -> list[float]:
    if image_path_or_url.startswith("http"):
        image = Image.open(requests.get(image_path_or_url, stream=True).raw)
    else:
        image = Image.open(image_path_or_url)
    inputs = siglip_processor(
        images=image, return_tensors="pt"
    ).to(device)
    with torch.no_grad():
        embedding = siglip_model.get_image_features(**inputs)
    return embedding.squeeze().cpu().tolist()

Generating Embeddings with DINOv2

from transformers import AutoImageProcessor, Dinov2Model

dinov2_model = Dinov2Model.from_pretrained(
    "facebook/dinov2-large"
).to(device)
dinov2_processor = AutoImageProcessor.from_pretrained(
    "facebook/dinov2-large"
)

def get_dinov2_embedding(image_path_or_url: str) -> list[float]:
    if image_path_or_url.startswith("http"):
        image = Image.open(requests.get(image_path_or_url, stream=True).raw)
    else:
        image = Image.open(image_path_or_url)
    inputs = dinov2_processor(
        images=image, return_tensors="pt"
    ).to(device)
    with torch.no_grad():
        outputs = dinov2_model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).squeeze().cpu().tolist()

Architecture: Image Search Pipeline

The pipeline has four stages:

[Image Upload] → [Embedding Service] → [Meilisearch Index] → [Query Gateway]
     │                   │                      │                     │
     │  Resize &         │  Encode with         │  Store vectors      │  Encode query
     │  normalize        │  chosen model        │  + metadata         │  image/text
     │                   │                      │                     │
     v                   v                      v                     v
  Static storage     GPU/CPU worker         UserProvided         Vector search
  (S3, local FS)    (batched inference)     embedder config      via API
  1. Ingestion: Images arrive via upload or batch import. Resize and normalize them.
  2. Embedding Service: A Python/Triton service runs the model, batches images for GPU efficiency, and returns vectors.
  3. Meilisearch Index: Each document gets an _vectors field with the embedding. Configure the embedder as userProvided with matching dimensions.
  4. Query Gateway: Encodes query images at request time, hits Meilisearch’s vector search endpoint, and returns ranked results.

Batch Processing Large Image Collections

Processing 10,000+ images one at a time is painfully slow. Batch inference on GPU and parallel CPU workers cut that down dramatically.

Parallel Batch Embedding with Asyncio

import asyncio
import aiohttp
from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
import torch

executor = ThreadPoolExecutor(max_workers=4)

async def process_batch(
    image_paths: list[str],
    model: torch.nn.Module,
    processor,
    batch_size: int = 32
) -> list[dict]:
    embeddings = []
    for i in range(0, len(image_paths), batch_size):
        batch_paths = image_paths[i:i + batch_size]
        batch_embeddings = await asyncio.get_event_loop().run_in_executor(
            executor,
            _infer_batch,
            batch_paths,
            model,
            processor
        )
        for path, emb in zip(batch_paths, batch_embeddings):
            embeddings.append({
                "id": Path(path).stem,
                "image_url": path,
                "_vectors": {"default": emb}
            })
    return embeddings

def _infer_batch(batch_paths, model, processor):
    images = [Image.open(p).convert("RGB") for p in batch_paths]
    inputs = processor(images=images, return_tensors="pt").to(device)
    with torch.no_grad():
        vecs = model.get_image_features(**inputs)
    return vecs.cpu().tolist()

async def index_all_images(image_dir: str, batch_size: int = 32):
    paths = list(Path(image_dir).glob("*.jpg"))[:5000]
    client = meilisearch.Client("http://localhost:7700", "master_key")

    for i in range(0, len(paths), 100):
        batch_paths = paths[i:i + 100]
        docs = await process_batch(batch_paths, model, processor, batch_size)
        client.index("images").add_documents(docs)
        print(f"Indexed {i + len(batch_paths)}/{len(paths)} images")

asyncio.run(index_all_images("./dataset"))

Rate Limiting and Memory Management

Model inference consumes GPU memory. Large batches cause OOM errors. Control batch size and add rate limiting.

import time
from functools import wraps

def rate_limit(max_per_second: int):
    interval = 1.0 / max_per_second
    last_call = [0.0]

    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            elapsed = time.time() - last_call[0]
            if elapsed < interval:
                time.sleep(interval - elapsed)
            result = func(*args, **kwargs)
            last_call[0] = time.time()
            return result
        return wrapper
    return decorator

@rate_limit(max_per_second=10)
def safe_infer(image: Image.Image, model, processor):
    inputs = processor(images=image, return_tensors="pt").to(device)
    with torch.no_grad():
        return model.get_image_features(**inputs).squeeze().cpu().tolist()

def clear_gpu_cache():
    if torch.cuda.is_available():
        torch.cuda.empty_cache()

Monitor GPU memory with nvidia-smi between batches and call clear_gpu_cache() every 1000 images to avoid fragmentation.

Precomputed Embedding Storage and Management

Storing embeddings in Meilisearch is fine for search, but you also need durable storage for reindexing and rollback.

Embedding Cache with Parquet

import pandas as pd
import pyarrow.parquet as pq

EMBEDDING_FILE = "image_embeddings.parquet"

def save_embeddings(documents: list[dict]):
    records = []
    for doc in documents:
        records.append({
            "id": doc["id"],
            "image_url": doc["image_url"],
            "embedding": doc["_vectors"]["default"]
        })
    df = pd.DataFrame(records)
    df.to_parquet(EMBEDDING_FILE, index=False)

def load_embeddings() -> list[dict]:
    if not Path(EMBEDDING_FILE).exists():
        return []
    df = pd.read_parquet(EMBEDDING_FILE)
    docs = []
    for _, row in df.iterrows():
        docs.append({
            "id": row["id"],
            "image_url": row["image_url"],
            "_vectors": {"default": row["embedding"]}
        })
    return docs

Embedding Caching with Redis

Cache recently computed embeddings to avoid re-encoding the same image.

import redis.asyncio as aioredis
import json
import hashlib

class EmbeddingCache:
    def __init__(self, redis_url: str = "redis://localhost:6379"):
        self.redis = aioredis.from_url(redis_url)
        self.ttl = 86400  # 24 hours

    def _key(self, image_url: str) -> str:
        return f"emb:{hashlib.md5(image_url.encode()).hexdigest()}"

    async def get(self, image_url: str) -> list[float] | None:
        data = await self.redis.get(self._key(image_url))
        return json.loads(data) if data else None

    async def set(self, image_url: str, embedding: list[float]):
        await self.redis.setex(
            self._key(image_url), self.ttl, json.dumps(embedding)
        )

cache = EmbeddingCache()

async def cached_embed(image_url: str) -> list[float]:
    cached = await cache.get(image_url)
    if cached:
        return cached
    embedding = get_image_embedding(image_url)
    await cache.set(image_url, embedding)
    return embedding

Hybrid Image Search: Text + Visual Features

Pure image-to-image search ignores metadata. Hybrid search filters by categories, tags, or dates while ranking by visual similarity.

Indexing with Metadata and Vectors

documents = [
    {
        "id": "img_001",
        "title": "Sunset over Golden Gate Bridge",
        "image_url": "https://example.com/sunset.jpg",
        "category": "landscape",
        "tags": ["sunset", "bridge", "san-francisco"],
        "date_taken": "2025-06-15",
        "_vectors": {"default": embedding_001}
    },
]

Configure Filterable Attributes

index.update_settings({
    "embedders": {
        "default": {
            "source": "userProvided",
            "dimensions": 512
        }
    },
    "filterableAttributes": ["category", "tags", "date_taken"],
    "sortableAttributes": ["date_taken"]
})

Query with Filters

def search_images(
    query_embedding: list[float],
    category: str | None = None,
    tags: list[str] | None = None,
    limit: int = 20
) -> list[dict]:
    filter_parts = []
    if category:
        filter_parts.append(f"category = '{category}'")
    if tags:
        tag_filters = [f"tags = '{t}'" for t in tags]
        filter_parts.append(f"({' AND '.join(tag_filters)})")
    filter_expr = " AND ".join(filter_parts) if filter_parts else None

    return client.index("images").search("", {
        "vector": query_embedding,
        "filter": filter_expr,
        "limit": limit
    })["hits"]

Combine a text query with vector similarity using Meilisearch’s hybrid search.

text = "urban landscape"
text_inputs = processor(text=text, return_tensors="pt").to(device)
with torch.no_grad():
    text_emb = model.get_text_features(**text_inputs).squeeze().cpu().tolist()

results = client.index("images").search(text, {
    "vector": text_emb,
    "hybrid": {"embedder": "default", "semanticRatio": 0.7},
    "limit": 10
})

Production Deployment with Docker Compose

Run Meilisearch and the embedding service as containers. Scale the embedding service independently.

# docker-compose.yml
version: "3.9"

services:
  meilisearch:
    image: getmeilisearch/meilisearch:v1.12
    ports:
      - "7700:7700"
    environment:
      MEILI_MASTER_KEY: ${MEILI_MASTER_KEY}
      MEILI_ENV: production
    volumes:
      - meili_data:/meili_data
    deploy:
      resources:
        limits:
          memory: 4G

  embedding-service:
    build:
      context: .
      dockerfile: Dockerfile.embedding
    ports:
      - "8000:8000"
    environment:
      MODEL_NAME: "openai/clip-vit-base-patch32"
      DEVICE: "cuda"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    volumes:
      - embedding_cache:/cache
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  meili_data:
  embedding_cache:
  redis_data:

Embedding Service API

# embedding_service.py
from fastapi import FastAPI, UploadFile
import numpy as np

app = FastAPI()
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

@app.post("/embed")
async def embed_image(file: UploadFile):
    image = Image.open(file.file).convert("RGB")
    inputs = processor(images=image, return_tensors="pt").to(device)
    with torch.no_grad():
        vec = model.get_image_features(**inputs)
    return {"embedding": vec.squeeze().cpu().tolist()}

@app.post("/embed-text")
async def embed_text(text: str):
    inputs = processor(text=text, return_tensors="pt").to(device)
    with torch.no_grad():
        vec = model.get_text_features(**inputs)
    return {"embedding": vec.squeeze().cpu().tolist()}

@app.post("/search-by-image")
async def search_by_image(file: UploadFile, limit: int = 10):
    emb = await embed_image(file)
    client = meilisearch.Client("http://meilisearch:7700", "master_key")
    results = client.index("images").search("", {
        "vector": emb["embedding"], "limit": limit
    })
    return results["hits"]

Performance Benchmarking

Measure indexing throughput and search latency to understand capacity and plan hardware.

Benchmarking Script

import time
import statistics

def benchmark_indexing(documents: list[dict], batch_sizes: list[int]):
    for batch_size in batch_sizes:
        client = meilisearch.Client("http://localhost:7700", "master_key")
        client.index("images_bench").delete()
        client.create_index("images_bench")
        client.index("images_bench").update_settings({
            "embedders": {
                "default": {"source": "userProvided", "dimensions": 512}
            }
        })

        start = time.perf_counter()
        total = 0
        for i in range(0, len(documents), batch_size):
            batch = documents[i:i + batch_size]
            client.index("images_bench").add_documents(batch)
            total += len(batch)
        elapsed = time.perf_counter() - start
        qps = total / elapsed
        print(f"Batch size {batch_size}: {total} docs in {elapsed:.1f}s = {qps:.0f} QPS")

def benchmark_search(embedding: list[float], n_queries: int = 100):
    client = meilisearch.Client("http://localhost:7700", "master_key")
    latencies = []
    for _ in range(n_queries):
        start = time.perf_counter()
        client.index("images_bench").search("", {
            "vector": embedding, "limit": 10
        })
        latencies.append((time.perf_counter() - start) * 1000)
    p50 = statistics.median(latencies)
    p99 = sorted(latencies)[int(len(latencies) * 0.99)]
    print(f"Search latency (ms) — P50: {p50:.1f}, P99: {p99:.1f}")

# Typical results on 10K images, CLIP ViT-B/32, 512-dim vectors:
# Batch size 32: 1420 QPS
# Batch size 128: 2100 QPS
# Search P50: 4.2ms, P99: 12.8ms

Monitoring Vector Search Performance

Track these metrics over time to catch degradation.

import psutil

def monitor_index_stats(index_name: str = "images"):
    client = meilisearch.Client("http://localhost:7700", "master_key")
    stats = client.index(index_name).get_stats()
    print(f"Number of documents: {stats.number_of_documents}")
    print(f"Index size: {stats.database_size / 1024 / 1024:.1f} MB")
    print(f"Last update: {stats.updated_at}")
    print(f"Memory usage: {psutil.virtual_memory().percent}%")

    # Compare vector field cardinality
    # Sudden drops indicate deletion or corruption

Handling Image Updates and Deletions

When images change, regenerate the embedding and update the document. Schedule periodic re-indexing for stale data.

Update Pipeline

def update_image(doc_id: str, new_image_url: str):
    new_embedding = get_image_embedding(new_image_url)
    client.index("images").update_documents([
        {"id": doc_id, "image_url": new_image_url,
         "_vectors": {"default": new_embedding}}
    ])

def delete_image(doc_id: str):
    client.index("images").delete_document(doc_id)

CI/CD Re-indexing Pipeline

For datasets that change frequently, automate re-indexing with a scheduled pipeline.

# .github/workflows/reindex-images.yml
name: Reindex Images
on:
  schedule:
    - cron: "0 6 * * 0"  # weekly
  workflow_dispatch:

jobs:
  reindex:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install dependencies
        run: |
          pip install torch transformers meilisearch Pillow requests
      - name: Generate embeddings
        run: python scripts/generate_embeddings.py --input ./dataset --output embeddings.parquet
      - name: Upload to Meilisearch
        env:
          MEILI_URL: ${{ secrets.MEILI_URL }}
          MEILI_KEY: ${{ secrets.MEILI_KEY }}
        run: python scripts/upload_embeddings.py --file embeddings.parquet

Troubleshooting

Dimension Mismatch

Meilisearch rejects documents when the vector dimension does not match the embedder configuration. Always verify dimensions match.

embedder_config = {
    "source": "userProvided",
    "dimensions": 512
}
# Must match your model output
assert len(embedding) == 512, f"Expected 512, got {len(embedding)}"

If you switch models, create a new index instead of reusing the old one with mismatched dimensions.

GPU Out of Memory

Reduce batch size. For CLIP ViT-L/14 on a 16 GB GPU, keep batch size under 64.

# Progressive batch-size reduction
for batch_size in [128, 64, 32, 16, 8]:
    try:
        process_batch(paths, model, processor, batch_size)
        break
    except torch.cuda.OutOfMemoryError:
        clear_gpu_cache()
        continue

Slow Search Queries

Search latency increases with collection size and vector dimensions.

  • Reduce dimensions (e.g., switch from SigLIP 1152 to CLIP 512).
  • Increase Meilisearch’s search_cutoff_ms if you prefer latency over recall.
  • Partition images into multiple indexes by category and query the relevant one.
  • Add more RAM to the Meilisearch host — vector search is memory-bound.

Stale Embeddings After Re-index

Clear the embedding cache when you update the model, or version your cache keys.

class VersionedCache(EmbeddingCache):
    def __init__(self, model_version: str, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.model_version = model_version

    def _key(self, image_url: str) -> str:
        base = super()._key(image_url)
        return f"{self.model_version}:{base}"

Best Practices

  • Precompute embeddings once and store them in durable storage (Parquet, S3, Redis). Never recompute on every index.
  • Resize images before encoding — 224x224 for CLIP, 384x384 for SigLIP. Larger images waste GPU cycles.
  • Monitor index memory — each 512-dim vector takes ~2 KB in Meilisearch. 1 million images consume ~2 GB just for vectors.
  • Benchmark before deploying — run the benchmarking script on a representative subset to set realistic SLOs.
  • Use a staging index — run re-indexing against a staging index, then swap aliases when ready.
  • Version your models — encode the model name in the embedder key so you can switch models without downtime.

Conclusion

Image search in Meilisearch is practical today. Pick a model that fits your accuracy and latency budget, precompute embeddings with batched inference, cache aggressively, and monitor search performance in production. The combination of Meilisearch’s fast vector engine with a dedicated embedding service handles datasets from hundreds to millions of images.

For more, see the Meilisearch vector search docs and the Meilisearch GitHub repository.

Resources

Comments

👍 Was this article helpful?