While Meilisearch is primarily a text search engine, it supports image search through vector embeddings, enabling visual similarity queries. This experimental feature uses machine learning models to convert images into vectors, allowing searches like “find images similar to this one.” This post covers setup, data preparation, indexing, and querying images.
What is Image Search in Meilisearch?
Image search leverages vector search by generating embeddings from images. Models like CLIP or ResNet create vectors representing visual features, which Meilisearch indexes for similarity searches.
Prerequisites
- Meilisearch v1.3 or later.
- Python with libraries like
transformersorclipfor embeddings. - Image dataset or URLs.
Step 1: Setting Up Meilisearch
Download and run Meilisearch:
curl -L https://install.meilisearch.com | sh
./meilisearch --master-key="your_master_key"
Ensure vector search is enabled (default in recent versions).
Step 2: Preparing Image Embeddings
Generate embeddings using a model like CLIP:
import torch
from transformers import CLIPProcessor, CLIPModel
from PIL import Image
import requests
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
def get_image_embedding(image_path_or_url):
if image_path_or_url.startswith('http'):
image = Image.open(requests.get(image_path_or_url, stream=True).raw)
else:
image = Image.open(image_path_or_url)
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
embeddings = model.get_image_features(**inputs)
return embeddings.squeeze().tolist()
documents = [
{"id": 1, "title": "Cat Image", "image_url": "https://example.com/cat.jpg"},
{"id": 2, "title": "Dog Image", "image_url": "https://example.com/dog.jpg"},
]
for doc in documents:
doc["_vectors"] = {"default": get_image_embedding(doc["image_url"])}
Step 3: Indexing Images
Add documents with vectors to Meilisearch:
import meilisearch
client = meilisearch.Client('http://localhost:7700', 'your_master_key')
index = client.index('images')
index.add_documents(documents)
index.update_settings({
"embedders": {
"default": {
"source": "userProvided",
"dimensions": 512 # CLIP embedding size
}
}
})
Step 4: Performing Image Searches
Search using an image’s embedding:
query_image_url = "https://example.com/query_cat.jpg"
query_embedding = get_image_embedding(query_image_url)
results = index.search("", {
"vector": query_embedding,
"limit": 5
})
for hit in results['hits']:
print(f"Similar: {hit['title']}, Distance: {hit['_vectors']['default']}")
For text-to-image search, encode text and search:
text_query = "a fluffy cat"
text_inputs = processor(text=text_query, return_tensors="pt")
with torch.no_grad():
text_embedding = model.get_text_features(**text_inputs).squeeze().tolist()
results = index.search("", {"vector": text_embedding})
Step 5: Advanced Features
Hybrid Search
Combine with metadata:
results = index.search("", {
"vector": query_embedding,
"filter": "category = 'animals'"
})
Custom Embedders
Use Meilisearch’s REST API for custom models.
Best Practices
- Use high-quality models like CLIP for better accuracy.
- Precompute embeddings to save time.
- Handle large images by resizing.
- Monitor index size; embeddings are memory-intensive.
Conclusion
Image search in Meilisearch via vectors enables powerful visual queries. Experiment with different models and integrate into apps for enhanced search capabilities.
For more, see the Meilisearch vector search docs.
Comments