Skip to main content
โšก Calmops

Apache Solr: The Complete Guide to Enterprise Search

Introduction

Apache Solr is a powerful, open-source enterprise search platform built on Apache Lucene. Originally created by CNET in 2004, Solr became an Apache top-level project in 2007 and continues to be the go-to choice for organizations needing full-text search, faceted navigation, and rich document retrieval.

In 2026, Solr powers search for major platforms including Netflix, Airbnb, and eBay. This comprehensive guide covers everything you need to get started with Solr.


What is Apache Solr?

Solr is a search server that provides:

  • Full-Text Search: Powerful text matching with Lucene
  • Faceted Navigation: Filter and categorize results
  • Distributed Search: Scale horizontally with SolrCloud
  • Rich Documents: Index PDF, Word, HTML, and more
  • Real-Time Indexing: Near real-time search availability

Solr vs Elasticsearch

Feature Solr Elasticsearch
License Apache 2.0 SSPL
Architecture Mature, rigid Flexible, JSON-first
Search Powerful Good
Faceting Excellent Good
Learning Curve Steeper Easier

Installation

Docker Installation

# Start Solr
docker run --name solr -d -p 8983:8983 solr:latest

# Create a core
docker exec solr solr create_core -c gettingstarted

Package Installation

# Download Solr
wget https://archive.apache.org/dist/solr/solr/9.10.1/solr-9.10.1.tgz

# Extract and start
tar -xzf solr-9.10.1.tgz
cd solr-9.10.1
bin/solr start

# Create core
bin/solr create_core -c mycore

Core Concepts

Documents and Fields

// Document structure
{
  "id": "1",
  "title": "Wireless Mouse",
  "description": "Ergonomic wireless mouse",
  "price": 29.99,
  "category": "Electronics",
  "in_stock": true
}

Schema (managed.json or schema.xml)

// Dynamic field definitions
{
  "fields": [
    {"name": "id", "type": "string", "indexed": true, "stored": true},
    {"name": "title", "type": "text_en", "indexed": true, "stored": true},
    {"name": "price", "type": "pfloat", "indexed": true, "stored": true},
    {"name": "category", "type": "string", "indexed": true, "stored": true},
    {"name": "in_stock", "type": "boolean", "indexed": true, "stored": true}
  ]
}

Indexing Documents

JSON Documents

# Index a single document
curl -X POST -H "Content-Type: application/json" \
  "http://localhost:8983/solr/gettingstarted/update" \
  --data-binary '[
    {
      "id": "1",
      "title": "Wireless Mouse",
      "price": 29.99,
      "category": "Electronics"
    }
  ]'

# Commit changes
curl -X POST "http://localhost:8983/solr/gettingstarted/update?commit=true"

CSV Documents

# Index CSV
curl -X POST -H "Content-Type: application/csv" \
  "http://localhost:8983/solr/gettingstarted/update?separator=%2C&encapsulator=%22&header=true" \
  --data-binary @products.csv

Querying

# Simple query
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless"

# Query specific field
curl "http://localhost:8983/solr/gettingstarted/select?q=title:wireless"

# Wildcard search
curl "http://localhost:8983/solr/gettingstarted/select?q=wire*"

Query Parameters

# Pagination
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless&rows=10&start=0"

# Sorting
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless&sort=price asc"

# Filtering
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&fq=category:Electronics"

Boolean Queries

# AND, OR, NOT
curl "http://localhost:8983/solr/gettingstarted/select?q=+(wireless mouse) +category:Electronics"

# Category facet
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.field=category"

# Range facet (price ranges)
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.range=price&facet.range.start=0&facet.range.end=100&facet.range.gap=25"

# Pivot facet (hierarchical)
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.pivot=category,brand"

Python Integration

Using pysolr

import pysolr

# Connect to Solr
solr = pysolr.Solr('http://localhost:8983/solr/gettingstarted')

# Index document
solr.add([
    {
        'id': '1',
        'title': 'Wireless Mouse',
        'price': '29.99',
        'category': 'Electronics'
    }
])

# Search
results = solr.search('wireless')

print(f"Found {results.hits} documents")
for doc in results:
    print(doc['title'])

Conclusion

Solr provides powerful enterprise search capabilities with full-text search, faceted navigation, and distributed scaling. Understanding Solr’s query syntax and schema design enables building sophisticated search applications.

In the next article, we’ll explore Solr operations: collection management, backups, and performance tuning.

Comments