Introduction
Apache Solr is a powerful, open-source enterprise search platform built on Apache Lucene. Originally created by CNET in 2004, Solr became an Apache top-level project in 2007 and continues to be the go-to choice for organizations needing full-text search, faceted navigation, and rich document retrieval.
In 2026, Solr powers search for major platforms including Netflix, Airbnb, and eBay. This comprehensive guide covers everything you need to get started with Solr.
What is Apache Solr?
Solr is a search server that provides:
- Full-Text Search: Powerful text matching with Lucene
- Faceted Navigation: Filter and categorize results
- Distributed Search: Scale horizontally with SolrCloud
- Rich Documents: Index PDF, Word, HTML, and more
- Real-Time Indexing: Near real-time search availability
Solr vs Elasticsearch
| Feature | Solr | Elasticsearch |
|---|---|---|
| License | Apache 2.0 | SSPL |
| Architecture | Mature, rigid | Flexible, JSON-first |
| Search | Powerful | Good |
| Faceting | Excellent | Good |
| Learning Curve | Steeper | Easier |
Installation
Docker Installation
# Start Solr
docker run --name solr -d -p 8983:8983 solr:latest
# Create a core
docker exec solr solr create_core -c gettingstarted
Package Installation
# Download Solr
wget https://archive.apache.org/dist/solr/solr/9.10.1/solr-9.10.1.tgz
# Extract and start
tar -xzf solr-9.10.1.tgz
cd solr-9.10.1
bin/solr start
# Create core
bin/solr create_core -c mycore
Core Concepts
Documents and Fields
// Document structure
{
"id": "1",
"title": "Wireless Mouse",
"description": "Ergonomic wireless mouse",
"price": 29.99,
"category": "Electronics",
"in_stock": true
}
Schema (managed.json or schema.xml)
// Dynamic field definitions
{
"fields": [
{"name": "id", "type": "string", "indexed": true, "stored": true},
{"name": "title", "type": "text_en", "indexed": true, "stored": true},
{"name": "price", "type": "pfloat", "indexed": true, "stored": true},
{"name": "category", "type": "string", "indexed": true, "stored": true},
{"name": "in_stock", "type": "boolean", "indexed": true, "stored": true}
]
}
Indexing Documents
JSON Documents
# Index a single document
curl -X POST -H "Content-Type: application/json" \
"http://localhost:8983/solr/gettingstarted/update" \
--data-binary '[
{
"id": "1",
"title": "Wireless Mouse",
"price": 29.99,
"category": "Electronics"
}
]'
# Commit changes
curl -X POST "http://localhost:8983/solr/gettingstarted/update?commit=true"
CSV Documents
# Index CSV
curl -X POST -H "Content-Type: application/csv" \
"http://localhost:8983/solr/gettingstarted/update?separator=%2C&encapsulator=%22&header=true" \
--data-binary @products.csv
Querying
Basic Search
# Simple query
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless"
# Query specific field
curl "http://localhost:8983/solr/gettingstarted/select?q=title:wireless"
# Wildcard search
curl "http://localhost:8983/solr/gettingstarted/select?q=wire*"
Query Parameters
# Pagination
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless&rows=10&start=0"
# Sorting
curl "http://localhost:8983/solr/gettingstarted/select?q=wireless&sort=price asc"
# Filtering
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&fq=category:Electronics"
Boolean Queries
# AND, OR, NOT
curl "http://localhost:8983/solr/gettingstarted/select?q=+(wireless mouse) +category:Electronics"
Faceted Search
# Category facet
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.field=category"
# Range facet (price ranges)
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.range=price&facet.range.start=0&facet.range.end=100&facet.range.gap=25"
# Pivot facet (hierarchical)
curl "http://localhost:8983/solr/gettingstarted/select?q=*:*&facet=true&facet.pivot=category,brand"
Python Integration
Using pysolr
import pysolr
# Connect to Solr
solr = pysolr.Solr('http://localhost:8983/solr/gettingstarted')
# Index document
solr.add([
{
'id': '1',
'title': 'Wireless Mouse',
'price': '29.99',
'category': 'Electronics'
}
])
# Search
results = solr.search('wireless')
print(f"Found {results.hits} documents")
for doc in results:
print(doc['title'])
Conclusion
Solr provides powerful enterprise search capabilities with full-text search, faceted navigation, and distributed scaling. Understanding Solr’s query syntax and schema design enables building sophisticated search applications.
In the next article, we’ll explore Solr operations: collection management, backups, and performance tuning.
Comments