Introduction
Running Apache Solr in production demands disciplined management of collections, secure access controls, reliable backup strategies, and ongoing performance tuning. This guide walks through every essential operation — from installing Solr in standalone or SolrCloud mode to handling rolling upgrades and disaster recovery. Each section includes working commands, configuration snippets, and production-hardened defaults you can apply immediately.
Installation and Deployment Modes
Solr supports three primary deployment modes. Choose based on your scale and availability requirements.
Standalone (Single Node)
# Download and extract Solr 9.x
wget https://dlcdn.apache.org/solr/solr/9.8.0/solr-9.8.0.tgz
tar xzf solr-9.8.0.tgz
# Start Solr in standalone mode
solr-9.8.0/bin/solr start -p 8983
# Create a core (standalone equivalent of collection)
solr-9.8.0/bin/solr create_core -c products -d server/solr/configsets/_default
SolrCloud (Multi-Node Cluster)
# Start first node in cloud mode
solr-9.8.0/bin/solr start -cloud -p 8983 -z zoo1:2181,zoo2:2181,zoo3:2181
# Start additional nodes
solr-9.8.0/bin/solr start -cloud -p 8984 -z zoo1:2181,zoo2:2181,zoo3:2181
# Verify cluster status
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS"
Docker Deployment
# docker-compose.yml
version: "3.8"
services:
solr:
image: solr:9.8
ports:
- "8983:8983"
environment:
- ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
- SOLR_HEAP=4g
- SOLR_OPTS=-Djava.security.egd=file:/dev/./urandom
volumes:
- solr_data:/var/solr
deploy:
replicas: 3
zoo1:
image: zookeeper:3.9
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
volumes:
solr_data:
Kubernetes Deployment
# solr-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: solr
spec:
serviceName: solr
replicas: 3
selector:
matchLabels:
app: solr
template:
metadata:
labels:
app: solr
spec:
containers:
- name: solr
image: solr:9.8
ports:
- containerPort: 8983
name: solr-client
- containerPort: 7983
name: solr-overseer
env:
- name: ZK_HOST
value: zoo1:2181,zoo2:2181,zoo3:2181
- name: SOLR_HEAP
value: 4g
- name: SOLR_OPTS
value: "-Djava.security.egd=file:/dev/./urandom"
volumeMounts:
- name: solr-data
mountPath: /var/solr
livenessProbe:
httpGet:
path: /solr/admin/info/system
port: 8983
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet:
path: /solr/admin/collections?action=CLUSTERSTATUS
port: 8983
initialDelaySeconds: 30
periodSeconds: 15
volumeClaimTemplates:
- metadata:
name: solr-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
Collection Management
All collection operations go through the Collections API. Below are the essential actions every administrator needs.
CREATE
# Create collection with explicit shard and replica count
curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=products&numShards=3&replicationFactor=2&collection.configName=products_config&maxShardsPerNode=2"
# Create collection with router and custom properties
curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=logs_202604&numShards=6&replicationFactor=2&router.name=compositeId&router.field=_route_&property.commitDistance=10000"
RELOAD
Reloading picks up changes to solrconfig.xml and managed-schema without deleting the collection.
# Reload collection
curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=products"
# Response: {"responseHeader":{"status":0,"QTime":142}}
DELETE
# Delete collection and its index data
curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=old_products"
# Delete with followAliases=false to prevent alias cascade
curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=products_v1&followAliases=false"
SPLITSHARD
Split a hot shard into two sub-shards to distribute load.
# Split shard shard1 into two pieces
curl "http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=products&shard=shard1"
# Split with explicit ranges
curl "http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=products&shard=shard1&ranges=0-1f4c,1f4c-3e8"
DELETESHARD
Remove an empty shard after splitting or rebalancing.
# Delete shard (must be inactive or empty)
curl "http://localhost:8983/solr/admin/collections?action=DELETESHARD&collection=products&shard=shard1_0"
ADDREPLICA
Add replicas to increase read capacity or fault tolerance.
# Add replica to specific shard
curl "http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard1&node=192.168.1.10:8983_solr"
# Add replica to the least-loaded node
curl "http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard1"
Collection Aliases
Aliases let you swap collections behind a fixed name — essential for zero-downtime reindexing.
# Create alias pointing to one collection
curl "http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=search_products&collections=products_v2"
# Point alias to multiple collections (routed by time)
curl "http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=logs&collections=logs_202604,logs_202605"
# Rotate alias for zero-downtime reindex
curl "http://localhost:8983/solr/admin/collections?action=ALIASPROP&name=search_products&collections=products_v3&isTime=false"
Cluster Status
# Full cluster status
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS"
# Status for specific collection
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&collection=products"
# List all collections
curl "http://localhost:8983/solr/admin/collections?action=LIST"
Collection API Parameters Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
string | required | Collection name |
numShards |
int | 1 | Number of shards |
replicationFactor |
int | 1 | Replicas per shard |
collection.configName |
string | configset | Configset name |
maxShardsPerNode |
int | unlimited | Caps shards per physical node |
router.name |
string | compositeId | Routing strategy |
router.field |
string | route | Field used for routing |
property.commitDistance |
int | 10000 | Docs between autocommits |
followAliases |
bool | true | Cascade alias on delete |
nrtReplicas |
int | 0 | Near-real-time replicas count |
tlogReplicas |
int | 0 | Transaction-log replicas count |
pullReplicas |
int | 0 | Pull replicas count |
async |
string | null | Async request ID for tracking |
Backup and Restore
Full Collection Backup
# Backup collection to local filesystem
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products_snapshot_20260426&collection=products&location=/backup/solr"
# Backup with index backup strategy (synchronous)
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products_idx&collection=products&location=/backup/solr/indexes&backupStrategy=INDEX"
Incremental Backup
Solr does not support true incremental backups natively. Use a script that tracks the transaction log.
#!/bin/bash
# incremental-solr-backup.sh
BACKUP_DIR="/backup/solr"
COLLECTION="products"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# Full backup on Sunday
if [ "$(date +%u)" -eq 7 ]; then
curl -s "http://localhost:8983/solr/admin/collections?action=BACKUP&name=${COLLECTION}_full_${TIMESTAMP}&collection=${COLLECTION}&location=${BACKUP_DIR}"
else
# Tlog backup on other days
curl -s "http://localhost:8983/solr/admin/collections?action=BACKUP&name=${COLLECTION}_incr_${TIMESTAMP}&collection=${COLLECTION}&location=${BACKUP_DIR}&backupStrategy=tlog"
fi
Snapshot Backup
# Create an index snapshot point
curl -X POST "http://localhost:8983/solr/products/replication?command=backup&name=pre_upgrade_snapshot&location=/backup/solr/snapshots"
# List snapshots
curl -X POST "http://localhost:8983/solr/products/replication?command=backup&location=/backup/solr/snapshots"
# Delete snapshot
curl -X POST "http://localhost:8983/solr/products/replication?command=deletesnapshot&name=pre_upgrade_snapshot&location=/backup/solr/snapshots"
Restore Collection
# Restore from backup
curl "http://localhost:8983/solr/admin/collections?action=RESTORE&name=products_snapshot_20260426&collection=products_restored&location=/backup/solr"
# Restore to existing collection (overwrite)
curl "http://localhost:8983/solr/admin/collections?action=RESTORE&name=products_snapshot_20260426&collection=products&location=/backup/solr&overwriteExisting=true"
Cloud Storage Backup
# Mount S3-compatible storage as backup target
s3fs solr-backups /backup/solr -o endpoint=us-east-1 -o use_path_request_style
# Run backup targeting the mount
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products&collection=products&location=/backup/solr/prod"
# For GCS, use gcsfuse
gcsfuse solr-backup-bucket /backup/solr
Security Hardening
Basic Authentication
# Create security.json on ZooKeeper
cat > security.json << 'EOF'
{
"authentication": {
"class": "solr.BasicAuthPlugin",
"blockUnknown": true,
"credentials": {
"admin": "IV0E6e8sFGPqJPB7RqVK1Q==:fIv72YF0T3eFfbMFTmXyj4pv4Z6e8qFZ2K8Bb1D"
}
},
"authorization": {
"class": "solr.RuleBasedAuthorizationPlugin",
"permissions": [
{"name": "security-edit", "role": "admin"},
{"name": "collection-admin-edit", "role": "admin"},
{"name": "core-admin-edit", "role": "admin"},
{"name": "read", "role": "readers"},
{"name": "update", "role": "writers"}
],
"user-role": {
"admin": ["admin"],
"reader_user": ["readers"],
"writer_user": ["writers"]
}
}
}
EOF
# Upload to ZooKeeper
solr-9.8.0/server/scripts/cloud-scripts/zkcli.sh -zkhost zoo1:2181 -cmd put /security.json security.json
Kerberos Authentication
{
"authentication": {
"class": "solr.KerberosPlugin",
"impersonate": true,
"kerberosPrincipal": "HTTP/[email protected]",
"kerberosKeytab": "/etc/solr/solr.keytab"
}
}
TLS Configuration
# In solr.in.sh
SOLR_SSL_ENABLED=true
SOLR_SSL_KEY_STORE=/etc/solr/certs/solr-keystore.jks
SOLR_SSL_KEY_STORE_PASSWORD=changeme
SOLR_SSL_TRUST_STORE=/etc/solr/certs/solr-truststore.jks
SOLR_SSL_TRUST_STORE_PASSWORD=changeme
SOLR_SSL_NEED_CLIENT_AUTH=false
SOLR_SSL_WANT_CLIENT_AUTH=false
SOLR_SSL_CHECK_PEER_NAME=true
Audit Logging
<!-- Enable audit logging in solrconfig.xml -->
<listener event="postCommit" class="solr.AuditLoggerListener">
<str name="logFile">${solr.solr.home}/logs/audit.log</str>
<str name="events">UPDATE,DELETE,COMMIT</str>
</listener>
Monitoring
Metrics API Endpoints
# All metrics
curl "http://localhost:8983/solr/admin/metrics"
# Core metrics only
curl "http://localhost:8983/solr/admin/metrics?group=core"
# JVM metrics
curl "http://localhost:8983/solr/admin/metrics?group=jvm"
# Cache metrics
curl "http://localhost:8983/solr/admin/metrics?group=cache"
# Request metrics
curl "http://localhost:8983/solr/admin/metrics?group=QUERY,/select"
Prometheus Integration
# prometheus-solr.yml
scrape_configs:
- job_name: "solr"
metrics_path: "/solr/admin/metrics"
params:
group:
- "core"
- "jvm"
- "cache"
- "query"
- "node"
static_configs:
- targets:
- "solr-node1:8983"
- "solr-node2:8983"
- "solr-node3:8983"
Key Metrics to Watch
| Metric | Category | Warning | Critical | Action |
|---|---|---|---|---|
solr.core.query.requestsPerSecond |
Throughput | 80% of peak | 95% of peak | Add replicas or scale |
solr.core.cache.hitratio |
Cache | < 0.70 | < 0.50 | Increase cache size |
solr.jvm.memory.heap.usage |
Memory | > 75% | > 90% | Increase heap or GC tune |
solr.jvm.gc.g1-old-generation.time |
GC | > 200ms | > 500ms | Tune GC or reduce heap pressure |
solr.node.fs.total.usableSpace |
Disk | < 20% free | < 10% free | Add nodes or purge old indices |
solr.core.segment.count |
Index | > 100 | > 200 | Force merge or tune merge policy |
solr.core.update.autoCommitCount |
Throughput | rapid spikes | sustained | Tune autoSoftCommit distance |
Solr Admin UI
Administration console:
http://localhost:8983/solr/#/
Key pages:
- Dashboard: node health, JVM, disk
- Collections: cluster topology, shard/replica states
- Core Admin: per-core metrics, index size
- Plugins: loaded analyzers, caches, update processors
- Logging: live log level adjustment for debugging
Grafana Dashboard Example
{
"title": "Solr Cluster Overview",
"panels": [
{
"title": "Query Throughput (req/s)",
"type": "graph",
"targets": [{"expr": "solr_core_query_requestsPerSecond"}]
},
{
"title": "Cache Hit Ratios",
"type": "graph",
"targets": [{"expr": "solr_core_cache_hitratio"}]
},
{
"title": "JVM Heap Usage %",
"type": "graph",
"targets": [{"expr": "solr_jvm_memory_heap_usage"}]
},
{
"title": "GC Pause Duration",
"type": "graph",
"targets": [{"expr": "solr_jvm_gc_g1_old_generation_time"}]
}
]
}
Performance Tuning
JVM Heap and GC Tuning
# solr.in.sh — production JVM settings
SOLR_HEAP=8g # 50% of available RAM on dedicated nodes
SOLR_JAVA_MEM="-Xms8g -Xmx8g" # Equal min/max to avoid resizing pauses
GC_TUNE="-XX:+UseG1GC \
-XX:MaxGCPauseMillis=100 \
-XX:+ParallelRefProcEnabled \
-XX:+UseStringDeduplication \
-XX:+UnlockExperimentalVMOptions \
-XX:G1NewSizePercent=30 \
-XX:G1MaxNewSizePercent=50 \
-XX:G1HeapRegionSize=8m \
-XX:G1ReservePercent=15 \
-XX:G1HeapWastePercent=5 \
-XX:G1MixedGCCountTarget=8 \
-XX:-UseBiasedLocking"
# For large heaps (>16 GB), consider Shenandoah
GC_TUNE="-XX:+UseShenandoahGC -XX:ShenandoahGCHeuristics=adaptive"
SOLR_OPTS="$GC_TUNE -XX:+AlwaysPreTouch -Djava.security.egd=file:/dev/./urandom"
JVM GC Options Reference
| Option | Purpose | Recommendation |
|---|---|---|
-Xms / -Xmx |
Heap min and max | Set equal, 50% of host RAM |
-XX:MaxGCPauseMillis |
Target GC pause | 50–200ms for query latency |
-XX:G1HeapRegionSize |
Region granularity | 4m–16m depending on heap |
-XX:+ParallelRefProcEnabled |
Parallelize reference processing | Always enable |
-XX:+UseStringDeduplication |
Deduplicate identical strings | Enable on data-heavy indices |
-XX:G1ReservePercent |
Headroom for promotion failure | 10–15% |
-XX:+AlwaysPreTouch |
Pre-allocate heap pages | Reduces latency spikes |
-XX:+UseShenandoahGC |
Ultra-low-pause GC | For heaps > 16 GB |
Cache Configuration
<!-- solrconfig.xml — tuned cache settings -->
<query>
<!-- filterCache: stores filtered doc ID sets -->
<filterCache class="solr.LRUQueryCache"
size="16384"
initialSize="8192"
minSize="4096"
maxRamMB="512"/>
<!-- queryResultCache: stores complete result sets -->
<queryResultCache class="solr.LRUCache"
size="8192"
initialSize="4096"
maxRamMB="256"/>
<!-- documentCache: stores stored field lookups -->
<documentCache class="solr.LRUCache"
size="16384"
initialSize="8192"
maxRamMB="256"/>
<!-- Per-segment filter cache for large indices -->
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<queryResultWindowSize>100</queryResultWindowSize>
<queryResultMaxDocsCached>500</queryResultMaxDocsCached>
</query>
Cache Types and Settings
| Cache | Purpose | Hit Ratio Target | Max RAM | Eviction |
|---|---|---|---|---|
filterCache |
Filtered doc ID lists | > 80% | 256–1024 MB | LRU |
queryResultCache |
Complete query results | > 70% | 128–512 MB | LRU |
documentCache |
Stored field retrievals | > 90% | 128–256 MB | LRU |
fieldValueCache |
Facet/group field value lookups | > 90% | 64–256 MB | LRU |
Auto Soft-Commit and Index Warming
<!-- solrconfig.xml — near-real-time and commit settings -->
<autoCommit>
<maxDocs>10000</maxDocs>
<maxTime>900000</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxDocs>1000</maxDocs>
<maxTime>60000</maxTime>
</autoSoftCommit>
<!-- Index warming: prime caches after reopen -->
<query>
<listener event="newSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<str>cat:electronics</str>
<str>price:[10 TO 100]</str>
<str>q=*:*&rows=0&facet=true&facet.field=category</str>
</arr>
</listener>
<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<str>q=*:*&rows=0</str>
</arr>
</listener>
</query>
Merge Policy Tuning
<!-- solrconfig.xml — tiered merge policy for write-heavy workloads -->
<mergePolicyFactory class="solr.TieredMergePolicyFactory">
<int name="maxMergeAtOnce">10</int>
<int name="segmentsPerTier">10</int>
<int name="maxMergedSegmentMB">5120</int>
<double name="reclaimDeletesWeight">2.5</double>
<double name="floorSegmentMB">50</double>
<double name="seeksPenalty">5.0</double>
<double name="expungeDeletesPctAllowed">20.0</double>
</mergePolicyFactory>
<!-- Force merge during low-traffic window (use sparingly) -->
curl -X POST "http://localhost:8983/solr/products/update?maxSegments=3&waitFlush=true"
Log Management and Troubleshooting
Log Levels
# Set log level via Admin UI or API
curl -X POST "http://localhost:8983/solr/admin/info/logging" \
--data "set=org.apache.solr.core&value=FINE"
# Get current log levels
curl "http://localhost:8983/solr/admin/info/logging"
# Reset to default
curl -X POST "http://localhost:8983/solr/admin/info/logging" \
--data "set=root&value=WARN"
Common Troubleshooting Commands
# Check cluster state
curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | jq .cluster
# Check shard health
curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&collection=products" | jq '.cluster.collections.products.shards'
# Ping a core
curl "http://localhost:8983/solr/products/admin/ping"
# Force leader election
curl "http://localhost:8983/solr/admin/collections?action=REBALANCELEADERS&collection=products"
# Unload a stuck core
curl "http://localhost:8983/solr/admin/cores?action=UNLOAD&core=products_shard1_replica_n1&deleteIndex=true&deleteDataDir=true"
Log Rotation
<!-- log4j2.xml — production logging configuration -->
<RollingFile name="solrlog" fileName="${sys:solr.log.dir}/solr.log"
filePattern="${sys:solr.log.dir}/solr.%d{yyyy-MM-dd}.%i.log.gz">
<PatternLayout pattern="%d{ISO8601} %-5p [%t] %c{1.} - %m%n"/>
<Policies>
<TimeBasedTriggeringPolicy interval="1" modulate="true"/>
<SizeBasedTriggeringPolicy size="256 MB"/>
</Policies>
<DefaultRolloverStrategy max="30">
<Delete basePath="${sys:solr.log.dir}" maxDepth="1">
<IfFileName glob="solr.*.log.gz"/>
<IfLastModified age="30d"/>
</Delete>
</DefaultRolloverStrategy>
</RollingFile>
Rolling Restarts and Upgrades
Rolling Restart Procedure
#!/bin/bash
# rolling-restart.sh
NODES=("solr-node1:8983" "solr-node2:8983" "solr-node3:8983")
for NODE in "${NODES[@]}"; do
echo "=== Draining and restarting $NODE ==="
# 1. Mark node for recovery (stop new replica assignments)
solr-9.8.0/bin/solr healthcheck -collection products -shard shard1 -node "$NODE"
# 2. Stop Solr gracefully
ssh "$NODE" "solr-9.8.0/bin/solr stop -p 8983"
# 3. Wait for ZooKeeper to register node as down
sleep 10
# 4. Verify replicas moved to recovery state
curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | \
jq '.cluster.live_nodes'
# 5. Start Solr again
ssh "$NODE" "solr-9.8.0/bin/solr start -cloud -p 8983 -z zoo1:2181"
# 6. Wait for node to rejoin and replicas to sync
sleep 30
# 7. Verify node is healthy
curl -s "http://$NODE/solr/admin/cores?action=STATUS" | jq '.status | length'
echo "=== $NODE restarted successfully ==="
done
Upgrade Strategy
Solr upgrade checklist:
1. Read release notes for breaking changes (config API, parameter renames)
2. Test upgrade on staging cluster with production data size
3. Take full backup of all collections and ZooKeeper data
4. Backup ZooKeeper state: solr-9.8.0/bin/zkcli.sh -zkhost localhost:2181 -cmd list -d /
5. Run rolling restart, one node at a time, verifying health between each
6. After all nodes upgraded: reindex collections using new features
7. Run full cluster health check and performance comparison
Disaster Recovery Planning
Multi-Datacenter Strategy
# Configure cross-DC backup on primary
curl "http://localhost:8983/solr/admin/collections?action=BACKUP" \
--data-urlencode "name=dr_snapshot" \
--data-urlencode "collection=products" \
--data-urlencode "location=/nfs/solr-backups"
# On standby DC, restore from NFS backup
curl "http://standby-dc:8983/solr/admin/collections?action=RESTORE" \
--data-urlencode "name=dr_snapshot" \
--data-urlencode "collection=products" \
--data-urlencode "location=/nfs/solr-backups"
Automated Recovery Script
#!/bin/bash
# solr-dr-failover.sh
PRIMARY_NODE="solr-primary:8983"
STANDBY_NODE="solr-standby:8983"
check_health() {
local node=$1
local status=$(curl -s -o /dev/null -w "%{http_code}" "http://$node/solr/admin/info/system")
return $([ "$status" == "200" ])
}
if ! check_health "$PRIMARY_NODE"; then
log_error "Primary Solr unhealthy, initiating failover"
# 1. Promote standby (update DNS)
kubectl patch service solr -p '{"spec":{"selector":{"role":"standby"}}}'
# 2. Restore latest backup to promoted cluster
curl "http://$STANDBY_NODE/solr/admin/collections?action=RESTORE&name=latest_dr&collection=products&location=/backup/solr"
# 3. Verify promoted node
if check_health "$STANDBY_NODE"; then
log_success "Failover complete, standby promoted"
else
log_error "Failover failed"
exit 1
fi
fi
Production Checklist
| Category | Item | Verification |
|---|---|---|
| Cluster | ZooKeeper ensemble has 3+ nodes | `echo stat |
| Cluster | SolrCloud mode enabled | curl /solr/admin/collections?action=CLUSTERSTATUS |
| Cluster | Replication factor >= 2 | Collection list output |
| JVM | Heap set to 50% of host RAM | curl /solr/admin/info/system |
| JVM | -Xms equals -Xmx |
Solr start log |
| JVM | G1GC or Shenandoah configured | jcmd <pid> VM.flags |
| Disk | Dedicated data mount (not root) | df -h /var/solr |
| Disk | 80% or less utilization | Monitoring alert |
| Backup | Automated backup running daily | crontab -l |
| Backup | Off-site backup copy exists | Verify S3/GCS bucket |
| Security | Authentication enabled | curl without credentials returns 401 |
| Security | TLS configured for all endpoints | openssl s_client |
| Security | ZooKeeper ACLs set | getAcl /solr |
| Monitoring | Prometheus scraping all nodes | Target status page |
| Monitoring | Grafana dashboards configured | Dashboard list |
| Monitoring | Alert rules defined | Alertmanager config |
| Logging | Log rotation configured | Check log4j2.xml |
| Logging | Audit logging enabled | Verify audit.log exists |
Conclusion
Solr operations demand vigilance across every layer — from ZooKeeper cluster health and JVM tuning to security hardening and disaster recovery planning. Automate backups, monitor cache hit ratios and GC pauses in production, and always run rolling restarts during maintenance windows. With these practices in place, your Solr cluster will handle enterprise-scale search reliably.
Resources
- Apache Solr Reference Guide
- Solr Metrics API Documentation
- Solr Security Setup
- Solr Performance Tuning
- Prometheus Solr Exporter
- G1GC Tuning Guide
Comments