Solr Operations: Collection Management and Performance

Introduction

Running Apache Solr in production demands disciplined management of collections, secure access controls, reliable backup strategies, and ongoing performance tuning. This guide walks through every essential operation — from installing Solr in standalone or SolrCloud mode to handling rolling upgrades and disaster recovery. Each section includes working commands, configuration snippets, and production-hardened defaults you can apply immediately.

Installation and Deployment Modes

Solr supports three primary deployment modes. Choose based on your scale and availability requirements.

Standalone (Single Node)

# Download and extract Solr 9.x
wget https://dlcdn.apache.org/solr/solr/9.8.0/solr-9.8.0.tgz
tar xzf solr-9.8.0.tgz

# Start Solr in standalone mode
solr-9.8.0/bin/solr start -p 8983

# Create a core (standalone equivalent of collection)
solr-9.8.0/bin/solr create_core -c products -d server/solr/configsets/_default

SolrCloud (Multi-Node Cluster)

# Start first node in cloud mode
solr-9.8.0/bin/solr start -cloud -p 8983 -z zoo1:2181,zoo2:2181,zoo3:2181

# Start additional nodes
solr-9.8.0/bin/solr start -cloud -p 8984 -z zoo1:2181,zoo2:2181,zoo3:2181

# Verify cluster status
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS"

Docker Deployment

# docker-compose.yml
version: "3.8"
services:
  solr:
    image: solr:9.8
    ports:
      - "8983:8983"
    environment:
      - ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
      - SOLR_HEAP=4g
      - SOLR_OPTS=-Djava.security.egd=file:/dev/./urandom
    volumes:
      - solr_data:/var/solr
    deploy:
      replicas: 3

  zoo1:
    image: zookeeper:3.9
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181

volumes:
  solr_data:

Kubernetes Deployment

# solr-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: solr
spec:
  serviceName: solr
  replicas: 3
  selector:
    matchLabels:
      app: solr
  template:
    metadata:
      labels:
        app: solr
    spec:
      containers:
      - name: solr
        image: solr:9.8
        ports:
        - containerPort: 8983
          name: solr-client
        - containerPort: 7983
          name: solr-overseer
        env:
        - name: ZK_HOST
          value: zoo1:2181,zoo2:2181,zoo3:2181
        - name: SOLR_HEAP
          value: 4g
        - name: SOLR_OPTS
          value: "-Djava.security.egd=file:/dev/./urandom"
        volumeMounts:
        - name: solr-data
          mountPath: /var/solr
        livenessProbe:
          httpGet:
            path: /solr/admin/info/system
            port: 8983
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /solr/admin/collections?action=CLUSTERSTATUS
            port: 8983
          initialDelaySeconds: 30
          periodSeconds: 15
  volumeClaimTemplates:
  - metadata:
      name: solr-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 100Gi

Collection Management

All collection operations go through the Collections API. Below are the essential actions every administrator needs.

CREATE

# Create collection with explicit shard and replica count
curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=products&numShards=3&replicationFactor=2&collection.configName=products_config&maxShardsPerNode=2"

# Create collection with router and custom properties
curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=logs_202604&numShards=6&replicationFactor=2&router.name=compositeId&router.field=_route_&property.commitDistance=10000"

RELOAD

Reloading picks up changes to solrconfig.xml and managed-schema without deleting the collection.

# Reload collection
curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=products"

# Response: {"responseHeader":{"status":0,"QTime":142}}

DELETE

# Delete collection and its index data
curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=old_products"

# Delete with followAliases=false to prevent alias cascade
curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=products_v1&followAliases=false"

SPLITSHARD

Split a hot shard into two sub-shards to distribute load.

# Split shard shard1 into two pieces
curl "http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=products&shard=shard1"

# Split with explicit ranges
curl "http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=products&shard=shard1&ranges=0-1f4c,1f4c-3e8"

DELETESHARD

Remove an empty shard after splitting or rebalancing.

# Delete shard (must be inactive or empty)
curl "http://localhost:8983/solr/admin/collections?action=DELETESHARD&collection=products&shard=shard1_0"

ADDREPLICA

Add replicas to increase read capacity or fault tolerance.

# Add replica to specific shard
curl "http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard1&node=192.168.1.10:8983_solr"

# Add replica to the least-loaded node
curl "http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=products&shard=shard1"

Collection Aliases

Aliases let you swap collections behind a fixed name — essential for zero-downtime reindexing.

# Create alias pointing to one collection
curl "http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=search_products&collections=products_v2"

# Point alias to multiple collections (routed by time)
curl "http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=logs&collections=logs_202604,logs_202605"

# Rotate alias for zero-downtime reindex
curl "http://localhost:8983/solr/admin/collections?action=ALIASPROP&name=search_products&collections=products_v3&isTime=false"

Cluster Status

# Full cluster status
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS"

# Status for specific collection
curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&collection=products"

# List all collections
curl "http://localhost:8983/solr/admin/collections?action=LIST"

Collection API Parameters Reference

Parameter	Type	Default	Description
`name`	string	required	Collection name
`numShards`	int	1	Number of shards
`replicationFactor`	int	1	Replicas per shard
`collection.configName`	string	configset	Configset name
`maxShardsPerNode`	int	unlimited	Caps shards per physical node
`router.name`	string	compositeId	Routing strategy
`router.field`	string	route	Field used for routing
`property.commitDistance`	int	10000	Docs between autocommits
`followAliases`	bool	true	Cascade alias on delete
`nrtReplicas`	int	0	Near-real-time replicas count
`tlogReplicas`	int	0	Transaction-log replicas count
`pullReplicas`	int	0	Pull replicas count
`async`	string	null	Async request ID for tracking

Backup and Restore

Full Collection Backup

# Backup collection to local filesystem
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products_snapshot_20260426&collection=products&location=/backup/solr"

# Backup with index backup strategy (synchronous)
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products_idx&collection=products&location=/backup/solr/indexes&backupStrategy=INDEX"

Incremental Backup

Solr does not support true incremental backups natively. Use a script that tracks the transaction log.

#!/bin/bash
# incremental-solr-backup.sh
BACKUP_DIR="/backup/solr"
COLLECTION="products"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

# Full backup on Sunday
if [ "$(date +%u)" -eq 7 ]; then
  curl -s "http://localhost:8983/solr/admin/collections?action=BACKUP&name=${COLLECTION}_full_${TIMESTAMP}&collection=${COLLECTION}&location=${BACKUP_DIR}"
else
  # Tlog backup on other days
  curl -s "http://localhost:8983/solr/admin/collections?action=BACKUP&name=${COLLECTION}_incr_${TIMESTAMP}&collection=${COLLECTION}&location=${BACKUP_DIR}&backupStrategy=tlog"
fi

Snapshot Backup

# Create an index snapshot point
curl -X POST "http://localhost:8983/solr/products/replication?command=backup&name=pre_upgrade_snapshot&location=/backup/solr/snapshots"

# List snapshots
curl -X POST "http://localhost:8983/solr/products/replication?command=backup&location=/backup/solr/snapshots"

# Delete snapshot
curl -X POST "http://localhost:8983/solr/products/replication?command=deletesnapshot&name=pre_upgrade_snapshot&location=/backup/solr/snapshots"

Restore Collection

# Restore from backup
curl "http://localhost:8983/solr/admin/collections?action=RESTORE&name=products_snapshot_20260426&collection=products_restored&location=/backup/solr"

# Restore to existing collection (overwrite)
curl "http://localhost:8983/solr/admin/collections?action=RESTORE&name=products_snapshot_20260426&collection=products&location=/backup/solr&overwriteExisting=true"

Cloud Storage Backup

# Mount S3-compatible storage as backup target
s3fs solr-backups /backup/solr -o endpoint=us-east-1 -o use_path_request_style

# Run backup targeting the mount
curl "http://localhost:8983/solr/admin/collections?action=BACKUP&name=products&collection=products&location=/backup/solr/prod"

# For GCS, use gcsfuse
gcsfuse solr-backup-bucket /backup/solr

Security Hardening

Basic Authentication

# Create security.json on ZooKeeper
cat > security.json << 'EOF'
{
  "authentication": {
    "class": "solr.BasicAuthPlugin",
    "blockUnknown": true,
    "credentials": {
      "admin": "IV0E6e8sFGPqJPB7RqVK1Q==:fIv72YF0T3eFfbMFTmXyj4pv4Z6e8qFZ2K8Bb1D"
    }
  },
  "authorization": {
    "class": "solr.RuleBasedAuthorizationPlugin",
    "permissions": [
      {"name": "security-edit", "role": "admin"},
      {"name": "collection-admin-edit", "role": "admin"},
      {"name": "core-admin-edit", "role": "admin"},
      {"name": "read", "role": "readers"},
      {"name": "update", "role": "writers"}
    ],
    "user-role": {
      "admin": ["admin"],
      "reader_user": ["readers"],
      "writer_user": ["writers"]
    }
  }
}
EOF

# Upload to ZooKeeper
solr-9.8.0/server/scripts/cloud-scripts/zkcli.sh -zkhost zoo1:2181 -cmd put /security.json security.json

Kerberos Authentication

{
  "authentication": {
    "class": "solr.KerberosPlugin",
    "impersonate": true,
    "kerberosPrincipal": "HTTP/[email protected]",
    "kerberosKeytab": "/etc/solr/solr.keytab"
  }
}

TLS Configuration

# In solr.in.sh
SOLR_SSL_ENABLED=true
SOLR_SSL_KEY_STORE=/etc/solr/certs/solr-keystore.jks
SOLR_SSL_KEY_STORE_PASSWORD=changeme
SOLR_SSL_TRUST_STORE=/etc/solr/certs/solr-truststore.jks
SOLR_SSL_TRUST_STORE_PASSWORD=changeme
SOLR_SSL_NEED_CLIENT_AUTH=false
SOLR_SSL_WANT_CLIENT_AUTH=false
SOLR_SSL_CHECK_PEER_NAME=true

Audit Logging

<!-- Enable audit logging in solrconfig.xml -->
<listener event="postCommit" class="solr.AuditLoggerListener">
  <str name="logFile">${solr.solr.home}/logs/audit.log</str>
  <str name="events">UPDATE,DELETE,COMMIT</str>
</listener>

Monitoring

Metrics API Endpoints

# All metrics
curl "http://localhost:8983/solr/admin/metrics"

# Core metrics only
curl "http://localhost:8983/solr/admin/metrics?group=core"

# JVM metrics
curl "http://localhost:8983/solr/admin/metrics?group=jvm"

# Cache metrics
curl "http://localhost:8983/solr/admin/metrics?group=cache"

# Request metrics
curl "http://localhost:8983/solr/admin/metrics?group=QUERY,/select"

Prometheus Integration

# prometheus-solr.yml
scrape_configs:
  - job_name: "solr"
    metrics_path: "/solr/admin/metrics"
    params:
      group:
        - "core"
        - "jvm"
        - "cache"
        - "query"
        - "node"
    static_configs:
      - targets:
          - "solr-node1:8983"
          - "solr-node2:8983"
          - "solr-node3:8983"

Key Metrics to Watch

Metric	Category	Warning	Critical	Action
`solr.core.query.requestsPerSecond`	Throughput	80% of peak	95% of peak	Add replicas or scale
`solr.core.cache.hitratio`	Cache	< 0.70	< 0.50	Increase cache size
`solr.jvm.memory.heap.usage`	Memory	> 75%	> 90%	Increase heap or GC tune
`solr.jvm.gc.g1-old-generation.time`	GC	> 200ms	> 500ms	Tune GC or reduce heap pressure
`solr.node.fs.total.usableSpace`	Disk	< 20% free	< 10% free	Add nodes or purge old indices
`solr.core.segment.count`	Index	> 100	> 200	Force merge or tune merge policy
`solr.core.update.autoCommitCount`	Throughput	rapid spikes	sustained	Tune autoSoftCommit distance

Solr Admin UI

Administration console:
http://localhost:8983/solr/#/

Key pages:
  - Dashboard: node health, JVM, disk
  - Collections: cluster topology, shard/replica states
  - Core Admin: per-core metrics, index size
  - Plugins: loaded analyzers, caches, update processors
  - Logging: live log level adjustment for debugging

Grafana Dashboard Example

{
  "title": "Solr Cluster Overview",
  "panels": [
    {
      "title": "Query Throughput (req/s)",
      "type": "graph",
      "targets": [{"expr": "solr_core_query_requestsPerSecond"}]
    },
    {
      "title": "Cache Hit Ratios",
      "type": "graph",
      "targets": [{"expr": "solr_core_cache_hitratio"}]
    },
    {
      "title": "JVM Heap Usage %",
      "type": "graph",
      "targets": [{"expr": "solr_jvm_memory_heap_usage"}]
    },
    {
      "title": "GC Pause Duration",
      "type": "graph",
      "targets": [{"expr": "solr_jvm_gc_g1_old_generation_time"}]
    }
  ]
}

Performance Tuning

JVM Heap and GC Tuning

# solr.in.sh — production JVM settings
SOLR_HEAP=8g                    # 50% of available RAM on dedicated nodes
SOLR_JAVA_MEM="-Xms8g -Xmx8g"  # Equal min/max to avoid resizing pauses

GC_TUNE="-XX:+UseG1GC \
  -XX:MaxGCPauseMillis=100 \
  -XX:+ParallelRefProcEnabled \
  -XX:+UseStringDeduplication \
  -XX:+UnlockExperimentalVMOptions \
  -XX:G1NewSizePercent=30 \
  -XX:G1MaxNewSizePercent=50 \
  -XX:G1HeapRegionSize=8m \
  -XX:G1ReservePercent=15 \
  -XX:G1HeapWastePercent=5 \
  -XX:G1MixedGCCountTarget=8 \
  -XX:-UseBiasedLocking"

# For large heaps (>16 GB), consider Shenandoah
GC_TUNE="-XX:+UseShenandoahGC -XX:ShenandoahGCHeuristics=adaptive"

SOLR_OPTS="$GC_TUNE -XX:+AlwaysPreTouch -Djava.security.egd=file:/dev/./urandom"

JVM GC Options Reference

Option	Purpose	Recommendation
`-Xms` / `-Xmx`	Heap min and max	Set equal, 50% of host RAM
`-XX:MaxGCPauseMillis`	Target GC pause	50–200ms for query latency
`-XX:G1HeapRegionSize`	Region granularity	4m–16m depending on heap
`-XX:+ParallelRefProcEnabled`	Parallelize reference processing	Always enable
`-XX:+UseStringDeduplication`	Deduplicate identical strings	Enable on data-heavy indices
`-XX:G1ReservePercent`	Headroom for promotion failure	10–15%
`-XX:+AlwaysPreTouch`	Pre-allocate heap pages	Reduces latency spikes
`-XX:+UseShenandoahGC`	Ultra-low-pause GC	For heaps > 16 GB

Cache Configuration

<!-- solrconfig.xml — tuned cache settings -->
<query>
  <!-- filterCache: stores filtered doc ID sets -->
  <filterCache class="solr.LRUQueryCache"
    size="16384"
    initialSize="8192"
    minSize="4096"
    maxRamMB="512"/>

  <!-- queryResultCache: stores complete result sets -->
  <queryResultCache class="solr.LRUCache"
    size="8192"
    initialSize="4096"
    maxRamMB="256"/>

  <!-- documentCache: stores stored field lookups -->
  <documentCache class="solr.LRUCache"
    size="16384"
    initialSize="8192"
    maxRamMB="256"/>

  <!-- Per-segment filter cache for large indices -->
  <enableLazyFieldLoading>true</enableLazyFieldLoading>
  <queryResultWindowSize>100</queryResultWindowSize>
  <queryResultMaxDocsCached>500</queryResultMaxDocsCached>
</query>

Cache Types and Settings

Cache	Purpose	Hit Ratio Target	Max RAM	Eviction
`filterCache`	Filtered doc ID lists	> 80%	256–1024 MB	LRU
`queryResultCache`	Complete query results	> 70%	128–512 MB	LRU
`documentCache`	Stored field retrievals	> 90%	128–256 MB	LRU
`fieldValueCache`	Facet/group field value lookups	> 90%	64–256 MB	LRU

Auto Soft-Commit and Index Warming

<!-- solrconfig.xml — near-real-time and commit settings -->
<autoCommit>
  <maxDocs>10000</maxDocs>
  <maxTime>900000</maxTime>
  <openSearcher>false</openSearcher>
</autoCommit>

<autoSoftCommit>
  <maxDocs>1000</maxDocs>
  <maxTime>60000</maxTime>
</autoSoftCommit>

<!-- Index warming: prime caches after reopen -->
<query>
  <listener event="newSearcher" class="solr.QuerySenderListener">
    <arr name="queries">
      <str>cat:electronics</str>
      <str>price:[10 TO 100]</str>
      <str>q=*:*&rows=0&facet=true&facet.field=category</str>
    </arr>
  </listener>
  <listener event="firstSearcher" class="solr.QuerySenderListener">
    <arr name="queries">
      <str>q=*:*&rows=0</str>
    </arr>
  </listener>
</query>

Merge Policy Tuning

<!-- solrconfig.xml — tiered merge policy for write-heavy workloads -->
<mergePolicyFactory class="solr.TieredMergePolicyFactory">
  <int name="maxMergeAtOnce">10</int>
  <int name="segmentsPerTier">10</int>
  <int name="maxMergedSegmentMB">5120</int>
  <double name="reclaimDeletesWeight">2.5</double>
  <double name="floorSegmentMB">50</double>
  <double name="seeksPenalty">5.0</double>
  <double name="expungeDeletesPctAllowed">20.0</double>
</mergePolicyFactory>

<!-- Force merge during low-traffic window (use sparingly) -->
curl -X POST "http://localhost:8983/solr/products/update?maxSegments=3&waitFlush=true"

Log Management and Troubleshooting

Log Levels

# Set log level via Admin UI or API
curl -X POST "http://localhost:8983/solr/admin/info/logging" \
  --data "set=org.apache.solr.core&value=FINE"

# Get current log levels
curl "http://localhost:8983/solr/admin/info/logging"

# Reset to default
curl -X POST "http://localhost:8983/solr/admin/info/logging" \
  --data "set=root&value=WARN"

Common Troubleshooting Commands

# Check cluster state
curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | jq .cluster

# Check shard health
curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS&collection=products" | jq '.cluster.collections.products.shards'

# Ping a core
curl "http://localhost:8983/solr/products/admin/ping"

# Force leader election
curl "http://localhost:8983/solr/admin/collections?action=REBALANCELEADERS&collection=products"

# Unload a stuck core
curl "http://localhost:8983/solr/admin/cores?action=UNLOAD&core=products_shard1_replica_n1&deleteIndex=true&deleteDataDir=true"

Log Rotation

<!-- log4j2.xml — production logging configuration -->
<RollingFile name="solrlog" fileName="${sys:solr.log.dir}/solr.log"
  filePattern="${sys:solr.log.dir}/solr.%d{yyyy-MM-dd}.%i.log.gz">
  <PatternLayout pattern="%d{ISO8601} %-5p [%t] %c{1.} - %m%n"/>
  <Policies>
    <TimeBasedTriggeringPolicy interval="1" modulate="true"/>
    <SizeBasedTriggeringPolicy size="256 MB"/>
  </Policies>
  <DefaultRolloverStrategy max="30">
    <Delete basePath="${sys:solr.log.dir}" maxDepth="1">
      <IfFileName glob="solr.*.log.gz"/>
      <IfLastModified age="30d"/>
    </Delete>
  </DefaultRolloverStrategy>
</RollingFile>

Rolling Restarts and Upgrades

Rolling Restart Procedure

#!/bin/bash
# rolling-restart.sh
NODES=("solr-node1:8983" "solr-node2:8983" "solr-node3:8983")

for NODE in "${NODES[@]}"; do
  echo "=== Draining and restarting $NODE ==="

  # 1. Mark node for recovery (stop new replica assignments)
  solr-9.8.0/bin/solr healthcheck -collection products -shard shard1 -node "$NODE"

  # 2. Stop Solr gracefully
  ssh "$NODE" "solr-9.8.0/bin/solr stop -p 8983"

  # 3. Wait for ZooKeeper to register node as down
  sleep 10

  # 4. Verify replicas moved to recovery state
  curl -s "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | \
    jq '.cluster.live_nodes'

  # 5. Start Solr again
  ssh "$NODE" "solr-9.8.0/bin/solr start -cloud -p 8983 -z zoo1:2181"

  # 6. Wait for node to rejoin and replicas to sync
  sleep 30

  # 7. Verify node is healthy
  curl -s "http://$NODE/solr/admin/cores?action=STATUS" | jq '.status | length'

  echo "=== $NODE restarted successfully ==="
done

Upgrade Strategy

Solr upgrade checklist:

1. Read release notes for breaking changes (config API, parameter renames)
2. Test upgrade on staging cluster with production data size
3. Take full backup of all collections and ZooKeeper data
4. Backup ZooKeeper state: solr-9.8.0/bin/zkcli.sh -zkhost localhost:2181 -cmd list -d /
5. Run rolling restart, one node at a time, verifying health between each
6. After all nodes upgraded: reindex collections using new features
7. Run full cluster health check and performance comparison

Disaster Recovery Planning

Multi-Datacenter Strategy

# Configure cross-DC backup on primary
curl "http://localhost:8983/solr/admin/collections?action=BACKUP" \
  --data-urlencode "name=dr_snapshot" \
  --data-urlencode "collection=products" \
  --data-urlencode "location=/nfs/solr-backups"

# On standby DC, restore from NFS backup
curl "http://standby-dc:8983/solr/admin/collections?action=RESTORE" \
  --data-urlencode "name=dr_snapshot" \
  --data-urlencode "collection=products" \
  --data-urlencode "location=/nfs/solr-backups"

Automated Recovery Script

#!/bin/bash
# solr-dr-failover.sh
PRIMARY_NODE="solr-primary:8983"
STANDBY_NODE="solr-standby:8983"

check_health() {
  local node=$1
  local status=$(curl -s -o /dev/null -w "%{http_code}" "http://$node/solr/admin/info/system")
  return $([ "$status" == "200" ])
}

if ! check_health "$PRIMARY_NODE"; then
  log_error "Primary Solr unhealthy, initiating failover"

  # 1. Promote standby (update DNS)
  kubectl patch service solr -p '{"spec":{"selector":{"role":"standby"}}}'

  # 2. Restore latest backup to promoted cluster
  curl "http://$STANDBY_NODE/solr/admin/collections?action=RESTORE&name=latest_dr&collection=products&location=/backup/solr"

  # 3. Verify promoted node
  if check_health "$STANDBY_NODE"; then
    log_success "Failover complete, standby promoted"
  else
    log_error "Failover failed"
    exit 1
  fi
fi

Production Checklist

Category	Item	Verification
Cluster	ZooKeeper ensemble has 3+ nodes	`echo stat
Cluster	SolrCloud mode enabled	curl `/solr/admin/collections?action=CLUSTERSTATUS`
Cluster	Replication factor >= 2	Collection list output
JVM	Heap set to 50% of host RAM	curl `/solr/admin/info/system`
JVM	`-Xms` equals `-Xmx`	Solr start log
JVM	G1GC or Shenandoah configured	`jcmd <pid> VM.flags`
Disk	Dedicated data mount (not root)	`df -h /var/solr`
Disk	80% or less utilization	Monitoring alert
Backup	Automated backup running daily	crontab -l
Backup	Off-site backup copy exists	Verify S3/GCS bucket
Security	Authentication enabled	curl without credentials returns 401
Security	TLS configured for all endpoints	openssl s_client
Security	ZooKeeper ACLs set	`getAcl /solr`
Monitoring	Prometheus scraping all nodes	Target status page
Monitoring	Grafana dashboards configured	Dashboard list
Monitoring	Alert rules defined	Alertmanager config
Logging	Log rotation configured	Check log4j2.xml
Logging	Audit logging enabled	Verify audit.log exists

Conclusion

Solr operations demand vigilance across every layer — from ZooKeeper cluster health and JVM tuning to security hardening and disaster recovery planning. Automate backups, monitor cache hit ratios and GC pauses in production, and always run rolling restarts during maintenance windows. With these practices in place, your Solr cluster will handle enterprise-scale search reliably.

Introduction

Installation and Deployment Modes

Standalone (Single Node)

SolrCloud (Multi-Node Cluster)

Docker Deployment

Kubernetes Deployment

Collection Management

CREATE

RELOAD

DELETE

SPLITSHARD

DELETESHARD

ADDREPLICA

Collection Aliases

Cluster Status

Collection API Parameters Reference

Backup and Restore

Full Collection Backup

Incremental Backup

Snapshot Backup

Restore Collection

Cloud Storage Backup

Security Hardening

Basic Authentication

Kerberos Authentication

TLS Configuration

Audit Logging

Monitoring

Metrics API Endpoints

Prometheus Integration

Key Metrics to Watch

Solr Admin UI

Grafana Dashboard Example

Performance Tuning

JVM Heap and GC Tuning

JVM GC Options Reference

Cache Configuration

Cache Types and Settings

Auto Soft-Commit and Index Warming

Merge Policy Tuning

Log Management and Troubleshooting

Log Levels

Common Troubleshooting Commands

Log Rotation

Rolling Restarts and Upgrades

Rolling Restart Procedure

Upgrade Strategy

Disaster Recovery Planning

Multi-Datacenter Strategy

Automated Recovery Script

Production Checklist

Conclusion

Resources

Comments

Share this article

👍 Was this article helpful?