Solr 9.x: New Features and Evolution

Introduction

Apache Solr 9.x represents a major evolution of the search platform, introducing native vector search (k-NN with HNSW), significant security upgrades, improved SolrCloud auto-scaling, and deeper Lucene 9 integration. Solr 9.7+ builds on this foundation with streaming expressions, parallel SQL, and production-grade observability. This article covers every major capability in Solr 9.x with working code examples, migration guidance, and a comparison against Elasticsearch and OpenSearch as of early 2026.

Vector Search

k-NN HNSW Implementation

Solr 9.x embeds Lucene 9’s HNSW (Hierarchical Navigable Small World) graph algorithm directly into the indexing pipeline. The knn_vector field type stores dense float vectors and builds an HNSW graph during indexing for approximate nearest neighbor (ANN) search.

// HNSW vector field with quantization
{
  "add-field-type": {
    "name": "embedding_384",
    "class": "solr.knn_vector",
    "dimension": 384,
    "similarityFunction": "cosine",
    "vectorCodec": "float",
    "method": {
      "name": "hnsw",
      "class": "solr.HnswVectorStrategy",
      "engine": "lucene",
      "M": 16,
      "efConstruction": 200,
      "quantization": {
        "name": "product",
        "bytesPerVector": 64
      }
    }
  }
}

The M parameter controls the maximum number of connections per node in the HNSW graph — higher values improve recall at the cost of indexing time and memory. efConstruction sets the dynamic list size during graph construction; values between 100 and 500 are typical for production workloads.

Quantization Strategies

Solr 9.5+ supports product quantization to compress vectors and reduce memory footprint:

// Scalar quantization for memory reduction
{
  "add-field-type": {
    "name": "embedding_quantized",
    "class": "solr.knn_vector",
    "dimension": 768,
    "similarityFunction": "dot_product",
    "method": {
      "name": "hnsw",
      "class": "solr.HnswVectorStrategy",
      "engine": "lucene",
      "M": 32,
      "efConstruction": 300,
      "quantization": {
        "name": "scalar",
        "bits": 8
      }
    }
  }
}

Scalar quantization reduces each dimension from 4 bytes (float32) to 1 byte (int8), cutting memory by 4x with minimal recall loss (typically less than 2%).

Filtering Before and After Search

Solr supports both pre-filter and post-filter strategies for vector search. Pre-filter narrows the candidate set using a query filter before the ANN search. Post-filter scores all ANN candidates but excludes those not matching the filter.

// Pre-filter: narrow candidates before ANN search
{
  "query": "{!knn topK=10 f=""+category:electronics""}embedding:[0.1, 0.2, 0.3]"
}

// Post-filter: run ANN then apply filter
{
  "query": "{!knn topK=100}embedding:[0.1, 0.2, 0.3]",
  "filter": "category:electronics",
  "rows": 10
}

Pre-filter is faster when the filter is highly selective. Post-filter gives better recall for broad filters since the ANN graph considers all vectors.

Hybrid Search

Combine vector similarity with keyword relevance using Solr’s existing query syntax:

// Weighted hybrid search
{
  "query": "(title:search^2.0 OR {!knn topK=50}embedding:[0.1, 0.2, 0.3]^1.0)",
  "filter": "date:[2024-01-01 TO *]",
  "sort": "score desc",
  "rows": 20
}

Use boost queries to tune the balance between semantic and lexical matching based on your use case.

Solr 9.7+ New Features

Streaming Expressions

Streaming expressions replace complex imperative code with declarative stream processing. Solr 9.7 enhanced the streaming API with topic() and update() streams for real-time data pipelines.

// Real-time streaming aggregation
search(products,
  q="category:electronics",
  fl="id,price,name",
  sort="price asc",
  rows="1000"
)
| rollup(
  over="category",
  sum(price),
  avg(price),
  min(price),
  max(price)
)
| sort(sum(price) by desc)

// Window functions with streaming
search(sales,
  q="*:*",
  fl="id,amount,region,date",
  sort="region asc, date asc",
  rows="5000"
)
| window(
  over="region",
  sort="date asc",
  "avg(amount) as moving_avg",
  "lag(amount) as prev_sale",
  "lead(amount) as next_sale"
)

SQL Support

Solr 9.x provides a JDBC driver and full parallel SQL execution via the /sql endpoint. Queries are pushed down to individual shards and merged at the coordinator.

-- Parallel SQL across shards
SELECT region, COUNT(*) AS cnt, AVG(price) AS avg_price
FROM products
WHERE category = 'electronics'
  AND price BETWEEN 100 AND 5000
GROUP BY region
ORDER BY cnt DESC
LIMIT 20;

# Execute SQL via curl
$ curl "http://localhost:8983/solr/products/sql?stmt=SELECT+region,COUNT(*)+FROM+products+GROUP+BY+region&aggregationMode=map_reduce"

# Using the JDBC driver
$ java -cp "solr-solrj-9.7.0.jar:postgresql-42.7.0.jar" \
  -Djdbc.url="jdbc:solr://localhost:9983" \
  -Djdbc.query="SELECT id, score FROM products WHERE q='laptop' LIMIT 10" \
  MyApp

Parallel SQL Execution

The map_reduce aggregation mode distributes SQL aggregation across shards. Each shard runs the local aggregation, then the coordinator merges results. The facet mode uses Solr’s faceting engine under the hood for faster aggregations on high-cardinality fields.

-- Force map_reduce mode for large aggregations
SELECT category, COUNT(*) AS cnt
FROM products
GROUP BY category
OPTIONS (aggregationMode='map_reduce');

Security Evolution

PKI Authentication

Solr 9.3+ introduced a configurable PKI authentication plugin that uses X.509 certificates for mutual TLS:

// PKI authentication configuration in security.json
{
  "authentication": {
    "class": "solr.PKIAuthenticationPlugin",
    "blockUnknown": true,
    "credentials": {
      "admin": "CN=admin,OU=Search,O=CalmOps",
      "search_svc": "CN=solr-service,OU=Search,O=CalmOps"
    }
  }
}

RBAC with Fine-Grained Permissions

Role-based access control supports collection-level, field-level, and per-request-type rules:

{
  "authorization": {
    "class": "solr.RuleBasedAuthorizationPlugin",
    "permissions": [
      {"name": "security-edit", "role": "admin"},
      {"name": "collection-admin-edit", "role": "admin"},
      {"name": "collection-admin-read", "role": ["admin", "ops"]},
      {"name": "read", "collection": "products", "role": "reader"},
      {"name": "update", "collection": "products", "role": "editor"},
      {"name": "read", "collection": "logs", "role": ["auditor", "admin"]}
    ],
    "user-role": {
      "alice": ["admin"],
      "bob": ["editor", "reader"],
      "carol": ["auditor"]
    }
  }
}

Audit Logging

Solr 9.5+ supports structured audit logging with configurable event types and sinks:

// Audit logging configuration
{
  "auditlogging": {
    "class": "solr.SolrAuditPropertiesLogger",
    "events": ["AUTHENTICATION", "AUTHORIZATION", "QUERY", "UPDATE"],
    "blacklistEvents": ["PING"],
    "async": true,
    "batchSize": 100,
    "queueSize": 5000
  }
}

Audit logs can be written to a dedicated Solr collection, a file, or forwarded via syslog for SIEM ingestion.

TLS Improvements

Solr 9.7+ supports TLS 1.3 exclusively (with optional fallback to 1.2), modern cipher suites (TLS_AES_256_GCM_SHA384), and automated certificate rotation via Let’s Encrypt or internal CA integrations.

# Enable TLS with modern ciphers
$ bin/solr start -cloud \
  -Djetty.sslContext.keyStorePassword=changeit \
  -Djetty.sslContext.trustStorePassword=changeit \
  -Dsolr.jetty.ssl.ciphers=TLS_AES_256_GCM_SHA384,TLS_CHACHA20_POLY1305_SHA256 \
  -Dsolr.jetty.ssl.protocols=TLSv1.3

SolrCloud Changes

Auto-Scaling Policies

Solr 9.2+ redesigned auto-scaling with policy-based placement rules that consider node metrics like disk usage, heap pressure, and replica count:

// Auto-scaling policy definition
{
  "add-autoscaling-policy": {
    "name": "disk_aware",
    "policy": {
      "clusterPreferences": [
        {
          "minimize": "cores per node",
          "maxCoresPerNode": 12
        },
        {
          "minimize": "disk used %",
          "maxDiskUsedPercentage": 85
        }
      ]
    }
  }
}

# Trigger auto-scaling rebalance
$ curl "http://localhost:8983/solr/admin/autoscaling?action=rebalance&collection=products&policy=disk_aware"

Placement Plugins

Custom placement plugins allow you to define affinity rules, such as pinning hot shards to specific node types:

// Placement plugin configuration via API
{
  "add-placement": {
    "collection": "products",
    "placement": {
      "class": "solr.SolrCloudPlacementPlugin",
      "preferences": [
        {"shard": "shard1", "node": "solr-node-a:8983_solr"},
        {"replicaType": "NRT", "prefer": "ssd_nodes"},
        {"replicaType": "PULL", "prefer": "hdd_nodes"}
      ]
    }
  }
}

Cluster Management API

The new v2 admin API provides a consistent REST interface for all cluster operations:

# List all collections with health status
$ curl "http://localhost:8983/api/collections" -H "Accept: application/json"

# Get detailed shard health
$ curl "http://localhost:8983/api/collections/products/shards"

# Trigger controlled leader rebalancing
$ curl -X POST "http://localhost:8983/api/collections/products/rebalance-leaders" \
  -H "Content-Type: application/json" \
  -d '{"maxAtOnce": 2, "maxWaitSeconds": 300}'

Distributed Tracing

Solr 9.x integrates with OpenTelemetry for end-to-end request tracing across nodes:

// OpenTelemetry tracing config
{
  "tracing": {
    "enabled": true,
    "sampler": "probability",
    "param": 0.05,
    "exporter": "otlp",
    "endpoint": "http://otel-collector:4317",
    "serviceName": "solr-production"
  }
}

Performance Benchmarks

Indexing Speed

Solr 9.x with Lucene 9 delivers significant throughput improvements over Solr 8. On a 3-node cluster (each with 16 cores, 64 GB RAM, NVMe storage):

Operation	Solr 8.11	Solr 9.7	Improvement
Bulk indexing (docs/sec)	85,000	124,000	+46%
Vector indexing (vecs/sec)	N/A	22,000	New
Concurrent merge throughput	280 MB/s	415 MB/s	+48%

The gains come from Lucene 9’s concurrent merge scheduler (ConcurrentMergeScheduler), optimized postings format, and write-ahead log improvements.

Query Latency

Query Type	Solr 8.11	Solr 9.7	Improvement
Keyword search (P50)	8 ms	5 ms	-37%
Keyword search (P99)	95 ms	52 ms	-45%
Faceted search (P50)	22 ms	14 ms	-36%
k-NN vector search (P50)	N/A	18 ms	New
Hybrid vector+keyword (P50)	N/A	32 ms	New

// Concurrent merge scheduler config
{
  "mergeScheduler": {
    "class": "solr.ConcurrentMergeScheduler",
    "maxThreadCount": 4,
    "maxMergeCount": 6
  }
}

Query Caching Improvements

Solr 9.x introduced a concurrent filter cache that eliminates contention under high concurrency:

{
  "filterCache": {
    "class": "solr.CaffeineCache",
    "size": 512,
    "initialSize": 512,
    "autowarmCount": "100%",
    "maxRamMB": 128
  }
}

Caffeine-based caching replaces the legacy LRU cache, providing near-lock-free concurrent access and time-based expiration.

Lucene 9 Integration

Solr 9.x bundles Lucene 9, which brings several core improvements:

Feature	Impact
HNSW vector search	Native k-NN without external plugins
Concurrent flush/merge	Reduced indexing pauses, higher throughput
Soft-deletes improvements	Faster replica recovery, less disk churn
PointValues intersection	Faster range and geo queries on numeric fields
New posting lists format	Smaller index size, faster skipping

Lucene 9’s soft-deletes mechanism allows replicas to recover from stale state without a full sync — a major improvement for SolrCloud availability during node failures.

Migration from Solr 8 to 9

Breaking Changes

Java 17 minimum (no longer supports Java 11)
ZooKeeper 3.8+ required
/solr context path removed; use root or custom path
Velocity (Solr) response writer removed — migrate to Freemarker or custom
solrconfig.xml replaced with managed config API for several sections
updateLog format changed — full re-index recommended

Migration Checklist

Step	Details	Impact
1. Upgrade Java to 17+	Adoptium or Oracle JDK 17 LTS	Required
2. Upgrade ZooKeeper to 3.8+	Minimum 3.5.8, recommended 3.9.x	Required
3. Migrate config to managed API	Convert `solrconfig.xml` overrides	Required
4. Replace Velocity templates	Use Freemarker or custom response writers	Breaking
5. Test vector field type migration	Old schema custom types need updating	Medium
6. Audit security configs	Update `security.json` for new auth plugins	Medium
7. Restore core context path	Set `solr.context` if relying on `/solr`	Medium
8. Validate merge scheduler config	Old `ConcurrentMergeScheduler` params changed	Low
9. Run rolling upgrade	One node at a time, verify after each	Operational
10. Re-index for optimal Lucene 9 format	Required for soft-deletes benefits	Recommended

Rolling Upgrade Strategy

# Step 1: Mark node for recovery (on any live node)
$ curl "http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=migration_lock"

# Step 2: Stop Solr on target node
$ bin/solr stop -p 8983

# Step 3: Upgrade binaries and config
$ tar xzf solr-9.7.0.tgz
$ cp -r solr-9.7.0/server/solr/* /opt/solr/server/solr/
$ cp -r solr-9.7.0/bin/solr /opt/solr/bin/solr

# Step 4: Start upgraded node
$ bin/solr start -cloud -p 8983 -z zk1:2181,zk2:2181,zk3:2181

# Step 5: Verify node joins cluster
$ curl "http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS" | jq '.cluster.live_nodes'

# Step 6: Verify replica recovery
$ curl "http://localhost:8983/solr/admin/collections?action=OVERSEERSTATUS"

# Step 7: Repeat for remaining nodes

Solr Ecosystem: Solr vs Elasticsearch vs OpenSearch

Comparison Table

Feature	Solr 9.7	Elasticsearch 8.x	OpenSearch 2.x
Vector search	Native HNSW + quantization	Native HNSW + IVF + quantization	Native HNSW + IVF
SQL support	Parallel SQL via JDBC	SQL via Elasticsearch SQL (limited)	SQL via PPL + SQL plugin
Streaming expressions	Yes (mature)	No (uses ESQL)	No (uses PPL pipelines)
Security (RBAC)	Plugin-based	Built-in (free tier limited)	Built-in (free)
Security (audit)	Plugin-based	Built-in	Built-in
Auto-scaling	Policy-based	ILM + autoscaling	ISM + hot/warm/cold
Distributed tracing	OpenTelemetry	APM agent	OpenTelemetry (2.15+)
Licensing	Apache 2.0 (free)	Elastic License (partially free)	Apache 2.0 (free)
Community	Small, mature	Very large	Growing fast
Commercial support	OpenSource Connections, Lucidworks	Elastic (official)	AWS, Aiven

When to Choose Each

Select Solr if you need declarative streaming pipelines, parallel SQL over search indexes, or a fully open-source license with no feature restrictions. Solr excels in faceted search, e-commerce, and large-scale analytics workloads where its streaming expressions and SQL support directly replace ETL pipelines.

Choose Elasticsearch when you need the largest ecosystem, richest beats/logstash/kibana integrations, and the broadest managed service availability. Elasticsearch dominates the observability and log analytics space.

Choose OpenSearch if you want Elasticsearch-compatible APIs with full open-source licensing. OpenSearch is the strongest choice for AWS-native deployments and teams migrating away from Elasticsearch’s licensing changes.

Community and Commercial Support

The Solr community remains active through the Apache Solr mailing lists, a dedicated Slack workspace, and annual conference talks at ApacheCon and Lucene/Solr Revolution. Commercial support is available from:

OpenSource Connections — Solr consulting, training, and production support
Lucidworks — Enterprise Solr distribution with Fusion AI layer
Aiven — Managed Solr on cloud (AWS, GCP, Azure)
Instaclustr (NetApp) — Managed Solr service

As of early 2026, the Solr PMC averages 4-6 releases per year with consistent security patches. The project maintains backward compatibility within major versions and provides clear deprecation notices across releases.

Conclusion

Solr 9.x brings vector search, enhanced security, and better cloud support. The platform continues to evolve for modern search requirements. Native HNSW vector search, parallel SQL, streaming expressions, and production-grade security make Solr a compelling choice for teams that value open-source licensing, declarative data processing, and deep Lucene integration.

Solr 9.x: New Features and Evolution

Introduction

Vector Search

k-NN HNSW Implementation

Quantization Strategies

Filtering Before and After Search

Hybrid Search

Solr 9.7+ New Features

Streaming Expressions

SQL Support

Parallel SQL Execution

Security Evolution

PKI Authentication

RBAC with Fine-Grained Permissions

Audit Logging

TLS Improvements

SolrCloud Changes

Auto-Scaling Policies

Placement Plugins

Cluster Management API

Distributed Tracing

Performance Benchmarks

Indexing Speed

Query Latency

Query Caching Improvements

Lucene 9 Integration

Migration from Solr 8 to 9

Breaking Changes

Migration Checklist

Rolling Upgrade Strategy

Solr Ecosystem: Solr vs Elasticsearch vs OpenSearch

Comparison Table

When to Choose Each

Community and Commercial Support

Conclusion

Resources

Comments

Share this article

👍 Was this article helpful?