Introduction
Running Neo4j in production requires understanding operational aspects that ensure reliability, performance, and maintainability. This article covers deployment options, configuration tuning, backup strategies, monitoring approaches, and high availability configurations for production Neo4j environments.
Whether you’re deploying a single instance for development or a cluster for mission-critical applications, understanding these operational is practices essential for successful Neo4j deployments.
Deployment Options
Neo4j offers several deployment models to match different requirements.
Single Instance Deployment
For development and smaller workloads:
# Docker Compose for single instance
version: '3.8'
services:
neo4j:
image: neo4j:5.26
ports:
- "7474:7474"
- "7687:7687"
environment:
- NEO4J_AUTH=neo4j/password
- NEO4J_server_memory_heap_initial__size=2G
- NEO4J_server_memory_heap_max__size=4G
- NEO4J_server_memory_pagecache_size=2G
volumes:
- neo4j_data:/data
- neo4j_logs:/logs
Neo4j Enterprise Clustering
For high availability and scalability, Neo4j Enterprise provides causal clustering:
# docker-compose.yml for causal cluster
version: '3.8'
services:
neo4j-core1:
image: neo4j:5.26-enterprise
hostname: core1
environment:
- NEO4J_server_memory_heap_initial__size=4G
- NEO4J_server_memory_heap_max__size=8G
- NEO4J_server_memory_pagecache_size=4G
- NEO4J_causal__clustering_minimum__cluster__size__at__formation=3
- NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
volumes:
- ./core1/data:/data
- ./core1/logs:/logs
ports:
- "7474:7474"
- "7687:7687"
neo4j-core2:
image: neo4j:5.26-enterprise
hostname: core2
environment:
- NEO4J_server_memory_heap_initial__size=4G
- NEO4J_server_memory_heap_max__size=8G
- NEO4J_server_memory_pagecache_size=4G
- NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
volumes:
- ./core2/data:/data
- ./core2/logs:/logs
neo4j-core3:
image: neo4j:5.26-enterprise
hostname: core3
environment:
- NEO4J_server_memory_heap_initial__size=4G
- NEO4J_server_memory_heap_max__size=8G
- NEO4J_server_memory_pagecache_size=4G
- NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
volumes:
- ./core3/data:/data
- ./core3/logs:/logs
Kubernetes Deployment
For cloud-native deployments:
# neo4j-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: neo4j
spec:
serviceName: neo4j
replicas: 3
selector:
matchLabels:
app: neo4j
template:
spec:
containers:
- name: neo4j
image: neo4j:5.26-enterprise
ports:
- containerPort: 7474
name: http
- containerPort: 7687
name: bolt
env:
- name: NEO4J_causal__clustering_minimum__cluster__size__at__formation
value: "3"
- name: NEO4J_server_memory_heap_initial__size
value: "4G"
- name: NEO4J_server_memory_heap_max__size
value: "8G"
Configuration Tuning
Neo4j’s performance depends heavily on proper memory configuration.
Memory Configuration
The most critical settings involve Neo4j’s three memory pools:
# neo4j.conf
# Heap memory - used for query execution and graph operations
server.memory.heap.initial_size=4G
server.memory.heap.max_size=8G
# Page cache - caches Neo4j store files for fast read access
server.memory.pagecache.size=4G
# Transaction guard memory - for transaction state
server.memory.transaction.maximum=1G
Guidelines:
- Heap: 1/3 to 1/2 of available RAM
- Page cache: 1/3 to 1/2 of available RAM
- Leave memory for OS and other processes
Query Configuration
# Query execution
server.query.cache_size=1000
# Transaction settings
server.transaction.timeout=60s
# Index configuration
db.indexes.default.schema_fill_factor=0.75
db.indexes.default.stalenessSeconds=600
Network Configuration
# Network connectors
server.bolt.enabled=true
server.bolt.listen_address=0.0.0.0:7687
server.http.enabled=true
server.http.listen_address=0.0.0.0:7474
# Connection pool
server.bolt.connection_pool_max_size=200
server.bolt.connection_pool_sweeping_enabled=true
Backup and Recovery
Protecting your graph data is critical.
Online Backup
Neo4j supports online backups:
# Perform backup
neo4j-admin database backup neo4j --backup-path=/backups/
# Backup with compression
neo4j-admin database backup neo4j --backup-path=/backups/ --compress=true
# Incremental backup
neo4j-admin database backup neo4j --backup-path=/backups/ --from-path=/backups/neo4j/
Scheduled Backups
#!/bin/bash
# backup.sh
BACKUP_DIR="/backups/neo4j"
DATE=$(date +%Y%m%d_%H%M%S)
# Create backup
neo4j-admin database backup neo4j --backup-path=$BACKUP_DIR/backup_$DATE
# Keep only last 7 backups
ls -t $BACKUP_DIR | tail -n +8 | xargs -r rm -rf
Restore from Backup
# Stop Neo4j
systemctl stop neo4j
# Restore database
neo4j-admin database restore --from-path=/backups/neo4j/ neo4j
# Start Neo4j
systemctl start neo4j
Export and Import
// Export to CSV using APOC
CALL apoc.export.csv.all('export.csv', {})
// Export to JSON
CALL apoc.export.json.all('export.json', {})
Monitoring
Effective monitoring ensures system health.
Neo4j Monitoring Endpoint
# Prometheus metrics
curl -o metrics.txt http://localhost:2004/metrics/prometheus
Key metrics include:
neo4j_dbms_memory_heap_used_bytes- Heap memory usageneo4j_dbms_memory_pagecache_used_bytes- Page cache usageneo4j_bolt_messages_done_total- Bolt protocol messagesneo4j_transaction_started_total- Started transactionsneo4j_transaction_committed_total- Committed transactions
Integration with Prometheus
# prometheus.yml
scrape_configs:
- job_name: 'neo4j'
static_configs:
- targets: ['neo4j:2004']
Neo4j Logs
Monitor various log files:
# Query log - track slow queries
tail -f /var/log/neo4j/query.log
# Debug log - detailed system information
tail -f /var/log/neo4j/debug.log
# GC logs - garbage collection information
tail -f /var/log/neo4j/gc.log
Query Logging
Enable query logging for performance analysis:
# Enable query logging
server.logs.query.enabled=true
server.logs.query.threshold=1s
server.logs.query.plan_description_enabled=true
Analyze slow queries:
// Find slowest recent queries
CALL dbms.listQueries() YIELD query, elapsedTimeMillis, cpuTimeMillis
RETURN query, elapsedTimeMillis, cpuTimeMillis
ORDER BY elapsedTimeMillis DESC
LIMIT 10
Security
Secure your Neo4j deployment.
Authentication and Authorization
// Create user
CREATE USER alice SET PASSWORD 'securePassword'
// Set role
GRANT ROLE reader TO alice
// Roles: reader, editor, architect, admin
SSL Configuration
# Enable SSL
server.bolt.tls_level=REQUIRED
server.http.tls_level=REQUIRED
# SSL certificate configuration
server.bolt.ssl_cert=/path/to/cert.pem
server.bolt.ssl_key=/path/to/key.pem
LDAP Integration
# LDAP authentication
dbms.security.auth_enabled=true
dbms.security.ldap.authentication.enabled=true
dbms.security.ldap.authentication.user_dn_template=uid={0},dc=example,dc=com
dbms.security.ldap.url=ldap://ldap.example.com:389
Performance Tuning
Optimize for your workload.
Index Usage
// Explain query plan
EXPLAIN MATCH (p:Person {name: 'Alice'}) RETURN p
// Profile query execution
PROFILE MATCH (p:Person {name: 'Alice'})-[:KNOWS]->(friend) RETURN friend
Connection Pool Tuning
# Connection pool settings
server.bolt.connection_pool_max_size=200
server.bolt.connection_pool_sweeping_enabled=true
server.bolt.connection_pool_sweeping_interval=300
Caching
# Increase query cache
server.query.cache_size=10000
# Relationship cache
db.cache.implementation=soft
db.cache.type=soft
High Availability
Configure HA for mission-critical deployments.
Causal Clustering
In causal clustering:
- Core servers - Provide RAFT consensus for transactional guarantees
- Read replicas - Handle read queries for horizontal scaling
// Check cluster status
CALL dbms.cluster.overview()
Switchover and Failover
# Force switchover to secondary
neo4j-admin database failover --database=neo4j --target-server=server-id
Load Balancing
# Configure load balancer
load.balancing.plugin=round_robin
Upgrade Procedures
Keep Neo4j updated:
# Stop Neo4j
systemctl stop neo4j
# Backup database
neo4j-admin database backup neo4j --backup-path=/backups/pre-upgrade
# Update Neo4j
apt-get update
apt-get install neo4j=5.26
# Start Neo4j
systemctl start neo4j
# Verify
cypher-shell -u neo4j -p password "CALL dbms.components()"
Conclusion
Operating Neo4j in production requires attention to deployment, configuration, monitoring, and security. The practices in this article provide a foundation for reliable Neo4j deployments, from single instances to causal clusters. Proper configuration of memory, indexes, and monitoring ensures optimal performance.
In the next article, we’ll explore Neo4j’s internal architecture to understand how it achieves its graph processing capabilities.
Comments