Skip to main content
โšก Calmops

Neo4j Operations: Deployment, Configuration, and Management

Introduction

Running Neo4j in production requires understanding operational aspects that ensure reliability, performance, and maintainability. This article covers deployment options, configuration tuning, backup strategies, monitoring approaches, and high availability configurations for production Neo4j environments.

Whether you’re deploying a single instance for development or a cluster for mission-critical applications, understanding these operational is practices essential for successful Neo4j deployments.

Deployment Options

Neo4j offers several deployment models to match different requirements.

Single Instance Deployment

For development and smaller workloads:

# Docker Compose for single instance
version: '3.8'
services:
  neo4j:
    image: neo4j:5.26
    ports:
      - "7474:7474"
      - "7687:7687"
    environment:
      - NEO4J_AUTH=neo4j/password
      - NEO4J_server_memory_heap_initial__size=2G
      - NEO4J_server_memory_heap_max__size=4G
      - NEO4J_server_memory_pagecache_size=2G
    volumes:
      - neo4j_data:/data
      - neo4j_logs:/logs

Neo4j Enterprise Clustering

For high availability and scalability, Neo4j Enterprise provides causal clustering:

# docker-compose.yml for causal cluster
version: '3.8'
services:
  neo4j-core1:
    image: neo4j:5.26-enterprise
    hostname: core1
    environment:
      - NEO4J_server_memory_heap_initial__size=4G
      - NEO4J_server_memory_heap_max__size=8G
      - NEO4J_server_memory_pagecache_size=4G
      - NEO4J_causal__clustering_minimum__cluster__size__at__formation=3
      - NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
    volumes:
      - ./core1/data:/data
      - ./core1/logs:/logs
    ports:
      - "7474:7474"
      - "7687:7687"

  neo4j-core2:
    image: neo4j:5.26-enterprise
    hostname: core2
    environment:
      - NEO4J_server_memory_heap_initial__size=4G
      - NEO4J_server_memory_heap_max__size=8G
      - NEO4J_server_memory_pagecache_size=4G
      - NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
    volumes:
      - ./core2/data:/data
      - ./core2/logs:/logs

  neo4j-core3:
    image: neo4j:5.26-enterprise
    hostname: core3
    environment:
      - NEO4J_server_memory_heap_initial__size=4G
      - NEO4J_server_memory_heap_max__size=8G
      - NEO4J_server_memory_pagecache_size=4G
      - NEO4J_causal__clustering_initial__cluster__members=core1:5000,core2:5000,core3:5000
    volumes:
      - ./core3/data:/data
      - ./core3/logs:/logs

Kubernetes Deployment

For cloud-native deployments:

# neo4j-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: neo4j
spec:
  serviceName: neo4j
  replicas: 3
  selector:
    matchLabels:
      app: neo4j
  template:
    spec:
      containers:
      - name: neo4j
        image: neo4j:5.26-enterprise
        ports:
        - containerPort: 7474
          name: http
        - containerPort: 7687
          name: bolt
        env:
        - name: NEO4J_causal__clustering_minimum__cluster__size__at__formation
          value: "3"
        - name: NEO4J_server_memory_heap_initial__size
          value: "4G"
        - name: NEO4J_server_memory_heap_max__size
          value: "8G"

Configuration Tuning

Neo4j’s performance depends heavily on proper memory configuration.

Memory Configuration

The most critical settings involve Neo4j’s three memory pools:

# neo4j.conf

# Heap memory - used for query execution and graph operations
server.memory.heap.initial_size=4G
server.memory.heap.max_size=8G

# Page cache - caches Neo4j store files for fast read access
server.memory.pagecache.size=4G

# Transaction guard memory - for transaction state
server.memory.transaction.maximum=1G

Guidelines:

  • Heap: 1/3 to 1/2 of available RAM
  • Page cache: 1/3 to 1/2 of available RAM
  • Leave memory for OS and other processes

Query Configuration

# Query execution
server.query.cache_size=1000

# Transaction settings
server.transaction.timeout=60s

# Index configuration
db.indexes.default.schema_fill_factor=0.75
db.indexes.default.stalenessSeconds=600

Network Configuration

# Network connectors
server.bolt.enabled=true
server.bolt.listen_address=0.0.0.0:7687
server.http.enabled=true
server.http.listen_address=0.0.0.0:7474

# Connection pool
server.bolt.connection_pool_max_size=200
server.bolt.connection_pool_sweeping_enabled=true

Backup and Recovery

Protecting your graph data is critical.

Online Backup

Neo4j supports online backups:

# Perform backup
neo4j-admin database backup neo4j --backup-path=/backups/

# Backup with compression
neo4j-admin database backup neo4j --backup-path=/backups/ --compress=true

# Incremental backup
neo4j-admin database backup neo4j --backup-path=/backups/ --from-path=/backups/neo4j/

Scheduled Backups

#!/bin/bash
# backup.sh

BACKUP_DIR="/backups/neo4j"
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup
neo4j-admin database backup neo4j --backup-path=$BACKUP_DIR/backup_$DATE

# Keep only last 7 backups
ls -t $BACKUP_DIR | tail -n +8 | xargs -r rm -rf

Restore from Backup

# Stop Neo4j
systemctl stop neo4j

# Restore database
neo4j-admin database restore --from-path=/backups/neo4j/ neo4j

# Start Neo4j
systemctl start neo4j

Export and Import

// Export to CSV using APOC
CALL apoc.export.csv.all('export.csv', {})

// Export to JSON
CALL apoc.export.json.all('export.json', {})

Monitoring

Effective monitoring ensures system health.

Neo4j Monitoring Endpoint

# Prometheus metrics
curl -o metrics.txt http://localhost:2004/metrics/prometheus

Key metrics include:

  • neo4j_dbms_memory_heap_used_bytes - Heap memory usage
  • neo4j_dbms_memory_pagecache_used_bytes - Page cache usage
  • neo4j_bolt_messages_done_total - Bolt protocol messages
  • neo4j_transaction_started_total - Started transactions
  • neo4j_transaction_committed_total - Committed transactions

Integration with Prometheus

# prometheus.yml
scrape_configs:
  - job_name: 'neo4j'
    static_configs:
      - targets: ['neo4j:2004']

Neo4j Logs

Monitor various log files:

# Query log - track slow queries
tail -f /var/log/neo4j/query.log

# Debug log - detailed system information
tail -f /var/log/neo4j/debug.log

# GC logs - garbage collection information
tail -f /var/log/neo4j/gc.log

Query Logging

Enable query logging for performance analysis:

# Enable query logging
server.logs.query.enabled=true
server.logs.query.threshold=1s
server.logs.query.plan_description_enabled=true

Analyze slow queries:

// Find slowest recent queries
CALL dbms.listQueries() YIELD query, elapsedTimeMillis, cpuTimeMillis
RETURN query, elapsedTimeMillis, cpuTimeMillis
ORDER BY elapsedTimeMillis DESC
LIMIT 10

Security

Secure your Neo4j deployment.

Authentication and Authorization

// Create user
CREATE USER alice SET PASSWORD 'securePassword'

// Set role
GRANT ROLE reader TO alice

// Roles: reader, editor, architect, admin

SSL Configuration

# Enable SSL
server.bolt.tls_level=REQUIRED
server.http.tls_level=REQUIRED

# SSL certificate configuration
server.bolt.ssl_cert=/path/to/cert.pem
server.bolt.ssl_key=/path/to/key.pem

LDAP Integration

# LDAP authentication
dbms.security.auth_enabled=true
dbms.security.ldap.authentication.enabled=true
dbms.security.ldap.authentication.user_dn_template=uid={0},dc=example,dc=com
dbms.security.ldap.url=ldap://ldap.example.com:389

Performance Tuning

Optimize for your workload.

Index Usage

// Explain query plan
EXPLAIN MATCH (p:Person {name: 'Alice'}) RETURN p

// Profile query execution
PROFILE MATCH (p:Person {name: 'Alice'})-[:KNOWS]->(friend) RETURN friend

Connection Pool Tuning

# Connection pool settings
server.bolt.connection_pool_max_size=200
server.bolt.connection_pool_sweeping_enabled=true
server.bolt.connection_pool_sweeping_interval=300

Caching

# Increase query cache
server.query.cache_size=10000

# Relationship cache
db.cache.implementation=soft
db.cache.type=soft

High Availability

Configure HA for mission-critical deployments.

Causal Clustering

In causal clustering:

  1. Core servers - Provide RAFT consensus for transactional guarantees
  2. Read replicas - Handle read queries for horizontal scaling
// Check cluster status
CALL dbms.cluster.overview()

Switchover and Failover

# Force switchover to secondary
neo4j-admin database failover --database=neo4j --target-server=server-id

Load Balancing

# Configure load balancer
load.balancing.plugin=round_robin

Upgrade Procedures

Keep Neo4j updated:

# Stop Neo4j
systemctl stop neo4j

# Backup database
neo4j-admin database backup neo4j --backup-path=/backups/pre-upgrade

# Update Neo4j
apt-get update
apt-get install neo4j=5.26

# Start Neo4j
systemctl start neo4j

# Verify
cypher-shell -u neo4j -p password "CALL dbms.components()"

Conclusion

Operating Neo4j in production requires attention to deployment, configuration, monitoring, and security. The practices in this article provide a foundation for reliable Neo4j deployments, from single instances to causal clusters. Proper configuration of memory, indexes, and monitoring ensures optimal performance.

In the next article, we’ll explore Neo4j’s internal architecture to understand how it achieves its graph processing capabilities.

Resources

Comments