Set up Elasticsearch 8 cross-cluster replication for disaster recovery and high availability

Advanced 45 min Apr 14, 2026 29 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure Elasticsearch 8 cross-cluster replication (CCR) to replicate indices across multiple clusters for disaster recovery. This tutorial covers security setup, remote cluster connections, replication policies, and automated failover procedures.

Prerequisites

  • At least 2 Linux servers with 4GB RAM each
  • Network connectivity between clusters on ports 9200 and 9300
  • Basic understanding of Elasticsearch concepts
  • Root or sudo access on all servers

What this solves

Cross-cluster replication (CCR) in Elasticsearch 8 allows you to replicate indices from a primary cluster to one or more remote clusters for disaster recovery and high availability. This ensures business continuity by maintaining synchronized copies of your data across geographically distributed locations, enabling rapid failover when primary clusters experience outages.

Step-by-step configuration

Update system packages

Start by updating your package manager to ensure you get the latest security patches.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install Elasticsearch 8 on both clusters

Install Elasticsearch 8 on all nodes that will participate in cross-cluster replication. This includes both primary and secondary clusters.

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y elasticsearch
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat << EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md
EOF
sudo dnf install --enablerepo=elasticsearch -y elasticsearch

Configure primary cluster security

Configure the primary Elasticsearch cluster with TLS encryption and authentication. Replace cluster names and IP addresses with your actual values.

# Cluster configuration
cluster.name: primary-cluster
node.name: primary-node-1
network.host: 203.0.113.10
http.port: 9200
transport.port: 9300

Security configuration

xpack.security.enabled: true xpack.security.enrollment.enabled: true

Transport layer security

xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.client_authentication: required xpack.security.transport.ssl.keystore.path: elastic-certificates.p12 xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

HTTP layer security

xpack.security.http.ssl.enabled: true xpack.security.http.ssl.keystore.path: elastic-certificates.p12

Cross-cluster replication settings

xpack.ccr.enabled: true xpack.ccr.ui.enabled: true

Discovery settings

discovery.seed_hosts: ["203.0.113.10:9300"] cluster.initial_master_nodes: ["primary-node-1"]

Performance settings

bootstrap.memory_lock: true indices.memory.index_buffer_size: 10%

Generate security certificates

Create TLS certificates for secure communication between clusters. Run this on the primary cluster first.

cd /usr/share/elasticsearch
sudo bin/elasticsearch-certutil ca --out /etc/elasticsearch/elastic-stack-ca.p12 --pass ""
sudo bin/elasticsearch-certutil cert --ca /etc/elasticsearch/elastic-stack-ca.p12 --out /etc/elasticsearch/elastic-certificates.p12 --pass ""
sudo chown elasticsearch:elasticsearch /etc/elasticsearch/elastic-*.p12
sudo chmod 660 /etc/elasticsearch/elastic-*.p12

Configure secondary cluster

Set up the secondary cluster that will receive replicated data. Copy the certificates from the primary cluster first.

sudo scp root@203.0.113.10:/etc/elasticsearch/elastic-*.p12 /etc/elasticsearch/
sudo chown elasticsearch:elasticsearch /etc/elasticsearch/elastic-*.p12
sudo chmod 660 /etc/elasticsearch/elastic-*.p12
# Cluster configuration
cluster.name: secondary-cluster
node.name: secondary-node-1
network.host: 203.0.113.20
http.port: 9200
transport.port: 9300

Security configuration

xpack.security.enabled: true xpack.security.enrollment.enabled: true

Transport layer security

xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.client_authentication: required xpack.security.transport.ssl.keystore.path: elastic-certificates.p12 xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

HTTP layer security

xpack.security.http.ssl.enabled: true xpack.security.http.ssl.keystore.path: elastic-certificates.p12

Cross-cluster replication settings

xpack.ccr.enabled: true xpack.ccr.ui.enabled: true

Discovery settings

discovery.seed_hosts: ["203.0.113.20:9300"] cluster.initial_master_nodes: ["secondary-node-1"]

Performance settings

bootstrap.memory_lock: true indices.memory.index_buffer_size: 10%

Configure system memory settings

Set up memory locking to prevent Elasticsearch from swapping to disk, which improves performance.

elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
[Service]
LimitMEMLOCK=infinity
sudo mkdir -p /etc/systemd/system/elasticsearch.service.d/
sudo systemctl daemon-reload

Start Elasticsearch clusters

Enable and start Elasticsearch on both primary and secondary clusters.

sudo systemctl enable --now elasticsearch
sudo systemctl status elasticsearch

Set up built-in user passwords

Configure passwords for built-in Elasticsearch users. Run this on both clusters.

sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
Note: Use strong passwords and store them securely. You'll need the elastic user credentials for cross-cluster configuration.

Create cross-cluster user

Create a dedicated user for cross-cluster replication with minimal required privileges on the primary cluster.

curl -k -u elastic -X POST "https://203.0.113.10:9200/_security/user/ccr_user" -H "Content-Type: application/json" -d'
{
  "password": "YourSecurePassword123!",
  "roles": ["cross_cluster_replication_remote"]
}'

Configure remote cluster connection

Set up the connection from the secondary cluster to the primary cluster using the cross-cluster user credentials.

curl -k -u elastic -X PUT "https://203.0.113.20:9200/_cluster/settings" -H "Content-Type: application/json" -d'
{
  "persistent": {
    "cluster": {
      "remote": {
        "primary-cluster": {
          "mode": "proxy",
          "proxy_address": "203.0.113.10:9300",
          "server_name": "primary-cluster",
          "proxy_socket_connections": 18,
          "skip_unavailable": false
        }
      }
    }
  }
}'

Create sample index on primary cluster

Create a test index with some data on the primary cluster to demonstrate replication.

curl -k -u elastic -X PUT "https://203.0.113.10:9200/logs-production" -H "Content-Type: application/json" -d'
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1,
    "index.soft_deletes.enabled": true
  },
  "mappings": {
    "properties": {
      "timestamp": {"type": "date"},
      "message": {"type": "text"},
      "level": {"type": "keyword"},
      "service": {"type": "keyword"}
    }
  }
}'

curl -k -u elastic -X POST "https://203.0.113.10:9200/logs-production/_doc" -H "Content-Type: application/json" -d'
{
  "timestamp": "2024-01-15T10:00:00Z",
  "message": "Application started successfully",
  "level": "INFO",
  "service": "web-api"
}'

Set up cross-cluster replication policy

Create a follower index on the secondary cluster that will replicate data from the primary cluster.

curl -k -u elastic -X PUT "https://203.0.113.20:9200/logs-production-follower/_ccr/follow" -H "Content-Type: application/json" -d'
{
  "remote_cluster": "primary-cluster",
  "leader_index": "logs-production",
  "settings": {
    "index.number_of_replicas": 0
  },
  "max_read_request_operation_count": 5120,
  "max_outstanding_read_requests": 12,
  "max_read_request_size": "32mb",
  "max_write_request_operation_count": 5120,
  "max_write_request_size": "9mb",
  "max_outstanding_write_requests": 9,
  "max_write_buffer_count": 2147483647,
  "max_write_buffer_size": "512mb",
  "max_retry_delay": "500ms",
  "read_poll_timeout": "1m"
}'

Create auto-follow pattern

Set up automatic replication for new indices matching a specific pattern.

curl -k -u elastic -X PUT "https://203.0.113.20:9200/_ccr/auto_follow/logs_pattern" -H "Content-Type: application/json" -d'
{
  "remote_cluster": "primary-cluster",
  "leader_index_patterns": ["logs-*"],
  "follow_index_pattern": "{{leader_index}}-follower",
  "settings": {
    "index.number_of_replicas": 0
  },
  "max_read_request_operation_count": 5120,
  "max_outstanding_read_requests": 12,
  "max_read_request_size": "32mb",
  "max_write_request_operation_count": 5120,
  "max_write_request_size": "9mb",
  "max_outstanding_write_requests": 9,
  "max_write_buffer_count": 2147483647,
  "max_write_buffer_size": "512mb",
  "max_retry_delay": "500ms",
  "read_poll_timeout": "1m"
}'

Configure monitoring and alerting

Set up monitoring scripts to track replication status and detect failures automatically.

#!/bin/bash

Configuration

SECONDARY_CLUSTER="https://203.0.113.20:9200" USERNAME="elastic" PASSWORD="YourElasticPassword" ALERT_EMAIL="admin@example.com" LOG_FILE="/var/log/elasticsearch/ccr-monitor.log"

Function to log with timestamp

log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" }

Check follower index stats

check_follower_stats() { local response response=$(curl -k -s -u "$USERNAME:$PASSWORD" "$SECONDARY_CLUSTER/_ccr/stats") if [[ $? -ne 0 ]]; then log_message "ERROR: Failed to retrieve CCR stats" return 1 fi local failed_operations failed_operations=$(echo "$response" | jq -r '.follow_stats.indices[].shards[].failed_read_requests // 0' | awk '{sum+=$1} END {print sum+0}') if [[ $failed_operations -gt 0 ]]; then log_message "WARNING: $failed_operations failed read requests detected" echo "CCR replication errors detected on secondary cluster" | mail -s "Elasticsearch CCR Alert" "$ALERT_EMAIL" return 1 fi log_message "INFO: CCR status check passed" return 0 }

Check remote cluster connectivity

check_remote_connectivity() { local response response=$(curl -k -s -u "$USERNAME:$PASSWORD" "$SECONDARY_CLUSTER/_remote/info") if [[ $? -ne 0 ]]; then log_message "ERROR: Failed to check remote cluster connectivity" return 1 fi local connected connected=$(echo "$response" | jq -r '."primary-cluster".connected') if [[ "$connected" != "true" ]]; then log_message "ERROR: Remote cluster 'primary-cluster' is not connected" echo "Primary cluster connectivity lost" | mail -s "Elasticsearch CCR Connection Alert" "$ALERT_EMAIL" return 1 fi log_message "INFO: Remote cluster connectivity check passed" return 0 }

Main execution

main() { log_message "Starting CCR health check" check_remote_connectivity connectivity_status=$? check_follower_stats stats_status=$? if [[ $connectivity_status -eq 0 && $stats_status -eq 0 ]]; then log_message "SUCCESS: All CCR health checks passed" exit 0 else log_message "FAILURE: One or more CCR health checks failed" exit 1 fi } main
sudo chmod +x /usr/local/bin/check-ccr-status.sh
sudo mkdir -p /var/log/elasticsearch
sudo chown elasticsearch:elasticsearch /var/log/elasticsearch

Set up automated monitoring

Create a systemd timer to run CCR health checks every 5 minutes.

[Unit]
Description=Elasticsearch Cross-Cluster Replication Monitor
After=network.target

[Service]
Type=oneshot
User=elasticsearch
Group=elasticsearch
ExecStart=/usr/local/bin/check-ccr-status.sh
StandardOutput=journal
StandardError=journal
[Unit]
Description=Run CCR Monitor every 5 minutes
Requires=ccr-monitor.service

[Timer]
OnCalendar=*:0/5
Persistent=true

[Install]
WantedBy=timers.target
sudo systemctl daemon-reload
sudo systemctl enable --now ccr-monitor.timer
sudo systemctl status ccr-monitor.timer

Configure failover procedures

Create scripts to handle automatic failover when the primary cluster becomes unavailable.

#!/bin/bash

Configuration

SECONDARY_CLUSTER="https://203.0.113.20:9200" USERNAME="elastic" PASSWORD="YourElasticPassword" LOG_FILE="/var/log/elasticsearch/failover.log"

Function to log with timestamp

log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" }

Promote follower indices to regular indices

promote_follower_indices() { local indices indices=$(curl -k -s -u "$USERNAME:$PASSWORD" "$SECONDARY_CLUSTER/_cat/indices/*-follower?h=index&format=json" | jq -r '.[].index') for index in $indices; do log_message "Promoting follower index: $index" # Pause following curl -k -s -u "$USERNAME:$PASSWORD" -X POST "$SECONDARY_CLUSTER/$index/_ccr/pause_follow" # Unfollow the index curl -k -s -u "$USERNAME:$PASSWORD" -X POST "$SECONDARY_CLUSTER/$index/_ccr/unfollow" # Create alias without '-follower' suffix local alias_name=${index%-follower} curl -k -s -u "$USERNAME:$PASSWORD" -X POST "$SECONDARY_CLUSTER/_aliases" -H "Content-Type: application/json" -d" { \"actions\": [ { \"add\": { \"index\": \"$index\", \"alias\": \"$alias_name\" } } ] }" log_message "Successfully promoted $index to $alias_name" done }

Main execution

log_message "Starting failover procedure" promote_follower_indices log_message "Failover procedure completed"
sudo chmod +x /usr/local/bin/failover-to-secondary.sh

Verify your setup

Test the cross-cluster replication configuration and verify data synchronization.

# Check remote cluster connection
curl -k -u elastic "https://203.0.113.20:9200/_remote/info"

Verify follower index exists

curl -k -u elastic "https://203.0.113.20:9200/_cat/indices/*-follower"

Check CCR stats

curl -k -u elastic "https://203.0.113.20:9200/_ccr/stats"

Test data replication by adding data to primary

curl -k -u elastic -X POST "https://203.0.113.10:9200/logs-production/_doc" -H "Content-Type: application/json" -d'{ "timestamp": "2024-01-15T10:05:00Z", "message": "Test replication message", "level": "INFO", "service": "test" }'

Verify data appears on secondary (wait a few seconds)

sleep 10 curl -k -u elastic "https://203.0.113.20:9200/logs-production-follower/_search?q=test"

Test monitoring script

sudo -u elasticsearch /usr/local/bin/check-ccr-status.sh

Common issues

SymptomCauseFix
Remote cluster connection failsTLS certificate mismatch or network connectivityVerify certificates are identical on both clusters and check firewall rules for port 9300
Follower index creation failsInsufficient privileges or leader index doesn't existEnsure cross-cluster user has cross_cluster_replication_remote role and leader index exists
Replication lag increasesNetwork latency or resource constraintsIncrease CCR buffer sizes and check network latency between clusters
Authentication errors during replicationPassword expiration or user permissionsUpdate cross-cluster user credentials and verify role assignments
Monitoring script failsMissing dependencies or incorrect permissionsInstall jq package and ensure elasticsearch user can execute monitoring scripts

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.