Configure Elasticsearch 8 ILM Hot-Warm-Cold Architecture

Set up Elasticsearch 8 with hot-warm-cold node architecture and automated index lifecycle management policies to optimize storage costs and query performance. Configure ILM policies that automatically move data through different tiers based on age and usage patterns.

Prerequisites

Elasticsearch 8.x cluster with at least 3 nodes
Different storage types for each tier
8GB RAM minimum per node
Root or sudo access

What this solves

Elasticsearch clusters accumulate massive amounts of data over time, making storage expensive and queries slower. Index Lifecycle Management (ILM) with hot-warm-cold architecture automatically moves older data to cheaper storage tiers while keeping recent data on fast nodes. This reduces costs by 60-80% while maintaining query performance for active data.

Prerequisites

At least 3 servers with 8GB RAM each for a proper cluster
Different storage types: fast SSD for hot nodes, standard SSD for warm, spinning disks for cold
Elasticsearch 8.x cluster already running (see our Elasticsearch 8 installation guide)

Step-by-step configuration

Configure hot node settings

Hot nodes handle active indexing and recent data queries. They need fast CPUs and SSDs.

# Hot node configuration
node.name: es-hot-01
node.roles: [ master, data_hot, data_content, ingest ]
node.attr.data_tier: hot
node.attr.box_type: hot

Performance settings for hot nodes
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]

Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true

Configure warm node settings

Warm nodes store less frequently accessed data and handle occasional queries.

# Warm node configuration
node.name: es-warm-01
node.roles: [ master, data_warm ]
node.attr.data_tier: warm
node.attr.box_type: warm

Warm nodes can have less memory allocated
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]

Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true

Configure cold node settings

Cold nodes store archived data with slower access times but much lower storage costs.

# Cold node configuration
node.name: es-cold-01
node.roles: [ master, data_cold ]
node.attr.data_tier: cold
node.attr.box_type: cold

Cold storage optimization
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

Less aggressive caching for cold nodes
indices.queries.cache.size: 5%
indices.fielddata.cache.size: 10%

Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]

Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true

Set JVM heap sizes per node type

Configure appropriate heap sizes based on node roles and available memory.

# Hot nodes - 50% of available RAM, max 31GB
-Xms4g
-Xmx4g

# Warm/cold nodes - can use less heap
-Xms2g
-Xmx2g

# Hot nodes - 50% of available RAM, max 31GB
-Xms4g
-Xmx4g

# Warm/cold nodes - can use less heap
-Xms2g
-Xmx2g

Restart Elasticsearch on all nodes

Apply the configuration changes by restarting each node one at a time.

# Restart hot nodes first
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Wait 30 seconds, then restart warm nodes
sleep 30
sudo systemctl restart elasticsearch

Wait 30 seconds, then restart cold nodes
sleep 30
sudo systemctl restart elasticsearch

Create ILM policy for log data

Define a lifecycle policy that moves data through hot-warm-cold phases automatically.

curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_primary_shard_size": "10gb",
            "max_age": "7d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "migrate": {
            "enabled": true
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "shrink": {
            "number_of_shards": 1
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "migrate": {
            "enabled": true
          },
          "set_priority": {
            "priority": 0
          }
        }
      },
      "delete": {
        "min_age": "365d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}'

Create index template with ILM integration

Set up an index template that automatically applies the ILM policy to new indices.

curl -X PUT "localhost:9200/_index_template/logs-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["logs-*"],
  "data_stream": {
    "timestamp_field": {
      "name": "@timestamp"
    }
  },
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logs-policy",
          "rollover_alias": "logs"
        },
        "number_of_shards": 2,
        "number_of_replicas": 1,
        "refresh_interval": "30s"
      }
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "level": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "host": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 200
}'

Create metrics ILM policy

Configure a separate policy for metrics data with different retention periods.

curl -X PUT "localhost:9200/_ilm/policy/metrics-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_primary_shard_size": "5gb",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "3d",
        "actions": {
          "migrate": {
            "enabled": true
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "15d",
        "actions": {
          "migrate": {
            "enabled": true
          },
          "set_priority": {
            "priority": 0
          }
        }
      },
      "delete": {
        "min_age": "90d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}'

Create data stream for testing

Create a data stream to test the ILM policy functionality.

# Create the data stream
curl -X PUT "localhost:9200/_data_stream/logs-application"

Add some test data
curl -X POST "localhost:9200/logs-application/_doc" -H 'Content-Type: application/json' -d'
{
  "@timestamp": "2024-01-15T10:30:00Z",
  "level": "INFO",
  "message": "Application started successfully",
  "host": "web-01"
}'

curl -X POST "localhost:9200/logs-application/_doc" -H 'Content-Type: application/json' -d'
{
  "@timestamp": "2024-01-15T10:31:00Z",
  "level": "ERROR",
  "message": "Database connection failed",
  "host": "web-02"
}'

Configure allocation awareness

Set up shard allocation awareness to ensure proper distribution across node types.

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.awareness.attributes": "data_tier,box_type",
    "cluster.routing.allocation.awareness.force.data_tier.values": "hot,warm,cold",
    "cluster.routing.allocation.balance.shard": 0.45,
    "cluster.routing.allocation.balance.index": 0.55
  }
}'

Install Kibana for monitoring

Install Kibana to visualize ILM policy execution and cluster health.

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y kibana

sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
echo '[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md' | sudo tee /etc/yum.repos.d/elasticsearch.repo
sudo dnf install --enablerepo=elasticsearch -y kibana

Configure Kibana for ILM monitoring

Set up Kibana to connect to your Elasticsearch cluster and monitor ILM policies.

# Kibana configuration
server.port: 5601
server.host: "0.0.0.0"
server.name: "kibana-server"

Elasticsearch configuration
elasticsearch.hosts: ["https://localhost:9200"]
elasticsearch.username: "kibana_system"
elasticsearch.password: "your-kibana-password"

Security settings
elasticsearch.ssl.certificateAuthorities: ["/etc/elasticsearch/certs/http_ca.crt"]
server.ssl.enabled: true
server.ssl.certificate: "/etc/kibana/certs/kibana.crt"
server.ssl.key: "/etc/kibana/certs/kibana.key"

Monitoring settings
monitoring.ui.ccs.enabled: false
monitoring.enabled: true

Start and enable Kibana

Enable Kibana to start automatically and launch the service.

sudo systemctl daemon-reload
sudo systemctl enable kibana
sudo systemctl start kibana
sudo systemctl status kibana

Verify your setup

Check that your hot-warm-cold architecture and ILM policies are working correctly.

# Check cluster health and node roles
curl -X GET "localhost:9200/_cluster/health?pretty"
curl -X GET "localhost:9200/_cat/nodes?v&h=name,node.role,heap.percent,disk.used_percent"

Verify ILM policies
curl -X GET "localhost:9200/_ilm/policy?pretty"

Check data stream and index allocation
curl -X GET "localhost:9200/_cat/indices?v&h=index,health,pri,rep,store.size,pri.store.size"

View ILM policy status
curl -X GET "localhost:9200/logs-application/_ilm/explain?pretty"

Check shard allocation across tiers
curl -X GET "localhost:9200/_cat/allocation?v"

Kibana Access: Open your browser to https://your-server-ip:5601 to access the Kibana interface. Use the elastic superuser credentials to log in and navigate to Stack Management > Index Lifecycle Policies to monitor your ILM policies.

Monitor ILM performance with Kibana dashboards

Create ILM monitoring dashboard

Set up Kibana visualizations to track ILM policy execution and data tier usage.

# Create index pattern for monitoring
PUT _template/ilm-monitoring
{
  "index_patterns": [".monitoring-*"],
  "settings": {
    "index.lifecycle.name": "logs-policy"
  }
}

Query to check phase transitions
GET _cat/indices?v&h=index,pri,rep,health,status,pri.store.size&s=pri.store.size:desc

Monitor ILM execution status
GET _ilm/status

Check specific index ILM explain
GET logs-application/_ilm/explain

Set up Watcher alerts for ILM issues

Configure automatic alerts when ILM policies fail or indices get stuck.

curl -X PUT "localhost:9200/_watcher/watch/ilm-failure-alert" -H 'Content-Type: application/json' -d'
{
  "trigger": {
    "schedule": {
      "interval": "1h"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": [".ds-*"],
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-1h"
                    }
                  }
                }
              ],
              "filter": [
                {
                  "term": {
                    "ilm.phase": "ERROR"
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "to": ["admin@example.com"],
        "subject": "ILM Policy Failure Alert",
        "body": "ILM policies have failed for {{ctx.payload.hits.total}} indices. Please check the cluster."
      }
    }
  }
}'

Common issues

Symptom	Cause	Fix
Indices stuck in hot phase	Insufficient warm/cold nodes	Add more warm/cold nodes or adjust allocation settings
ILM policy not executing	ILM disabled or insufficient permissions	Run `PUT _ilm/start` and check user permissions
Shards not moving to warm tier	Node attribute mismatch	Verify node.attr.data_tier settings match policy
High disk usage on hot nodes	Rollover conditions not met	Adjust max_primary_shard_size or max_age in policy
Search performance degraded	Too many segments in warm/cold	Enable forcemerge action in warm phase
Node allocation warnings	Awareness attributes not set	Configure cluster.routing.allocation.awareness.attributes

Storage Planning: Cold nodes typically use 70% less expensive storage but have 3-5x slower query performance. Plan your hot-warm-cold ratios based on query patterns, not just age. Active dashboards may need data kept in warm tier longer than expected.

Next steps

Running this in production?

Managing Elasticsearch complexity? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly NC='\033[0m'

# Configuration variables
CLUSTER_NAME="${1:-production-cluster}"
NODE_TYPE="${2:-hot}"
NODE_IPS="${3:-10.0.1.10,10.0.1.11,10.0.1.12}"

usage() {
    echo "Usage: $0 [cluster_name] [node_type] [node_ips_comma_separated]"
    echo "  cluster_name: Name of the Elasticsearch cluster (default: production-cluster)"
    echo "  node_type: hot, warm, or cold (default: hot)"
    echo "  node_ips: Comma-separated list of node IPs (default: 10.0.1.10,10.0.1.11,10.0.1.12)"
    exit 1
}

log_info() { echo -e "${GREEN}$1${NC}"; }
log_warn() { echo -e "${YELLOW}$1${NC}"; }
log_error() { echo -e "${RED}$1${NC}" >&2; }

cleanup() {
    log_error "[CLEANUP] Installation failed. Check logs above."
    if systemctl is-active --quiet elasticsearch 2>/dev/null; then
        systemctl stop elasticsearch
    fi
}

trap cleanup ERR

# Validate node type
if [[ ! "$NODE_TYPE" =~ ^(hot|warm|cold)$ ]]; then
    log_error "Invalid node type: $NODE_TYPE. Must be hot, warm, or cold."
    usage
fi

# Check prerequisites
echo "[1/8] Checking prerequisites..."
if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root"
    exit 1
fi

if ! command -v curl &> /dev/null; then
    log_error "curl is required but not installed"
    exit 1
fi

# Detect distribution
echo "[2/8] Detecting distribution..."
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
    log_info "Detected: $PRETTY_NAME"
else
    log_error "Cannot detect distribution"
    exit 1
fi

# Check if Elasticsearch is already installed
echo "[3/8] Checking Elasticsearch installation..."
if ! systemctl list-unit-files elasticsearch.service &>/dev/null; then
    log_error "Elasticsearch is not installed. Please install Elasticsearch 8.x first."
    exit 1
fi

# Stop Elasticsearch for configuration
echo "[4/8] Stopping Elasticsearch for configuration..."
systemctl stop elasticsearch

# Configure node-specific settings
echo "[5/8] Configuring $NODE_TYPE node settings..."

# Convert comma-separated IPs to array format for discovery.seed_hosts
IFS=',' read -ra IP_ARRAY <<< "$NODE_IPS"
SEED_HOSTS=""
for ip in "${IP_ARRAY[@]}"; do
    SEED_HOSTS="$SEED_HOSTS\"$ip\", "
done
SEED_HOSTS="[${SEED_HOSTS%, }]"

# Create main configuration
cat > $ES_CONFIG_DIR/elasticsearch.yml << EOF
# Cluster configuration
cluster.name: $CLUSTER_NAME
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

# Node configuration
node.name: es-$NODE_TYPE-$(hostname)
EOF

# Add node-specific roles and attributes
case "$NODE_TYPE" in
    hot)
        cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_hot, data_content, ingest ]
node.attr.data_tier: hot
node.attr.box_type: hot
EOF
        ;;
    warm)
        cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_warm ]
node.attr.data_tier: warm
node.attr.box_type: warm
EOF
        ;;
    cold)
        cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_cold ]
node.attr.data_tier: cold
node.attr.box_type: cold

# Cold storage optimization
indices.queries.cache.size: 5%
indices.fielddata.cache.size: 10%
EOF
        ;;
esac

# Add common settings
cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF

# Path settings
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

# Memory settings
bootstrap.memory_lock: true

# Discovery settings
discovery.seed_hosts: $SEED_HOSTS
cluster.initial_master_nodes: ["es-hot-$(hostname)", "es-warm-$(hostname)", "es-cold-$(hostname)"]

# Security settings
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
EOF

# Configure JVM heap sizes
echo "[6/8] Configuring JVM heap sizes..."
case "$NODE_TYPE" in
    hot)
        HEAP_SIZE="4g"
        ;;
    warm|cold)
        HEAP_SIZE="2g"
        ;;
esac

# Update JVM options
sed -i '/^-Xms/d; /^-Xmx/d' $ES_JVM_CONFIG
echo "-Xms$HEAP_SIZE" >> $ES_JVM_CONFIG
echo "-Xmx$HEAP_SIZE" >> $ES_JVM_CONFIG

# Set proper ownership and permissions
chown -R elasticsearch:elasticsearch $ES_CONFIG_DIR
chmod 750 $ES_CONFIG_DIR
chmod 640 $ES_CONFIG_DIR/elasticsearch.yml
chmod 640 $ES_JVM_CONFIG

# Configure system limits
echo "[7/8] Configuring system limits..."
cat > /etc/security/limits.d/elasticsearch.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
elasticsearch soft nofile 65536
elasticsearch hard nofile 65536
EOF

# Enable memory locking in systemd
mkdir -p /etc/systemd/system/elasticsearch.service.d
cat > /etc/systemd/system/elasticsearch.service.d/override.conf << EOF
[Service]
LimitMEMLOCK=infinity
EOF

systemctl daemon-reload

# Start and enable Elasticsearch
echo "[8/8] Starting Elasticsearch..."
systemctl enable elasticsearch
systemctl start elasticsearch

# Wait for service to start
sleep 10

# Verify installation
echo "Verifying installation..."
if systemctl is-active --quiet elasticsearch; then
    log_info "✓ Elasticsearch service is running"
else
    log_error "✗ Elasticsearch service failed to start"
    exit 1
fi

# Check if port is listening
if netstat -tuln 2>/dev/null | grep -q ":9200" || ss -tuln 2>/dev/null | grep -q ":9200"; then
    log_info "✓ Elasticsearch is listening on port 9200"
else
    log_warn "! Elasticsearch may not be listening on port 9200 yet (check logs)"
fi

log_info "Elasticsearch $NODE_TYPE node configuration completed successfully!"
log_info "Cluster name: $CLUSTER_NAME"
log_info "Node type: $NODE_TYPE"
log_info "Config file: $ES_CONFIG_DIR/elasticsearch.yml"
log_info "JVM config: $ES_JVM_CONFIG"
log_warn "Note: Configure SSL certificates and set up passwords before production use"
log_warn "Check logs: journalctl -u elasticsearch"

Review the script before running. Execute with: bash install.sh

#elasticsearch #ilm #hot-warm-cold #data-tiering #kibana

Configure Elasticsearch 8 index lifecycle management with hot-warm-cold architecture for automated data tiering

Prerequisites

What this solves

Prerequisites

Step-by-step configuration

Configure hot node settings

Performance settings for hot nodes

Discovery settings

Security

Configure warm node settings

Warm nodes can have less memory allocated

Discovery settings

Security

Configure cold node settings

Cold storage optimization

Less aggressive caching for cold nodes

Discovery settings

Security

Set JVM heap sizes per node type

Restart Elasticsearch on all nodes

Wait 30 seconds, then restart warm nodes

Wait 30 seconds, then restart cold nodes

Create ILM policy for log data

Create index template with ILM integration

Create metrics ILM policy

Create data stream for testing

Add some test data

Configure allocation awareness

Install Kibana for monitoring

Configure Kibana for ILM monitoring

Elasticsearch configuration

Security settings

Monitoring settings

Start and enable Kibana

Verify your setup

Verify ILM policies

Check data stream and index allocation

View ILM policy status

Check shard allocation across tiers

Monitor ILM performance with Kibana dashboards

Create ILM monitoring dashboard

Query to check phase transitions

Monitor ILM execution status

Check specific index ILM explain

Set up Watcher alerts for ILM issues

Common issues

Next steps

Running this in production?

Related tutorials

Implement MariaDB backup encryption with Mariabackup and automated restoration

Configure MariaDB Galera cluster for multi-master replication with automatic failover

Configure Elasticsearch 8 snapshot and restore policies with automated backup strategies

Don't want to manage this yourself?