Set up Elasticsearch 8 with hot-warm-cold node architecture and automated index lifecycle management policies to optimize storage costs and query performance. Configure ILM policies that automatically move data through different tiers based on age and usage patterns.
Prerequisites
- Elasticsearch 8.x cluster with at least 3 nodes
- Different storage types for each tier
- 8GB RAM minimum per node
- Root or sudo access
What this solves
Elasticsearch clusters accumulate massive amounts of data over time, making storage expensive and queries slower. Index Lifecycle Management (ILM) with hot-warm-cold architecture automatically moves older data to cheaper storage tiers while keeping recent data on fast nodes. This reduces costs by 60-80% while maintaining query performance for active data.
Prerequisites
- At least 3 servers with 8GB RAM each for a proper cluster
- Different storage types: fast SSD for hot nodes, standard SSD for warm, spinning disks for cold
- Elasticsearch 8.x cluster already running (see our Elasticsearch 8 installation guide)
Step-by-step configuration
Configure hot node settings
Hot nodes handle active indexing and recent data queries. They need fast CPUs and SSDs.
# Hot node configuration
node.name: es-hot-01
node.roles: [ master, data_hot, data_content, ingest ]
node.attr.data_tier: hot
node.attr.box_type: hot
Performance settings for hot nodes
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]
Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
Configure warm node settings
Warm nodes store less frequently accessed data and handle occasional queries.
# Warm node configuration
node.name: es-warm-01
node.roles: [ master, data_warm ]
node.attr.data_tier: warm
node.attr.box_type: warm
Warm nodes can have less memory allocated
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]
Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
Configure cold node settings
Cold nodes store archived data with slower access times but much lower storage costs.
# Cold node configuration
node.name: es-cold-01
node.roles: [ master, data_cold ]
node.attr.data_tier: cold
node.attr.box_type: cold
Cold storage optimization
path.data: /var/lib/elasticsearch
bootstrap.memory_lock: true
cluster.name: production-cluster
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
Less aggressive caching for cold nodes
indices.queries.cache.size: 5%
indices.fielddata.cache.size: 10%
Discovery settings
discovery.seed_hosts: ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
cluster.initial_master_nodes: ["es-hot-01", "es-warm-01", "es-cold-01"]
Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
Set JVM heap sizes per node type
Configure appropriate heap sizes based on node roles and available memory.
# Hot nodes - 50% of available RAM, max 31GB
-Xms4g
-Xmx4g
# Warm/cold nodes - can use less heap
-Xms2g
-Xmx2g
Restart Elasticsearch on all nodes
Apply the configuration changes by restarting each node one at a time.
# Restart hot nodes first
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch
Wait 30 seconds, then restart warm nodes
sleep 30
sudo systemctl restart elasticsearch
Wait 30 seconds, then restart cold nodes
sleep 30
sudo systemctl restart elasticsearch
Create ILM policy for log data
Define a lifecycle policy that moves data through hot-warm-cold phases automatically.
curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_primary_shard_size": "10gb",
"max_age": "7d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"migrate": {
"enabled": true
},
"forcemerge": {
"max_num_segments": 1
},
"shrink": {
"number_of_shards": 1
},
"set_priority": {
"priority": 50
}
}
},
"cold": {
"min_age": "30d",
"actions": {
"migrate": {
"enabled": true
},
"set_priority": {
"priority": 0
}
}
},
"delete": {
"min_age": "365d",
"actions": {
"delete": {}
}
}
}
}
}'
Create index template with ILM integration
Set up an index template that automatically applies the ILM policy to new indices.
curl -X PUT "localhost:9200/_index_template/logs-template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["logs-*"],
"data_stream": {
"timestamp_field": {
"name": "@timestamp"
}
},
"template": {
"settings": {
"index": {
"lifecycle": {
"name": "logs-policy",
"rollover_alias": "logs"
},
"number_of_shards": 2,
"number_of_replicas": 1,
"refresh_interval": "30s"
}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"host": {
"type": "keyword"
}
}
}
},
"priority": 200
}'
Create metrics ILM policy
Configure a separate policy for metrics data with different retention periods.
curl -X PUT "localhost:9200/_ilm/policy/metrics-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_primary_shard_size": "5gb",
"max_age": "1d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "3d",
"actions": {
"migrate": {
"enabled": true
},
"forcemerge": {
"max_num_segments": 1
},
"set_priority": {
"priority": 50
}
}
},
"cold": {
"min_age": "15d",
"actions": {
"migrate": {
"enabled": true
},
"set_priority": {
"priority": 0
}
}
},
"delete": {
"min_age": "90d",
"actions": {
"delete": {}
}
}
}
}
}'
Create data stream for testing
Create a data stream to test the ILM policy functionality.
# Create the data stream
curl -X PUT "localhost:9200/_data_stream/logs-application"
Add some test data
curl -X POST "localhost:9200/logs-application/_doc" -H 'Content-Type: application/json' -d'
{
"@timestamp": "2024-01-15T10:30:00Z",
"level": "INFO",
"message": "Application started successfully",
"host": "web-01"
}'
curl -X POST "localhost:9200/logs-application/_doc" -H 'Content-Type: application/json' -d'
{
"@timestamp": "2024-01-15T10:31:00Z",
"level": "ERROR",
"message": "Database connection failed",
"host": "web-02"
}'
Configure allocation awareness
Set up shard allocation awareness to ensure proper distribution across node types.
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "data_tier,box_type",
"cluster.routing.allocation.awareness.force.data_tier.values": "hot,warm,cold",
"cluster.routing.allocation.balance.shard": 0.45,
"cluster.routing.allocation.balance.index": 0.55
}
}'
Install Kibana for monitoring
Install Kibana to visualize ILM policy execution and cluster health.
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y kibana
Configure Kibana for ILM monitoring
Set up Kibana to connect to your Elasticsearch cluster and monitor ILM policies.
# Kibana configuration
server.port: 5601
server.host: "0.0.0.0"
server.name: "kibana-server"
Elasticsearch configuration
elasticsearch.hosts: ["https://localhost:9200"]
elasticsearch.username: "kibana_system"
elasticsearch.password: "your-kibana-password"
Security settings
elasticsearch.ssl.certificateAuthorities: ["/etc/elasticsearch/certs/http_ca.crt"]
server.ssl.enabled: true
server.ssl.certificate: "/etc/kibana/certs/kibana.crt"
server.ssl.key: "/etc/kibana/certs/kibana.key"
Monitoring settings
monitoring.ui.ccs.enabled: false
monitoring.enabled: true
Start and enable Kibana
Enable Kibana to start automatically and launch the service.
sudo systemctl daemon-reload
sudo systemctl enable kibana
sudo systemctl start kibana
sudo systemctl status kibana
Verify your setup
Check that your hot-warm-cold architecture and ILM policies are working correctly.
# Check cluster health and node roles
curl -X GET "localhost:9200/_cluster/health?pretty"
curl -X GET "localhost:9200/_cat/nodes?v&h=name,node.role,heap.percent,disk.used_percent"
Verify ILM policies
curl -X GET "localhost:9200/_ilm/policy?pretty"
Check data stream and index allocation
curl -X GET "localhost:9200/_cat/indices?v&h=index,health,pri,rep,store.size,pri.store.size"
View ILM policy status
curl -X GET "localhost:9200/logs-application/_ilm/explain?pretty"
Check shard allocation across tiers
curl -X GET "localhost:9200/_cat/allocation?v"
https://your-server-ip:5601 to access the Kibana interface. Use the elastic superuser credentials to log in and navigate to Stack Management > Index Lifecycle Policies to monitor your ILM policies.Monitor ILM performance with Kibana dashboards
Create ILM monitoring dashboard
Set up Kibana visualizations to track ILM policy execution and data tier usage.
# Create index pattern for monitoring
PUT _template/ilm-monitoring
{
"index_patterns": [".monitoring-*"],
"settings": {
"index.lifecycle.name": "logs-policy"
}
}
Query to check phase transitions
GET _cat/indices?v&h=index,pri,rep,health,status,pri.store.size&s=pri.store.size:desc
Monitor ILM execution status
GET _ilm/status
Check specific index ILM explain
GET logs-application/_ilm/explain
Set up Watcher alerts for ILM issues
Configure automatic alerts when ILM policies fail or indices get stuck.
curl -X PUT "localhost:9200/_watcher/watch/ilm-failure-alert" -H 'Content-Type: application/json' -d'
{
"trigger": {
"schedule": {
"interval": "1h"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [".ds-*"],
"body": {
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"gte": "now-1h"
}
}
}
],
"filter": [
{
"term": {
"ilm.phase": "ERROR"
}
}
]
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"send_email": {
"email": {
"profile": "standard",
"to": ["admin@example.com"],
"subject": "ILM Policy Failure Alert",
"body": "ILM policies have failed for {{ctx.payload.hits.total}} indices. Please check the cluster."
}
}
}
}'
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Indices stuck in hot phase | Insufficient warm/cold nodes | Add more warm/cold nodes or adjust allocation settings |
| ILM policy not executing | ILM disabled or insufficient permissions | Run PUT _ilm/start and check user permissions |
| Shards not moving to warm tier | Node attribute mismatch | Verify node.attr.data_tier settings match policy |
| High disk usage on hot nodes | Rollover conditions not met | Adjust max_primary_shard_size or max_age in policy |
| Search performance degraded | Too many segments in warm/cold | Enable forcemerge action in warm phase |
| Node allocation warnings | Awareness attributes not set | Configure cluster.routing.allocation.awareness.attributes |
Next steps
- Set up cross-cluster replication for disaster recovery
- Monitor your Elasticsearch cluster with Prometheus
- Implement advanced security and encryption
- Optimize indexing performance for high-volume data
- Set up automated snapshots and disaster recovery
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly NC='\033[0m'
# Configuration variables
CLUSTER_NAME="${1:-production-cluster}"
NODE_TYPE="${2:-hot}"
NODE_IPS="${3:-10.0.1.10,10.0.1.11,10.0.1.12}"
usage() {
echo "Usage: $0 [cluster_name] [node_type] [node_ips_comma_separated]"
echo " cluster_name: Name of the Elasticsearch cluster (default: production-cluster)"
echo " node_type: hot, warm, or cold (default: hot)"
echo " node_ips: Comma-separated list of node IPs (default: 10.0.1.10,10.0.1.11,10.0.1.12)"
exit 1
}
log_info() { echo -e "${GREEN}$1${NC}"; }
log_warn() { echo -e "${YELLOW}$1${NC}"; }
log_error() { echo -e "${RED}$1${NC}" >&2; }
cleanup() {
log_error "[CLEANUP] Installation failed. Check logs above."
if systemctl is-active --quiet elasticsearch 2>/dev/null; then
systemctl stop elasticsearch
fi
}
trap cleanup ERR
# Validate node type
if [[ ! "$NODE_TYPE" =~ ^(hot|warm|cold)$ ]]; then
log_error "Invalid node type: $NODE_TYPE. Must be hot, warm, or cold."
usage
fi
# Check prerequisites
echo "[1/8] Checking prerequisites..."
if [[ $EUID -ne 0 ]]; then
log_error "This script must be run as root"
exit 1
fi
if ! command -v curl &> /dev/null; then
log_error "curl is required but not installed"
exit 1
fi
# Detect distribution
echo "[2/8] Detecting distribution..."
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
ES_CONFIG_DIR="/etc/elasticsearch"
ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
ES_CONFIG_DIR="/etc/elasticsearch"
ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
ES_CONFIG_DIR="/etc/elasticsearch"
ES_JVM_CONFIG="$ES_CONFIG_DIR/jvm.options"
;;
*)
log_error "Unsupported distribution: $ID"
exit 1
;;
esac
log_info "Detected: $PRETTY_NAME"
else
log_error "Cannot detect distribution"
exit 1
fi
# Check if Elasticsearch is already installed
echo "[3/8] Checking Elasticsearch installation..."
if ! systemctl list-unit-files elasticsearch.service &>/dev/null; then
log_error "Elasticsearch is not installed. Please install Elasticsearch 8.x first."
exit 1
fi
# Stop Elasticsearch for configuration
echo "[4/8] Stopping Elasticsearch for configuration..."
systemctl stop elasticsearch
# Configure node-specific settings
echo "[5/8] Configuring $NODE_TYPE node settings..."
# Convert comma-separated IPs to array format for discovery.seed_hosts
IFS=',' read -ra IP_ARRAY <<< "$NODE_IPS"
SEED_HOSTS=""
for ip in "${IP_ARRAY[@]}"; do
SEED_HOSTS="$SEED_HOSTS\"$ip\", "
done
SEED_HOSTS="[${SEED_HOSTS%, }]"
# Create main configuration
cat > $ES_CONFIG_DIR/elasticsearch.yml << EOF
# Cluster configuration
cluster.name: $CLUSTER_NAME
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
# Node configuration
node.name: es-$NODE_TYPE-$(hostname)
EOF
# Add node-specific roles and attributes
case "$NODE_TYPE" in
hot)
cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_hot, data_content, ingest ]
node.attr.data_tier: hot
node.attr.box_type: hot
EOF
;;
warm)
cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_warm ]
node.attr.data_tier: warm
node.attr.box_type: warm
EOF
;;
cold)
cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
node.roles: [ master, data_cold ]
node.attr.data_tier: cold
node.attr.box_type: cold
# Cold storage optimization
indices.queries.cache.size: 5%
indices.fielddata.cache.size: 10%
EOF
;;
esac
# Add common settings
cat >> $ES_CONFIG_DIR/elasticsearch.yml << EOF
# Path settings
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
# Memory settings
bootstrap.memory_lock: true
# Discovery settings
discovery.seed_hosts: $SEED_HOSTS
cluster.initial_master_nodes: ["es-hot-$(hostname)", "es-warm-$(hostname)", "es-cold-$(hostname)"]
# Security settings
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
EOF
# Configure JVM heap sizes
echo "[6/8] Configuring JVM heap sizes..."
case "$NODE_TYPE" in
hot)
HEAP_SIZE="4g"
;;
warm|cold)
HEAP_SIZE="2g"
;;
esac
# Update JVM options
sed -i '/^-Xms/d; /^-Xmx/d' $ES_JVM_CONFIG
echo "-Xms$HEAP_SIZE" >> $ES_JVM_CONFIG
echo "-Xmx$HEAP_SIZE" >> $ES_JVM_CONFIG
# Set proper ownership and permissions
chown -R elasticsearch:elasticsearch $ES_CONFIG_DIR
chmod 750 $ES_CONFIG_DIR
chmod 640 $ES_CONFIG_DIR/elasticsearch.yml
chmod 640 $ES_JVM_CONFIG
# Configure system limits
echo "[7/8] Configuring system limits..."
cat > /etc/security/limits.d/elasticsearch.conf << EOF
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
elasticsearch soft nofile 65536
elasticsearch hard nofile 65536
EOF
# Enable memory locking in systemd
mkdir -p /etc/systemd/system/elasticsearch.service.d
cat > /etc/systemd/system/elasticsearch.service.d/override.conf << EOF
[Service]
LimitMEMLOCK=infinity
EOF
systemctl daemon-reload
# Start and enable Elasticsearch
echo "[8/8] Starting Elasticsearch..."
systemctl enable elasticsearch
systemctl start elasticsearch
# Wait for service to start
sleep 10
# Verify installation
echo "Verifying installation..."
if systemctl is-active --quiet elasticsearch; then
log_info "✓ Elasticsearch service is running"
else
log_error "✗ Elasticsearch service failed to start"
exit 1
fi
# Check if port is listening
if netstat -tuln 2>/dev/null | grep -q ":9200" || ss -tuln 2>/dev/null | grep -q ":9200"; then
log_info "✓ Elasticsearch is listening on port 9200"
else
log_warn "! Elasticsearch may not be listening on port 9200 yet (check logs)"
fi
log_info "Elasticsearch $NODE_TYPE node configuration completed successfully!"
log_info "Cluster name: $CLUSTER_NAME"
log_info "Node type: $NODE_TYPE"
log_info "Config file: $ES_CONFIG_DIR/elasticsearch.yml"
log_info "JVM config: $ES_JVM_CONFIG"
log_warn "Note: Configure SSL certificates and set up passwords before production use"
log_warn "Check logs: journalctl -u elasticsearch"
Review the script before running. Execute with: bash install.sh