Implement Elasticsearch 8 snapshot lifecycle management with S3 storage for automated backups

Intermediate 45 min Apr 25, 2026
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up automated Elasticsearch 8 backups using snapshot lifecycle management policies with S3 repository storage. Configure retention policies, scheduling, and monitoring for production backup strategies.

Prerequisites

  • Elasticsearch 8.x installed and running
  • AWS account with S3 access
  • Root or sudo access
  • Basic familiarity with AWS IAM

What this solves

Elasticsearch snapshot lifecycle management (SLM) automates the creation, retention, and deletion of cluster snapshots to ensure data protection without manual intervention. This tutorial sets up automated backups to Amazon S3 storage with configurable retention policies and monitoring.

Step-by-step configuration

Install and configure AWS CLI

Install the AWS command line interface to manage S3 credentials and test connectivity before configuring Elasticsearch.

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
aws --version
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
dnf install -y unzip
unzip awscliv2.zip
sudo ./aws/install
aws --version

Create S3 bucket for snapshots

Create a dedicated S3 bucket for Elasticsearch snapshots with versioning enabled for additional data protection.

aws configure
aws s3 mb s3://elasticsearch-snapshots-prod-2024
aws s3api put-bucket-versioning --bucket elasticsearch-snapshots-prod-2024 --versioning-configuration Status=Enabled

Create IAM policy for Elasticsearch S3 access

Create an IAM policy with minimal permissions required for Elasticsearch to read, write, and delete snapshots in the designated bucket.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:ListBucketMultipartUploads",
        "s3:ListBucketVersions"
      ],
      "Resource": "arn:aws:s3:::elasticsearch-snapshots-prod-2024"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:AbortMultipartUpload",
        "s3:ListMultipartUploadParts"
      ],
      "Resource": "arn:aws:s3:::elasticsearch-snapshots-prod-2024/*"
    }
  ]
}

Create IAM user and attach policy

Create a dedicated IAM user for Elasticsearch snapshot operations and attach the policy created above.

aws iam create-policy --policy-name ElasticsearchS3SnapshotPolicy --policy-document file://elasticsearch-s3-policy.json
aws iam create-user --user-name elasticsearch-snapshot-user
aws iam attach-user-policy --user-name elasticsearch-snapshot-user --policy-arn arn:aws:iam::YOUR_ACCOUNT_ID:policy/ElasticsearchS3SnapshotPolicy
aws iam create-access-key --user-name elasticsearch-snapshot-user

Configure Elasticsearch keystore with S3 credentials

Add AWS credentials to Elasticsearch's secure keystore to authenticate with S3 without storing credentials in configuration files.

sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.default.access_key
sudo -u elasticsearch /usr/share/elasticsearch/bin/elasticsearch-keystore add s3.client.default.secret_key
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Install S3 repository plugin

Install the official Elasticsearch S3 repository plugin to enable snapshot storage to Amazon S3.

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install repository-s3
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Create S3 snapshot repository

Register the S3 bucket as a snapshot repository in Elasticsearch with appropriate settings for chunk size and compression.

curl -X PUT "localhost:9200/_snapshot/s3_repository" -H 'Content-Type: application/json' -d'
{
  "type": "s3",
  "settings": {
    "bucket": "elasticsearch-snapshots-prod-2024",
    "region": "us-east-1",
    "compress": true,
    "chunk_size": "1gb",
    "max_restore_bytes_per_sec": "40mb",
    "max_snapshot_bytes_per_sec": "40mb"
  }
}'

Verify repository configuration

Test the S3 repository configuration to ensure Elasticsearch can successfully connect and write to the bucket.

curl -X POST "localhost:9200/_snapshot/s3_repository/_verify"
curl -X GET "localhost:9200/_snapshot/s3_repository"

Create snapshot lifecycle management policy

Define an SLM policy that creates daily snapshots with retention rules to automatically delete old snapshots after a specified period.

curl -X PUT "localhost:9200/_slm/policy/daily-snapshots" -H 'Content-Type: application/json' -d'
{
  "schedule": "0 2   *",
  "name": "",
  "repository": "s3_repository",
  "config": {
    "indices": "*",
    "ignore_unavailable": false,
    "include_global_state": true,
    "metadata": {
      "taken_by": "snapshot-lifecycle-management",
      "taken_because": "daily automated backup"
    }
  },
  "retention": {
    "expire_after": "30d",
    "min_count": 7,
    "max_count": 50
  }
}'

Create weekly long-term retention policy

Configure a second SLM policy for weekly snapshots with longer retention for disaster recovery scenarios.

curl -X PUT "localhost:9200/_slm/policy/weekly-snapshots" -H 'Content-Type: application/json' -d'
{
  "schedule": "0 3   0",
  "name": "",
  "repository": "s3_repository",
  "config": {
    "indices": "*",
    "ignore_unavailable": false,
    "include_global_state": true,
    "metadata": {
      "taken_by": "snapshot-lifecycle-management",
      "taken_because": "weekly long-term backup"
    }
  },
  "retention": {
    "expire_after": "180d",
    "min_count": 4,
    "max_count": 26
  }
}'

Execute manual snapshot for testing

Trigger a manual snapshot using the SLM policy to verify the configuration works correctly before waiting for the scheduled execution.

curl -X POST "localhost:9200/_slm/policy/daily-snapshots/_execute"
curl -X GET "localhost:9200/_slm/policy/daily-snapshots/_execute"

Configure SLM policy monitoring

Enable detailed logging for snapshot lifecycle management operations to track policy execution and failures.

# Add these lines to existing log4j2.properties
logger.slm.name = org.elasticsearch.xpack.slm
logger.slm.level = info
logger.slm.appenderRef.console.ref = console
logger.slm.appenderRef.rolling.ref = rolling

logger.snapshot.name = org.elasticsearch.snapshots
logger.snapshot.level = info
logger.snapshot.appenderRef.console.ref = console
logger.snapshot.appenderRef.rolling.ref = rolling

Set up index lifecycle management integration

Configure index lifecycle management to work with snapshot policies for comprehensive data tiering and backup strategies.

curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB",
            "max_age": "1d"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "90d"
      }
    }
  }
}'
Note: This integrates with our index lifecycle management tutorial for complete data management.

Monitor snapshot lifecycle management

Check SLM policy status

Monitor the execution status and statistics of your snapshot lifecycle management policies.

curl -X GET "localhost:9200/_slm/policy"
curl -X GET "localhost:9200/_slm/stats"

Monitor snapshot progress

Check the status of ongoing and completed snapshots to verify successful backup operations.

curl -X GET "localhost:9200/_snapshot/s3_repository/_all"
curl -X GET "localhost:9200/_snapshot/_status"

Set up alerting for snapshot failures

Create a watcher to alert on snapshot failures and SLM policy execution problems.

curl -X PUT "localhost:9200/_watcher/watch/snapshot_failure_alert" -H 'Content-Type: application/json' -d'
{
  "trigger": {
    "schedule": {
      "interval": "5m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": ["_all"],
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-10m"
                    }
                  }
                },
                {
                  "match": {
                    "message": "snapshot failed"
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "ctx.payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "to": ["ops@example.com"],
        "subject": "Elasticsearch Snapshot Failed",
        "body": "Snapshot operation failed. Check cluster logs for details."
      }
    }
  }
}'

Verify your setup

# Check SLM policies are active
curl -X GET "localhost:9200/_slm/policy"

Verify S3 repository connectivity

curl -X POST "localhost:9200/_snapshot/s3_repository/_verify"

Check recent snapshots

curl -X GET "localhost:9200/_snapshot/s3_repository/_all?pretty"

Monitor SLM execution stats

curl -X GET "localhost:9200/_slm/stats?pretty"

Check AWS S3 bucket contents

aws s3 ls s3://elasticsearch-snapshots-prod-2024/

Common issues

SymptomCauseFix
Repository verification failsAWS credentials not configuredCheck keystore with elasticsearch-keystore list
Snapshots not appearing in S3Insufficient IAM permissionsVerify IAM policy allows s3:PutObject and s3:ListBucket
SLM policy not executingInvalid cron scheduleTest schedule with /_slm/policy/POLICY_NAME/_execute
Large snapshots timeoutDefault timeout too lowIncrease max_snapshot_bytes_per_sec in repository settings
S3 access denied errorsBucket region mismatchEnsure repository region matches S3 bucket region
Snapshot retention not workingMin/max count conflictsAdjust min_count and max_count in retention policy

Troubleshoot automated snapshots

Debug SLM policy execution

Check detailed logs and execution history to identify issues with snapshot lifecycle management.

# Check SLM execution history
curl -X GET "localhost:9200/_slm/policy/daily-snapshots?human"

View detailed policy statistics

curl -X GET "localhost:9200/_slm/stats?pretty"

Check Elasticsearch logs for SLM errors

sudo tail -f /var/log/elasticsearch/elasticsearch.log | grep -i slm

Test manual snapshot operations

Verify repository configuration by performing manual snapshot and restore operations.

# Create manual test snapshot
curl -X PUT "localhost:9200/_snapshot/s3_repository/test_snapshot_$(date +%Y%m%d_%H%M%S)?wait_for_completion=true"

List all snapshots in repository

curl -X GET "localhost:9200/_snapshot/s3_repository/_all?pretty"

Delete test snapshot

curl -X DELETE "localhost:9200/_snapshot/s3_repository/test_snapshot_*"

Monitor S3 storage costs

Track S3 storage usage and costs for snapshot retention optimization.

# Get bucket size and object count
aws s3api list-objects-v2 --bucket elasticsearch-snapshots-prod-2024 --query "[Contents[].{Key:Key,Size:Size,LastModified:LastModified}]" --output table

Calculate total bucket size

aws s3 ls s3://elasticsearch-snapshots-prod-2024/ --recursive --human-readable --summarize

Next steps

Running this in production?

Want this handled for you? Setting up Elasticsearch snapshots once is straightforward. Keeping it patched, monitored, backed up and performant across environments is the harder part. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.