Configure Jaeger data retention policies and automated archiving with Elasticsearch backend

Intermediate 45 min Apr 07, 2026 256 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Learn to configure Jaeger data retention policies with Elasticsearch backend for automated trace archiving. This tutorial covers index lifecycle management, storage optimization, and performance monitoring to prevent disk space issues while maintaining observability requirements.

Prerequisites

  • Root or sudo access
  • 4GB+ RAM recommended
  • 20GB+ disk space
  • Elasticsearch 8.x compatible system

What this solves

Jaeger trace data grows rapidly in production environments, leading to storage exhaustion and performance degradation. This tutorial configures automated retention policies and archiving to maintain optimal Elasticsearch performance while preserving important trace data for compliance and debugging.

Step-by-step configuration

Install Elasticsearch and Jaeger

Start by installing Elasticsearch 8 and Jaeger with proper dependencies. This establishes the foundation for trace storage and retention management.

curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y elasticsearch openjdk-11-jdk
wget https://github.com/jaegertracing/jaeger/releases/download/v1.50.0/jaeger-1.50.0-linux-amd64.tar.gz
tar -xzf jaeger-1.50.0-linux-amd64.tar.gz
sudo mv jaeger-1.50.0-linux-amd64/* /opt/jaeger/
sudo chmod +x /opt/jaeger/*
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat > /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md
EOF
sudo dnf install -y --enablerepo=elasticsearch elasticsearch java-11-openjdk
wget https://github.com/jaegertracing/jaeger/releases/download/v1.50.0/jaeger-1.50.0-linux-amd64.tar.gz
tar -xzf jaeger-1.50.0-linux-amd64.tar.gz
sudo mkdir -p /opt/jaeger
sudo mv jaeger-1.50.0-linux-amd64/* /opt/jaeger/
sudo chmod +x /opt/jaeger/*

Configure Elasticsearch for Jaeger

Configure Elasticsearch with optimized settings for trace data storage and enable security features required for production deployments.

cluster.name: jaeger-cluster
node.name: jaeger-node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: localhost
http.port: 9200
discovery.type: single-node
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
action.auto_create_index: true
indices.requests.cache.size: 5%
indices.fielddata.cache.size: 20%
thread_pool.write.queue_size: 1000
cluster.routing.allocation.disk.watermark.low: 85%
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%

Configure JVM heap settings

Set appropriate JVM heap size for Elasticsearch based on available system memory. Use half of available RAM, maximum 32GB.

-Xms2g
-Xmx2g

Start and enable Elasticsearch

Start Elasticsearch and configure it to start automatically on system boot.

sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
sudo systemctl status elasticsearch

Create Jaeger systemd services

Create systemd service files for Jaeger collector and query components with proper configuration for production use.

[Unit]
Description=Jaeger Collector
After=network.target elasticsearch.service
Requires=elasticsearch.service

[Service]
Type=simple
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/jaeger-collector \
  --es.server-urls=http://localhost:9200 \
  --es.num-shards=1 \
  --es.num-replicas=0 \
  --collector.grpc-port=14250 \
  --collector.http-port=14268 \
  --collector.zipkin.host-port=:9411
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Create Jaeger query service

Configure the Jaeger query service for web UI access and API queries with Elasticsearch backend.

[Unit]
Description=Jaeger Query
After=network.target elasticsearch.service
Requires=elasticsearch.service

[Service]
Type=simple
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/jaeger-query \
  --es.server-urls=http://localhost:9200 \
  --query.port=16686 \
  --query.grpc-port=16685 \
  --query.base-path=/
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Create jaeger user and set permissions

Create a dedicated user for Jaeger services and set appropriate permissions for security.

sudo useradd --system --no-create-home --shell /bin/false jaeger
sudo chown -R jaeger:jaeger /opt/jaeger
sudo chmod 755 /opt/jaeger
sudo chmod 755 /opt/jaeger/*

Configure index lifecycle management policy

Create an ILM policy for automatic trace data lifecycle management including hot, warm, cold, and delete phases.

curl -X PUT "localhost:9200/_ilm/policy/jaeger-ilm-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "10gb",
            "max_age": "1d",
            "max_docs": 10000000
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "2d",
        "actions": {
          "set_priority": {
            "priority": 50
          },
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "7d",
        "actions": {
          "set_priority": {
            "priority": 0
          },
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}'
sleep 2

Create index templates with ILM policy

Configure index templates for Jaeger spans and services that automatically apply the ILM policy to new indices.

curl -X PUT "localhost:9200/_index_template/jaeger-span-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["jaeger-span-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "index.lifecycle.name": "jaeger-ilm-policy",
      "index.lifecycle.rollover_alias": "jaeger-span-write",
      "index.refresh_interval": "5s",
      "index.translog.durability": "async"
    },
    "mappings": {
      "properties": {
        "traceID": { "type": "keyword" },
        "spanID": { "type": "keyword" },
        "parentSpanID": { "type": "keyword" },
        "operationName": { "type": "keyword" },
        "startTime": { "type": "long" },
        "duration": { "type": "long" },
        "process.serviceName": { "type": "keyword" },
        "process.tags": { "type": "nested" }
      }
    }
  },
  "priority": 500,
  "version": 1
}'
sleep 2

Create service template

Configure template for Jaeger service indices with appropriate lifecycle management.

curl -X PUT "localhost:9200/_index_template/jaeger-service-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["jaeger-service-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "index.lifecycle.name": "jaeger-ilm-policy",
      "index.refresh_interval": "30s"
    },
    "mappings": {
      "properties": {
        "serviceName": { "type": "keyword" },
        "operationName": { "type": "keyword" },
        "timestamp": { "type": "date" }
      }
    }
  },
  "priority": 500,
  "version": 1
}'
sleep 2

Create initial indices and aliases

Create the initial indices with write aliases for the rollover functionality to work properly.

curl -X PUT "localhost:9200/jaeger-span-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "jaeger-span-write": {
      "is_write_index": true
    }
  }
}'
sleep 2
curl -X PUT "localhost:9200/jaeger-service-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "jaeger-service-write": {
      "is_write_index": true
    }
  }
}'
sleep 2

Configure archival storage

Create a script for automated archival of older trace data to compressed storage before deletion.

#!/bin/bash

Configuration

ES_HOST="localhost:9200" ARCHIVE_DIR="/var/lib/jaeger/archive" ARCHIVE_AGE_DAYS=7 COMPRESSION_LEVEL=9

Create archive directory

mkdir -p "$ARCHIVE_DIR"

Get indices older than archive age

OLD_DATE=$(date -d "$ARCHIVE_AGE_DAYS days ago" +%Y-%m-%d)

Function to archive index

archive_index() { local index_name="$1" local archive_file="${ARCHIVE_DIR}/${index_name}-$(date +%Y%m%d).json.gz" echo "Archiving index: $index_name" # Export index data curl -s "${ES_HOST}/${index_name}/_search?scroll=10m&size=1000" \ -H 'Content-Type: application/json' \ -d '{"query":{"match_all":{}}}' | \ jq -c '.hits.hits[]._source' | \ gzip -$COMPRESSION_LEVEL > "$archive_file" if [ $? -eq 0 ]; then echo "Successfully archived $index_name to $archive_file" return 0 else echo "Failed to archive $index_name" return 1 fi }

Get list of indices to archive

INDICES=$(curl -s "${ES_HOST}/_cat/indices/jaeger-span-*?h=index,creation.date.string" | \ awk -v cutoff="$OLD_DATE" '$2 < cutoff {print $1}')

Archive each old index

for index in $INDICES; do archive_index "$index" done

Clean up archives older than 90 days

find "$ARCHIVE_DIR" -name "*.json.gz" -mtime +90 -delete echo "Archive process completed"

Make archive script executable and set permissions

Set proper permissions for the archive script and create the archive directory structure.

sudo chmod +x /opt/jaeger/archive-traces.sh
sudo mkdir -p /var/lib/jaeger/archive
sudo chown -R jaeger:jaeger /var/lib/jaeger
sudo chmod 755 /var/lib/jaeger
sudo chmod 755 /var/lib/jaeger/archive

Create monitoring script

Create a monitoring script to track storage usage and ILM policy execution for operational visibility.

#!/bin/bash

Configuration

ES_HOST="localhost:9200" LOG_FILE="/var/log/jaeger/storage-monitor.log" ALERT_THRESHOLD_GB=80

Create log directory

sudo mkdir -p /var/log/jaeger sudo chown jaeger:jaeger /var/log/jaeger

Function to log with timestamp

log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE" }

Check cluster health

CLUSTER_STATUS=$(curl -s "${ES_HOST}/_cluster/health" | jq -r '.status') log "Cluster status: $CLUSTER_STATUS"

Check indices status

INDICES_INFO=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?v&h=index,health,status,store.size&s=store.size:desc") log "Indices status:" echo "$INDICES_INFO" >> "$LOG_FILE"

Check ILM policy execution

ILM_STATUS=$(curl -s "${ES_HOST}/_ilm/policy/jaeger-ilm-policy" | jq -r '.jaeger-ilm-policy.policy') log "ILM policy active: $(echo $ILM_STATUS | jq -r 'if . then "Yes" else "No" end)')

Check total storage usage

TOTAL_SIZE=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?h=store.size" | \ sed 's/[a-zA-Z]//g' | awk '{sum += $1} END {print sum}') if [ "$TOTAL_SIZE" -gt "$ALERT_THRESHOLD_GB" ]; then log "WARNING: Total Jaeger storage ($TOTAL_SIZE GB) exceeds threshold ($ALERT_THRESHOLD_GB GB)" else log "Storage usage: ${TOTAL_SIZE} GB (within limits)" fi

Check for stuck indices

STUCK_INDICES=$(curl -s "${ES_HOST}/_ilm/explain/jaeger-*" | \ jq -r '.indices | to_entries[] | select(.value.step == "ERROR") | .key') if [ -n "$STUCK_INDICES" ]; then log "ERROR: Indices with ILM errors: $STUCK_INDICES" else log "All indices processing normally" fi

Create systemd timer for automated tasks

Configure systemd timers for automated archiving and monitoring to run without manual intervention.

[Unit]
Description=Jaeger Archive Service
Requires=elasticsearch.service
After=elasticsearch.service

[Service]
Type=oneshot
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/archive-traces.sh
StandardOutput=journal
StandardError=journal

Create archive timer

Configure the timer to run archiving daily at 2 AM when system load is typically low.

[Unit]
Description=Run Jaeger Archive Daily
Requires=jaeger-archive.service

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Create monitoring service and timer

Configure automated monitoring to run every 4 hours for continuous visibility into storage and performance.

[Unit]
Description=Jaeger Storage Monitor
Requires=elasticsearch.service
After=elasticsearch.service

[Service]
Type=oneshot
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/monitor-storage.sh
StandardOutput=journal
StandardError=journal

Create monitoring timer

Set up the monitoring timer to execute every 4 hours for regular health checks.

[Unit]
Description=Run Jaeger Storage Monitor Every 4 Hours
Requires=jaeger-monitor.service

[Timer]
OnCalendar=--* 00,04,08,12,16,20:00:00
Persistent=true

[Install]
WantedBy=timers.target

Enable and start all services

Enable and start all Jaeger services and timers to begin trace collection with automated retention management.

sudo systemctl daemon-reload
sudo systemctl enable --now jaeger-collector
sudo systemctl enable --now jaeger-query
sudo systemctl enable --now jaeger-archive.timer
sudo systemctl enable --now jaeger-monitor.timer
sudo systemctl status jaeger-collector
sudo systemctl status jaeger-query
Note: The setup includes comprehensive monitoring and archiving. For production environments, consider implementing alerting integration with your monitoring stack.

Verify your setup

# Check Elasticsearch cluster health
curl -s "localhost:9200/_cluster/health?pretty"

Verify ILM policy is active

curl -s "localhost:9200/_ilm/policy/jaeger-ilm-policy?pretty"

Check Jaeger services status

sudo systemctl status jaeger-collector sudo systemctl status jaeger-query

Access Jaeger UI

curl -s "http://localhost:16686/search"

Check indices and their lifecycle status

curl -s "localhost:9200/_cat/indices/jaeger-*?v&s=index"

Verify timers are active

sudo systemctl list-timers jaeger-*

Test archive script manually

sudo -u jaeger /opt/jaeger/archive-traces.sh

Check monitoring log

sudo tail -f /var/log/jaeger/storage-monitor.log

Monitor storage and performance

Configure storage alerts

Set up disk usage monitoring and alerts to prevent storage exhaustion before it impacts the system.

#!/bin/bash

Storage monitoring with alerts

ES_HOST="localhost:9200" MAX_INDEX_SIZE_GB=5 MAX_TOTAL_SIZE_GB=50 EMAIL_ALERT="admin@example.com"

Check individual index sizes

LARGE_INDICES=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?h=index,store.size" | \ awk -v max=$MAX_INDEX_SIZE_GB '$2 ~ /[0-9]+gb/ && $2+0 > max {print $1" ("$2")"}') if [ -n "$LARGE_INDICES" ]; then echo "Large indices detected: $LARGE_INDICES" | \ mail -s "Jaeger: Large indices alert" "$EMAIL_ALERT" fi

Monitor phase transitions

PHASE_INFO=$(curl -s "${ES_HOST}/_ilm/explain/jaeger-*" | \ jq -r '.indices | to_entries[] | "\(.key): \(.value.phase)"') echo "Current ILM phases:" echo "$PHASE_INFO"

Performance optimization settings

Adjust Elasticsearch settings for optimal Jaeger performance based on workload characteristics.

# Optimize for write-heavy workloads
curl -X PUT "localhost:9200/jaeger-span-*/_settings" -H 'Content-Type: application/json' -d'
{
  "index.refresh_interval": "10s",
  "index.number_of_replicas": 0,
  "index.translog.durability": "async",
  "index.translog.sync_interval": "30s"
}'

Configure cluster-level settings

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "transient": { "cluster.routing.allocation.disk.watermark.low": "85%", "cluster.routing.allocation.disk.watermark.high": "90%", "cluster.routing.allocation.disk.watermark.flood_stage": "95%", "indices.memory.index_buffer_size": "20%" } }'

Enable slow query logging

curl -X PUT "localhost:9200/jaeger-*/_settings" -H 'Content-Type: application/json' -d' { "index.search.slowlog.threshold.query.warn": "10s", "index.search.slowlog.threshold.fetch.warn": "1s" }'

Common issues

SymptomCauseFix
ILM policy not executingIndices not using write aliasesRecreate indices with proper aliases: curl -X PUT localhost:9200/jaeger-span-000001 -d '{"aliases":{"jaeger-span-write":{"is_write_index":true}}}'
High disk usage despite ILMDelete phase not configuredUpdate policy with delete phase: curl -X PUT localhost:9200/_ilm/policy/jaeger-ilm-policy -d '{"policy":{"phases":{"delete":{"min_age":"30d"}}}}'
Archive script failsMissing jq dependencyInstall jq: sudo apt install jq or sudo dnf install jq
Jaeger services won't startElasticsearch not readyCheck ES health: curl localhost:9200/_cluster/health
Slow query performanceToo many indices in warm phaseReduce warm phase duration: curl -X PUT localhost:9200/_ilm/policy/jaeger-ilm-policy -d '{"policy":{"phases":{"warm":{"min_age":"1d"}}}}'
Storage alerts not workingMail system not configuredInstall and configure postfix: sudo apt install postfix

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.