Learn to configure Jaeger data retention policies with Elasticsearch backend for automated trace archiving. This tutorial covers index lifecycle management, storage optimization, and performance monitoring to prevent disk space issues while maintaining observability requirements.
Prerequisites
- Root or sudo access
- 4GB+ RAM recommended
- 20GB+ disk space
- Elasticsearch 8.x compatible system
What this solves
Jaeger trace data grows rapidly in production environments, leading to storage exhaustion and performance degradation. This tutorial configures automated retention policies and archiving to maintain optimal Elasticsearch performance while preserving important trace data for compliance and debugging.
Step-by-step configuration
Install Elasticsearch and Jaeger
Start by installing Elasticsearch 8 and Jaeger with proper dependencies. This establishes the foundation for trace storage and retention management.
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt update
sudo apt install -y elasticsearch openjdk-11-jdk
wget https://github.com/jaegertracing/jaeger/releases/download/v1.50.0/jaeger-1.50.0-linux-amd64.tar.gz
tar -xzf jaeger-1.50.0-linux-amd64.tar.gz
sudo mv jaeger-1.50.0-linux-amd64/* /opt/jaeger/
sudo chmod +x /opt/jaeger/*
Configure Elasticsearch for Jaeger
Configure Elasticsearch with optimized settings for trace data storage and enable security features required for production deployments.
cluster.name: jaeger-cluster
node.name: jaeger-node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: localhost
http.port: 9200
discovery.type: single-node
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
action.auto_create_index: true
indices.requests.cache.size: 5%
indices.fielddata.cache.size: 20%
thread_pool.write.queue_size: 1000
cluster.routing.allocation.disk.watermark.low: 85%
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%
Configure JVM heap settings
Set appropriate JVM heap size for Elasticsearch based on available system memory. Use half of available RAM, maximum 32GB.
-Xms2g
-Xmx2g
Start and enable Elasticsearch
Start Elasticsearch and configure it to start automatically on system boot.
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
sudo systemctl status elasticsearch
Create Jaeger systemd services
Create systemd service files for Jaeger collector and query components with proper configuration for production use.
[Unit]
Description=Jaeger Collector
After=network.target elasticsearch.service
Requires=elasticsearch.service
[Service]
Type=simple
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/jaeger-collector \
--es.server-urls=http://localhost:9200 \
--es.num-shards=1 \
--es.num-replicas=0 \
--collector.grpc-port=14250 \
--collector.http-port=14268 \
--collector.zipkin.host-port=:9411
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Create Jaeger query service
Configure the Jaeger query service for web UI access and API queries with Elasticsearch backend.
[Unit]
Description=Jaeger Query
After=network.target elasticsearch.service
Requires=elasticsearch.service
[Service]
Type=simple
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/jaeger-query \
--es.server-urls=http://localhost:9200 \
--query.port=16686 \
--query.grpc-port=16685 \
--query.base-path=/
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Create jaeger user and set permissions
Create a dedicated user for Jaeger services and set appropriate permissions for security.
sudo useradd --system --no-create-home --shell /bin/false jaeger
sudo chown -R jaeger:jaeger /opt/jaeger
sudo chmod 755 /opt/jaeger
sudo chmod 755 /opt/jaeger/*
Configure index lifecycle management policy
Create an ILM policy for automatic trace data lifecycle management including hot, warm, cold, and delete phases.
curl -X PUT "localhost:9200/_ilm/policy/jaeger-ilm-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_size": "10gb",
"max_age": "1d",
"max_docs": 10000000
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "2d",
"actions": {
"set_priority": {
"priority": 50
},
"allocate": {
"number_of_replicas": 0
},
"forcemerge": {
"max_num_segments": 1
}
}
},
"cold": {
"min_age": "7d",
"actions": {
"set_priority": {
"priority": 0
},
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}'
sleep 2
Create index templates with ILM policy
Configure index templates for Jaeger spans and services that automatically apply the ILM policy to new indices.
curl -X PUT "localhost:9200/_index_template/jaeger-span-template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["jaeger-span-*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index.lifecycle.name": "jaeger-ilm-policy",
"index.lifecycle.rollover_alias": "jaeger-span-write",
"index.refresh_interval": "5s",
"index.translog.durability": "async"
},
"mappings": {
"properties": {
"traceID": { "type": "keyword" },
"spanID": { "type": "keyword" },
"parentSpanID": { "type": "keyword" },
"operationName": { "type": "keyword" },
"startTime": { "type": "long" },
"duration": { "type": "long" },
"process.serviceName": { "type": "keyword" },
"process.tags": { "type": "nested" }
}
}
},
"priority": 500,
"version": 1
}'
sleep 2
Create service template
Configure template for Jaeger service indices with appropriate lifecycle management.
curl -X PUT "localhost:9200/_index_template/jaeger-service-template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["jaeger-service-*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index.lifecycle.name": "jaeger-ilm-policy",
"index.refresh_interval": "30s"
},
"mappings": {
"properties": {
"serviceName": { "type": "keyword" },
"operationName": { "type": "keyword" },
"timestamp": { "type": "date" }
}
}
},
"priority": 500,
"version": 1
}'
sleep 2
Create initial indices and aliases
Create the initial indices with write aliases for the rollover functionality to work properly.
curl -X PUT "localhost:9200/jaeger-span-000001" -H 'Content-Type: application/json' -d'
{
"aliases": {
"jaeger-span-write": {
"is_write_index": true
}
}
}'
sleep 2
curl -X PUT "localhost:9200/jaeger-service-000001" -H 'Content-Type: application/json' -d'
{
"aliases": {
"jaeger-service-write": {
"is_write_index": true
}
}
}'
sleep 2
Configure archival storage
Create a script for automated archival of older trace data to compressed storage before deletion.
#!/bin/bash
Configuration
ES_HOST="localhost:9200"
ARCHIVE_DIR="/var/lib/jaeger/archive"
ARCHIVE_AGE_DAYS=7
COMPRESSION_LEVEL=9
Create archive directory
mkdir -p "$ARCHIVE_DIR"
Get indices older than archive age
OLD_DATE=$(date -d "$ARCHIVE_AGE_DAYS days ago" +%Y-%m-%d)
Function to archive index
archive_index() {
local index_name="$1"
local archive_file="${ARCHIVE_DIR}/${index_name}-$(date +%Y%m%d).json.gz"
echo "Archiving index: $index_name"
# Export index data
curl -s "${ES_HOST}/${index_name}/_search?scroll=10m&size=1000" \
-H 'Content-Type: application/json' \
-d '{"query":{"match_all":{}}}' | \
jq -c '.hits.hits[]._source' | \
gzip -$COMPRESSION_LEVEL > "$archive_file"
if [ $? -eq 0 ]; then
echo "Successfully archived $index_name to $archive_file"
return 0
else
echo "Failed to archive $index_name"
return 1
fi
}
Get list of indices to archive
INDICES=$(curl -s "${ES_HOST}/_cat/indices/jaeger-span-*?h=index,creation.date.string" | \
awk -v cutoff="$OLD_DATE" '$2 < cutoff {print $1}')
Archive each old index
for index in $INDICES; do
archive_index "$index"
done
Clean up archives older than 90 days
find "$ARCHIVE_DIR" -name "*.json.gz" -mtime +90 -delete
echo "Archive process completed"
Make archive script executable and set permissions
Set proper permissions for the archive script and create the archive directory structure.
sudo chmod +x /opt/jaeger/archive-traces.sh
sudo mkdir -p /var/lib/jaeger/archive
sudo chown -R jaeger:jaeger /var/lib/jaeger
sudo chmod 755 /var/lib/jaeger
sudo chmod 755 /var/lib/jaeger/archive
Create monitoring script
Create a monitoring script to track storage usage and ILM policy execution for operational visibility.
#!/bin/bash
Configuration
ES_HOST="localhost:9200"
LOG_FILE="/var/log/jaeger/storage-monitor.log"
ALERT_THRESHOLD_GB=80
Create log directory
sudo mkdir -p /var/log/jaeger
sudo chown jaeger:jaeger /var/log/jaeger
Function to log with timestamp
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
Check cluster health
CLUSTER_STATUS=$(curl -s "${ES_HOST}/_cluster/health" | jq -r '.status')
log "Cluster status: $CLUSTER_STATUS"
Check indices status
INDICES_INFO=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?v&h=index,health,status,store.size&s=store.size:desc")
log "Indices status:"
echo "$INDICES_INFO" >> "$LOG_FILE"
Check ILM policy execution
ILM_STATUS=$(curl -s "${ES_HOST}/_ilm/policy/jaeger-ilm-policy" | jq -r '.jaeger-ilm-policy.policy')
log "ILM policy active: $(echo $ILM_STATUS | jq -r 'if . then "Yes" else "No" end)')
Check total storage usage
TOTAL_SIZE=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?h=store.size" | \
sed 's/[a-zA-Z]//g' | awk '{sum += $1} END {print sum}')
if [ "$TOTAL_SIZE" -gt "$ALERT_THRESHOLD_GB" ]; then
log "WARNING: Total Jaeger storage ($TOTAL_SIZE GB) exceeds threshold ($ALERT_THRESHOLD_GB GB)"
else
log "Storage usage: ${TOTAL_SIZE} GB (within limits)"
fi
Check for stuck indices
STUCK_INDICES=$(curl -s "${ES_HOST}/_ilm/explain/jaeger-*" | \
jq -r '.indices | to_entries[] | select(.value.step == "ERROR") | .key')
if [ -n "$STUCK_INDICES" ]; then
log "ERROR: Indices with ILM errors: $STUCK_INDICES"
else
log "All indices processing normally"
fi
Create systemd timer for automated tasks
Configure systemd timers for automated archiving and monitoring to run without manual intervention.
[Unit]
Description=Jaeger Archive Service
Requires=elasticsearch.service
After=elasticsearch.service
[Service]
Type=oneshot
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/archive-traces.sh
StandardOutput=journal
StandardError=journal
Create archive timer
Configure the timer to run archiving daily at 2 AM when system load is typically low.
[Unit]
Description=Run Jaeger Archive Daily
Requires=jaeger-archive.service
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
Create monitoring service and timer
Configure automated monitoring to run every 4 hours for continuous visibility into storage and performance.
[Unit]
Description=Jaeger Storage Monitor
Requires=elasticsearch.service
After=elasticsearch.service
[Service]
Type=oneshot
User=jaeger
Group=jaeger
ExecStart=/opt/jaeger/monitor-storage.sh
StandardOutput=journal
StandardError=journal
Create monitoring timer
Set up the monitoring timer to execute every 4 hours for regular health checks.
[Unit]
Description=Run Jaeger Storage Monitor Every 4 Hours
Requires=jaeger-monitor.service
[Timer]
OnCalendar=--* 00,04,08,12,16,20:00:00
Persistent=true
[Install]
WantedBy=timers.target
Enable and start all services
Enable and start all Jaeger services and timers to begin trace collection with automated retention management.
sudo systemctl daemon-reload
sudo systemctl enable --now jaeger-collector
sudo systemctl enable --now jaeger-query
sudo systemctl enable --now jaeger-archive.timer
sudo systemctl enable --now jaeger-monitor.timer
sudo systemctl status jaeger-collector
sudo systemctl status jaeger-query
Verify your setup
# Check Elasticsearch cluster health
curl -s "localhost:9200/_cluster/health?pretty"
Verify ILM policy is active
curl -s "localhost:9200/_ilm/policy/jaeger-ilm-policy?pretty"
Check Jaeger services status
sudo systemctl status jaeger-collector
sudo systemctl status jaeger-query
Access Jaeger UI
curl -s "http://localhost:16686/search"
Check indices and their lifecycle status
curl -s "localhost:9200/_cat/indices/jaeger-*?v&s=index"
Verify timers are active
sudo systemctl list-timers jaeger-*
Test archive script manually
sudo -u jaeger /opt/jaeger/archive-traces.sh
Check monitoring log
sudo tail -f /var/log/jaeger/storage-monitor.log
Monitor storage and performance
Configure storage alerts
Set up disk usage monitoring and alerts to prevent storage exhaustion before it impacts the system.
#!/bin/bash
Storage monitoring with alerts
ES_HOST="localhost:9200"
MAX_INDEX_SIZE_GB=5
MAX_TOTAL_SIZE_GB=50
EMAIL_ALERT="admin@example.com"
Check individual index sizes
LARGE_INDICES=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?h=index,store.size" | \
awk -v max=$MAX_INDEX_SIZE_GB '$2 ~ /[0-9]+gb/ && $2+0 > max {print $1" ("$2")"}')
if [ -n "$LARGE_INDICES" ]; then
echo "Large indices detected: $LARGE_INDICES" | \
mail -s "Jaeger: Large indices alert" "$EMAIL_ALERT"
fi
Monitor phase transitions
PHASE_INFO=$(curl -s "${ES_HOST}/_ilm/explain/jaeger-*" | \
jq -r '.indices | to_entries[] | "\(.key): \(.value.phase)"')
echo "Current ILM phases:"
echo "$PHASE_INFO"
Performance optimization settings
Adjust Elasticsearch settings for optimal Jaeger performance based on workload characteristics.
# Optimize for write-heavy workloads
curl -X PUT "localhost:9200/jaeger-span-*/_settings" -H 'Content-Type: application/json' -d'
{
"index.refresh_interval": "10s",
"index.number_of_replicas": 0,
"index.translog.durability": "async",
"index.translog.sync_interval": "30s"
}'
Configure cluster-level settings
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "85%",
"cluster.routing.allocation.disk.watermark.high": "90%",
"cluster.routing.allocation.disk.watermark.flood_stage": "95%",
"indices.memory.index_buffer_size": "20%"
}
}'
Enable slow query logging
curl -X PUT "localhost:9200/jaeger-*/_settings" -H 'Content-Type: application/json' -d'
{
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.fetch.warn": "1s"
}'
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| ILM policy not executing | Indices not using write aliases | Recreate indices with proper aliases: curl -X PUT localhost:9200/jaeger-span-000001 -d '{"aliases":{"jaeger-span-write":{"is_write_index":true}}}' |
| High disk usage despite ILM | Delete phase not configured | Update policy with delete phase: curl -X PUT localhost:9200/_ilm/policy/jaeger-ilm-policy -d '{"policy":{"phases":{"delete":{"min_age":"30d"}}}}' |
| Archive script fails | Missing jq dependency | Install jq: sudo apt install jq or sudo dnf install jq |
| Jaeger services won't start | Elasticsearch not ready | Check ES health: curl localhost:9200/_cluster/health |
| Slow query performance | Too many indices in warm phase | Reduce warm phase duration: curl -X PUT localhost:9200/_ilm/policy/jaeger-ilm-policy -d '{"policy":{"phases":{"warm":{"min_age":"1d"}}}}' |
| Storage alerts not working | Mail system not configured | Install and configure postfix: sudo apt install postfix |
Next steps
- Set up Jaeger high availability clustering with load balancing and failover
- Implement Elasticsearch 8 index lifecycle management and retention policies
- Setup centralized log aggregation with Elasticsearch 8, Logstash 8, and Kibana 8 (ELK Stack)
- Configure Jaeger authentication with OAuth2 and RBAC for enterprise security
- Setup Jaeger sampling strategies for high-volume production tracing
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
JAEGER_VERSION="1.50.0"
ES_VERSION="8.x"
JAEGER_USER="jaeger"
ARCHIVE_DIR="/var/lib/jaeger/archive"
LOG_DIR="/var/log/jaeger"
# Functions
log() {
echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
}
warn() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
cleanup() {
warn "Installation failed. Cleaning up..."
systemctl stop elasticsearch jaeger-collector jaeger-query 2>/dev/null || true
exit 1
}
trap cleanup ERR
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " -h, --help Show this help message"
echo " --es-host HOST Elasticsearch host (default: localhost)"
echo " --retention-days N Trace retention days (default: 7)"
exit 1
}
# Parse arguments
ES_HOST="localhost"
RETENTION_DAYS="7"
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
usage
;;
--es-host)
ES_HOST="$2"
shift 2
;;
--retention-days)
RETENTION_DAYS="$2"
shift 2
;;
*)
error "Unknown option: $1"
usage
;;
esac
done
# Check prerequisites
if [[ $EUID -ne 0 ]]; then
error "This script must be run as root"
exit 1
fi
# Auto-detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update"
JAVA_PACKAGE="openjdk-11-jdk"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf update -y"
JAVA_PACKAGE="java-11-openjdk"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum update -y"
JAVA_PACKAGE="java-11-openjdk"
;;
*)
error "Unsupported distribution: $ID"
exit 1
;;
esac
else
error "Cannot detect distribution"
exit 1
fi
log "[1/8] Installing prerequisites..."
$PKG_UPDATE
$PKG_INSTALL curl wget tar gzip jq
# Install Elasticsearch
log "[2/8] Installing Elasticsearch..."
if [[ "$PKG_MGR" == "apt" ]]; then
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/${ES_VERSION}/apt stable main" > /etc/apt/sources.list.d/elastic-${ES_VERSION}.list
apt update
$PKG_INSTALL elasticsearch $JAVA_PACKAGE
else
rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat > /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch]
name=Elasticsearch repository for ${ES_VERSION} packages
baseurl=https://artifacts.elastic.co/packages/${ES_VERSION}/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=0
autorefresh=1
type=rpm-md
EOF
$PKG_INSTALL --enablerepo=elasticsearch elasticsearch $JAVA_PACKAGE
fi
# Configure Elasticsearch
log "[3/8] Configuring Elasticsearch..."
cat > /etc/elasticsearch/elasticsearch.yml << EOF
cluster.name: jaeger-cluster
node.name: jaeger-node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 127.0.0.1
http.port: 9200
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
EOF
systemctl enable elasticsearch
systemctl start elasticsearch
# Install Jaeger
log "[4/8] Installing Jaeger..."
useradd -r -s /bin/false $JAEGER_USER 2>/dev/null || true
mkdir -p /opt/jaeger
wget -O /tmp/jaeger.tar.gz "https://github.com/jaegertracing/jaeger/releases/download/v${JAEGER_VERSION}/jaeger-${JAEGER_VERSION}-linux-amd64.tar.gz"
tar -xzf /tmp/jaeger.tar.gz -C /tmp
cp /tmp/jaeger-${JAEGER_VERSION}-linux-amd64/* /opt/jaeger/
chown -R $JAEGER_USER:$JAEGER_USER /opt/jaeger
chmod 755 /opt/jaeger/*
# Create Jaeger systemd services
log "[5/8] Creating Jaeger services..."
cat > /etc/systemd/system/jaeger-collector.service << EOF
[Unit]
Description=Jaeger Collector
After=elasticsearch.service
Requires=elasticsearch.service
[Service]
Type=simple
User=$JAEGER_USER
Group=$JAEGER_USER
ExecStart=/opt/jaeger/jaeger-collector --es.server-urls=http://$ES_HOST:9200
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
cat > /etc/systemd/system/jaeger-query.service << EOF
[Unit]
Description=Jaeger Query
After=elasticsearch.service
Requires=elasticsearch.service
[Service]
Type=simple
User=$JAEGER_USER
Group=$JAEGER_USER
ExecStart=/opt/jaeger/jaeger-query --es.server-urls=http://$ES_HOST:9200
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable jaeger-collector jaeger-query
systemctl start jaeger-collector jaeger-query
# Configure ILM policy
log "[6/8] Configuring ILM policy..."
sleep 10 # Wait for Elasticsearch to be ready
curl -X PUT "http://$ES_HOST:9200/_ilm/policy/jaeger-ilm-policy" -H "Content-Type: application/json" -d '{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "1gb",
"max_age": "1d"
}
}
},
"warm": {
"min_age": "2d",
"actions": {
"shrink": {
"number_of_shards": 1
}
}
},
"cold": {
"min_age": "'$RETENTION_DAYS'd",
"actions": {
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "30d"
}
}
}
}'
# Create archive script
log "[7/8] Creating archive and monitoring scripts..."
mkdir -p $ARCHIVE_DIR $LOG_DIR
chown -R $JAEGER_USER:$JAEGER_USER /var/lib/jaeger /var/log/jaeger
chmod 755 /var/lib/jaeger $ARCHIVE_DIR $LOG_DIR
cat > /opt/jaeger/archive-traces.sh << 'EOF'
#!/bin/bash
set -euo pipefail
ES_HOST="localhost:9200"
ARCHIVE_DIR="/var/lib/jaeger/archive"
RETENTION_DAYS=7
OLD_DATE=$(date -d "$RETENTION_DAYS days ago" '+%Y-%m-%d')
archive_index() {
local index_name="$1"
local archive_file="$ARCHIVE_DIR/${index_name}-$(date +%Y%m%d).json.gz"
curl -s "$ES_HOST/$index_name/_search" | gzip > "$archive_file"
if [ $? -eq 0 ]; then
echo "Successfully archived $index_name to $archive_file"
return 0
else
echo "Failed to archive $index_name"
return 1
fi
}
INDICES=$(curl -s "${ES_HOST}/_cat/indices/jaeger-span-*?h=index,creation.date.string" | \
awk -v cutoff="$OLD_DATE" '$2 < cutoff {print $1}')
for index in $INDICES; do
archive_index "$index"
done
find "$ARCHIVE_DIR" -name "*.json.gz" -mtime +90 -delete
echo "Archive process completed"
EOF
chmod 755 /opt/jaeger/archive-traces.sh
chown $JAEGER_USER:$JAEGER_USER /opt/jaeger/archive-traces.sh
# Create monitoring script
cat > /opt/jaeger/storage-monitor.sh << 'EOF'
#!/bin/bash
set -euo pipefail
ES_HOST="localhost:9200"
LOG_FILE="/var/log/jaeger/storage-monitor.log"
ALERT_THRESHOLD_GB=80
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
CLUSTER_STATUS=$(curl -s "${ES_HOST}/_cluster/health" | jq -r '.status')
log "Cluster status: $CLUSTER_STATUS"
INDICES_INFO=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?v&h=index,health,status,store.size&s=store.size:desc")
log "Indices status:"
echo "$INDICES_INFO" >> "$LOG_FILE"
TOTAL_SIZE=$(curl -s "${ES_HOST}/_cat/indices/jaeger-*?h=store.size" | \
sed 's/[a-zA-Z]//g' | awk '{sum += $1} END {print sum}')
if [ "${TOTAL_SIZE:-0}" -gt "$ALERT_THRESHOLD_GB" ]; then
log "WARNING: Total Jaeger storage ($TOTAL_SIZE GB) exceeds threshold ($ALERT_THRESHOLD_GB GB)"
else
log "Storage usage: ${TOTAL_SIZE:-0} GB (within limits)"
fi
EOF
chmod 755 /opt/jaeger/storage-monitor.sh
chown $JAEGER_USER:$JAEGER_USER /opt/jaeger/storage-monitor.sh
# Create systemd timers
log "[8/8] Creating systemd timers..."
cat > /etc/systemd/system/jaeger-archive.service << EOF
[Unit]
Description=Jaeger Archive Service
After=elasticsearch.service
[Service]
Type=oneshot
User=$JAEGER_USER
Group=$JAEGER_USER
ExecStart=/opt/jaeger/archive-traces.sh
EOF
cat > /etc/systemd/system/jaeger-archive.timer << EOF
[Unit]
Description=Run Jaeger Archive Daily
Requires=jaeger-archive.service
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
EOF
cat > /etc/systemd/system/jaeger-monitor.service << EOF
[Unit]
Description=Jaeger Storage Monitor
After=elasticsearch.service
[Service]
Type=oneshot
User=$JAEGER_USER
Group=$JAEGER_USER
ExecStart=/opt/jaeger/storage-monitor.sh
EOF
cat > /etc/systemd/system/jaeger-monitor.timer << EOF
[Unit]
Description=Run Jaeger Monitor Hourly
Requires=jaeger-monitor.service
[Timer]
OnCalendar=hourly
Persistent=true
[Install]
WantedBy=timers.target
EOF
systemctl daemon-reload
systemctl enable jaeger-archive.timer jaeger-monitor.timer
systemctl start jaeger-archive.timer jaeger-monitor.timer
# Verification
log "Verifying installation..."
sleep 5
if systemctl is-active --quiet elasticsearch; then
log "✓ Elasticsearch is running"
else
error "✗ Elasticsearch is not running"
fi
if systemctl is-active --quiet jaeger-collector; then
log "✓ Jaeger Collector is running"
else
error "✗ Jaeger Collector is not running"
fi
if systemctl is-active --quiet jaeger-query; then
log "✓ Jaeger Query is running"
else
error "✗ Jaeger Query is not running"
fi
if curl -s "http://$ES_HOST:9200/_cluster/health" | jq -e '.status' >/dev/null 2>&1; then
log "✓ Elasticsearch cluster is accessible"
else
error "✗ Elasticsearch cluster is not accessible"
fi
log "Installation completed successfully!"
log "Jaeger Query UI: http://localhost:16686"
log "Elasticsearch: http://$ES_HOST:9200"
log "Archive directory: $ARCHIVE_DIR"
log "Log directory: $LOG_DIR"
Review the script before running. Execute with: bash install.sh