Set up ELK stack alerting with Watcher and email notifications for monitoring and incident response

Intermediate 45 min Apr 17, 2026 14 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure Elasticsearch Watcher to monitor log data and automatically send email alerts when critical system events occur. Create sophisticated alert conditions, manage email notification templates, and set up automated incident response workflows.

Prerequisites

  • Elasticsearch 8.x cluster with valid license
  • SMTP email account or service
  • Log data indexed in Elasticsearch
  • Basic knowledge of Elasticsearch queries

What this solves

Elasticsearch Watcher enables proactive monitoring by automatically detecting anomalies, errors, and critical events in your log data. Instead of manually checking dashboards, Watcher runs scheduled queries and triggers email notifications when predefined conditions are met. This tutorial shows you how to configure email alerts for system failures, security incidents, and performance degradation.

Step-by-step configuration

Enable Watcher in Elasticsearch

Watcher is included with Elasticsearch but requires a license. Enable it in your elasticsearch.yml configuration.

xpack.watcher.enabled: true

Restart Elasticsearch to apply the changes.

sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Configure email account settings

Set up SMTP configuration for sending email notifications. Add these settings to your elasticsearch.yml file.

xpack.notification.email:
  account:
    smtp_account:
      profile: gmail
      email_defaults:
        from: 'alerts@example.com'
      smtp:
        auth: true
        starttls.enable: true
        host: smtp.gmail.com
        port: 587
        user: 'alerts@example.com'
        password: 'app_specific_password'

Create keystore for email credentials

Store sensitive email credentials in Elasticsearch keystore instead of plain text configuration files.

sudo /usr/share/elasticsearch/bin/elasticsearch-keystore create
sudo /usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.notification.email.account.smtp_account.smtp.password

Set appropriate permissions on the keystore file.

sudo chown elasticsearch:elasticsearch /etc/elasticsearch/elasticsearch.keystore
sudo chmod 660 /etc/elasticsearch/elasticsearch.keystore

Restart Elasticsearch with new configuration

Apply the email configuration changes by restarting the service.

sudo systemctl restart elasticsearch

Verify Watcher is running and email account is configured.

curl -X GET "localhost:9200/_watcher/stats?pretty"
curl -X POST "localhost:9200/_watcher/_email_account/smtp_account/_simulate?pretty" -H 'Content-Type: application/json' -d '{
  "email": {
    "to": ["admin@example.com"],
    "subject": "Test Email",
    "body": "This is a test email from Elasticsearch Watcher"
  }
}'

Create a basic error monitoring watch

Set up a watch that monitors for ERROR level log entries and sends email alerts when they occur frequently.

curl -X PUT "localhost:9200/_watcher/watch/error_monitor?pretty" -H 'Content-Type: application/json' -d '{
  "trigger": {
    "schedule": {
      "interval": "5m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": ["logstash-", "filebeat-"],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "match": {
                    "log.level": "ERROR"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-5m"
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "error_count": {
              "terms": {
                "field": "host.name.keyword",
                "size": 10
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "payload.hits.total": {
        "gt": 10
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "account": "smtp_account",
        "to": ["sysadmin@example.com"],
        "subject": "High Error Rate Detected - {{payload.hits.total}} errors in 5 minutes",
        "body": {
          "html": "

Error Alert

Detected {{payload.hits.total}} ERROR level log entries in the last 5 minutes.

Top Affected Hosts:

    {{#payload.aggregations.error_count.buckets}}
  • {{key}}: {{doc_count}} errors
  • {{/payload.aggregations.error_count.buckets}}

Please investigate immediately.

" } } } } }'

Create a disk space monitoring watch

Monitor system disk usage and alert when disk space exceeds critical thresholds.

curl -X PUT "localhost:9200/_watcher/watch/disk_space_monitor?pretty" -H 'Content-Type: application/json' -d '{
  "trigger": {
    "schedule": {
      "interval": "10m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": ["metricbeat-*"],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "match": {
                    "metricset.name": "filesystem"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-10m"
                    }
                  }
                },
                {
                  "range": {
                    "system.filesystem.used.pct": {
                      "gte": 0.85
                    }
                  }
                }
              ]
            }
          },
          "sort": [
            {
              "system.filesystem.used.pct": {
                "order": "desc"
              }
            }
          ],
          "size": 10
        }
      }
    }
  },
  "condition": {
    "compare": {
      "payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "profile": "standard",
        "account": "smtp_account",
        "to": ["sysadmin@example.com"],
        "subject": "Critical Disk Space Alert - {{payload.hits.total}} filesystems above 85%",
        "body": {
          "html": "

Disk Space Warning

The following filesystems have exceeded 85% usage:

{{#payload.hits.hits}}{{/payload.hits.hits}}
HostMount PointUsage %Available
{{_source.host.name}}{{_source.system.filesystem.mount_point}}{{_source.system.filesystem.used.pct}}%{{_source.system.filesystem.available}}

Please free up disk space immediately.

" } } } } }'

Create a failed login monitoring watch

Monitor authentication logs for failed login attempts and detect potential brute force attacks.

curl -X PUT "localhost:9200/_watcher/watch/failed_login_monitor?pretty" -H 'Content-Type: application/json' -d '{
  "trigger": {
    "schedule": {
      "interval": "2m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": ["filebeat-*"],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "match_phrase": {
                    "message": "authentication failure"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-2m"
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "failed_by_ip": {
              "terms": {
                "field": "source.ip.keyword",
                "size": 10,
                "min_doc_count": 5
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "payload.aggregations.failed_by_ip.buckets.0.doc_count": {
        "gte": 5
      }
    }
  },
  "actions": {
    "send_security_alert": {
      "email": {
        "profile": "standard",
        "account": "smtp_account",
        "to": ["security@example.com"],
        "subject": "Security Alert - Potential Brute Force Attack Detected",
        "priority": "high",
        "body": {
          "html": "

Security Alert

Potential brute force attack detected!

Multiple failed login attempts from the same IP addresses:

    {{#payload.aggregations.failed_by_ip.buckets}}
  • {{key}}: {{doc_count}} failed attempts
  • {{/payload.aggregations.failed_by_ip.buckets}}

Total failed attempts: {{payload.hits.total}}

Please investigate and consider blocking these IP addresses.

" } } } } }'

Set up service availability monitoring

Create a watch that monitors service health checks and alerts when critical services become unavailable.

curl -X PUT "localhost:9200/_watcher/watch/service_health_monitor?pretty" -H 'Content-Type: application/json' -d '{
  "trigger": {
    "schedule": {
      "interval": "3m"
    }
  },
  "input": {
    "search": {
      "request": {
        "search_type": "query_then_fetch",
        "indices": ["heartbeat-*"],
        "rest_total_hits_as_int": true,
        "body": {
          "query": {
            "bool": {
              "must": [
                {
                  "match": {
                    "monitor.status": "down"
                  }
                },
                {
                  "range": {
                    "@timestamp": {
                      "gte": "now-3m"
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "down_services": {
              "terms": {
                "field": "monitor.name.keyword",
                "size": 20
              },
              "aggs": {
                "latest_error": {
                  "top_hits": {
                    "size": 1,
                    "sort": [
                      {
                        "@timestamp": {
                          "order": "desc"
                        }
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {
      "payload.hits.total": {
        "gt": 0
      }
    }
  },
  "actions": {
    "send_service_alert": {
      "email": {
        "profile": "standard",
        "account": "smtp_account",
        "to": ["oncall@example.com"],
        "subject": "Service Down Alert - {{payload.aggregations.down_services.buckets.size}} services unavailable",
        "priority": "high",
        "body": {
          "html": "

Service Availability Alert

The following services are currently down:

{{#payload.aggregations.down_services.buckets}}{{/payload.aggregations.down_services.buckets}}
ServiceError CountURLError Message
{{key}}{{doc_count}}{{latest_error.hits.hits.0._source.url.full}}{{latest_error.hits.hits.0._source.error.message}}

Immediate investigation required.

" } } } } }'

Configure watch throttling and scheduling

Prevent alert spam by adding throttling periods to your watches. Update existing watches with throttling configuration.

curl -X POST "localhost:9200/_watcher/watch/error_monitor/_update?pretty" -H 'Content-Type: application/json' -d '{
  "doc": {
    "throttle_period": "15m",
    "actions": {
      "send_email": {
        "throttle_period": "30m",
        "email": {
          "profile": "standard",
          "account": "smtp_account",
          "to": ["sysadmin@example.com"],
          "subject": "High Error Rate Detected - {{payload.hits.total}} errors in 5 minutes",
          "body": {
            "html": "

Error Alert

Detected {{payload.hits.total}} ERROR level log entries in the last 5 minutes.

Top Affected Hosts:

    {{#payload.aggregations.error_count.buckets}}
  • {{key}}: {{doc_count}} errors
  • {{/payload.aggregations.error_count.buckets}}

Please investigate immediately.

This alert will not repeat for 30 minutes.

" } } } } } }'

Verify your setup

Test your Watcher configuration and email functionality with these verification commands.

# Check Watcher status
curl -X GET "localhost:9200/_watcher/stats?pretty"

List all configured watches

curl -X GET "localhost:9200/_watcher/watch/_all?pretty"

Execute a watch manually for testing

curl -X POST "localhost:9200/_watcher/watch/error_monitor/_execute?pretty"

Check watch execution history

curl -X GET "localhost:9200/.watcher-history-*/_search?pretty" -H 'Content-Type: application/json' -d '{ "query": { "match_all": {} }, "sort": [ { "result.execution_time": { "order": "desc" } } ], "size": 5 }'

Test email account configuration

curl -X POST "localhost:9200/_watcher/_email_account/smtp_account/_simulate?pretty" -H 'Content-Type: application/json' -d '{ "email": { "to": ["test@example.com"], "subject": "Watcher Test Email", "body": "This email confirms Watcher is configured correctly." } }'
Note: Monitor your email provider's sending limits and consider using dedicated SMTP services like SendGrid or Amazon SES for production environments to ensure reliable alert delivery.

Common issues

SymptomCauseFix
Watcher not executing License expired or missing Check license status with curl -X GET "localhost:9200/_license" and install valid license
Emails not sending SMTP authentication failure Verify credentials in keystore and test with _email_account/_simulate
Too many alert emails Missing throttle configuration Add throttle_period to watch and action definitions
Watch condition never triggers Index patterns or query syntax errors Test query manually in Kibana Dev Tools and verify index names
Email template rendering fails Mustache syntax errors Use _execute API to test templates and check watch history for errors
Watch execution timeouts Complex queries on large datasets Optimize queries with proper time ranges and field filtering

Next steps

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.