Monitor System Time Drift with Prometheus Grafana

Set up comprehensive time synchronization monitoring with Prometheus node exporter metrics, Grafana dashboards, and automated alerting to prevent system clock drift issues in production environments.

Prerequisites

Root access to target servers
Basic knowledge of Prometheus and Grafana
Understanding of NTP and time synchronization concepts
Network access to NTP servers (UDP port 123)

What this solves

System time drift can cause authentication failures, log correlation issues, and database consistency problems in distributed systems. This tutorial shows you how to monitor time synchronization health across your infrastructure using Prometheus metrics and Grafana alerts, with automatic notifications when clocks drift beyond acceptable thresholds.

Step-by-step configuration

Install and configure Prometheus node exporter

Node exporter provides time-related metrics including clock offset and NTP synchronization status. Install it first to start collecting time metrics.

wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz
sudo cp node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo useradd --no-create-home --shell /bin/false node_exporter
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz
sudo cp node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo useradd --no-create-home --shell /bin/false node_exporter
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

Create systemd service for node exporter

Configure node exporter to run as a system service with time collector enabled. This ensures continuous collection of time synchronization metrics.

[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter --collector.systemd --collector.ntp --collector.time
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Enable and start node exporter

Start the service and verify it's exposing time metrics on port 9100.

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter
sudo systemctl status node_exporter

Install and configure chrony for NTP

Install chrony to provide accurate time synchronization and enable detailed time metrics collection.

sudo apt update
sudo apt install -y chrony

sudo dnf install -y chrony

Configure chrony with monitoring settings

Enable statistics and detailed logging for better time drift monitoring and troubleshooting.

# Public NTP servers
pool 2.pool.ntp.org iburst
pool 1.pool.ntp.org iburst
pool 0.pool.ntp.org iburst

Record statistics
driftfile /var/lib/chrony/chrony.drift
dumpdir /var/lib/chrony
logdir /var/log/chrony
log statistics measurements tracking

Maximum allowed offset
maxupdateskew 100.0

Enable command port for monitoring
cmdport 323
cmdallow 127.0.0.1

Step clock if offset is larger than 1 second
makestep 1.0 3

Enable RTC synchronization
rtcsync

Start chrony service

Enable and start chrony to begin time synchronization.

sudo systemctl enable --now chrony
sudo systemctl status chrony

Configure Prometheus to scrape time metrics

Add the node exporter target to your Prometheus configuration to collect time-related metrics.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "time_drift_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

scrape_configs:
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
    scrape_interval: 10s
    metrics_path: /metrics

Create Prometheus alerting rules for time drift

Define alert rules that trigger when system clocks drift beyond acceptable thresholds or NTP synchronization fails.

groups:
  - name: time_drift_alerts
    rules:
    - alert: ClockDriftHigh
      expr: abs(node_timex_offset_seconds) > 0.05
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "System clock drift detected on {{ $labels.instance }}"
        description: "Clock offset is {{ $value }}s, exceeding 50ms threshold"
    
    - alert: ClockDriftCritical
      expr: abs(node_timex_offset_seconds) > 0.5
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "Critical clock drift on {{ $labels.instance }}"
        description: "Clock offset is {{ $value }}s, exceeding 500ms threshold"
    
    - alert: NTPSyncLost
      expr: node_timex_sync_status != 1
      for: 3m
      labels:
        severity: critical
      annotations:
        summary: "NTP synchronization lost on {{ $labels.instance }}"
        description: "System clock is not synchronized with NTP servers"
    
    - alert: TimeServerUnreachable
      expr: node_ntp_stratum == 16
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "NTP servers unreachable on {{ $labels.instance }}"
        description: "System cannot reach configured NTP servers"

Install Alertmanager for notifications

Set up Alertmanager to handle time drift alerts and send notifications via email or Slack.

wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.26.0.linux-amd64.tar.gz
sudo cp alertmanager-0.26.0.linux-amd64/alertmanager /usr/local/bin/
sudo cp alertmanager-0.26.0.linux-amd64/amtool /usr/local/bin/
sudo mkdir -p /etc/alertmanager /var/lib/alertmanager
sudo useradd --no-create-home --shell /bin/false alertmanager
sudo chown -R alertmanager:alertmanager /etc/alertmanager /var/lib/alertmanager

wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.26.0.linux-amd64.tar.gz
sudo cp alertmanager-0.26.0.linux-amd64/alertmanager /usr/local/bin/
sudo cp alertmanager-0.26.0.linux-amd64/amtool /usr/local/bin/
sudo mkdir -p /etc/alertmanager /var/lib/alertmanager
sudo useradd --no-create-home --shell /bin/false alertmanager
sudo chown -R alertmanager:alertmanager /etc/alertmanager /var/lib/alertmanager

Configure Alertmanager for time drift notifications

Set up notification channels and routing for time drift alerts with appropriate escalation.

global:
  smtp_smarthost: 'localhost:587'
  smtp_from: 'alerts@example.com'

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'time-drift-alerts'
  routes:
  - match:
      severity: critical
    receiver: 'critical-alerts'
    repeat_interval: 15m

receivers:
name: 'time-drift-alerts'  email_configs:
  - to: 'ops-team@example.com'
    subject: 'Time Drift Alert: {{ .GroupLabels.alertname }}'
    body: |
      {{ range .Alerts }}
      Alert: {{ .Annotations.summary }}
      Description: {{ .Annotations.description }}
      Instance: {{ .Labels.instance }}
      Severity: {{ .Labels.severity }}
      {{ end }}

name: 'critical-alerts'  email_configs:
  - to: 'critical-ops@example.com'
    subject: 'CRITICAL: Time Drift Alert'
    body: |
      {{ range .Alerts }}
      CRITICAL TIME DRIFT DETECTED
      Alert: {{ .Annotations.summary }}
      Description: {{ .Annotations.description }}
      Instance: {{ .Labels.instance }}
      {{ end }}
  slack_configs:
  - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
    channel: '#alerts'
    title: 'Critical Time Drift Alert'
    text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'

Create Alertmanager systemd service

Configure Alertmanager to run as a system service for reliable alert handling.

[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
WorkingDirectory=/etc/alertmanager
ExecStart=/usr/local/bin/alertmanager --config.file=/etc/alertmanager/alertmanager.yml --storage.path=/var/lib/alertmanager
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Create Grafana dashboard for time drift visualization

Import a comprehensive dashboard to visualize time synchronization metrics and trends.

{
  "dashboard": {
    "id": null,
    "title": "System Time Drift Monitoring",
    "tags": ["time", "ntp", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "Clock Offset",
        "type": "stat",
        "targets": [
          {
            "expr": "node_timex_offset_seconds * 1000",
            "legendFormat": "Offset (ms)"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "thresholds": {
              "steps": [
                {"color": "green", "value": null},
                {"color": "yellow", "value": 50},
                {"color": "red", "value": 500}
              ]
            }
          }
        },
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
      },
      {
        "id": 2,
        "title": "NTP Synchronization Status",
        "type": "stat",
        "targets": [
          {
            "expr": "node_timex_sync_status",
            "legendFormat": "Sync Status"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "mappings": [
              {"options": {"0": {"text": "Not Synced", "color": "red"}}},
              {"options": {"1": {"text": "Synced", "color": "green"}}}
            ]
          }
        },
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
      },
      {
        "id": 3,
        "title": "Clock Offset Over Time",
        "type": "graph",
        "targets": [
          {
            "expr": "node_timex_offset_seconds * 1000",
            "legendFormat": "Clock Offset (ms)"
          }
        ],
        "yAxes": [
          {"label": "Milliseconds"},
          {"show": false}
        ],
        "gridPos": {"h": 9, "w": 24, "x": 0, "y": 8}
      }
    ],
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "refresh": "30s"
  }
}

Import dashboard into Grafana

Use the Grafana API to import the time drift monitoring dashboard.

curl -X POST \
  http://admin:admin@localhost:3000/api/dashboards/db \
  -H 'Content-Type: application/json' \
  -d @/tmp/time_drift_dashboard.json

Start all services

Enable and start all monitoring services to begin time drift detection.

sudo systemctl enable --now prometheus
sudo systemctl enable --now alertmanager
sudo systemctl enable --now grafana-server

Configure alert escalation policies

Set up escalation rules for persistent time drift issues that require immediate attention.

route:
  receiver: 'default'
  routes:
  - match:
      alertname: ClockDriftCritical
    receiver: 'critical-escalation'
    continue: true
    routes:
    - match:
        severity: critical
      receiver: 'pager-duty'
      repeat_interval: 5m
      group_wait: 0s

receivers:
name: 'critical-escalation'  webhook_configs:
  - url: 'https://api.pagerduty.com/integration/YOUR-KEY/enqueue'
    send_resolved: true

Verify your setup

Check that all components are running and collecting time metrics properly.

# Verify node exporter is exposing time metrics
curl -s localhost:9100/metrics | grep -E "(timex_offset|timex_sync)"

Check chrony synchronization status
chronyc tracking
chronyc sources -v

Verify Prometheus is scraping metrics
curl -s "localhost:9090/api/v1/query?query=node_timex_offset_seconds"

Test alert rules
curl -s "localhost:9090/api/v1/rules" | jq '.data.groups[].rules[].name'

Check Alertmanager status
curl -s localhost:9093/api/v1/status | jq

Verify Grafana dashboard
curl -s -u admin:admin "localhost:3000/api/dashboards/uid/time-drift"

Note: Time drift monitoring requires at least 5-10 minutes of data collection before accurate trends appear in Grafana dashboards.

Common issues

Symptom	Cause	Fix
No time metrics in Prometheus	Node exporter not running or misconfigured	`sudo systemctl restart node_exporter` and check `--collector.ntp` flag
Clock drift alerts not firing	Alert rules not loaded or thresholds too high	Verify rules with `promtool check rules time_drift_rules.yml`
NTP sync status shows 0	Chrony not synchronizing with time servers	Check firewall rules for UDP 123 and verify NTP pool connectivity
Alertmanager not sending emails	SMTP configuration incorrect	Test with `amtool config check` and verify SMTP settings
Grafana dashboard shows no data	Data source not configured or wrong query	Verify Prometheus data source URL and test queries manually
High clock drift on VM	Hypervisor time synchronization disabled	Enable VMware Tools time sync or Hyper-V time integration services

Advanced configuration

Fine-tune your time monitoring setup for different environments and use cases. You can configure multiple NTP sources, set custom drift thresholds based on your application requirements, and integrate with existing monitoring systems. For high-precision applications, consider using hardware time sources and implementing stepped time correction policies. The monitoring system can also be extended to track time server performance and automatically switch between time sources during outages.

Next steps

Configure NGINX monitoring with Prometheus and Grafana dashboards for comprehensive web server monitoring
Configure Prometheus Blackbox Exporter for endpoint monitoring to monitor service availability
Configure backup monitoring with Prometheus and Grafana for infrastructure oversight
Implement Grafana advanced alerting with webhooks for notification integration
Configure Linux system time synchronization with chrony and NTP hardening for security best practices

Running this in production?

Want this handled for you? Setting up monitoring once is straightforward. Keeping it patched, monitored, backed up and tuned across environments is the harder part. See how we run infrastructure like this for European SaaS and e-commerce teams.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

# Configuration
NODE_EXPORTER_VERSION="1.7.0"
PROMETHEUS_PORT="9090"
NODE_EXPORTER_PORT="9100"

# Usage function
usage() {
    echo "Usage: $0 [--prometheus-host HOSTNAME] [--no-prometheus]"
    echo "  --prometheus-host: Hostname/IP where Prometheus is running (default: localhost)"
    echo "  --no-prometheus: Skip Prometheus configuration steps"
    exit 1
}

# Parse arguments
PROMETHEUS_HOST="localhost"
SKIP_PROMETHEUS=false

while [[ $# -gt 0 ]]; do
    case $1 in
        --prometheus-host)
            PROMETHEUS_HOST="$2"
            shift 2
            ;;
        --no-prometheus)
            SKIP_PROMETHEUS=true
            shift
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo -e "${RED}Unknown option: $1${NC}"
            usage
            ;;
    esac
done

# Cleanup function for rollback
cleanup() {
    if [[ $? -ne 0 ]]; then
        echo -e "${RED}Installation failed. Cleaning up...${NC}"
        systemctl stop node_exporter 2>/dev/null || true
        systemctl disable node_exporter 2>/dev/null || true
        rm -f /etc/systemd/system/node_exporter.service
        rm -f /usr/local/bin/node_exporter
        userdel node_exporter 2>/dev/null || true
    fi
}
trap cleanup ERR

# Check prerequisites
echo -e "${YELLOW}[1/8] Checking prerequisites...${NC}"
if [[ $EUID -ne 0 ]]; then
    echo -e "${RED}This script must be run as root${NC}"
    exit 1
fi

# Detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            PKG_UPDATE="apt update"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            PKG_UPDATE="dnf check-update || true"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            PKG_UPDATE="yum check-update || true"
            ;;
        *)
            echo -e "${RED}Unsupported distribution: $ID${NC}"
            exit 1
            ;;
    esac
else
    echo -e "${RED}Cannot detect distribution${NC}"
    exit 1
fi

echo -e "${GREEN}Distribution detected: $PRETTY_NAME${NC}"

# Update package manager
echo -e "${YELLOW}[2/8] Updating package manager...${NC}"
$PKG_UPDATE

# Install required packages
echo -e "${YELLOW}[3/8] Installing required packages...${NC}"
$PKG_INSTALL wget tar chrony curl

# Download and install node exporter
echo -e "${YELLOW}[4/8] Installing Node Exporter...${NC}"
cd /tmp
wget -q "https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXPORTER_VERSION}/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz"
tar xzf "node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz"
cp "node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64/node_exporter" /usr/local/bin/
chown root:root /usr/local/bin/node_exporter
chmod 755 /usr/local/bin/node_exporter

# Create node_exporter user
if ! id "node_exporter" &>/dev/null; then
    useradd --no-create-home --shell /bin/false node_exporter
fi

# Create systemd service
echo -e "${YELLOW}[5/8] Creating systemd service...${NC}"
cat > /etc/systemd/system/node_exporter.service << 'EOF'
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter --collector.systemd --collector.ntp --collector.time
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

chmod 644 /etc/systemd/system/node_exporter.service
systemctl daemon-reload
systemctl enable node_exporter
systemctl start node_exporter

# Configure chrony
echo -e "${YELLOW}[6/8] Configuring chrony...${NC}"
cat > /etc/chrony.conf << 'EOF'
# Public NTP servers
pool 2.pool.ntp.org iburst
pool 1.pool.ntp.org iburst
pool 0.pool.ntp.org iburst

# Record statistics
driftfile /var/lib/chrony/chrony.drift
dumpdir /var/lib/chrony
logdir /var/log/chrony
log statistics measurements tracking

# Maximum allowed offset
maxupdateskew 100.0

# Enable command port for monitoring
cmdport 323
cmdallow 127.0.0.1

# Step clock if offset is larger than 1 second
makestep 1.0 3

# Enable RTC synchronization
rtcsync
EOF

chmod 644 /etc/chrony.conf
systemctl enable chronyd
systemctl restart chronyd

# Configure firewall
echo -e "${YELLOW}[7/8] Configuring firewall...${NC}"
if command -v firewall-cmd &> /dev/null && systemctl is-active firewalld &> /dev/null; then
    firewall-cmd --permanent --add-port=${NODE_EXPORTER_PORT}/tcp
    firewall-cmd --reload
elif command -v ufw &> /dev/null; then
    ufw allow ${NODE_EXPORTER_PORT}/tcp
fi

# Create Prometheus configuration if not skipped
if [[ "$SKIP_PROMETHEUS" == false ]]; then
    echo -e "${YELLOW}[8/8] Creating Prometheus configuration templates...${NC}"
    
    # Create Prometheus config directory if it doesn't exist
    mkdir -p /etc/prometheus
    
    # Create time drift rules file
    cat > /etc/prometheus/time_drift_rules.yml << EOF
groups:
  - name: time_drift_alerts
    rules:
    - alert: ClockDriftHigh
      expr: abs(node_timex_offset_seconds) > 0.05
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "System clock drift detected on {{ \$labels.instance }}"
        description: "Clock drift is {{ \$value }} seconds on {{ \$labels.instance }}"
    
    - alert: NTPSyncLost
      expr: node_timex_sync_status == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "NTP synchronization lost on {{ \$labels.instance }}"
        description: "System is not synchronized with NTP on {{ \$labels.instance }}"
    
    - alert: ClockSkewHigh
      expr: node_timex_estimated_error_seconds > 0.1
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High clock skew detected on {{ \$labels.instance }}"
        description: "Clock skew is {{ \$value }} seconds on {{ \$labels.instance }}"
EOF
    
    # Create sample Prometheus config
    cat > /etc/prometheus/prometheus_time_monitoring.yml << EOF
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "time_drift_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - ${PROMETHEUS_HOST}:9093

scrape_configs:
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['${PROMETHEUS_HOST}:${NODE_EXPORTER_PORT}']
    scrape_interval: 10s
    metrics_path: /metrics
EOF
    
    chmod 644 /etc/prometheus/time_drift_rules.yml
    chmod 644 /etc/prometheus/prometheus_time_monitoring.yml
    chown -R root:root /etc/prometheus
    
    echo -e "${GREEN}Prometheus configuration files created in /etc/prometheus/${NC}"
else
    echo -e "${YELLOW}[8/8] Skipping Prometheus configuration as requested${NC}"
fi

# Verification
echo -e "${YELLOW}Verifying installation...${NC}"

# Check node_exporter service
if systemctl is-active node_exporter &> /dev/null; then
    echo -e "${GREEN}✓ Node Exporter is running${NC}"
else
    echo -e "${RED}✗ Node Exporter failed to start${NC}"
    exit 1
fi

# Check chrony service
if systemctl is-active chronyd &> /dev/null; then
    echo -e "${GREEN}✓ Chrony is running${NC}"
else
    echo -e "${RED}✗ Chrony failed to start${NC}"
    exit 1
fi

# Check metrics endpoint
if curl -s "http://localhost:${NODE_EXPORTER_PORT}/metrics" | grep -q "node_timex_offset_seconds"; then
    echo -e "${GREEN}✓ Time metrics are available${NC}"
else
    echo -e "${RED}✗ Time metrics not found${NC}"
    exit 1
fi

# Check chrony synchronization
sleep 5
if chrony sources &> /dev/null; then
    echo -e "${GREEN}✓ NTP sources are configured${NC}"
else
    echo -e "${YELLOW}⚠ NTP synchronization may take time to establish${NC}"
fi

echo -e "\n${GREEN}Installation completed successfully!${NC}"
echo -e "Node Exporter is running on port ${NODE_EXPORTER_PORT}"
echo -e "Time metrics endpoint: http://localhost:${NODE_EXPORTER_PORT}/metrics"
if [[ "$SKIP_PROMETHEUS" == false ]]; then
    echo -e "Prometheus configuration templates created in /etc/prometheus/"
fi
echo -e "\nKey metrics to monitor:"
echo -e "  - node_timex_offset_seconds (clock offset)"
echo -e "  - node_timex_sync_status (NTP sync status)"
echo -e "  - node_timex_estimated_error_seconds (clock accuracy)"

rm -f "/tmp/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz"
rm -rf "/tmp/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64"

Review the script before running. Execute with: bash install.sh

#time drift monitoring #Prometheus time alerts #system clock monitoring #NTP monitoring Grafana #chrony synchronization

Monitor system time drift with Prometheus and Grafana alerts

Prerequisites

What this solves

Step-by-step configuration

Install and configure Prometheus node exporter

Create systemd service for node exporter

Enable and start node exporter

Install and configure chrony for NTP

Configure chrony with monitoring settings

Record statistics

Maximum allowed offset

Enable command port for monitoring

Step clock if offset is larger than 1 second

Enable RTC synchronization

Start chrony service

Configure Prometheus to scrape time metrics

Create Prometheus alerting rules for time drift

Install Alertmanager for notifications

Configure Alertmanager for time drift notifications

Create Alertmanager systemd service

Create Grafana dashboard for time drift visualization

Import dashboard into Grafana

Start all services

Configure alert escalation policies

Verify your setup

Check chrony synchronization status

Verify Prometheus is scraping metrics

Test alert rules

Check Alertmanager status

Verify Grafana dashboard

Common issues

Advanced configuration

Next steps

Running this in production?

Related tutorials

Configure Consul Connect service mesh monitoring with distributed tracing

Configure OpenTelemetry custom metrics for application monitoring with Prometheus and Grafana

Configure Jaeger with Elasticsearch backend security and encryption

Don't want to manage this yourself?