Set up comprehensive time synchronization monitoring using chrony, Prometheus node exporter, and custom Grafana dashboards with alerting for time drift and NTP service failures.
Prerequisites
- Prometheus server installed
- Grafana server installed
- Sudo access
- Network connectivity to NTP servers
What this solves
Accurate time synchronization is critical for distributed systems, logging, security protocols, and compliance requirements. This tutorial sets up monitoring for your NTP service using chrony, collects time drift metrics with Prometheus, and creates Grafana dashboards with alerting for time synchronization issues.
Step-by-step configuration
Install and configure chrony NTP service
Start by installing chrony, a modern NTP implementation that provides better accuracy and faster synchronization than traditional ntpd.
sudo apt update
sudo apt install -y chrony
Configure chrony with monitoring-friendly settings
Edit the chrony configuration to use reliable NTP servers and enable statistics logging for monitoring.
# Use public NTP servers from the pool.ntp.org project
server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
server 3.pool.ntp.org iburst
Record the rate at which the system clock gains/losses time
driftfile /var/lib/chrony/drift
Allow the system clock to be stepped in the first three updates
makestep 1.0 3
Enable kernel synchronization of the real-time clock (RTC)
rtcsync
Enable hardware timestamping on all interfaces that support it
hwtimestamp *
Increase the minimum number of selectable sources required to adjust the system clock
minsources 2
Allow NTP client access from local network
allow 192.168.0.0/16
allow 10.0.0.0/8
allow 172.16.0.0/12
Serve time even if not synchronized to a time source
local stratum 10
Enable statistics logging for monitoring
log statistics measurements tracking tempcomp
logdir /var/log/chrony
Create chrony log directory and set permissions
Create the log directory for chrony statistics and set appropriate permissions.
sudo mkdir -p /var/log/chrony
sudo chown chrony:chrony /var/log/chrony
sudo chmod 755 /var/log/chrony
Enable and start chrony service
Start the chrony service and enable it to start automatically on boot.
sudo systemctl enable chrony
sudo systemctl start chrony
sudo systemctl status chrony
Install Prometheus node exporter
Install the Prometheus node exporter which will collect system metrics including time synchronization data.
sudo apt install -y prometheus-node-exporter
Download and install NTP exporter
Install a dedicated NTP exporter to collect detailed chrony metrics for Prometheus.
cd /tmp
wget https://github.com/sapcc/ntp_exporter/releases/download/v1.1.0/ntp_exporter-1.1.0.linux-amd64.tar.gz
tar -xzf ntp_exporter-1.1.0.linux-amd64.tar.gz
sudo mv ntp_exporter-1.1.0.linux-amd64/ntp_exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/ntp_exporter
Create NTP exporter systemd service
Create a systemd service file to run the NTP exporter as a service.
[Unit]
Description=NTP Exporter for Prometheus
After=network.target
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/ntp_exporter -chrony.address unix:///var/run/chrony/chronyd.sock
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.target
Create prometheus user and configure permissions
Create a dedicated user for the NTP exporter and configure access to chrony socket.
sudo useradd --no-create-home --shell /bin/false prometheus
sudo usermod -a -G chrony prometheus
sudo chmod 755 /var/run/chrony
sudo chmod 666 /var/run/chrony/chronyd.sock
Start NTP exporter service
Enable and start the NTP exporter service.
sudo systemctl daemon-reload
sudo systemctl enable ntp-exporter
sudo systemctl start ntp-exporter
sudo systemctl status ntp-exporter
Configure Prometheus to scrape NTP metrics
Add the NTP exporter to your Prometheus configuration. This assumes you have Prometheus already installed following our Prometheus and Grafana setup guide.
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "/etc/prometheus/rules/*.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['localhost:9100']
scrape_interval: 5s
- job_name: 'ntp-exporter'
static_configs:
- targets: ['localhost:9559']
scrape_interval: 30s
metrics_path: /metrics
Create NTP alerting rules
Create Prometheus alerting rules to detect time synchronization issues.
sudo mkdir -p /etc/prometheus/rules
groups:
- name: ntp_alerts
rules:
- alert: NTPDrift
expr: abs(ntp_drift_seconds) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "NTP time drift detected on {{ $labels.instance }}"
description: "System clock drift is {{ $value }}s on {{ $labels.instance }}, which exceeds the 0.5s threshold."
- alert: NTPHighDrift
expr: abs(ntp_drift_seconds) > 2.0
for: 2m
labels:
severity: critical
annotations:
summary: "High NTP time drift on {{ $labels.instance }}"
description: "System clock drift is {{ $value }}s on {{ $labels.instance }}, which exceeds the critical 2.0s threshold."
- alert: NTPNotSynchronized
expr: ntp_stratum > 16
for: 5m
labels:
severity: critical
annotations:
summary: "NTP not synchronized on {{ $labels.instance }}"
description: "NTP stratum is {{ $value }} on {{ $labels.instance }}, indicating no time synchronization."
- alert: NTPServiceDown
expr: up{job="ntp-exporter"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "NTP exporter is down on {{ $labels.instance }}"
description: "NTP exporter has been down for more than 2 minutes on {{ $labels.instance }}."
- alert: ChronydDown
expr: node_systemd_unit_state{name="chronyd.service",state="active"} != 1
for: 3m
labels:
severity: critical
annotations:
summary: "Chronyd service is not running on {{ $labels.instance }}"
description: "Chronyd service is not in active state on {{ $labels.instance }}."
- alert: NTPSourcesLow
expr: ntp_source_count < 2
for: 10m
labels:
severity: warning
annotations:
summary: "Low number of NTP sources on {{ $labels.instance }}"
description: "Only {{ $value }} NTP sources available on {{ $labels.instance }}, recommend at least 2 sources."
- alert: NTPRootDelay
expr: ntp_root_delay_seconds > 0.1
for: 15m
labels:
severity: warning
annotations:
summary: "High NTP root delay on {{ $labels.instance }}"
description: "NTP root delay is {{ $value }}s on {{ $labels.instance }}, indicating potential network issues."
Restart Prometheus to load new configuration
Restart Prometheus to apply the new scrape configuration and alerting rules.
sudo systemctl restart prometheus
sudo systemctl status prometheus
Configure Grafana data source
If you haven't already configured Prometheus as a data source in Grafana, add it now. Access your Grafana instance and add Prometheus as a data source pointing to http://localhost:9090.
Create NTP monitoring dashboard
Create a comprehensive Grafana dashboard for NTP monitoring. Save this JSON configuration as a new dashboard in Grafana.
{
"dashboard": {
"id": null,
"title": "NTP Time Synchronization Monitoring",
"tags": ["ntp", "time", "chrony"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Time Drift",
"type": "stat",
"targets": [
{
"expr": "ntp_drift_seconds",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "s",
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 0.1},
{"color": "red", "value": 0.5}
]
}
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"id": 2,
"title": "NTP Stratum",
"type": "stat",
"targets": [
{
"expr": "ntp_stratum",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 8},
{"color": "red", "value": 15}
]
}
}
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"id": 3,
"title": "Time Drift Over Time",
"type": "timeseries",
"targets": [
{
"expr": "ntp_drift_seconds",
"refId": "A",
"legendFormat": "Time Drift"
}
],
"fieldConfig": {
"defaults": {
"unit": "s"
}
},
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
},
{
"id": 4,
"title": "NTP Sources",
"type": "stat",
"targets": [
{
"expr": "ntp_source_count",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"thresholds": {
"steps": [
{"color": "red", "value": null},
{"color": "yellow", "value": 1},
{"color": "green", "value": 2}
]
}
}
},
"gridPos": {"h": 8, "w": 8, "x": 0, "y": 16}
},
{
"id": 5,
"title": "Root Delay",
"type": "stat",
"targets": [
{
"expr": "ntp_root_delay_seconds",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "s",
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 0.05},
{"color": "red", "value": 0.1}
]
}
}
},
"gridPos": {"h": 8, "w": 8, "x": 8, "y": 16}
},
{
"id": 6,
"title": "Chronyd Service Status",
"type": "stat",
"targets": [
{
"expr": "node_systemd_unit_state{name=\"chronyd.service\",state=\"active\"}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"mappings": [
{"options": {"0": {"text": "DOWN"}, "1": {"text": "UP"}}, "type": "value"}
],
"thresholds": {
"steps": [
{"color": "red", "value": null},
{"color": "green", "value": 1}
]
}
}
},
"gridPos": {"h": 8, "w": 8, "x": 16, "y": 16}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
Configure Alertmanager for NTP alerts
Configure Alertmanager to handle NTP alerts. This example shows email notifications, but you can adapt it for Slack or other channels following our Alertmanager webhook guide.
global:
smtp_smarthost: 'localhost:587'
smtp_from: 'alerts@example.com'
smtp_auth_username: 'alerts@example.com'
smtp_auth_password: 'your-email-password'
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'ntp-alerts'
receivers:
- name: 'ntp-alerts'
email_configs:
- to: 'admin@example.com'
subject: 'NTP Alert: {{ .GroupLabels.alertname }}'
body: |
{{ range .Alerts }}
Alert: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
Instance: {{ .Labels.instance }}
Severity: {{ .Labels.severity }}
{{ end }}
Restart Alertmanager
Restart Alertmanager to apply the new configuration.
sudo systemctl restart alertmanager
sudo systemctl status alertmanager
Verify your setup
Check that all components are working correctly:
# Verify chrony is synchronizing
chronyc sources -v
chronyc tracking
Check NTP exporter metrics
curl http://localhost:9559/metrics | grep ntp_
Verify Prometheus is scraping NTP metrics
curl http://localhost:9090/api/v1/query?query=ntp_drift_seconds
Check service statuses
sudo systemctl status chrony
sudo systemctl status ntp-exporter
sudo systemctl status prometheus
sudo systemctl status alertmanager
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| NTP exporter fails to start | Cannot access chrony socket | sudo usermod -a -G chrony prometheus and restart service |
| No metrics in Prometheus | Incorrect scrape configuration | Verify targets in Prometheus UI and check exporter is running on port 9559 |
| High time drift alerts | Network issues or bad NTP sources | Check chronyc sources and consider changing NTP pool servers |
| Chronyd not synchronizing | Firewall blocking NTP traffic | Allow UDP port 123: sudo ufw allow 123/udp |
| Dashboard shows no data | Grafana data source misconfigured | Verify Prometheus data source URL and test connection |
| Alertmanager not sending emails | SMTP configuration issues | Test SMTP settings and check Alertmanager logs: journalctl -u alertmanager |
Next steps
- Configure advanced Grafana dashboards and alerting with Prometheus integration
- Configure Prometheus Alertmanager with custom webhook integrations for Slack, Microsoft Teams, and PagerDuty notifications
- Configure NTP clustering for high availability infrastructure
- Implement Network Time Security (NTS) with authentication
- Setup centralized NTP monitoring across multiple datacenters