Set up comprehensive Node.js application monitoring using PM2 process manager with Prometheus metrics collection and Grafana dashboards for production-grade observability and alerting.
Prerequisites
- Node.js application to monitor
- Docker and Docker Compose installed
- Basic understanding of PM2 process manager
- Server with at least 2GB RAM
What this solves
Node.js applications in production need comprehensive monitoring to track performance, memory usage, CPU consumption, and process health. PM2 provides process management with built-in metrics collection, while Grafana offers powerful visualization and alerting capabilities. This setup gives you real-time insights into application performance, automatic restart capabilities, and proactive alerting when issues arise.
Step-by-step installation
Update system packages
Start by updating your package manager to ensure you get the latest versions of all components.
sudo apt update && sudo apt upgrade -y
Install Node.js and npm
Install Node.js runtime and npm package manager if not already present on your system.
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs
Install PM2 globally
PM2 is a production process manager for Node.js applications with built-in load balancer and monitoring capabilities.
sudo npm install -g pm2
pm2 --version
Install Docker and Docker Compose
We'll use Docker to run Prometheus and Grafana for a clean, isolated monitoring stack.
sudo apt install -y docker.io docker-compose-plugin
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
Log out and back in for group membership changes to take effect.
Create sample Node.js application
Create a simple Express.js application to demonstrate monitoring capabilities.
mkdir -p /opt/myapp && cd /opt/myapp
npm init -y
npm install express prom-client
Configure Node.js application with Prometheus metrics
Create an Express application that exposes Prometheus metrics for monitoring.
const express = require('express');
const client = require('prom-client');
const app = express();
const port = process.env.PORT || 3000;
// Create a Registry to register the metrics
const register = new client.Registry();
// Add a default label which is added to all metrics
register.setDefaultLabels({
app: 'myapp',
instance: process.env.INSTANCE_ID || 'local'
});
// Enable the collection of default metrics
client.collectDefaultMetrics({ register });
// Create custom metrics
const httpRequestDuration = new client.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['route', 'method', 'status'],
buckets: [0.1, 0.5, 1, 2, 5]
});
const httpRequestsTotal = new client.Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['route', 'method', 'status']
});
register.registerMetric(httpRequestDuration);
register.registerMetric(httpRequestsTotal);
// Middleware to track requests
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
const labels = {
route: req.path,
method: req.method,
status: res.statusCode
};
httpRequestDuration.observe(labels, duration);
httpRequestsTotal.inc(labels);
});
next();
});
// Routes
app.get('/', (req, res) => {
res.json({ message: 'Hello World!', timestamp: new Date().toISOString() });
});
app.get('/health', (req, res) => {
res.json({ status: 'healthy', pid: process.pid });
});
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
const metrics = await register.metrics();
res.end(metrics);
});
// Simulate some load
app.get('/load', (req, res) => {
const iterations = Math.floor(Math.random() * 1000000) + 100000;
let result = 0;
for (let i = 0; i < iterations; i++) {
result += Math.random();
}
res.json({ result, iterations });
});
app.listen(port, () => {
console.log(App listening at http://localhost:${port});
});
Create PM2 ecosystem configuration
PM2 ecosystem file defines how your application should run, including environment variables and monitoring settings.
module.exports = {
apps: [{
name: 'myapp',
script: 'app.js',
instances: 2,
exec_mode: 'cluster',
env: {
NODE_ENV: 'production',
PORT: 3000
},
env_production: {
NODE_ENV: 'production',
PORT: 3000
},
// Monitoring settings
monitoring: true,
pmx: true,
// Auto restart settings
max_memory_restart: '500M',
min_uptime: '10s',
max_restarts: 10,
// Logging
log_file: '/var/log/pm2/myapp.log',
out_file: '/var/log/pm2/myapp-out.log',
error_file: '/var/log/pm2/myapp-error.log',
merge_logs: true,
log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
// Process behavior
autorestart: true,
watch: false,
ignore_watch: ['node_modules', 'logs'],
// Health monitoring
health_check_grace_period: 3000,
health_check_fatal_exceptions: true
}]
};
Create PM2 log directory
Create the log directory with proper permissions for PM2 to write application logs.
sudo mkdir -p /var/log/pm2
sudo chown -R $USER:$USER /var/log/pm2
sudo chmod 755 /var/log/pm2
Install and configure PM2 Prometheus module
Install the PM2 Prometheus module to expose PM2 metrics in Prometheus format.
pm2 install pm2-prometheus-exporter
Start application with PM2
Deploy your application using PM2 with the ecosystem configuration.
cd /opt/myapp
pm2 start ecosystem.config.js --env production
pm2 save
pm2 startup
Follow the instructions provided by pm2 startup to enable PM2 to start on system boot.
Create monitoring stack directory
Set up a directory structure for the Prometheus and Grafana configuration files.
mkdir -p /opt/monitoring/{prometheus,grafana/dashboards,grafana/provisioning/{datasources,dashboards}}
cd /opt/monitoring
Configure Prometheus
Create Prometheus configuration to scrape metrics from both PM2 and your Node.js application.
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "alert_rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets: []
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'pm2-exporter'
static_configs:
- targets: ['host.docker.internal:9209']
scrape_interval: 5s
metrics_path: '/metrics'
- job_name: 'nodejs-app'
static_configs:
- targets: ['host.docker.internal:3000']
scrape_interval: 10s
metrics_path: '/metrics'
- job_name: 'node-exporter'
static_configs:
- targets: ['host.docker.internal:9100']
scrape_interval: 10s
Create Prometheus alert rules
Define alert rules for common Node.js application issues and PM2 process monitoring.
groups:
- name: nodejs_alerts
rules:
- alert: NodeJSAppDown
expr: up{job="nodejs-app"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Node.js application is down"
description: "Node.js application has been down for more than 1 minute."
- alert: PM2ProcessDown
expr: pm2_process_uptime{name="myapp"} == 0
for: 30s
labels:
severity: critical
annotations:
summary: "PM2 process {{ $labels.name }} is down"
description: "PM2 process {{ $labels.name }} has been down for more than 30 seconds."
- alert: HighMemoryUsage
expr: pm2_process_memory{name="myapp"} > 400000000
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage in {{ $labels.name }}"
description: "Process {{ $labels.name }} is using more than 400MB of memory."
- alert: HighCPUUsage
expr: pm2_process_cpu_percent{name="myapp"} > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage in {{ $labels.name }}"
description: "Process {{ $labels.name }} is using more than 80% CPU."
- alert: FrequentRestarts
expr: increase(pm2_process_restart_count{name="myapp"}[5m]) > 3
for: 1m
labels:
severity: warning
annotations:
summary: "Frequent restarts for {{ $labels.name }}"
description: "Process {{ $labels.name }} has restarted more than 3 times in the last 5 minutes."
- alert: HighHTTPErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 2m
labels:
severity: warning
annotations:
summary: "High HTTP error rate"
description: "HTTP 5xx error rate is above 10% for the last 5 minutes."
Configure Grafana datasources
Set up Grafana to automatically connect to Prometheus as a data source.
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: true
Configure Grafana dashboard provisioning
Enable automatic dashboard loading in Grafana.
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
updateIntervalSeconds: 10
allowUiUpdates: true
options:
path: /etc/grafana/provisioning/dashboards
Create Node.js monitoring dashboard
Create a comprehensive Grafana dashboard for Node.js and PM2 monitoring.
{
"dashboard": {
"id": null,
"title": "Node.js Application Monitoring",
"tags": ["nodejs", "pm2", "monitoring"],
"timezone": "browser",
"refresh": "10s",
"time": {
"from": "now-1h",
"to": "now"
},
"panels": [
{
"id": 1,
"title": "Application Status",
"type": "stat",
"targets": [
{
"expr": "up{job=\"nodejs-app\"}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"steps": [
{
"color": "red",
"value": 0
},
{
"color": "green",
"value": 1
}
]
}
}
},
"gridPos": {
"h": 8,
"w": 6,
"x": 0,
"y": 0
}
},
{
"id": 2,
"title": "HTTP Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"refId": "A",
"legendFormat": "{{method}} {{route}} {{status}}"
}
],
"gridPos": {
"h": 8,
"w": 12,
"x": 6,
"y": 0
},
"yAxes": [
{
"label": "Requests/sec",
"show": true
},
{
"show": true
}
]
},
{
"id": 3,
"title": "PM2 Process Count",
"type": "stat",
"targets": [
{
"expr": "count(pm2_process_uptime > 0)",
"refId": "A"
}
],
"gridPos": {
"h": 8,
"w": 6,
"x": 18,
"y": 0
}
},
{
"id": 4,
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "pm2_process_memory / 1024 / 1024",
"refId": "A",
"legendFormat": "{{name}} ({{id}})"
}
],
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"yAxes": [
{
"label": "Memory (MB)",
"show": true
},
{
"show": true
}
]
},
{
"id": 5,
"title": "CPU Usage",
"type": "graph",
"targets": [
{
"expr": "pm2_process_cpu_percent",
"refId": "A",
"legendFormat": "{{name}} ({{id}})"
}
],
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"yAxes": [
{
"label": "CPU %",
"show": true,
"max": 100
},
{
"show": true
}
]
},
{
"id": 6,
"title": "HTTP Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"refId": "A",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))",
"refId": "B",
"legendFormat": "50th percentile"
}
],
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 16
},
"yAxes": [
{
"label": "Response time (s)",
"show": true
},
{
"show": true
}
]
},
{
"id": 7,
"title": "Process Restarts",
"type": "graph",
"targets": [
{
"expr": "pm2_process_restart_count",
"refId": "A",
"legendFormat": "{{name}} ({{id}})"
}
],
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 16
},
"yAxes": [
{
"label": "Restart count",
"show": true
},
{
"show": true
}
]
}
],
"schemaVersion": 27,
"version": 1
}
}
Install Node Exporter
Install Node Exporter to collect system metrics for comprehensive monitoring.
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xfz node_exporter-1.7.0.linux-amd64.tar.gz
sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo rm -rf node_exporter-1.7.0.linux-amd64*
Create Node Exporter service
Create a systemd service for Node Exporter to run automatically.
[Unit]
Description=Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target
[Service]
User=nobody
Group=nogroup
Type=simple
ExecStart=/usr/local/bin/node_exporter
SyslogIdentifier=node_exporter
Restart=always
RestartSec=1
[Install]
WantedBy=multi-user.target
Start Node Exporter
Enable and start the Node Exporter service.
sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter
sudo systemctl status node_exporter
Create Docker Compose configuration
Set up Prometheus and Grafana using Docker Compose for easy management.
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus/alert_rules.yml:/etc/prometheus/alert_rules.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=30d'
- '--web.enable-lifecycle'
extra_hosts:
- "host.docker.internal:host-gateway"
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
ports:
- "3001:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin123
- GF_USERS_ALLOW_SIGN_UP=false
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
- ./grafana/dashboards:/etc/grafana/provisioning/dashboards
depends_on:
- prometheus
volumes:
prometheus_data:
driver: local
grafana_data:
driver: local
networks:
default:
name: monitoring
Start monitoring stack
Launch Prometheus and Grafana containers using Docker Compose.
cd /opt/monitoring
docker compose up -d
docker compose ps
Configure firewall rules
Open necessary ports for monitoring services while maintaining security.
sudo ufw allow 3000/tcp comment "Node.js App"
sudo ufw allow 3001/tcp comment "Grafana"
sudo ufw allow 9090/tcp comment "Prometheus"
sudo ufw allow 9100/tcp comment "Node Exporter"
sudo ufw allow 9209/tcp comment "PM2 Exporter"
Configure alerting
Set up Grafana alerting
Configure Grafana to send alerts via email when thresholds are breached.
Access Grafana at http://your-server-ip:3001 with username admin and password admin123. Navigate to Alerting > Notification channels and create email notifications.
Create alert rules in Grafana
Set up alert rules for critical Node.js application metrics. In Grafana, go to your dashboard and add alert rules for:
- Application uptime monitoring
- Memory usage thresholds
- CPU usage alerts
- Process restart frequency
- HTTP error rate monitoring
Application performance monitoring
Add custom application metrics
Enhance your Node.js application with business-specific metrics tracking.
const client = require('prom-client');
// Database connection pool metrics
const dbConnections = new client.Gauge({
name: 'database_connections_active',
help: 'Number of active database connections',
labelNames: ['pool', 'database']
});
// Cache hit/miss metrics
const cacheOperations = new client.Counter({
name: 'cache_operations_total',
help: 'Total cache operations',
labelNames: ['operation', 'status']
});
// Queue processing metrics
const queueSize = new client.Gauge({
name: 'queue_size',
help: 'Current queue size',
labelNames: ['queue_name']
});
// Business metrics
const userActions = new client.Counter({
name: 'user_actions_total',
help: 'Total user actions',
labelNames: ['action', 'status']
});
module.exports = {
dbConnections,
cacheOperations,
queueSize,
userActions
};
Configure PM2 monitoring settings
Fine-tune PM2 monitoring configuration for better observability.
pm2 set pm2-prometheus-exporter:port 9209
pm2 set pm2-prometheus-exporter:prefix pm2
pm2 restart pm2-prometheus-exporter
Verify your setup
Test all monitoring components to ensure they're working correctly.
# Check PM2 processes
pm2 status
pm2 monit
Test application endpoints
curl http://localhost:3000/health
curl http://localhost:3000/metrics
Check PM2 metrics export
curl http://localhost:9209/metrics
Verify Node Exporter
curl http://localhost:9100/metrics
Check Prometheus targets
curl http://localhost:9090/api/v1/targets
Test monitoring stack
docker compose -f /opt/monitoring/docker-compose.yml ps
docker compose -f /opt/monitoring/docker-compose.yml logs --tail=50
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| PM2 process keeps restarting | Memory limit exceeded or application errors | Check pm2 logs and increase max_memory_restart in ecosystem config |
| Prometheus can't scrape PM2 metrics | PM2 exporter not running or port blocked | Verify pm2 list shows exporter and check firewall rules |
| Grafana shows no data | Prometheus data source not configured | Check Grafana data source configuration and Prometheus connectivity |
| High memory usage alerts | Memory leaks or inefficient code | Use pm2 monit to identify problematic processes and optimize code |
| Node Exporter not starting | Port 9100 already in use | Check sudo netstat -tulpn | grep 9100 and kill conflicting processes |
| Docker containers won't start | Port conflicts or permission issues | Check docker compose logs and ensure ports 9090, 3001 are available |
Next steps
- Implement Node.js application deployment with Git hooks and PM2 clustering
- Configure NGINX reverse proxy with SSL termination and load balancing
- Set up Elasticsearch monitoring with Metricbeat and Kibana dashboards
- Configure Node.js application clustering with PM2 and load balancing
- Implement Node.js JWT authentication with Redis session storage