Configure Node.js application monitoring with PM2 and Grafana

Intermediate 45 min May 08, 2026 45 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up comprehensive Node.js application monitoring using PM2 process manager with Prometheus metrics collection and Grafana dashboards for production-grade observability and alerting.

Prerequisites

  • Node.js application to monitor
  • Docker and Docker Compose installed
  • Basic understanding of PM2 process manager
  • Server with at least 2GB RAM

What this solves

Node.js applications in production need comprehensive monitoring to track performance, memory usage, CPU consumption, and process health. PM2 provides process management with built-in metrics collection, while Grafana offers powerful visualization and alerting capabilities. This setup gives you real-time insights into application performance, automatic restart capabilities, and proactive alerting when issues arise.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of all components.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install Node.js and npm

Install Node.js runtime and npm package manager if not already present on your system.

curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs
curl -fsSL https://rpm.nodesource.com/setup_20.x | sudo bash -
sudo dnf install -y nodejs npm

Install PM2 globally

PM2 is a production process manager for Node.js applications with built-in load balancer and monitoring capabilities.

sudo npm install -g pm2
pm2 --version

Install Docker and Docker Compose

We'll use Docker to run Prometheus and Grafana for a clean, isolated monitoring stack.

sudo apt install -y docker.io docker-compose-plugin
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
sudo dnf install -y docker docker-compose
sudo systemctl enable --now docker
sudo usermod -aG docker $USER

Log out and back in for group membership changes to take effect.

Create sample Node.js application

Create a simple Express.js application to demonstrate monitoring capabilities.

mkdir -p /opt/myapp && cd /opt/myapp
npm init -y
npm install express prom-client

Configure Node.js application with Prometheus metrics

Create an Express application that exposes Prometheus metrics for monitoring.

const express = require('express');
const client = require('prom-client');

const app = express();
const port = process.env.PORT || 3000;

// Create a Registry to register the metrics
const register = new client.Registry();

// Add a default label which is added to all metrics
register.setDefaultLabels({
  app: 'myapp',
  instance: process.env.INSTANCE_ID || 'local'
});

// Enable the collection of default metrics
client.collectDefaultMetrics({ register });

// Create custom metrics
const httpRequestDuration = new client.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['route', 'method', 'status'],
  buckets: [0.1, 0.5, 1, 2, 5]
});

const httpRequestsTotal = new client.Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['route', 'method', 'status']
});

register.registerMetric(httpRequestDuration);
register.registerMetric(httpRequestsTotal);

// Middleware to track requests
app.use((req, res, next) => {
  const start = Date.now();
  
  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    const labels = {
      route: req.path,
      method: req.method,
      status: res.statusCode
    };
    
    httpRequestDuration.observe(labels, duration);
    httpRequestsTotal.inc(labels);
  });
  
  next();
});

// Routes
app.get('/', (req, res) => {
  res.json({ message: 'Hello World!', timestamp: new Date().toISOString() });
});

app.get('/health', (req, res) => {
  res.json({ status: 'healthy', pid: process.pid });
});

app.get('/metrics', async (req, res) => {
  res.set('Content-Type', register.contentType);
  const metrics = await register.metrics();
  res.end(metrics);
});

// Simulate some load
app.get('/load', (req, res) => {
  const iterations = Math.floor(Math.random() * 1000000) + 100000;
  let result = 0;
  
  for (let i = 0; i < iterations; i++) {
    result += Math.random();
  }
  
  res.json({ result, iterations });
});

app.listen(port, () => {
  console.log(App listening at http://localhost:${port});
});

Create PM2 ecosystem configuration

PM2 ecosystem file defines how your application should run, including environment variables and monitoring settings.

module.exports = {
  apps: [{
    name: 'myapp',
    script: 'app.js',
    instances: 2,
    exec_mode: 'cluster',
    env: {
      NODE_ENV: 'production',
      PORT: 3000
    },
    env_production: {
      NODE_ENV: 'production',
      PORT: 3000
    },
    // Monitoring settings
    monitoring: true,
    pmx: true,
    
    // Auto restart settings
    max_memory_restart: '500M',
    min_uptime: '10s',
    max_restarts: 10,
    
    // Logging
    log_file: '/var/log/pm2/myapp.log',
    out_file: '/var/log/pm2/myapp-out.log',
    error_file: '/var/log/pm2/myapp-error.log',
    merge_logs: true,
    log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
    
    // Process behavior
    autorestart: true,
    watch: false,
    ignore_watch: ['node_modules', 'logs'],
    
    // Health monitoring
    health_check_grace_period: 3000,
    health_check_fatal_exceptions: true
  }]
};

Create PM2 log directory

Create the log directory with proper permissions for PM2 to write application logs.

sudo mkdir -p /var/log/pm2
sudo chown -R $USER:$USER /var/log/pm2
sudo chmod 755 /var/log/pm2

Install and configure PM2 Prometheus module

Install the PM2 Prometheus module to expose PM2 metrics in Prometheus format.

pm2 install pm2-prometheus-exporter

Start application with PM2

Deploy your application using PM2 with the ecosystem configuration.

cd /opt/myapp
pm2 start ecosystem.config.js --env production
pm2 save
pm2 startup

Follow the instructions provided by pm2 startup to enable PM2 to start on system boot.

Create monitoring stack directory

Set up a directory structure for the Prometheus and Grafana configuration files.

mkdir -p /opt/monitoring/{prometheus,grafana/dashboards,grafana/provisioning/{datasources,dashboards}}
cd /opt/monitoring

Configure Prometheus

Create Prometheus configuration to scrape metrics from both PM2 and your Node.js application.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "alert_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets: []

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'pm2-exporter'
    static_configs:
      - targets: ['host.docker.internal:9209']
    scrape_interval: 5s
    metrics_path: '/metrics'

  - job_name: 'nodejs-app'
    static_configs:
      - targets: ['host.docker.internal:3000']
    scrape_interval: 10s
    metrics_path: '/metrics'

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['host.docker.internal:9100']
    scrape_interval: 10s

Create Prometheus alert rules

Define alert rules for common Node.js application issues and PM2 process monitoring.

groups:
  - name: nodejs_alerts
    rules:
      - alert: NodeJSAppDown
        expr: up{job="nodejs-app"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Node.js application is down"
          description: "Node.js application has been down for more than 1 minute."

      - alert: PM2ProcessDown
        expr: pm2_process_uptime{name="myapp"} == 0
        for: 30s
        labels:
          severity: critical
        annotations:
          summary: "PM2 process {{ $labels.name }} is down"
          description: "PM2 process {{ $labels.name }} has been down for more than 30 seconds."

      - alert: HighMemoryUsage
        expr: pm2_process_memory{name="myapp"} > 400000000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage in {{ $labels.name }}"
          description: "Process {{ $labels.name }} is using more than 400MB of memory."

      - alert: HighCPUUsage
        expr: pm2_process_cpu_percent{name="myapp"} > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage in {{ $labels.name }}"
          description: "Process {{ $labels.name }} is using more than 80% CPU."

      - alert: FrequentRestarts
        expr: increase(pm2_process_restart_count{name="myapp"}[5m]) > 3
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Frequent restarts for {{ $labels.name }}"
          description: "Process {{ $labels.name }} has restarted more than 3 times in the last 5 minutes."

      - alert: HighHTTPErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High HTTP error rate"
          description: "HTTP 5xx error rate is above 10% for the last 5 minutes."

Configure Grafana datasources

Set up Grafana to automatically connect to Prometheus as a data source.

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    editable: true

Configure Grafana dashboard provisioning

Enable automatic dashboard loading in Grafana.

apiVersion: 1

providers:
  - name: 'default'
    orgId: 1
    folder: ''
    type: file
    disableDeletion: false
    updateIntervalSeconds: 10
    allowUiUpdates: true
    options:
      path: /etc/grafana/provisioning/dashboards

Create Node.js monitoring dashboard

Create a comprehensive Grafana dashboard for Node.js and PM2 monitoring.

{
  "dashboard": {
    "id": null,
    "title": "Node.js Application Monitoring",
    "tags": ["nodejs", "pm2", "monitoring"],
    "timezone": "browser",
    "refresh": "10s",
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "panels": [
      {
        "id": 1,
        "title": "Application Status",
        "type": "stat",
        "targets": [
          {
            "expr": "up{job=\"nodejs-app\"}",
            "refId": "A"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "color": {
              "mode": "thresholds"
            },
            "thresholds": {
              "steps": [
                {
                  "color": "red",
                  "value": 0
                },
                {
                  "color": "green",
                  "value": 1
                }
              ]
            }
          }
        },
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 0,
          "y": 0
        }
      },
      {
        "id": 2,
        "title": "HTTP Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(http_requests_total[5m])",
            "refId": "A",
            "legendFormat": "{{method}} {{route}} {{status}}"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 6,
          "y": 0
        },
        "yAxes": [
          {
            "label": "Requests/sec",
            "show": true
          },
          {
            "show": true
          }
        ]
      },
      {
        "id": 3,
        "title": "PM2 Process Count",
        "type": "stat",
        "targets": [
          {
            "expr": "count(pm2_process_uptime > 0)",
            "refId": "A"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 6,
          "x": 18,
          "y": 0
        }
      },
      {
        "id": 4,
        "title": "Memory Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "pm2_process_memory / 1024 / 1024",
            "refId": "A",
            "legendFormat": "{{name}} ({{id}})"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 0,
          "y": 8
        },
        "yAxes": [
          {
            "label": "Memory (MB)",
            "show": true
          },
          {
            "show": true
          }
        ]
      },
      {
        "id": 5,
        "title": "CPU Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "pm2_process_cpu_percent",
            "refId": "A",
            "legendFormat": "{{name}} ({{id}})"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 12,
          "y": 8
        },
        "yAxes": [
          {
            "label": "CPU %",
            "show": true,
            "max": 100
          },
          {
            "show": true
          }
        ]
      },
      {
        "id": 6,
        "title": "HTTP Response Time",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
            "refId": "A",
            "legendFormat": "95th percentile"
          },
          {
            "expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))",
            "refId": "B",
            "legendFormat": "50th percentile"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 0,
          "y": 16
        },
        "yAxes": [
          {
            "label": "Response time (s)",
            "show": true
          },
          {
            "show": true
          }
        ]
      },
      {
        "id": 7,
        "title": "Process Restarts",
        "type": "graph",
        "targets": [
          {
            "expr": "pm2_process_restart_count",
            "refId": "A",
            "legendFormat": "{{name}} ({{id}})"
          }
        ],
        "gridPos": {
          "h": 8,
          "w": 12,
          "x": 12,
          "y": 16
        },
        "yAxes": [
          {
            "label": "Restart count",
            "show": true
          },
          {
            "show": true
          }
        ]
      }
    ],
    "schemaVersion": 27,
    "version": 1
  }
}

Install Node Exporter

Install Node Exporter to collect system metrics for comprehensive monitoring.

cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xfz node_exporter-1.7.0.linux-amd64.tar.gz
sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo rm -rf node_exporter-1.7.0.linux-amd64*

Create Node Exporter service

Create a systemd service for Node Exporter to run automatically.

[Unit]
Description=Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target

[Service]
User=nobody
Group=nogroup
Type=simple
ExecStart=/usr/local/bin/node_exporter
SyslogIdentifier=node_exporter
Restart=always
RestartSec=1

[Install]
WantedBy=multi-user.target

Start Node Exporter

Enable and start the Node Exporter service.

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter
sudo systemctl status node_exporter

Create Docker Compose configuration

Set up Prometheus and Grafana using Docker Compose for easy management.

version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
      - ./prometheus/alert_rules.yml:/etc/prometheus/alert_rules.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--storage.tsdb.retention.time=30d'
      - '--web.enable-lifecycle'
    extra_hosts:
      - "host.docker.internal:host-gateway"

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=admin123
      - GF_USERS_ALLOW_SIGN_UP=false
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards
    depends_on:
      - prometheus

volumes:
  prometheus_data:
    driver: local
  grafana_data:
    driver: local

networks:
  default:
    name: monitoring

Start monitoring stack

Launch Prometheus and Grafana containers using Docker Compose.

cd /opt/monitoring
docker compose up -d
docker compose ps

Configure firewall rules

Open necessary ports for monitoring services while maintaining security.

sudo ufw allow 3000/tcp comment "Node.js App"
sudo ufw allow 3001/tcp comment "Grafana"
sudo ufw allow 9090/tcp comment "Prometheus"
sudo ufw allow 9100/tcp comment "Node Exporter"
sudo ufw allow 9209/tcp comment "PM2 Exporter"
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=3001/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --permanent --add-port=9209/tcp
sudo firewall-cmd --reload

Configure alerting

Set up Grafana alerting

Configure Grafana to send alerts via email when thresholds are breached.

Access Grafana at http://your-server-ip:3001 with username admin and password admin123. Navigate to Alerting > Notification channels and create email notifications.

Create alert rules in Grafana

Set up alert rules for critical Node.js application metrics. In Grafana, go to your dashboard and add alert rules for:

  • Application uptime monitoring
  • Memory usage thresholds
  • CPU usage alerts
  • Process restart frequency
  • HTTP error rate monitoring

Application performance monitoring

Add custom application metrics

Enhance your Node.js application with business-specific metrics tracking.

const client = require('prom-client');

// Database connection pool metrics
const dbConnections = new client.Gauge({
  name: 'database_connections_active',
  help: 'Number of active database connections',
  labelNames: ['pool', 'database']
});

// Cache hit/miss metrics
const cacheOperations = new client.Counter({
  name: 'cache_operations_total',
  help: 'Total cache operations',
  labelNames: ['operation', 'status']
});

// Queue processing metrics
const queueSize = new client.Gauge({
  name: 'queue_size',
  help: 'Current queue size',
  labelNames: ['queue_name']
});

// Business metrics
const userActions = new client.Counter({
  name: 'user_actions_total',
  help: 'Total user actions',
  labelNames: ['action', 'status']
});

module.exports = {
  dbConnections,
  cacheOperations,
  queueSize,
  userActions
};

Configure PM2 monitoring settings

Fine-tune PM2 monitoring configuration for better observability.

pm2 set pm2-prometheus-exporter:port 9209
pm2 set pm2-prometheus-exporter:prefix pm2
pm2 restart pm2-prometheus-exporter

Verify your setup

Test all monitoring components to ensure they're working correctly.

# Check PM2 processes
pm2 status
pm2 monit

Test application endpoints

curl http://localhost:3000/health curl http://localhost:3000/metrics

Check PM2 metrics export

curl http://localhost:9209/metrics

Verify Node Exporter

curl http://localhost:9100/metrics

Check Prometheus targets

curl http://localhost:9090/api/v1/targets

Test monitoring stack

docker compose -f /opt/monitoring/docker-compose.yml ps docker compose -f /opt/monitoring/docker-compose.yml logs --tail=50

Common issues

Symptom Cause Fix
PM2 process keeps restarting Memory limit exceeded or application errors Check pm2 logs and increase max_memory_restart in ecosystem config
Prometheus can't scrape PM2 metrics PM2 exporter not running or port blocked Verify pm2 list shows exporter and check firewall rules
Grafana shows no data Prometheus data source not configured Check Grafana data source configuration and Prometheus connectivity
High memory usage alerts Memory leaks or inefficient code Use pm2 monit to identify problematic processes and optimize code
Node Exporter not starting Port 9100 already in use Check sudo netstat -tulpn | grep 9100 and kill conflicting processes
Docker containers won't start Port conflicts or permission issues Check docker compose logs and ensure ports 9090, 3001 are available
Note: For production deployments, consider implementing log rotation for PM2 logs, setting up SSL certificates for Grafana, and configuring proper backup strategies for Prometheus data. The NGINX monitoring tutorial provides additional insights for reverse proxy setups.

Next steps

Running this in production?

Want this handled for you? Setting this up once is straightforward. Keeping it patched, monitored, backed up and performant across environments is the harder part. See how we run infrastructure like this for European SaaS and e-commerce teams.

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.