Set up OpenTelemetry custom instrumentation and metrics collection with Prometheus integration

Intermediate 45 min Apr 16, 2026 42 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure OpenTelemetry Collector with custom metrics exporters and processors, set up application instrumentation with SDKs, and integrate with Prometheus and Grafana for comprehensive distributed system monitoring and observability.

Prerequisites

  • Root or sudo access
  • Python 3.8+ for sample applications
  • Node.js 16+ for sample applications
  • At least 2GB RAM
  • Ports 4317, 4318, 8888, 8889, 9090, 3000 available

What this solves

OpenTelemetry provides a unified way to collect, process, and export telemetry data from your applications and infrastructure. This tutorial shows you how to set up custom instrumentation and metrics collection with Prometheus integration, enabling comprehensive monitoring of distributed systems with standardized telemetry data.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of dependencies.

sudo apt update && sudo apt upgrade -y
sudo apt install -y wget curl unzip
sudo dnf update -y
sudo dnf install -y wget curl unzip

Download and install OpenTelemetry Collector

Download the OpenTelemetry Collector binary from the official releases and install it in a standard location.

OTEL_VERSION="0.91.0"
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
tar -xzf otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
sudo mv otelcol /usr/local/bin/
sudo chmod +x /usr/local/bin/otelcol

Create OpenTelemetry user and directories

Create a dedicated user and directory structure for OpenTelemetry Collector with proper permissions.

sudo useradd --system --no-create-home --shell /bin/false otelcol
sudo mkdir -p /etc/otelcol /var/log/otelcol /var/lib/otelcol
sudo chown otelcol:otelcol /var/log/otelcol /var/lib/otelcol
sudo chmod 755 /etc/otelcol
sudo chmod 750 /var/log/otelcol /var/lib/otelcol

Configure OpenTelemetry Collector

Create the main configuration file with receivers, processors, exporters, and service pipeline definitions.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          static_configs:
            - targets: ['localhost:8888']
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      disk: {}
      filesystem: {}
      memory: {}
      network: {}
      process: {}

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 512
  resource:
    attributes:
      - key: environment
        value: production
        action: upsert
      - key: service.instance.id
        from_attribute: host.name
        action: insert

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: "otel"
    const_labels:
      environment: "production"
  otlp/jaeger:
    endpoint: http://localhost:14250
    tls:
      insecure: true
  logging:
    loglevel: info

service:
  extensions: [health_check, pprof]
  pipelines:
    metrics:
      receivers: [otlp, prometheus, hostmetrics]
      processors: [memory_limiter, resource, batch]
      exporters: [prometheus, logging]
    traces:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [otlp/jaeger, logging]
  telemetry:
    logs:
      level: info
    metrics:
      address: 0.0.0.0:8888

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  pprof:
    endpoint: 0.0.0.0:1777

Create systemd service

Set up a systemd service to manage the OpenTelemetry Collector with proper restart policies and security settings.

[Unit]
Description=OpenTelemetry Collector
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=otelcol
Group=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otelcol
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30

Security settings

NoNewPrivileges=yes ProtectSystem=strict ProtectHome=yes ReadWritePaths=/var/lib/otelcol /var/log/otelcol ProtectKernelTunables=yes ProtectKernelModules=yes ProtectControlGroups=yes [Install] WantedBy=multi-user.target

Install Prometheus for metrics storage

Install Prometheus to scrape and store metrics from the OpenTelemetry Collector.

sudo apt install -y prometheus
sudo systemctl enable prometheus
sudo dnf install -y prometheus
sudo systemctl enable prometheus

Configure Prometheus to scrape OpenTelemetry metrics

Add the OpenTelemetry Collector as a scrape target in Prometheus configuration.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'otel-collector-metrics'
    static_configs:
      - targets: ['localhost:8889']
    scrape_interval: 30s
    metrics_path: /metrics
    
  - job_name: 'otel-collector-internal'
    static_configs:
      - targets: ['localhost:8888']
    scrape_interval: 30s
    metrics_path: /metrics

Set up Python application instrumentation

Install OpenTelemetry Python SDK and create a sample application with custom metrics.

pip3 install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp opentelemetry-instrumentation-requests opentelemetry-instrumentation-flask

Create instrumented Python application

Create a sample Flask application with OpenTelemetry instrumentation and custom metrics.

from flask import Flask
import time
import random
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor

Initialize tracing

trace.set_tracer_provider(TracerProvider()) tracer = trace.get_tracer(__name__)

Initialize metrics

metric_reader = PeriodicExportingMetricReader( exporter=OTLPMetricExporter(endpoint="http://localhost:4317", insecure=True), export_interval_millis=30000 ) metrics.set_meter_provider(MeterProvider(metric_readers=[metric_reader])) meter = metrics.get_meter(__name__)

Create custom metrics

request_counter = meter.create_counter( "http_requests_total", description="Total number of HTTP requests", unit="1" ) response_time_histogram = meter.create_histogram( "http_request_duration_seconds", description="HTTP request duration in seconds", unit="s" ) active_connections = meter.create_up_down_counter( "active_connections", description="Number of active connections", unit="1" )

Configure OTLP exporter

otlp_exporter = OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True) span_processor = BatchSpanProcessor(otlp_exporter) trace.get_tracer_provider().add_span_processor(span_processor) app = Flask(__name__) FlaskInstrumentor().instrument_app(app) RequestsInstrumentor().instrument() @app.route('/api/users') def get_users(): start_time = time.time() with tracer.start_as_current_span("get_users") as span: span.set_attribute("operation", "fetch_users") span.set_attribute("user.count", 100) # Simulate work processing_time = random.uniform(0.1, 0.5) time.sleep(processing_time) # Record metrics request_counter.add(1, {"method": "GET", "endpoint": "/api/users"}) response_time_histogram.record(time.time() - start_time, {"method": "GET", "endpoint": "/api/users"}) active_connections.add(1) return {"users": ["user1", "user2", "user3"]} @app.route('/api/health') def health_check(): request_counter.add(1, {"method": "GET", "endpoint": "/api/health"}) return {"status": "healthy"} if __name__ == '__main__': app.run(host='0.0.0.0', port=5000, debug=False)

Create Node.js application instrumentation

Install OpenTelemetry Node.js SDK and create a sample Express application with custom metrics.

mkdir -p /opt/nodejs-app
cd /opt/nodejs-app
npm init -y
npm install express @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-otlp-grpc

Create instrumented Node.js application

Create a sample Express application with OpenTelemetry instrumentation and custom business metrics.

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { metrics, trace } = require('@opentelemetry/api');

// Initialize OpenTelemetry
const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://localhost:4317',
  }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: 'http://localhost:4317',
    }),
    exportIntervalMillis: 30000,
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

const express = require('express');
const app = express();

// Get meter and tracer
const meter = metrics.getMeter('nodejs-app', '1.0.0');
const tracer = trace.getTracer('nodejs-app', '1.0.0');

// Create custom metrics
const orderCounter = meter.createCounter('orders_total', {
  description: 'Total number of orders processed',
});

const orderValueHistogram = meter.createHistogram('order_value_dollars', {
  description: 'Order value distribution in dollars',
});

const inventoryGauge = meter.createUpDownCounter('inventory_items', {
  description: 'Current inventory levels',
});

app.use(express.json());

app.get('/api/orders', (req, res) => {
  const span = tracer.startSpan('get_orders');
  span.setAttributes({
    'operation': 'fetch_orders',
    'user.id': req.query.user_id || 'anonymous'
  });
  
  try {
    // Simulate fetching orders
    const orders = [
      { id: 1, value: 29.99, status: 'completed' },
      { id: 2, value: 149.50, status: 'pending' }
    ];
    
    // Record metrics
    orderCounter.add(orders.length, {
      status: 'success',
      endpoint: '/api/orders'
    });
    
    orders.forEach(order => {
      orderValueHistogram.record(order.value, {
        status: order.status
      });
    });
    
    span.setStatus({ code: trace.SpanStatusCode.OK });
    res.json({ orders });
  } catch (error) {
    span.recordException(error);
    span.setStatus({
      code: trace.SpanStatusCode.ERROR,
      message: error.message
    });
    res.status(500).json({ error: 'Internal server error' });
  } finally {
    span.end();
  }
});

app.post('/api/orders', (req, res) => {
  const span = tracer.startSpan('create_order');
  span.setAttributes({
    'operation': 'create_order',
    'order.value': req.body.value
  });
  
  try {
    const order = {
      id: Math.floor(Math.random() * 10000),
      value: req.body.value || 0,
      status: 'created'
    };
    
    // Record metrics
    orderCounter.add(1, {
      status: 'created',
      endpoint: '/api/orders'
    });
    
    orderValueHistogram.record(order.value, {
      status: order.status
    });
    
    inventoryGauge.add(-1, {
      item: req.body.item || 'unknown'
    });
    
    span.setStatus({ code: trace.SpanStatusCode.OK });
    res.status(201).json({ order });
  } catch (error) {
    span.recordException(error);
    span.setStatus({
      code: trace.SpanStatusCode.ERROR,
      message: error.message
    });
    res.status(500).json({ error: 'Internal server error' });
  } finally {
    span.end();
  }
});

app.get('/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

const port = process.env.PORT || 3000;
app.listen(port, () => {
  console.log(Server running on port ${port});
});

Start all services

Enable and start OpenTelemetry Collector, Prometheus, and verify they are running correctly.

sudo systemctl daemon-reload
sudo systemctl enable --now otelcol
sudo systemctl start prometheus
sudo systemctl status otelcol prometheus

Configure firewall rules

Open necessary ports for OpenTelemetry Collector, Prometheus, and application access.

sudo ufw allow 4317/tcp comment 'OpenTelemetry OTLP gRPC'
sudo ufw allow 4318/tcp comment 'OpenTelemetry OTLP HTTP'
sudo ufw allow 8889/tcp comment 'OpenTelemetry Prometheus metrics'
sudo ufw allow 9090/tcp comment 'Prometheus web UI'
sudo ufw reload
sudo firewall-cmd --permanent --add-port=4317/tcp --comment="OpenTelemetry OTLP gRPC"
sudo firewall-cmd --permanent --add-port=4318/tcp --comment="OpenTelemetry OTLP HTTP"
sudo firewall-cmd --permanent --add-port=8889/tcp --comment="OpenTelemetry Prometheus metrics"
sudo firewall-cmd --permanent --add-port=9090/tcp --comment="Prometheus web UI"
sudo firewall-cmd --reload

Install and configure Grafana

Install Grafana for visualizing metrics collected by Prometheus from OpenTelemetry.

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana
sudo systemctl enable --now grafana-server
cat << 'EOF' | sudo tee /etc/yum.repos.d/grafana.repo
[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
EOF
sudo dnf install -y grafana
sudo systemctl enable --now grafana-server

Configure Prometheus data source in Grafana

Add Prometheus data source

Configure Grafana to use Prometheus as a data source for OpenTelemetry metrics visualization.

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://localhost:9090
    isDefault: true
    editable: true
    jsonData:
      httpMethod: POST
      exemplarTraceIdDestinations:
        - name: trace_id
          datasourceUid: jaeger
          urlDisplayLabel: "View in Jaeger"

Create OpenTelemetry dashboard

Create a custom Grafana dashboard for monitoring OpenTelemetry metrics and application performance.

{
  "dashboard": {
    "id": null,
    "title": "OpenTelemetry Application Metrics",
    "tags": ["opentelemetry", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "HTTP Requests Total",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(rate(otel_http_requests_total[5m]))",
            "legendFormat": "Requests/sec"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "unit": "reqps"
          }
        },
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
      },
      {
        "id": 2,
        "title": "Request Duration",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(otel_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          },
          {
            "expr": "histogram_quantile(0.50, rate(otel_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "50th percentile"
          }
        ],
        "yAxes": [
          {
            "unit": "s"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
      },
      {
        "id": 3,
        "title": "Order Metrics",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(orders_total[5m])",
            "legendFormat": "Orders/sec"
          }
        ],
        "gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
      }
    ],
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "refresh": "5s"
  }
}

Test custom instrumentation

Start sample applications

Run the Python and Node.js applications to generate telemetry data for testing.

# Start Python app in background
python3 /opt/sample-app.py &

Start Node.js app in background

cd /opt/nodejs-app node app.js &

Generate test traffic

curl http://localhost:5000/api/users curl http://localhost:3000/api/orders curl -X POST http://localhost:3000/api/orders -H "Content-Type: application/json" -d '{"value": 99.99, "item": "laptop"}'

Verify your setup

# Check OpenTelemetry Collector status
sudo systemctl status otelcol

Verify collector is receiving metrics

curl http://localhost:8888/metrics

Check Prometheus metrics endpoint

curl http://localhost:8889/metrics

Verify Prometheus is scraping targets

curl http://localhost:9090/api/v1/targets

Check Grafana is running

sudo systemctl status grafana-server

View collector logs

sudo journalctl -u otelcol -f

Common issues

Symptom Cause Fix
Collector fails to start Invalid YAML configuration otelcol --config=/etc/otelcol/config.yaml --dry-run
No metrics in Prometheus Firewall blocking port 8889 Check firewall rules and collector endpoint
Application spans not appearing OTLP exporter connection failure Verify port 4317/4318 accessibility
High memory usage Memory limiter not configured Adjust memory_limiter processor settings
Permission denied on log directory Incorrect ownership sudo chown otelcol:otelcol /var/log/otelcol
Grafana dashboard shows no data Prometheus data source misconfigured Check data source URL and connectivity
Never use chmod 777. It gives every user on the system full access to your files. Instead, fix ownership with chown and use minimal permissions like 750 for directories and 644 for files.

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.