OpenTelemetry Custom Instrumentation & Prometheus Setup

Configure OpenTelemetry Collector with custom metrics exporters and processors, set up application instrumentation with SDKs, and integrate with Prometheus and Grafana for comprehensive distributed system monitoring and observability.

Prerequisites

Root or sudo access
Python 3.8+ for sample applications
Node.js 16+ for sample applications
At least 2GB RAM
Ports 4317, 4318, 8888, 8889, 9090, 3000 available

What this solves

OpenTelemetry provides a unified way to collect, process, and export telemetry data from your applications and infrastructure. This tutorial shows you how to set up custom instrumentation and metrics collection with Prometheus integration, enabling comprehensive monitoring of distributed systems with standardized telemetry data.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of dependencies.

sudo apt update && sudo apt upgrade -y
sudo apt install -y wget curl unzip

sudo dnf update -y
sudo dnf install -y wget curl unzip

Download and install OpenTelemetry Collector

Download the OpenTelemetry Collector binary from the official releases and install it in a standard location.

OTEL_VERSION="0.91.0"
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
tar -xzf otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
sudo mv otelcol /usr/local/bin/
sudo chmod +x /usr/local/bin/otelcol

Create OpenTelemetry user and directories

Create a dedicated user and directory structure for OpenTelemetry Collector with proper permissions.

sudo useradd --system --no-create-home --shell /bin/false otelcol
sudo mkdir -p /etc/otelcol /var/log/otelcol /var/lib/otelcol
sudo chown otelcol:otelcol /var/log/otelcol /var/lib/otelcol
sudo chmod 755 /etc/otelcol
sudo chmod 750 /var/log/otelcol /var/lib/otelcol

Configure OpenTelemetry Collector

Create the main configuration file with receivers, processors, exporters, and service pipeline definitions.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          static_configs:
            - targets: ['localhost:8888']
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      disk: {}
      filesystem: {}
      memory: {}
      network: {}
      process: {}

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 512
  resource:
    attributes:
      - key: environment
        value: production
        action: upsert
      - key: service.instance.id
        from_attribute: host.name
        action: insert

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: "otel"
    const_labels:
      environment: "production"
  otlp/jaeger:
    endpoint: http://localhost:14250
    tls:
      insecure: true
  logging:
    loglevel: info

service:
  extensions: [health_check, pprof]
  pipelines:
    metrics:
      receivers: [otlp, prometheus, hostmetrics]
      processors: [memory_limiter, resource, batch]
      exporters: [prometheus, logging]
    traces:
      receivers: [otlp]
      processors: [memory_limiter, resource, batch]
      exporters: [otlp/jaeger, logging]
  telemetry:
    logs:
      level: info
    metrics:
      address: 0.0.0.0:8888

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  pprof:
    endpoint: 0.0.0.0:1777

Create systemd service

Set up a systemd service to manage the OpenTelemetry Collector with proper restart policies and security settings.

[Unit]
Description=OpenTelemetry Collector
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=otelcol
Group=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otelcol
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30

Security settings
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/otelcol /var/log/otelcol
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes

[Install]
WantedBy=multi-user.target

Install Prometheus for metrics storage

Install Prometheus to scrape and store metrics from the OpenTelemetry Collector.

sudo apt install -y prometheus
sudo systemctl enable prometheus

sudo dnf install -y prometheus
sudo systemctl enable prometheus

Configure Prometheus to scrape OpenTelemetry metrics

Add the OpenTelemetry Collector as a scrape target in Prometheus configuration.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'otel-collector-metrics'
    static_configs:
      - targets: ['localhost:8889']
    scrape_interval: 30s
    metrics_path: /metrics
    
  - job_name: 'otel-collector-internal'
    static_configs:
      - targets: ['localhost:8888']
    scrape_interval: 30s
    metrics_path: /metrics

Set up Python application instrumentation

Install OpenTelemetry Python SDK and create a sample application with custom metrics.

pip3 install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp opentelemetry-instrumentation-requests opentelemetry-instrumentation-flask

Create instrumented Python application

Create a sample Flask application with OpenTelemetry instrumentation and custom metrics.

from flask import Flask
import time
import random
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor

Initialize tracing
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)

Initialize metrics
metric_reader = PeriodicExportingMetricReader(
    exporter=OTLPMetricExporter(endpoint="http://localhost:4317", insecure=True),
    export_interval_millis=30000
)
metrics.set_meter_provider(MeterProvider(metric_readers=[metric_reader]))
meter = metrics.get_meter(__name__)

Create custom metrics
request_counter = meter.create_counter(
    "http_requests_total",
    description="Total number of HTTP requests",
    unit="1"
)

response_time_histogram = meter.create_histogram(
    "http_request_duration_seconds",
    description="HTTP request duration in seconds",
    unit="s"
)

active_connections = meter.create_up_down_counter(
    "active_connections",
    description="Number of active connections",
    unit="1"
)

Configure OTLP exporter
otlp_exporter = OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True)
span_processor = BatchSpanProcessor(otlp_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)

app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)
RequestsInstrumentor().instrument()

@app.route('/api/users')
def get_users():
    start_time = time.time()
    
    with tracer.start_as_current_span("get_users") as span:
        span.set_attribute("operation", "fetch_users")
        span.set_attribute("user.count", 100)
        
        # Simulate work
        processing_time = random.uniform(0.1, 0.5)
        time.sleep(processing_time)
        
        # Record metrics
        request_counter.add(1, {"method": "GET", "endpoint": "/api/users"})
        response_time_histogram.record(time.time() - start_time, {"method": "GET", "endpoint": "/api/users"})
        active_connections.add(1)
        
        return {"users": ["user1", "user2", "user3"]}

@app.route('/api/health')
def health_check():
    request_counter.add(1, {"method": "GET", "endpoint": "/api/health"})
    return {"status": "healthy"}

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=False)

Create Node.js application instrumentation

Install OpenTelemetry Node.js SDK and create a sample Express application with custom metrics.

mkdir -p /opt/nodejs-app
cd /opt/nodejs-app
npm init -y
npm install express @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-otlp-grpc

Create instrumented Node.js application

Create a sample Express application with OpenTelemetry instrumentation and custom business metrics.

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { metrics, trace } = require('@opentelemetry/api');

// Initialize OpenTelemetry
const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://localhost:4317',
  }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: 'http://localhost:4317',
    }),
    exportIntervalMillis: 30000,
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

const express = require('express');
const app = express();

// Get meter and tracer
const meter = metrics.getMeter('nodejs-app', '1.0.0');
const tracer = trace.getTracer('nodejs-app', '1.0.0');

// Create custom metrics
const orderCounter = meter.createCounter('orders_total', {
  description: 'Total number of orders processed',
});

const orderValueHistogram = meter.createHistogram('order_value_dollars', {
  description: 'Order value distribution in dollars',
});

const inventoryGauge = meter.createUpDownCounter('inventory_items', {
  description: 'Current inventory levels',
});

app.use(express.json());

app.get('/api/orders', (req, res) => {
  const span = tracer.startSpan('get_orders');
  span.setAttributes({
    'operation': 'fetch_orders',
    'user.id': req.query.user_id || 'anonymous'
  });
  
  try {
    // Simulate fetching orders
    const orders = [
      { id: 1, value: 29.99, status: 'completed' },
      { id: 2, value: 149.50, status: 'pending' }
    ];
    
    // Record metrics
    orderCounter.add(orders.length, {
      status: 'success',
      endpoint: '/api/orders'
    });
    
    orders.forEach(order => {
      orderValueHistogram.record(order.value, {
        status: order.status
      });
    });
    
    span.setStatus({ code: trace.SpanStatusCode.OK });
    res.json({ orders });
  } catch (error) {
    span.recordException(error);
    span.setStatus({
      code: trace.SpanStatusCode.ERROR,
      message: error.message
    });
    res.status(500).json({ error: 'Internal server error' });
  } finally {
    span.end();
  }
});

app.post('/api/orders', (req, res) => {
  const span = tracer.startSpan('create_order');
  span.setAttributes({
    'operation': 'create_order',
    'order.value': req.body.value
  });
  
  try {
    const order = {
      id: Math.floor(Math.random() * 10000),
      value: req.body.value || 0,
      status: 'created'
    };
    
    // Record metrics
    orderCounter.add(1, {
      status: 'created',
      endpoint: '/api/orders'
    });
    
    orderValueHistogram.record(order.value, {
      status: order.status
    });
    
    inventoryGauge.add(-1, {
      item: req.body.item || 'unknown'
    });
    
    span.setStatus({ code: trace.SpanStatusCode.OK });
    res.status(201).json({ order });
  } catch (error) {
    span.recordException(error);
    span.setStatus({
      code: trace.SpanStatusCode.ERROR,
      message: error.message
    });
    res.status(500).json({ error: 'Internal server error' });
  } finally {
    span.end();
  }
});

app.get('/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

const port = process.env.PORT || 3000;
app.listen(port, () => {
  console.log(Server running on port ${port});
});

Start all services

Enable and start OpenTelemetry Collector, Prometheus, and verify they are running correctly.

sudo systemctl daemon-reload
sudo systemctl enable --now otelcol
sudo systemctl start prometheus
sudo systemctl status otelcol prometheus

Configure firewall rules

Open necessary ports for OpenTelemetry Collector, Prometheus, and application access.

sudo ufw allow 4317/tcp comment 'OpenTelemetry OTLP gRPC'
sudo ufw allow 4318/tcp comment 'OpenTelemetry OTLP HTTP'
sudo ufw allow 8889/tcp comment 'OpenTelemetry Prometheus metrics'
sudo ufw allow 9090/tcp comment 'Prometheus web UI'
sudo ufw reload

sudo firewall-cmd --permanent --add-port=4317/tcp --comment="OpenTelemetry OTLP gRPC"
sudo firewall-cmd --permanent --add-port=4318/tcp --comment="OpenTelemetry OTLP HTTP"
sudo firewall-cmd --permanent --add-port=8889/tcp --comment="OpenTelemetry Prometheus metrics"
sudo firewall-cmd --permanent --add-port=9090/tcp --comment="Prometheus web UI"
sudo firewall-cmd --reload

Install and configure Grafana

Install Grafana for visualizing metrics collected by Prometheus from OpenTelemetry.

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana
sudo systemctl enable --now grafana-server

cat << 'EOF' | sudo tee /etc/yum.repos.d/grafana.repo
[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
EOF
sudo dnf install -y grafana
sudo systemctl enable --now grafana-server

Configure Prometheus data source in Grafana

Add Prometheus data source

Configure Grafana to use Prometheus as a data source for OpenTelemetry metrics visualization.

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://localhost:9090
    isDefault: true
    editable: true
    jsonData:
      httpMethod: POST
      exemplarTraceIdDestinations:
        - name: trace_id
          datasourceUid: jaeger
          urlDisplayLabel: "View in Jaeger"

Create OpenTelemetry dashboard

Create a custom Grafana dashboard for monitoring OpenTelemetry metrics and application performance.

{
  "dashboard": {
    "id": null,
    "title": "OpenTelemetry Application Metrics",
    "tags": ["opentelemetry", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "HTTP Requests Total",
        "type": "stat",
        "targets": [
          {
            "expr": "sum(rate(otel_http_requests_total[5m]))",
            "legendFormat": "Requests/sec"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "unit": "reqps"
          }
        },
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
      },
      {
        "id": 2,
        "title": "Request Duration",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(otel_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          },
          {
            "expr": "histogram_quantile(0.50, rate(otel_http_request_duration_seconds_bucket[5m]))",
            "legendFormat": "50th percentile"
          }
        ],
        "yAxes": [
          {
            "unit": "s"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
      },
      {
        "id": 3,
        "title": "Order Metrics",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(orders_total[5m])",
            "legendFormat": "Orders/sec"
          }
        ],
        "gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
      }
    ],
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "refresh": "5s"
  }
}

Test custom instrumentation

Start sample applications

Run the Python and Node.js applications to generate telemetry data for testing.

# Start Python app in background
python3 /opt/sample-app.py &

Start Node.js app in background
cd /opt/nodejs-app
node app.js &

Generate test traffic
curl http://localhost:5000/api/users
curl http://localhost:3000/api/orders
curl -X POST http://localhost:3000/api/orders -H "Content-Type: application/json" -d '{"value": 99.99, "item": "laptop"}'

Verify your setup

# Check OpenTelemetry Collector status
sudo systemctl status otelcol

Verify collector is receiving metrics
curl http://localhost:8888/metrics

Check Prometheus metrics endpoint
curl http://localhost:8889/metrics

Verify Prometheus is scraping targets
curl http://localhost:9090/api/v1/targets

Check Grafana is running
sudo systemctl status grafana-server

View collector logs
sudo journalctl -u otelcol -f

Common issues

Symptom	Cause	Fix
Collector fails to start	Invalid YAML configuration	`otelcol --config=/etc/otelcol/config.yaml --dry-run`
No metrics in Prometheus	Firewall blocking port 8889	Check firewall rules and collector endpoint
Application spans not appearing	OTLP exporter connection failure	Verify port 4317/4318 accessibility
High memory usage	Memory limiter not configured	Adjust memory_limiter processor settings
Permission denied on log directory	Incorrect ownership	`sudo chown otelcol:otelcol /var/log/otelcol`
Grafana dashboard shows no data	Prometheus data source misconfigured	Check data source URL and connectivity

Never use chmod 777. It gives every user on the system full access to your files. Instead, fix ownership with chown and use minimal permissions like 750 for directories and 644 for files.

Next steps

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Default configuration
OTEL_VERSION="0.91.0"
PROMETHEUS_VERSION="2.47.2"

# Usage function
usage() {
    echo "Usage: $0 [OPTIONS]"
    echo "Options:"
    echo "  --otel-version VERSION    OpenTelemetry Collector version (default: $OTEL_VERSION)"
    echo "  --prometheus-version VER  Prometheus version (default: $PROMETHEUS_VERSION)"
    echo "  -h, --help               Show this help message"
    exit 1
}

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --otel-version)
            OTEL_VERSION="$2"
            shift 2
            ;;
        --prometheus-version)
            PROMETHEUS_VERSION="$2"
            shift 2
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo -e "${RED}Unknown option: $1${NC}"
            usage
            ;;
    esac
done

# Cleanup function for rollback
cleanup() {
    echo -e "${YELLOW}Cleaning up on failure...${NC}"
    systemctl stop otelcol prometheus 2>/dev/null || true
    systemctl disable otelcol prometheus 2>/dev/null || true
    rm -f /etc/systemd/system/otelcol.service /etc/systemd/system/prometheus.service
    rm -f /usr/local/bin/otelcol /usr/local/bin/prometheus /usr/local/bin/promtool
    userdel otelcol prometheus 2>/dev/null || true
    rm -rf /etc/otelcol /var/lib/otelcol /var/log/otelcol
    rm -rf /etc/prometheus /var/lib/prometheus
    systemctl daemon-reload
}

trap cleanup ERR

# Color echo functions
echo_info() { echo -e "${BLUE}$1${NC}"; }
echo_success() { echo -e "${GREEN}$1${NC}"; }
echo_warning() { echo -e "${YELLOW}$1${NC}"; }
echo_error() { echo -e "${RED}$1${NC}"; }

# Check prerequisites
echo_info "[1/10] Checking prerequisites..."

if [[ $EUID -ne 0 ]]; then
    echo_error "This script must be run as root or with sudo"
    exit 1
fi

# Detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_UPDATE="apt update && apt upgrade -y"
            PKG_INSTALL="apt install -y"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora) 
            PKG_MGR="dnf"
            PKG_UPDATE="dnf update -y"
            PKG_INSTALL="dnf install -y"
            ;;
        amzn) 
            PKG_MGR="yum"
            PKG_UPDATE="yum update -y"
            PKG_INSTALL="yum install -y"
            ;;
        *) 
            echo_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    echo_error "Cannot detect distribution - /etc/os-release not found"
    exit 1
fi

echo_success "Detected distribution: $ID using $PKG_MGR"

# Update system packages
echo_info "[2/10] Updating system packages..."
$PKG_UPDATE
$PKG_INSTALL wget curl tar gzip systemd

# Download and install OpenTelemetry Collector
echo_info "[3/10] Installing OpenTelemetry Collector..."
cd /tmp
wget "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"
tar -xzf "otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"
mv otelcol /usr/local/bin/
chmod 755 /usr/local/bin/otelcol
rm -f "otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"

# Create OpenTelemetry user and directories
echo_info "[4/10] Creating OpenTelemetry user and directories..."
useradd --system --no-create-home --shell /bin/false otelcol || true
mkdir -p /etc/otelcol /var/log/otelcol /var/lib/otelcol
chown otelcol:otelcol /var/log/otelcol /var/lib/otelcol
chmod 755 /etc/otelcol
chmod 750 /var/log/otelcol /var/lib/otelcol

# Configure OpenTelemetry Collector
echo_info "[5/10] Configuring OpenTelemetry Collector..."
cat > /etc/otelcol/config.yaml << 'EOF'
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          static_configs:
            - targets: ['localhost:8888']
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      disk: {}
      filesystem: {}
      memory: {}
      network: {}
      process: {}

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 512
  resource:
    attributes:
      - key: environment
        value: production
        action: upsert
      - key: service.instance.id
        from_attribute: host.name
        action: insert

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: "otel"
    const_labels:
      environment: "production"
  logging:
    loglevel: info

service:
  extensions: [health_check, pprof]
  pipelines:
    metrics:
      receivers: [otlp, prometheus, hostmetrics]
      processors: [memory_limiter, resource, batch]
      exporters: [prometheus, logging]
  telemetry:
    logs:
      level: info
    metrics:
      address: 0.0.0.0:8888

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  pprof:
    endpoint: 0.0.0.0:1777
EOF

chmod 644 /etc/otelcol/config.yaml
chown otelcol:otelcol /etc/otelcol/config.yaml

# Create OpenTelemetry systemd service
echo_info "[6/10] Creating OpenTelemetry systemd service..."
cat > /etc/systemd/system/otelcol.service << 'EOF'
[Unit]
Description=OpenTelemetry Collector
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=otelcol
Group=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otelcol
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30

# Security settings
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/otelcol /var/log/otelcol
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes

[Install]
WantedBy=multi-user.target
EOF

# Download and install Prometheus
echo_info "[7/10] Installing Prometheus..."
cd /tmp
wget "https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
tar -xzf "prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
mv "prometheus-${PROMETHEUS_VERSION}.linux-amd64/prometheus" /usr/local/bin/
mv "prometheus-${PROMETHEUS_VERSION}.linux-amd64/promtool" /usr/local/bin/
chmod 755 /usr/local/bin/prometheus /usr/local/bin/promtool
rm -rf "prometheus-${PROMETHEUS_VERSION}.linux-amd64" "prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"

# Create Prometheus user and directories
echo_info "[8/10] Creating Prometheus user and directories..."
useradd --system --no-create-home --shell /bin/false prometheus || true
mkdir -p /etc/prometheus /var/lib/prometheus
chown prometheus:prometheus /var/lib/prometheus
chmod 755 /etc/prometheus
chmod 750 /var/lib/prometheus

# Configure Prometheus
cat > /etc/prometheus/prometheus.yml << 'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['localhost:8889']
    scrape_interval: 10s
    metrics_path: /metrics
EOF

chmod 644 /etc/prometheus/prometheus.yml
chown prometheus:prometheus /etc/prometheus/prometheus.yml

# Create Prometheus systemd service
cat > /etc/systemd/system/prometheus.service << 'EOF'
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.listen-address=0.0.0.0:9090 \
    --web.enable-lifecycle
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# Configure firewall
echo_info "[9/10] Configuring firewall..."
if command -v firewall-cmd >/dev/null 2>&1; then
    firewall-cmd --permanent --add-port=4317/tcp --add-port=4318/tcp --add-port=8888/tcp --add-port=8889/tcp --add-port=9090/tcp --add-port=13133/tcp || true
    firewall-cmd --reload || true
elif command -v ufw >/dev/null 2>&1; then
    ufw allow 4317/tcp || true
    ufw allow 4318/tcp || true
    ufw allow 8888/tcp || true
    ufw allow 8889/tcp || true
    ufw allow 9090/tcp || true
    ufw allow 13133/tcp || true
fi

# Start and enable services
echo_info "[10/10] Starting and enabling services..."
systemctl daemon-reload
systemctl enable otelcol prometheus
systemctl start otelcol
systemctl start prometheus

# Verification checks
echo_info "Performing verification checks..."
sleep 5

# Check service status
if ! systemctl is-active --quiet otelcol; then
    echo_error "OpenTelemetry Collector failed to start"
    journalctl -u otelcol --no-pager -l
    exit 1
fi

if ! systemctl is-active --quiet prometheus; then
    echo_error "Prometheus failed to start"
    journalctl -u prometheus --no-pager -l
    exit 1
fi

# Check if ports are listening
if ! ss -tlnp | grep -q ":4317"; then
    echo_error "OpenTelemetry GRPC port 4317 not listening"
    exit 1
fi

if ! ss -tlnp | grep -q ":9090"; then
    echo_error "Prometheus port 9090 not listening"
    exit 1
fi

echo_success "✓ OpenTelemetry Collector is running"
echo_success "✓ Prometheus is running"
echo_success "✓ All services are healthy"

echo_info "Installation completed successfully!"
echo_info "OpenTelemetry Collector endpoints:"
echo_info "  - GRPC: localhost:4317"
echo_info "  - HTTP: localhost:4318"
echo_info "  - Metrics: localhost:8889/metrics"
echo_info "  - Health: localhost:13133"
echo_info "Prometheus:"
echo_info "  - Web UI: http://localhost:9090"

trap - ERR

Review the script before running. Execute with: bash install.sh

#opentelemetry #prometheus #metrics #instrumentation #monitoring

Set up OpenTelemetry custom instrumentation and metrics collection with Prometheus integration

Prerequisites

What this solves

Step-by-step installation

Update system packages

Download and install OpenTelemetry Collector

Create OpenTelemetry user and directories

Configure OpenTelemetry Collector

Create systemd service

Security settings

Install Prometheus for metrics storage

Configure Prometheus to scrape OpenTelemetry metrics

Set up Python application instrumentation

Create instrumented Python application

Initialize tracing

Initialize metrics

Create custom metrics

Configure OTLP exporter

Create Node.js application instrumentation

Create instrumented Node.js application

Start all services

Configure firewall rules

Install and configure Grafana

Configure Prometheus data source in Grafana

Add Prometheus data source

Create OpenTelemetry dashboard

Test custom instrumentation

Start sample applications

Start Node.js app in background

Generate test traffic

Verify your setup

Verify collector is receiving metrics

Check Prometheus metrics endpoint

Verify Prometheus is scraping targets

Check Grafana is running

View collector logs

Common issues

Next steps

Related tutorials

Configure Consul Connect service mesh monitoring with distributed tracing

Configure OpenTelemetry custom metrics for application monitoring with Prometheus and Grafana

Configure Jaeger with Elasticsearch backend security and encryption

Don't want to manage this yourself?