Configure comprehensive monitoring for your Istio service mesh using Prometheus for metrics collection and Grafana for visualization. Set up observability dashboards to track traffic flow, security policies, and service performance with production-ready alerting rules.
Prerequisites
- Kubernetes cluster with kubeadm
- Istio service mesh installed
- kubectl access with cluster-admin rights
- At least 8GB RAM and 4 CPU cores
What this solves
Istio service mesh generates extensive telemetry data about service-to-service communication, security policies, and traffic patterns, but without proper monitoring configuration, this valuable data remains invisible. This tutorial shows you how to configure Prometheus to collect Istio metrics and set up Grafana dashboards for comprehensive service mesh observability. You'll implement monitoring for traffic flow, security policy enforcement, and performance metrics with alerting rules for production environments.
Step-by-step configuration
Update system packages
Start by updating your package manager to ensure you have the latest versions for all dependencies.
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget gnupg software-properties-common
Enable Istio telemetry components
Configure Istio to enable Prometheus metrics collection and ensure telemetry v2 is active for comprehensive observability data.
kubectl apply -f - <
Install Prometheus operator
Deploy the Prometheus operator to manage Prometheus instances and monitoring configurations in your Kubernetes cluster.
kubectl create namespace monitoring
kubectl apply --server-side -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.71.0/bundle.yaml
Configure Prometheus for Istio metrics
Create a Prometheus instance specifically configured to scrape Istio control plane and data plane metrics.
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: istio-prometheus
namespace: monitoring
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
app: istio-proxy
ruleSelector:
matchLabels:
app: istio
resources:
requests:
memory: 400Mi
cpu: 100m
limits:
memory: 2Gi
cpu: 1000m
retention: 7d
storage:
volumeClaimTemplate:
spec:
storageClassName: default
resources:
requests:
storage: 50Gi
additionalScrapeConfigs:
name: istio-scrape-configs
key: prometheus.yaml
Create Prometheus RBAC configuration
Set up service account and RBAC permissions for Prometheus to access Istio metrics endpoints across namespaces.
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitoring
Configure Istio service monitors
Create ServiceMonitor resources to automatically discover and scrape Istio components including istiod, gateways, and sidecars.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: istio-control-plane
namespace: monitoring
labels:
app: istio-proxy
spec:
selector:
matchLabels:
app: istiod
namespaceSelector:
matchNames:
- istio-system
endpoints:
- port: http-monitoring
interval: 15s
path: /stats/prometheus
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: istio-proxy
namespace: monitoring
labels:
app: istio-proxy
spec:
selector:
matchExpressions:
- key: security.istio.io/tlsMode
operator: Exists
namespaceSelector:
any: true
endpoints:
- port: http-envoy-prom
interval: 15s
path: /stats/prometheus
relabelings:
- sourceLabels: [__meta_kubernetes_pod_name]
targetLabel: pod_name
- sourceLabels: [__meta_kubernetes_namespace]
targetLabel: namespace
Apply monitoring configurations
Deploy all monitoring components to your cluster and verify they're running correctly.
kubectl apply -f prometheus-rbac.yaml
kubectl apply -f istio-prometheus.yaml
kubectl apply -f istio-service-monitors.yaml
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=prometheus -n monitoring --timeout=300s
Install Grafana
Deploy Grafana with persistent storage and configure it to use the Prometheus instance as a data source.
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:10.2.2
ports:
- containerPort: 3000
env:
- name: GF_SECURITY_ADMIN_PASSWORD
value: "admin123!@#"
- name: GF_INSTALL_PLUGINS
value: "grafana-piechart-panel"
volumeMounts:
- name: grafana-storage
mountPath: /var/lib/grafana
- name: grafana-config
mountPath: /etc/grafana/provisioning
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 512Mi
cpu: 500m
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
- name: grafana-config
configMap:
name: grafana-config
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
namespace: monitoring
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: default
Configure Grafana data source
Create a ConfigMap to automatically provision Prometheus as a data source in Grafana.
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-config
namespace: monitoring
data:
datasources.yml: |
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus-operated:9090
isDefault: true
editable: false
dashboards.yml: |
apiVersion: 1
providers:
- name: 'istio'
orgId: 1
folder: 'Istio'
type: file
disableDeletion: false
updateIntervalSeconds: 10
options:
path: /var/lib/grafana/dashboards
Deploy Grafana service
Create a service to expose Grafana and apply all configurations to make it accessible.
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: monitoring
spec:
selector:
app: grafana
ports:
- name: http
port: 3000
targetPort: 3000
type: LoadBalancer
kubectl apply -f grafana-config.yaml
kubectl apply -f grafana-deployment.yaml
kubectl apply -f grafana-service.yaml
Install Istio dashboard templates
Download and configure the official Istio Grafana dashboards for comprehensive service mesh monitoring.
mkdir -p /tmp/istio-dashboards
cd /tmp/istio-dashboards
Download official Istio dashboards
curl -L https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/grafana.yaml -o istio-grafana-dashboards.yaml
Extract dashboard JSON from the configmap
kubectl apply -f istio-grafana-dashboards.yaml
kubectl get configmap istio-grafana-dashboards -n istio-system -o yaml | grep -A 10000 'istio-mesh-dashboard.json:' | tail -n +2 > mesh-dashboard.json
kubectl get configmap istio-grafana-dashboards -n istio-system -o yaml | grep -A 10000 'istio-service-dashboard.json:' | tail -n +2 > service-dashboard.json
kubectl get configmap istio-grafana-dashboards -n istio-system -o yaml | grep -A 10000 'istio-workload-dashboard.json:' | tail -n +2 > workload-dashboard.json
Create dashboard ConfigMap
Create a ConfigMap containing the Istio dashboards so Grafana can automatically load them.
kubectl create configmap istio-dashboards \
--from-file=mesh-dashboard.json \
--from-file=service-dashboard.json \
--from-file=workload-dashboard.json \
-n monitoring
Mount the dashboards in Grafana
kubectl patch deployment grafana -n monitoring --patch '{
"spec": {
"template": {
"spec": {
"containers": [{
"name": "grafana",
"volumeMounts": [{
"name": "istio-dashboards",
"mountPath": "/var/lib/grafana/dashboards"
}]
}],
"volumes": [{
"name": "istio-dashboards",
"configMap": {
"name": "istio-dashboards"
}
}]
}
}
}
}'
Configure alerting rules
Set up PrometheusRule resources to define alerting conditions for Istio service mesh health and performance issues.
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: istio-alerts
namespace: monitoring
labels:
app: istio
spec:
groups:
- name: istio.rules
rules:
- alert: IstioHighRequestRate
expr: sum(rate(istio_requests_total[5m])) > 1000
for: 2m
labels:
severity: warning
annotations:
summary: "High request rate detected"
description: "Request rate is {{ $value }} requests per second"
- alert: IstioHighErrorRate
expr: sum(rate(istio_requests_total{response_code!~"2.."}[5m])) / sum(rate(istio_requests_total[5m])) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value | humanizePercentage }}"
- alert: IstioPilotPushErrors
expr: sum(rate(pilot_xds_push_errors[5m])) > 5
for: 1m
labels:
severity: warning
annotations:
summary: "Istio Pilot push errors"
description: "Pilot is experiencing {{ $value }} push errors per second"
- alert: IstioProxyConfigurationDrift
expr: sum(pilot_k8s_cfg_events{type="warning"}) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "Istio proxy configuration warnings"
description: "{{ $value }} configuration warnings detected"
Apply alerting configuration
Deploy the alerting rules and restart Grafana to ensure all configurations are loaded.
kubectl apply -f istio-alerts.yaml
kubectl rollout restart deployment/grafana -n monitoring
kubectl wait --for=condition=available deployment/grafana -n monitoring --timeout=300s
Enable traffic generation for testing
Deploy a sample application with Istio injection to generate metrics for dashboard testing. This tutorial assumes you have an Istio-enabled cluster as covered in our Istio installation guide.
kubectl create namespace bookinfo
kubectl label namespace bookinfo istio-injection=enabled
kubectl apply -n bookinfo -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -n bookinfo -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/bookinfo/networking/bookinfo-gateway.yaml
Verify your setup
Check that all monitoring components are running and collecting metrics properly.
# Verify Prometheus is scraping Istio metrics
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090 &
curl -s http://localhost:9090/api/v1/query?query=up{job="istio-proxy"} | jq '.data.result[] | select(.value[1]=="1") | .metric.instance'
Check Grafana accessibility
kubectl port-forward -n monitoring svc/grafana 3000:3000 &
curl -s http://localhost:3000/api/health
Verify Istio metrics are being collected
kubectl exec -n monitoring $(kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus -o jsonpath='{.items[0].metadata.name}') -- wget -qO- http://localhost:9090/api/v1/query?query=istio_requests_total | head -20
Check service mesh connectivity
kubectl exec -n bookinfo deployment/ratings-v1 -- curl -s productpage:9080/productpage | grep -o ".* "
Configure dashboard access and security
Set up Grafana authentication
Configure proper authentication and access controls for your Grafana instance in production environments.
# Get Grafana external IP
GRAFANA_IP=$(kubectl get svc grafana -n monitoring -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Grafana URL: http://$GRAFANA_IP:3000"
echo "Username: admin"
echo "Password: admin123!@#"
Create additional Grafana users via API
curl -X POST http://admin:admin123!@#@$GRAFANA_IP:3000/api/admin/users \
-H "Content-Type: application/json" \
-d '{
"name": "Istio Viewer",
"email": "viewer@example.com",
"login": "istio-viewer",
"password": "ViewerPass123!",
"role": "Viewer"
}'
Monitor traffic flow and security policies
Access your Grafana dashboards to monitor service mesh performance and security policy enforcement. Navigate to the Istio folder in Grafana to find pre-configured dashboards for mesh overview, service performance, and workload metrics. The mesh dashboard shows global traffic patterns and success rates, while service dashboards provide detailed metrics for individual microservices including request duration, throughput, and error rates.
Key metrics to monitor include request success rate, P99 latency, mutual TLS adoption rate, and policy violation counts. Set up alerts based on your SLA requirements, typically focusing on error rates above 1%, latency exceeding baseline by 50%, or any mutual TLS policy violations. For comprehensive observability, consider integrating with our distributed tracing setup to correlate metrics with trace data.
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| No Istio metrics in Prometheus | ServiceMonitor not discovering endpoints | kubectl get servicemonitor -n monitoring -o yaml and verify selectors match Istio services |
| Grafana dashboards show "No data" | Data source configuration incorrect | Check Prometheus URL in Grafana data source: http://prometheus-operated:9090 |
| High memory usage in Prometheus | Too many high-cardinality metrics | Add metric relabeling rules to drop unnecessary labels in ServiceMonitor |
| Alerts not firing | PrometheusRule not loaded | kubectl get prometheusrule -n monitoring and verify rule selector matches Prometheus spec |
| Missing sidecar metrics | Pods not injected with Istio proxy | kubectl get pods -o jsonpath='{.items[].spec.containers[].name}' to verify istio-proxy container |
Next steps
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Global variables
PKG_MGR=""
PKG_INSTALL=""
PKG_UPDATE=""
DISTRO=""
ISTIO_VERSION="1.20.1"
PROMETHEUS_VERSION="2.48.0"
GRAFANA_VERSION="10.2.2"
# Progress counter
STEP=0
TOTAL_STEPS=8
print_step() {
((STEP++))
echo -e "${BLUE}[$STEP/$TOTAL_STEPS]${NC} $1"
}
print_success() {
echo -e "${GREEN}✓${NC} $1"
}
print_error() {
echo -e "${RED}✗${NC} $1" >&2
}
print_warning() {
echo -e "${YELLOW}⚠${NC} $1"
}
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " -n, --namespace NAMESPACE Kubernetes namespace for monitoring (default: istio-system)"
echo " -d, --domain DOMAIN Domain for Grafana ingress (optional)"
echo " -h, --help Show this help message"
exit 1
}
cleanup() {
print_error "Script failed. Cleaning up..."
# Remove any partially installed components
rm -f /tmp/istio-*.tar.gz /tmp/prometheus-*.tar.gz
exit 1
}
trap cleanup ERR
detect_distro() {
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update && apt upgrade -y"
DISTRO="debian"
;;
almalinux|rocky|centos|rhel|ol)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf update -y"
DISTRO="rhel"
;;
fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf update -y"
DISTRO="rhel"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum update -y"
DISTRO="rhel"
;;
*)
print_error "Unsupported distribution: $ID"
exit 1
;;
esac
else
print_error "Cannot detect distribution. /etc/os-release not found."
exit 1
fi
}
check_prerequisites() {
# Check if running as root or with sudo
if [[ $EUID -ne 0 ]]; then
print_error "This script must be run as root or with sudo"
exit 1
fi
# Check for required tools
local required_tools=("curl" "wget" "tar")
for tool in "${required_tools[@]}"; do
if ! command -v "$tool" >/dev/null 2>&1; then
print_error "Required tool '$tool' is not installed"
exit 1
fi
done
# Check for kubectl
if ! command -v kubectl >/dev/null 2>&1; then
print_error "kubectl is required but not installed. Please install kubectl first."
exit 1
fi
# Check Kubernetes cluster connectivity
if ! kubectl cluster-info >/dev/null 2>&1; then
print_error "Cannot connect to Kubernetes cluster. Please check your kubeconfig."
exit 1
fi
}
install_dependencies() {
print_step "Installing system dependencies..."
$PKG_UPDATE
if [ "$DISTRO" = "debian" ]; then
$PKG_INSTALL curl wget gnupg software-properties-common apt-transport-https
else
$PKG_INSTALL curl wget gnupg2 tar
fi
print_success "System dependencies installed"
}
install_istio() {
print_step "Installing Istio..."
cd /tmp
curl -L "https://istio.io/downloadIstio" | ISTIO_VERSION="$ISTIO_VERSION" sh -
# Move istioctl to system path
mv "istio-$ISTIO_VERSION/bin/istioctl" /usr/local/bin/
chmod 755 /usr/local/bin/istioctl
# Install Istio with telemetry components
istioctl install --set values.telemetry.v2.enabled=true --set values.pilot.env.EXTERNAL_ISTIOD=false -y
# Enable Istio injection for default namespace
kubectl label namespace default istio-injection=enabled --overwrite
print_success "Istio installed and configured"
}
configure_prometheus() {
print_step "Configuring Prometheus for Istio..."
# Apply Istio telemetry configuration
kubectl apply -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: control-plane
spec:
values:
telemetry:
v2:
enabled: true
prometheus:
configOverride:
metric_relabeling_configs:
- source_labels: [__name__]
regex: 'istio_.*'
action: keep
- source_labels: [__name__]
regex: 'envoy_.*'
action: keep
EOF
# Install Prometheus via Istio addons
kubectl apply -f "https://raw.githubusercontent.com/istio/istio/release-$ISTIO_VERSION/samples/addons/prometheus.yaml"
# Wait for Prometheus to be ready
kubectl wait --for=condition=available --timeout=300s deployment/prometheus -n istio-system
print_success "Prometheus configured for Istio metrics collection"
}
install_grafana() {
print_step "Installing Grafana with Istio dashboards..."
# Install Grafana via Istio addons
kubectl apply -f "https://raw.githubusercontent.com/istio/istio/release-$ISTIO_VERSION/samples/addons/grafana.yaml"
# Wait for Grafana to be ready
kubectl wait --for=condition=available --timeout=300s deployment/grafana -n istio-system
print_success "Grafana installed with Istio dashboards"
}
configure_telemetry() {
print_step "Enabling comprehensive Istio telemetry..."
# Enable telemetry v2 with comprehensive metrics
kubectl apply -f - <<EOF
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: default-metrics
namespace: istio-system
spec:
metrics:
- providers:
- name: prometheus
- overrides:
- match:
metric: ALL_METRICS
tagOverrides:
request_protocol:
operation: UPSERT
value: "%{REQUEST_PROTOCOL}"
EOF
print_success "Istio telemetry v2 enabled with comprehensive metrics"
}
setup_monitoring_access() {
print_step "Setting up monitoring access..."
# Create port-forward services for easy access
cat > /usr/local/bin/istio-monitoring <<'EOF'
#!/bin/bash
echo "Starting Istio monitoring port-forwards..."
echo "Grafana will be available at: http://localhost:3000"
echo "Prometheus will be available at: http://localhost:9090"
echo "Press Ctrl+C to stop all port-forwards"
kubectl port-forward -n istio-system svc/grafana 3000:3000 &
GRAFANA_PID=$!
kubectl port-forward -n istio-system svc/prometheus 9090:9090 &
PROMETHEUS_PID=$!
trap "kill $GRAFANA_PID $PROMETHEUS_PID 2>/dev/null" EXIT
wait
EOF
chmod 755 /usr/local/bin/istio-monitoring
print_success "Monitoring access script created at /usr/local/bin/istio-monitoring"
}
setup_alerting_rules() {
print_step "Configuring Prometheus alerting rules for Istio..."
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio-alerts
namespace: istio-system
data:
istio.rules: |
groups:
- name: istio.rules
rules:
- alert: IstioHighRequestLatency
expr: histogram_quantile(0.99, sum(rate(istio_request_duration_milliseconds_bucket[5m])) by (le, destination_service_name)) > 1000
for: 5m
labels:
severity: warning
annotations:
summary: High request latency detected
description: "{{ \$labels.destination_service_name }} has 99th percentile latency above 1000ms"
- alert: IstioHighErrorRate
expr: sum(rate(istio_requests_total{response_code!~"2.*"}[5m])) / sum(rate(istio_requests_total[5m])) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: High error rate in service mesh
description: "Error rate is above 10% for the last 2 minutes"
EOF
print_success "Istio alerting rules configured"
}
verify_installation() {
print_step "Verifying installation..."
# Check Istio components
if ! kubectl get pods -n istio-system | grep -E "(istiod|prometheus|grafana)" | grep -q Running; then
print_error "Some Istio components are not running"
return 1
fi
# Check if metrics are being collected
if ! kubectl exec -n istio-system deployment/prometheus -- promtool query instant 'up{job="istio-mesh"}' >/dev/null 2>&1; then
print_warning "Istio metrics may not be fully available yet (this can take a few minutes)"
fi
print_success "Installation verification completed"
echo
echo -e "${GREEN}🎉 Istio monitoring setup completed successfully!${NC}"
echo
echo "Next steps:"
echo "1. Run 'istio-monitoring' to start port-forwards for local access"
echo "2. Access Grafana at http://localhost:3000 (admin/admin)"
echo "3. Access Prometheus at http://localhost:9090"
echo "4. Deploy sample applications to see metrics in action"
echo
echo "Useful commands:"
echo " kubectl get pods -n istio-system # Check component status"
echo " istioctl proxy-status # Check proxy status"
echo " istioctl analyze # Analyze configuration"
}
# Parse command line arguments
NAMESPACE="istio-system"
DOMAIN=""
while [[ $# -gt 0 ]]; do
case $1 in
-n|--namespace)
NAMESPACE="$2"
shift 2
;;
-d|--domain)
DOMAIN="$2"
shift 2
;;
-h|--help)
usage
;;
*)
print_error "Unknown option: $1"
usage
;;
esac
done
# Main execution
main() {
echo -e "${BLUE}Istio Service Mesh Monitoring Setup${NC}"
echo "===================================="
detect_distro
print_success "Detected distribution: $ID ($DISTRO-based)"
check_prerequisites
install_dependencies
install_istio
configure_prometheus
install_grafana
configure_telemetry
setup_monitoring_access
setup_alerting_rules
verify_installation
}
main "$@"
Review the script before running. Execute with: bash install.sh