Configure Prometheus adapter to expose custom application metrics to Kubernetes Horizontal Pod Autoscaler for intelligent scaling based on business metrics like queue depth, response time, and user load instead of basic CPU/memory usage.
Prerequisites
- Running Kubernetes cluster with kubectl access
- Helm 3 installed
- Prometheus server deployed
- Applications exposing metrics
What this solves
Kubernetes Horizontal Pod Autoscaler (HPA) by default only scales based on CPU and memory usage, which often doesn't reflect your application's actual load patterns. This tutorial shows you how to configure Prometheus adapter to expose custom application metrics to HPA, enabling intelligent scaling based on business-relevant metrics like queue depth, active connections, or request latency.
Prerequisites
- A running Kubernetes cluster with kubectl access
- Helm 3 installed on your system
- Prometheus server deployed in your cluster
- Applications already exposing metrics to Prometheus
Step-by-step configuration
Install the Prometheus adapter with Helm
The Prometheus adapter translates Prometheus metrics into the Kubernetes custom metrics API format that HPA can consume.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Create the Prometheus adapter configuration
This configuration defines which Prometheus metrics to expose and how to format them for HPA consumption.
prometheus:
url: http://prometheus-server.monitoring.svc.cluster.local
port: 80
rules:
custom:
- seriesQuery: 'http_requests_per_second{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "http_requests_per_second"
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
- seriesQuery: 'nginx_ingress_controller_requests{namespace!="",ingress!=""}'
resources:
overrides:
namespace:
resource: namespace
ingress:
resource: ingress
name:
matches: "^(.*)"
as: "nginx_requests_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
- seriesQuery: 'rabbitmq_queue_messages{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "rabbitmq_queue_depth"
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}[3m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,id='/'}[3m])) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}) by (<<.GroupBy>>)
nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
Deploy the Prometheus adapter
Install the adapter with your custom configuration to enable metric translation.
helm install prometheus-adapter prometheus-community/prometheus-adapter \
-n monitoring --create-namespace \
-f prometheus-adapter-values.yaml
Create a sample application with custom metrics
Deploy a test application that exposes metrics Prometheus can scrape for HPA scaling decisions.
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: sample-app
image: nginx:alpine
ports:
- containerPort: 80
- containerPort: 8080
volumeMounts:
- name: metrics-config
mountPath: /etc/nginx/conf.d
command: ["/bin/sh"]
args:
- -c
- |
# Start nginx in background
nginx -g "daemon off;" &
# Simple metrics server
while true; do
REQUESTS=$(shuf -i 10-100 -n 1)
echo "# TYPE http_requests_per_second gauge" > /tmp/metrics
echo "http_requests_per_second{pod=\"$HOSTNAME\",namespace=\"default\"} $REQUESTS" >> /tmp/metrics
nc -l -p 8080 -e sh -c 'echo -e "HTTP/1.1 200 OK\n\nContent-Type: text/plain\n\n$(cat /tmp/metrics)"' &
sleep 15
done
volumes:
- name: metrics-config
configMap:
name: nginx-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
namespace: default
data:
default.conf: |
server {
listen 80;
location / {
return 200 "Sample App Running";
add_header Content-Type text/plain;
}
}
---
apiVersion: v1
kind: Service
metadata:
name: sample-app
namespace: default
labels:
app: sample-app
spec:
ports:
- port: 80
targetPort: 80
name: http
- port: 8080
targetPort: 8080
name: metrics
selector:
app: sample-app
kubectl apply -f sample-app.yaml
Configure Prometheus to scrape the application metrics
Ensure Prometheus discovers and scrapes your application's custom metrics through service annotations.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
# If using Helm-deployed Prometheus, update the values
helm upgrade prometheus prometheus-community/prometheus \
-n monitoring \
--set-file server.configMapOverride.prometheus\.yml=prometheus-scrape-config.yaml
Create a Horizontal Pod Autoscaler with custom metrics
Configure HPA to scale based on your custom application metrics instead of CPU or memory.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: sample-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "30"
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 2
periodSeconds: 60
selectPolicy: Max
kubectl apply -f custom-metrics-hpa.yaml
Test autoscaling with metric injection
Generate load to trigger scaling based on your custom metrics and verify HPA responds appropriately.
apiVersion: apps/v1
kind: Deployment
metadata:
name: load-generator
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: load-generator
template:
metadata:
labels:
app: load-generator
spec:
containers:
- name: load-generator
image: busybox
command: ["/bin/sh"]
args:
- -c
- |
while true; do
for i in $(seq 1 10); do
wget -qO- http://sample-app.default.svc.cluster.local/ &
done
sleep 1
done
---
apiVersion: batch/v1
kind: Job
metadata:
name: metric-injector
namespace: default
spec:
template:
spec:
containers:
- name: metric-injector
image: curlimages/curl
command: ["/bin/sh"]
args:
- -c
- |
# Simulate high request rate by updating metrics
for i in $(seq 1 300); do
echo "Injecting high metric value: $(date)"
kubectl patch deployment sample-app -p '{"spec":{"template":{"metadata":{"annotations":{"metric-injection":"'$(date +%s)'"}}}}}}' || true
sleep 10
done
restartPolicy: OnFailure
kubectl apply -f load-generator.yaml
Verify your setup
Check that the Prometheus adapter is exposing your custom metrics and HPA can access them.
# Verify Prometheus adapter is running
kubectl get pods -n monitoring | grep prometheus-adapter
Check available custom metrics
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
Verify specific metric values
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second" | jq .
Monitor HPA status and scaling events
kubectl get hpa sample-app-hpa -w
kubectl describe hpa sample-app-hpa
Check current replica count
kubectl get deployment sample-app
View HPA events
kubectl get events --field-selector involvedObject.name=sample-app-hpa
Advanced configuration options
Multiple metric types configuration
Configure HPA to use multiple custom metrics for more sophisticated scaling decisions.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: multi-metric-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "50"
- type: Object
object:
metric:
name: nginx_requests_per_second
describedObject:
apiVersion: networking.k8s.io/v1
kind: Ingress
name: sample-app-ingress
target:
type: Value
value: "100"
- type: External
external:
metric:
name: rabbitmq_queue_depth
selector:
matchLabels:
queue: workqueue
target:
type: AverageValue
averageValue: "10"
behavior:
scaleDown:
stabilizationWindowSeconds: 600
policies:
- type: Pods
value: 2
periodSeconds: 120
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
Prometheus adapter query optimization
Fine-tune metric queries for better performance and accuracy in high-traffic environments.
rules:
custom:
- seriesQuery: 'http_request_duration_seconds{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^http_request_duration_seconds"
as: "http_request_latency_p99"
metricsQuery: 'histogram_quantile(0.99, sum(rate(<<.Series>>_bucket{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>, le))'
- seriesQuery: 'application_active_connections{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "active_connections"
metricsQuery: 'avg(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
- seriesQuery: 'custom_business_metric{namespace!="",service!=""}'
resources:
overrides:
namespace:
resource: namespace
service:
resource: service
name:
matches: "^(.*)"
as: "business_load_index"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
Production considerations
Scaling policy best practices
- Set appropriate stabilization windows to prevent flapping
- Use multiple metrics with different scaling behaviors
- Implement gradual scale-down policies to maintain service stability
- Monitor metric availability and implement fallback to CPU/memory scaling
- Test scaling behavior under various load patterns
Metric reliability considerations
When you're managing Kubernetes resource quotas and limits, ensure your custom metrics remain available during resource constraints. Consider implementing metric backup strategies and monitoring the Prometheus adapter's own resource usage.
For comprehensive infrastructure monitoring beyond just autoscaling metrics, you might want to explore monitoring your entire Kubernetes cluster with Prometheus Operator.
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| HPA shows "unknown" for custom metrics | Prometheus adapter not configured correctly | Check adapter logs: kubectl logs -n monitoring deployment/prometheus-adapter |
| Metrics API returns empty results | Query doesn't match any series | Verify query in Prometheus UI, check label matching |
| Scaling is too aggressive | Missing stabilization windows | Add behavior section with appropriate stabilization settings |
| Custom metrics not appearing | Prometheus not scraping application | Verify service annotations and Prometheus targets |
| HPA scales down immediately | Metric returns zero during scale-up | Implement metric smoothing or minimum value thresholds |
| Adapter fails to start | Invalid Prometheus URL or configuration | Check adapter config and Prometheus service connectivity |
Next steps
- Implement Kubernetes cluster autoscaler for automatic node scaling
- Configure Kubernetes Vertical Pod Autoscaler for resource optimization
- Set up Kubernetes event-driven autoscaling with KEDA
- Monitor Kubernetes HPA performance with Grafana dashboards
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Default values
NAMESPACE="${1:-monitoring}"
PROMETHEUS_URL="${2:-http://prometheus-server.monitoring.svc.cluster.local}"
PROMETHEUS_PORT="${3:-80}"
# Usage function
usage() {
echo "Usage: $0 [namespace] [prometheus_url] [prometheus_port]"
echo " namespace: Kubernetes namespace (default: monitoring)"
echo " prometheus_url: Prometheus server URL (default: http://prometheus-server.monitoring.svc.cluster.local)"
echo " prometheus_port: Prometheus port (default: 80)"
exit 1
}
# Check if help requested
[[ "${1:-}" == "--help" || "${1:-}" == "-h" ]] && usage
# Detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian) PKG_MGR="apt"; PKG_INSTALL="apt install -y"; PKG_UPDATE="apt update" ;;
almalinux|rocky|centos|rhel|ol|fedora) PKG_MGR="dnf"; PKG_INSTALL="dnf install -y"; PKG_UPDATE="dnf check-update || true" ;;
amzn) PKG_MGR="yum"; PKG_INSTALL="yum install -y"; PKG_UPDATE="yum check-update || true" ;;
*) echo -e "${RED}Unsupported distro: $ID${NC}"; exit 1 ;;
esac
else
echo -e "${RED}/etc/os-release not found. Cannot detect distribution.${NC}"
exit 1
fi
# Cleanup function
cleanup() {
echo -e "${RED}Installation failed. Cleaning up...${NC}"
kubectl delete namespace "$NAMESPACE" --ignore-not-found=true 2>/dev/null || true
rm -f /tmp/prometheus-adapter-values.yaml /tmp/sample-app.yaml /tmp/hpa.yaml
}
trap cleanup ERR
# Check prerequisites
echo -e "${BLUE}[1/8] Checking prerequisites...${NC}"
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}This script must be run as root${NC}"
exit 1
fi
# Install required packages
echo -e "${BLUE}[2/8] Installing required packages...${NC}"
$PKG_UPDATE
$PKG_INSTALL curl wget
# Check kubectl
if ! command -v kubectl &> /dev/null; then
echo -e "${YELLOW}kubectl not found, installing...${NC}"
case "$ID" in
ubuntu|debian)
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list
apt update && apt install -y kubectl
;;
*)
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
EOF
$PKG_INSTALL kubectl
;;
esac
fi
# Check helm
if ! command -v helm &> /dev/null; then
echo -e "${YELLOW}Helm not found, installing...${NC}"
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
fi
# Verify cluster connectivity
echo -e "${BLUE}[3/8] Verifying Kubernetes cluster connectivity...${NC}"
if ! kubectl cluster-info &> /dev/null; then
echo -e "${RED}Cannot connect to Kubernetes cluster. Please check kubectl configuration.${NC}"
exit 1
fi
# Add Helm repositories
echo -e "${BLUE}[4/8] Adding Helm repositories...${NC}"
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Create Prometheus adapter configuration
echo -e "${BLUE}[5/8] Creating Prometheus adapter configuration...${NC}"
cat > /tmp/prometheus-adapter-values.yaml <<EOF
prometheus:
url: $PROMETHEUS_URL
port: $PROMETHEUS_PORT
rules:
custom:
- seriesQuery: 'http_requests_per_second{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "http_requests_per_second"
metricsQuery: 'sum({}) by (<<.GroupBy>>)'
- seriesQuery: 'nginx_ingress_controller_requests{namespace!="",ingress!=""}'
resources:
overrides:
namespace:
resource: namespace
ingress:
resource: ingress
name:
matches: "^(.*)"
as: "nginx_requests_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
- seriesQuery: 'rabbitmq_queue_messages{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "rabbitmq_queue_depth"
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
resourceRules:
cpu:
containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}[3m])) by (<<.GroupBy>>)
nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>,id='/'}[3m])) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
memory:
containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,container!="POD",container!="",pod!=""}) by (<<.GroupBy>>)
nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
resources:
overrides:
instance:
resource: node
namespace:
resource: namespace
pod:
resource: pod
containerLabel: container
EOF
chmod 644 /tmp/prometheus-adapter-values.yaml
# Deploy Prometheus adapter
echo -e "${BLUE}[6/8] Deploying Prometheus adapter...${NC}"
kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -
helm upgrade --install prometheus-adapter prometheus-community/prometheus-adapter \
-n "$NAMESPACE" \
-f /tmp/prometheus-adapter-values.yaml \
--wait
# Create sample application
echo -e "${BLUE}[7/8] Creating sample application with custom metrics...${NC}"
cat > /tmp/sample-app.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: sample-app
image: nginx:alpine
ports:
- containerPort: 80
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
name: sample-app-service
namespace: default
labels:
app: sample-app
spec:
selector:
app: sample-app
ports:
- name: http
port: 80
targetPort: 80
- name: metrics
port: 8080
targetPort: 8080
EOF
kubectl apply -f /tmp/sample-app.yaml
# Create HPA configuration
cat > /tmp/hpa.yaml <<EOF
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: sample-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"
EOF
kubectl apply -f /tmp/hpa.yaml
# Verification
echo -e "${BLUE}[8/8] Verifying installation...${NC}"
echo -e "${YELLOW}Waiting for Prometheus adapter to be ready...${NC}"
kubectl wait --for=condition=available deployment/prometheus-adapter -n "$NAMESPACE" --timeout=300s
echo -e "${YELLOW}Checking custom metrics API...${NC}"
sleep 30
if kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" &> /dev/null; then
echo -e "${GREEN}✓ Custom metrics API is available${NC}"
else
echo -e "${YELLOW}⚠ Custom metrics API not yet available (may take a few minutes)${NC}"
fi
echo -e "${YELLOW}Verifying HPA status...${NC}"
kubectl get hpa sample-app-hpa -n default
# Cleanup temporary files
rm -f /tmp/prometheus-adapter-values.yaml /tmp/sample-app.yaml /tmp/hpa.yaml
echo -e "${GREEN}Installation completed successfully!${NC}"
echo -e "${BLUE}Next steps:${NC}"
echo "1. Verify custom metrics: kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1'"
echo "2. Check HPA status: kubectl describe hpa sample-app-hpa"
echo "3. Test scaling by generating load on the sample application"
echo "4. Monitor with: kubectl get hpa -w"
Review the script before running. Execute with: bash install.sh