Configure Kubernetes horizontal pod autoscaler for dynamic scaling based on resource metrics

Intermediate 45 min Apr 11, 2026 19 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up HPA with CPU and memory targets for automatic pod scaling. Configure metrics server and Prometheus adapter for custom metrics monitoring. Enable dynamic workload scaling based on resource utilization.

Prerequisites

  • Kubernetes cluster with admin access
  • kubectl installed and configured
  • Helm 3 installed

What this solves

Horizontal Pod Autoscaler (HPA) automatically scales your Kubernetes workloads based on resource metrics like CPU and memory usage. This prevents performance degradation during traffic spikes and reduces costs by scaling down during low demand. HPA requires a metrics server to collect resource data and can use custom metrics from Prometheus for advanced scaling decisions.

Step-by-step installation

Update system packages

Start by updating your system packages to ensure compatibility with Kubernetes components.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install kubectl and verify cluster access

Install kubectl if not already available and verify you can connect to your Kubernetes cluster.

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Verify cluster connectivity:

kubectl cluster-info
kubectl get nodes

Deploy the metrics server

The metrics server collects resource metrics from kubelets and exposes them via the Metrics API for HPA consumption.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Wait for the metrics server to be ready:

kubectl wait --for=condition=ready pod -l k8s-app=metrics-server -n kube-system --timeout=300s

Configure metrics server for local clusters

If running on a local cluster like minikube or k3s, you may need to disable TLS verification for the metrics server.

kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
  {
    "op": "add",
    "path": "/spec/template/spec/containers/0/args/-",
    "value": "--kubelet-insecure-tls"
  }
]'
Warning: Only use --kubelet-insecure-tls in development environments. Production clusters should use proper TLS certificates.

Create a test deployment

Deploy a sample application to test HPA functionality. This deployment includes resource requests required for HPA.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-test-app
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: hpa-test-app
  template:
    metadata:
      labels:
        app: hpa-test-app
    spec:
      containers:
      - name: app
        image: nginx:1.25
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: hpa-test-service
  namespace: default
spec:
  selector:
    app: hpa-test-app
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP
kubectl apply -f hpa-test-app.yaml

Configure HPA with CPU target

Create an HPA that scales based on CPU utilization. This example scales between 2-10 pods when CPU exceeds 50%.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-test-cpu
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-test-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
kubectl apply -f hpa-cpu.yaml

Configure HPA with memory target

Create an additional HPA that scales based on memory utilization alongside CPU metrics.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-test-cpu-memory
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-test-app
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Pods
        value: 2
        periodSeconds: 30
      - type: Percent
        value: 50
        periodSeconds: 30
      selectPolicy: Max
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60
kubectl delete hpa hpa-test-cpu
kubectl apply -f hpa-cpu-memory.yaml

Install Prometheus for custom metrics

Deploy Prometheus to collect custom application metrics that can be used for HPA scaling decisions.

kubectl create namespace monitoring
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml

Create a basic Prometheus configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
    scrape_configs:
    - job_name: 'kubernetes-pods'
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: prom/prometheus:v2.45.0
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: config
          mountPath: /etc/prometheus
        args:
          - '--config.file=/etc/prometheus/prometheus.yml'
          - '--storage.tsdb.path=/prometheus'
          - '--web.console.libraries=/etc/prometheus/console_libraries'
          - '--web.console.templates=/etc/prometheus/consoles'
      volumes:
      - name: config
        configMap:
          name: prometheus-config
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitoring
spec:
  selector:
    app: prometheus
  ports:
  - port: 9090
    targetPort: 9090
kubectl apply -f prometheus.yaml

Install Prometheus adapter

The Prometheus adapter exposes custom metrics from Prometheus to the Kubernetes custom metrics API for HPA consumption.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Install the adapter with custom configuration:

prometheus:
  url: http://prometheus.monitoring.svc.cluster.local:9090
  port: 9090
rules:
  custom:
  - seriesQuery: 'http_requests_per_second{namespace!="",pod!=""}'
    resources:
      overrides:
        namespace:
          resource: namespace
        pod:
          resource: pod
    name:
      matches: "^(.*)"
      as: "http_requests_per_second"
    metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
helm install prometheus-adapter prometheus-community/prometheus-adapter \
  --namespace monitoring \
  --values prometheus-adapter-values.yaml

Configure HPA with custom metrics

Create an HPA that uses custom metrics from Prometheus for scaling decisions based on application-specific metrics.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-custom-metrics
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-test-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 30
      - type: Pods
        value: 4
        periodSeconds: 30
      selectPolicy: Max
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
Note: This example assumes your application exposes http_requests_per_second metrics. Adjust the metric name based on your application's exposed metrics.

Test HPA scaling behavior

Generate load to test the HPA scaling functionality and observe the scaling behavior.

kubectl run load-generator --image=busybox --restart=Never --rm -it -- /bin/sh -c "while true; do wget -q -O- http://hpa-test-service; done"

In another terminal, monitor the HPA scaling:

kubectl get hpa -w
kubectl get pods -l app=hpa-test-app -w

Monitor and troubleshoot HPA scaling behavior

Monitor HPA status and events

Use these commands to monitor HPA behavior and troubleshoot scaling issues.

# Check HPA status
kubectl describe hpa hpa-custom-metrics

View HPA events

kubectl get events --field-selector involvedObject.kind=HorizontalPodAutoscaler

Check metrics server logs

kubectl logs -n kube-system -l k8s-app=metrics-server

Verify metrics availability

kubectl top nodes kubectl top pods

Configure HPA with vertical scaling considerations

Set resource requests and limits appropriately to ensure HPA calculations are accurate and prevent resource contention.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-optimized-app
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hpa-optimized-app
  template:
    metadata:
      labels:
        app: hpa-optimized-app
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      containers:
      - name: app
        image: nginx:1.25
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        ports:
        - containerPort: 80
        - containerPort: 8080
          name: metrics
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 20
kubectl apply -f optimized-deployment.yaml

Verify your setup

# Check metrics server is running
kubectl get deployment metrics-server -n kube-system

Verify HPA is active

kubectl get hpa

Check current resource usage

kubectl top pods

Verify custom metrics API

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

Check Prometheus adapter

kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus-adapter

View HPA scaling history

kubectl describe hpa hpa-custom-metrics

Common issues

SymptomCauseFix
HPA shows "unknown" for metricsMetrics server not running or misconfiguredCheck metrics server logs: kubectl logs -n kube-system -l k8s-app=metrics-server
HPA not scaling upResource requests not set on podsAdd CPU/memory requests to deployment spec
Custom metrics unavailablePrometheus adapter configuration errorVerify adapter config: kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus-adapter
Scaling too aggressiveDefault behavior settings too sensitiveAdjust stabilizationWindowSeconds and scaling policies in HPA spec
Pods stuck in pendingInsufficient cluster resourcesCheck node resources: kubectl describe nodes and verify resource quotas
HPA shows "FailedGetResourceMetric"Metrics API not accessibleVerify metrics server service: kubectl get svc -n kube-system metrics-server

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.