Set up HPA with CPU and memory targets for automatic pod scaling. Configure metrics server and Prometheus adapter for custom metrics monitoring. Enable dynamic workload scaling based on resource utilization.
Prerequisites
- Kubernetes cluster with admin access
- kubectl installed and configured
- Helm 3 installed
What this solves
Horizontal Pod Autoscaler (HPA) automatically scales your Kubernetes workloads based on resource metrics like CPU and memory usage. This prevents performance degradation during traffic spikes and reduces costs by scaling down during low demand. HPA requires a metrics server to collect resource data and can use custom metrics from Prometheus for advanced scaling decisions.
Step-by-step installation
Update system packages
Start by updating your system packages to ensure compatibility with Kubernetes components.
sudo apt update && sudo apt upgrade -y
Install kubectl and verify cluster access
Install kubectl if not already available and verify you can connect to your Kubernetes cluster.
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
Verify cluster connectivity:
kubectl cluster-info
kubectl get nodes
Deploy the metrics server
The metrics server collects resource metrics from kubelets and exposes them via the Metrics API for HPA consumption.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Wait for the metrics server to be ready:
kubectl wait --for=condition=ready pod -l k8s-app=metrics-server -n kube-system --timeout=300s
Configure metrics server for local clusters
If running on a local cluster like minikube or k3s, you may need to disable TLS verification for the metrics server.
kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value": "--kubelet-insecure-tls"
}
]'
Create a test deployment
Deploy a sample application to test HPA functionality. This deployment includes resource requests required for HPA.
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-test-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: hpa-test-app
template:
metadata:
labels:
app: hpa-test-app
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: hpa-test-service
namespace: default
spec:
selector:
app: hpa-test-app
ports:
- port: 80
targetPort: 80
type: ClusterIP
kubectl apply -f hpa-test-app.yaml
Configure HPA with CPU target
Create an HPA that scales based on CPU utilization. This example scales between 2-10 pods when CPU exceeds 50%.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-test-cpu
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-test-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
kubectl apply -f hpa-cpu.yaml
Configure HPA with memory target
Create an additional HPA that scales based on memory utilization alongside CPU metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-test-cpu-memory
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-test-app
minReplicas: 2
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
behavior:
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Pods
value: 2
periodSeconds: 30
- type: Percent
value: 50
periodSeconds: 30
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
kubectl delete hpa hpa-test-cpu
kubectl apply -f hpa-cpu-memory.yaml
Install Prometheus for custom metrics
Deploy Prometheus to collect custom application metrics that can be used for HPA scaling decisions.
kubectl create namespace monitoring
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml
Create a basic Prometheus configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: prometheus
image: prom/prometheus:v2.45.0
ports:
- containerPort: 9090
volumeMounts:
- name: config
mountPath: /etc/prometheus
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
volumes:
- name: config
configMap:
name: prometheus-config
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: monitoring
spec:
selector:
app: prometheus
ports:
- port: 9090
targetPort: 9090
kubectl apply -f prometheus.yaml
Install Prometheus adapter
The Prometheus adapter exposes custom metrics from Prometheus to the Kubernetes custom metrics API for HPA consumption.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Install the adapter with custom configuration:
prometheus:
url: http://prometheus.monitoring.svc.cluster.local:9090
port: 9090
rules:
custom:
- seriesQuery: 'http_requests_per_second{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^(.*)"
as: "http_requests_per_second"
metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
helm install prometheus-adapter prometheus-community/prometheus-adapter \
--namespace monitoring \
--values prometheus-adapter-values.yaml
Configure HPA with custom metrics
Create an HPA that uses custom metrics from Prometheus for scaling decisions based on application-specific metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-custom-metrics
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-test-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 4
periodSeconds: 30
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 25
periodSeconds: 60
Test HPA scaling behavior
Generate load to test the HPA scaling functionality and observe the scaling behavior.
kubectl run load-generator --image=busybox --restart=Never --rm -it -- /bin/sh -c "while true; do wget -q -O- http://hpa-test-service; done"
In another terminal, monitor the HPA scaling:
kubectl get hpa -w
kubectl get pods -l app=hpa-test-app -w
Monitor and troubleshoot HPA scaling behavior
Monitor HPA status and events
Use these commands to monitor HPA behavior and troubleshoot scaling issues.
# Check HPA status
kubectl describe hpa hpa-custom-metrics
View HPA events
kubectl get events --field-selector involvedObject.kind=HorizontalPodAutoscaler
Check metrics server logs
kubectl logs -n kube-system -l k8s-app=metrics-server
Verify metrics availability
kubectl top nodes
kubectl top pods
Configure HPA with vertical scaling considerations
Set resource requests and limits appropriately to ensure HPA calculations are accurate and prevent resource contention.
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-optimized-app
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: hpa-optimized-app
template:
metadata:
labels:
app: hpa-optimized-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
ports:
- containerPort: 80
- containerPort: 8080
name: metrics
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 15
periodSeconds: 20
kubectl apply -f optimized-deployment.yaml
Verify your setup
# Check metrics server is running
kubectl get deployment metrics-server -n kube-system
Verify HPA is active
kubectl get hpa
Check current resource usage
kubectl top pods
Verify custom metrics API
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
Check Prometheus adapter
kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus-adapter
View HPA scaling history
kubectl describe hpa hpa-custom-metrics
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| HPA shows "unknown" for metrics | Metrics server not running or misconfigured | Check metrics server logs: kubectl logs -n kube-system -l k8s-app=metrics-server |
| HPA not scaling up | Resource requests not set on pods | Add CPU/memory requests to deployment spec |
| Custom metrics unavailable | Prometheus adapter configuration error | Verify adapter config: kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus-adapter |
| Scaling too aggressive | Default behavior settings too sensitive | Adjust stabilizationWindowSeconds and scaling policies in HPA spec |
| Pods stuck in pending | Insufficient cluster resources | Check node resources: kubectl describe nodes and verify resource quotas |
| HPA shows "FailedGetResourceMetric" | Metrics API not accessible | Verify metrics server service: kubectl get svc -n kube-system metrics-server |
Next steps
- Monitor Kubernetes clusters with Prometheus and Grafana for container orchestration insights
- Implement Kubernetes resource quotas and limits for namespace isolation and workload management
- Configure Kubernetes vertical pod autoscaler for resource optimization
- Implement Kubernetes cluster autoscaler for automatic node scaling
- Configure Kubernetes pod disruption budgets for high availability
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Global variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TEMP_DIR=""
PKG_MGR=""
PKG_INSTALL=""
LOCAL_CLUSTER="${1:-false}"
# Usage function
usage() {
echo "Usage: $0 [local]"
echo " local: Configure for local cluster (minikube/k3s) - adds --kubelet-insecure-tls"
echo "Example: $0 local"
exit 1
}
# Cleanup function
cleanup() {
local exit_code=$?
if [ -n "$TEMP_DIR" ] && [ -d "$TEMP_DIR" ]; then
echo -e "${YELLOW}[CLEANUP] Removing temporary files...${NC}"
rm -rf "$TEMP_DIR"
fi
if [ $exit_code -ne 0 ]; then
echo -e "${RED}[ERROR] Script failed. Check the output above for details.${NC}"
fi
exit $exit_code
}
# Set up error handling
trap cleanup ERR EXIT
# Logging functions
log_info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
log_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check if running as root or with sudo
check_privileges() {
if [ "$EUID" -ne 0 ] && ! sudo -n true 2>/dev/null; then
log_error "This script requires root privileges or sudo access"
exit 1
fi
}
# Detect distribution and package manager
detect_distro() {
if [ ! -f /etc/os-release ]; then
log_error "/etc/os-release not found. Cannot detect distribution."
exit 1
fi
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
if ! command -v dnf &> /dev/null; then
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
fi
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
;;
*)
log_error "Unsupported distribution: $ID"
exit 1
;;
esac
log_info "Detected distribution: $PRETTY_NAME"
log_info "Using package manager: $PKG_MGR"
}
# Update system packages
update_system() {
echo -e "${BLUE}[1/8] Updating system packages...${NC}"
case "$PKG_MGR" in
apt)
sudo apt update && sudo apt upgrade -y
;;
dnf)
sudo dnf update -y
;;
yum)
sudo yum update -y
;;
esac
log_success "System packages updated"
}
# Install prerequisites
install_prerequisites() {
echo -e "${BLUE}[2/8] Installing prerequisites...${NC}"
case "$PKG_MGR" in
apt)
sudo $PKG_INSTALL curl wget ca-certificates
;;
dnf|yum)
sudo $PKG_INSTALL curl wget ca-certificates
;;
esac
log_success "Prerequisites installed"
}
# Install kubectl
install_kubectl() {
echo -e "${BLUE}[3/8] Installing kubectl...${NC}"
if command -v kubectl &> /dev/null; then
log_info "kubectl already installed, checking version..."
kubectl version --client
else
# Create temporary directory
TEMP_DIR=$(mktemp -d)
cd "$TEMP_DIR"
# Download kubectl
log_info "Downloading kubectl..."
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
# Install kubectl
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
log_success "kubectl installed successfully"
fi
# Verify cluster connectivity
echo "Verifying cluster connectivity..."
if ! kubectl cluster-info &> /dev/null; then
log_error "Cannot connect to Kubernetes cluster. Please ensure your kubeconfig is properly configured."
exit 1
fi
kubectl cluster-info
kubectl get nodes
log_success "Cluster connectivity verified"
}
# Deploy metrics server
deploy_metrics_server() {
echo -e "${BLUE}[4/8] Deploying metrics server...${NC}"
# Check if metrics server is already deployed
if kubectl get deployment metrics-server -n kube-system &> /dev/null; then
log_info "Metrics server already exists, checking status..."
else
log_info "Deploying metrics server..."
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
fi
# Wait for metrics server to be ready
log_info "Waiting for metrics server to be ready (timeout: 300s)..."
kubectl wait --for=condition=ready pod -l k8s-app=metrics-server -n kube-system --timeout=300s
log_success "Metrics server deployed successfully"
}
# Configure metrics server for local clusters
configure_local_metrics_server() {
if [ "$LOCAL_CLUSTER" = "local" ]; then
echo -e "${BLUE}[5/8] Configuring metrics server for local cluster...${NC}"
log_warning "Configuring metrics server with --kubelet-insecure-tls for local development"
log_warning "This should NOT be used in production environments"
kubectl patch deployment metrics-server -n kube-system --type='json' -p='[
{
"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value": "--kubelet-insecure-tls"
}
]'
# Wait for rollout to complete
kubectl rollout status deployment/metrics-server -n kube-system --timeout=300s
log_success "Metrics server configured for local cluster"
else
echo -e "${BLUE}[5/8] Skipping local cluster configuration...${NC}"
log_info "Using production TLS configuration"
fi
}
# Create test deployment
create_test_deployment() {
echo -e "${BLUE}[6/8] Creating test deployment...${NC}"
# Create temporary directory if not exists
if [ -z "$TEMP_DIR" ]; then
TEMP_DIR=$(mktemp -d)
fi
# Create test application manifest
cat > "$TEMP_DIR/hpa-test-app.yaml" << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-test-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: hpa-test-app
template:
metadata:
labels:
app: hpa-test-app
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: hpa-test-service
namespace: default
spec:
selector:
app: hpa-test-app
ports:
- port: 80
targetPort: 80
type: ClusterIP
EOF
kubectl apply -f "$TEMP_DIR/hpa-test-app.yaml"
# Wait for deployment to be ready
kubectl rollout status deployment/hpa-test-app --timeout=300s
log_success "Test deployment created successfully"
}
# Configure HPA
configure_hpa() {
echo -e "${BLUE}[7/8] Configuring Horizontal Pod Autoscaler...${NC}"
# Create HPA with CPU and memory metrics
cat > "$TEMP_DIR/hpa-config.yaml" << 'EOF'
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-test-cpu-memory
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-test-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
EOF
kubectl apply -f "$TEMP_DIR/hpa-config.yaml"
log_success "HPA configured successfully"
}
# Verify installation
verify_installation() {
echo -e "${BLUE}[8/8] Verifying installation...${NC}"
# Check metrics server
log_info "Checking metrics server status..."
kubectl get pods -n kube-system -l k8s-app=metrics-server
# Check test deployment
log_info "Checking test deployment..."
kubectl get deployment hpa-test-app
kubectl get service hpa-test-service
# Check HPA
log_info "Checking HPA status..."
kubectl get hpa
kubectl describe hpa hpa-test-cpu-memory
# Wait a moment and check if metrics are available
log_info "Waiting for metrics to be available..."
sleep 30
if kubectl top nodes &> /dev/null; then
log_success "Node metrics are available"
kubectl top nodes
else
log_warning "Node metrics not yet available, may take a few more minutes"
fi
if kubectl top pods &> /dev/null; then
log_success "Pod metrics are available"
kubectl top pods
else
log_warning "Pod metrics not yet available, may take a few more minutes"
fi
log_success "Installation completed successfully!"
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${GREEN}HPA Setup Complete!${NC}"
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo
echo "Next steps:"
echo "1. Monitor HPA: kubectl get hpa -w"
echo "2. Generate load to test scaling: kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh"
echo " Then run: while true; do wget -q -O- http://hpa-test-service; done"
echo "3. Watch pod scaling: kubectl get pods -w"
echo
}
# Main execution
main() {
# Parse arguments
if [ $# -gt 1 ]; then
usage
fi
if [ $# -eq 1 ]; then
if [ "$1" != "local" ]; then
usage
fi
LOCAL_CLUSTER="local"
fi
echo -e "${GREEN}Kubernetes HPA Installation Script${NC}"
echo "=================================="
check_privileges
detect_distro
update_system
install_prerequisites
install_kubectl
deploy_metrics_server
configure_local_metrics_server
create_test_deployment
configure_hpa
verify_installation
}
# Run main function
main "$@"
Review the script before running. Execute with: bash install.sh