Configure comprehensive Kubernetes autoscaling with cluster autoscaler for node management, KEDA for event-driven pod scaling, and vertical pod autoscaler for resource optimization. This tutorial covers production-grade deployment using Helm charts with monitoring and optimization strategies.
Prerequisites
- Existing Kubernetes cluster with kubectl access
- Helm 3 installed
- Cloud provider IAM permissions for cluster autoscaler
- Prometheus monitoring stack for metrics-based scaling
- Basic understanding of Kubernetes resource management
What this solves
Kubernetes autoscaling ensures your applications automatically scale based on demand, optimizing resource usage and costs. This tutorial implements three complementary autoscaling strategies: cluster autoscaler manages node scaling, KEDA provides event-driven horizontal pod autoscaling beyond CPU/memory metrics, and vertical pod autoscaler optimizes resource requests and limits.
Step-by-step installation
Update system packages and install prerequisites
Start by updating your system and installing the required tools for Kubernetes management.
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget git
Install kubectl and verify Kubernetes cluster access
Install kubectl to manage your Kubernetes cluster and verify connectivity. This assumes you have an existing cluster as covered in our Kubernetes cluster installation guide.
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client
Install Helm 3 for package management
Install Helm to deploy autoscaling components using charts. If you already have Helm installed, you can reference our comprehensive Helm guide.
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version
Add required Helm repositories
Add the official repositories for cluster autoscaler, KEDA, and vertical pod autoscaler components.
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo add kedacore https://kedacore.github.io/charts
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update
Create dedicated namespaces for autoscaling components
Organize autoscaling components in separate namespaces for better management and security isolation.
kubectl create namespace kube-system-autoscaler
kubectl create namespace keda
kubectl create namespace vpa-system
Configure cluster autoscaler values
Create configuration for the cluster autoscaler with cloud provider-specific settings. This example uses AWS, but adapt for your cloud provider.
autoDiscovery:
clusterName: "your-cluster-name"
enabled: true
cloudProvider: aws
cloudConfigPath: "/etc/kubernetes/cloud-config"
image:
tag: "v1.28.2"
rbac:
create: true
serviceAccount:
create: true
name: "cluster-autoscaler"
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/cluster-autoscaler"
extraArgs:
scale-down-enabled: true
scale-down-delay-after-add: "10m"
scale-down-unneeded-time: "10m"
skip-nodes-with-local-storage: false
skip-nodes-with-system-pods: false
max-node-provision-time: "15m"
node-group-auto-discovery: "asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/your-cluster-name"
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
Deploy cluster autoscaler with Helm
Install the cluster autoscaler using Helm with your custom configuration values.
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system-autoscaler \
--values cluster-autoscaler-values.yaml \
--wait
kubectl get pods -n kube-system-autoscaler
Configure KEDA for event-driven autoscaling
Create KEDA configuration to enable advanced autoscaling based on metrics like queue length, database connections, or custom metrics.
image:
keda:
tag: "2.12.1"
metricsApiServer:
tag: "2.12.1"
webhooks:
tag: "2.12.1"
resources:
operator:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
metricServer:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
webhooks:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
service:
type: ClusterIP
portHttp: 8080
portHttpTarget: 8080
portHttps: 6443
portHttpsTarget: 6443
securityContext:
operator:
runAsNonRoot: true
runAsUser: 1001
metricServer:
runAsNonRoot: true
runAsUser: 1001
webhooks:
runAsNonRoot: true
runAsUser: 1001
Deploy KEDA with Helm
Install KEDA components including the operator, metrics server, and admission webhooks for comprehensive event-driven scaling.
helm install keda kedacore/keda \
--namespace keda \
--values keda-values.yaml \
--wait
kubectl get pods -n keda
Configure vertical pod autoscaler
Create VPA configuration to automatically adjust pod resource requests and limits based on actual usage patterns.
recommender:
enabled: true
image:
tag: "1.0.0"
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
extraArgs:
storage: prometheus
prometheus-address: http://prometheus-server.monitoring.svc.cluster.local:80
prometheus-cadvisor-job-name: kubernetes-cadvisor
updater:
enabled: true
image:
tag: "1.0.0"
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
admissionController:
enabled: true
image:
tag: "1.0.0"
resources:
limits:
cpu: 200m
memory: 500Mi
requests:
cpu: 50m
memory: 200Mi
generateCertificate: true
rbac:
create: true
Deploy vertical pod autoscaler
Install VPA components to automatically optimize resource allocation for your workloads.
helm install vpa fairwinds-stable/vpa \
--namespace vpa-system \
--values vpa-values.yaml \
--wait
kubectl get pods -n vpa-system
Create example KEDA ScaledObject
Configure a sample application with KEDA scaling based on Prometheus metrics to demonstrate event-driven autoscaling.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
spec:
scaleTargetRef:
name: sample-app
pollingInterval: 30
cooldownPeriod: 300
idleReplicaCount: 0
minReplicaCount: 1
maxReplicaCount: 50
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
metricName: http_requests_per_second
threshold: '100'
query: sum(rate(http_requests_total{job="sample-app"}[1m]))
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
spec:
containers:
- name: sample-app
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
Apply KEDA scaling configuration
Deploy the sample application with KEDA scaling configuration to test event-driven autoscaling.
kubectl apply -f keda-scaledobject-example.yaml
kubectl get scaledobject -n default
Create VPA policy for sample application
Configure vertical pod autoscaler policy to automatically adjust resource requests based on actual usage.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: sample-app-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: sample-app
minAllowed:
cpu: 50m
memory: 64Mi
maxAllowed:
cpu: 1000m
memory: 1Gi
controlledResources: ["cpu", "memory"]
Apply VPA policy
Deploy the VPA configuration to enable automatic resource optimization for your sample application.
kubectl apply -f vpa-policy-example.yaml
kubectl get vpa -n default
Configure monitoring and alerting
Set up monitoring rules to track autoscaling events and performance. This integrates with Prometheus monitoring setup.
apiVersion: v1
kind: ConfigMap
metadata:
name: autoscaling-alerts
namespace: monitoring
data:
autoscaling-rules.yaml: |
groups:
- name: autoscaling
rules:
- alert: ClusterAutoscalerError
expr: increase(cluster_autoscaler_errors_total[5m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Cluster autoscaler errors detected"
description: "Cluster autoscaler has {{ $value }} errors in the last 5 minutes"
- alert: KEDAScalerError
expr: increase(keda_scaler_errors_total[5m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "KEDA scaler errors detected"
description: "KEDA has {{ $value }} scaler errors in the last 5 minutes"
- alert: VPARecommendationMissing
expr: (time() - vpa_status_recommendation_last_updated) > 3600
for: 10m
labels:
severity: warning
annotations:
summary: "VPA recommendations outdated"
description: "VPA recommendations haven't been updated for {{ $value }} seconds"
Apply monitoring configuration
Deploy the monitoring rules to track autoscaling performance and identify issues early.
kubectl apply -f autoscaling-monitoring.yaml
kubectl get configmap -n monitoring
Verify your setup
Check that all autoscaling components are running correctly and verify their functionality.
# Check cluster autoscaler status
kubectl get pods -n kube-system-autoscaler
kubectl logs -n kube-system-autoscaler -l app.kubernetes.io/name=cluster-autoscaler
Verify KEDA components
kubectl get pods -n keda
kubectl get scaledobject --all-namespaces
Check VPA status
kubectl get pods -n vpa-system
kubectl get vpa --all-namespaces
kubectl describe vpa sample-app-vpa -n default
Test scaling behavior
kubectl get hpa --all-namespaces
kubectl top pods -n default
Configure optimization policies
Fine-tune cluster autoscaler settings
Optimize cluster autoscaler behavior for your specific workload patterns and cost requirements.
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-status
namespace: kube-system-autoscaler
data:
nodes.max: "100"
scale-down-delay-after-add: "10m"
scale-down-delay-after-delete: "10s"
scale-down-delay-after-failure: "3m"
scale-down-unneeded-time: "10m"
scale-down-utilization-threshold: "0.5"
skip-nodes-with-local-storage: "false"
skip-nodes-with-system-pods: "false"
new-pod-scale-up-delay: "10s"
max-node-provision-time: "15m"
Configure KEDA scaling policies
Set up advanced KEDA scaling behaviors to prevent flapping and optimize scaling decisions.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: advanced-scaling-policy
namespace: default
spec:
scaleTargetRef:
name: sample-app
pollingInterval: 30
cooldownPeriod: 300
idleReplicaCount: 0
minReplicaCount: 2
maxReplicaCount: 100
advanced:
restoreToOriginalReplicaCount: true
horizontalPodAutoscalerConfig:
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
metricName: custom_metric_rate
threshold: '10'
query: sum(rate(custom_metric_total{service="sample-app"}[2m]))
Apply optimization configurations
Deploy the optimized autoscaling policies to improve performance and reduce costs.
kubectl apply -f cluster-autoscaler-optimization.yaml
kubectl apply -f keda-scaling-policy.yaml
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Cluster autoscaler not scaling nodes | Missing IAM permissions or incorrect node group tags | Verify IAM role has autoscaling permissions and node groups have correct tags |
| KEDA ScaledObject shows "Unknown" status | Cannot connect to metrics source | Check Prometheus connectivity and query syntax: kubectl describe scaledobject |
| VPA not updating pod resources | Insufficient metrics data or admission webhook issues | Wait 24 hours for data collection or check webhook certificates: kubectl get validatingwebhookconfiguration |
| Pods stuck in pending state | Resource limits or node selection constraints | Check pod events and node capacity: kubectl describe pod and kubectl describe nodes |
| Excessive scaling up and down | Aggressive scaling policies or insufficient stabilization | Increase cooldown periods and adjust stabilization windows in HPA behavior |
| KEDA metrics server connection refused | Network policies or service mesh interference | Verify network connectivity and check service mesh proxy configuration |
Next steps
- Configure Kubernetes resource quotas and limit ranges for namespace-level resource management
- Implement Kubernetes pod disruption budgets for high availability during scaling events
- Set up custom metrics autoscaling with Prometheus adapter for application-specific scaling
- Configure cluster autoscaler with mixed instance types for cost optimization
- Implement Kubernetes workload rightsizing with VPA recommendations and cost analysis
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly NC='\033[0m' # No Color
# Configuration
readonly SCRIPT_NAME="$(basename "$0")"
readonly CLUSTER_NAME="${1:-my-k8s-cluster}"
readonly AWS_ACCOUNT_ID="${2:-}"
# Usage message
usage() {
echo "Usage: $SCRIPT_NAME <cluster-name> [aws-account-id]"
echo " cluster-name: Name of your Kubernetes cluster"
echo " aws-account-id: AWS account ID for EKS role ARN (optional)"
exit 1
}
# Error handling
cleanup() {
echo -e "${RED}[ERROR]${NC} Installation failed. Cleaning up..."
helm uninstall cluster-autoscaler -n kube-system-autoscaler 2>/dev/null || true
helm uninstall keda -n keda 2>/dev/null || true
helm uninstall vpa -n vpa-system 2>/dev/null || true
kubectl delete namespace kube-system-autoscaler keda vpa-system 2>/dev/null || true
rm -f /tmp/cluster-autoscaler-values.yaml /tmp/keda-values.yaml /tmp/vpa-values.yaml
}
trap cleanup ERR
# Logging functions
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
# Validate arguments
if [[ -z "$CLUSTER_NAME" ]]; then
usage
fi
# Check prerequisites
if [[ $EUID -eq 0 ]]; then
log_error "This script should not be run as root"
exit 1
fi
if ! command -v sudo >/dev/null 2>&1; then
log_error "sudo is required but not installed"
exit 1
fi
# Auto-detect distribution
if [[ -f /etc/os-release ]]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_UPDATE="apt update && apt upgrade -y"
PKG_INSTALL="apt install -y"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_UPDATE="dnf update -y"
PKG_INSTALL="dnf install -y"
;;
amzn)
PKG_MGR="yum"
PKG_UPDATE="yum update -y"
PKG_INSTALL="yum install -y"
;;
*)
log_error "Unsupported distribution: $ID"
exit 1
;;
esac
else
log_error "Cannot detect distribution - /etc/os-release not found"
exit 1
fi
log_info "Detected distribution: $ID using $PKG_MGR"
# Step 1: Update system and install prerequisites
echo "[1/8] Updating system packages and installing prerequisites..."
sudo $PKG_UPDATE
sudo $PKG_INSTALL curl wget git
# Step 2: Install kubectl
echo "[2/8] Installing kubectl..."
if ! command -v kubectl >/dev/null 2>&1; then
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
rm -f kubectl
fi
# Verify kubectl access
if ! kubectl version --client >/dev/null 2>&1; then
log_error "kubectl installation failed"
exit 1
fi
if ! kubectl cluster-info >/dev/null 2>&1; then
log_warn "Cannot connect to Kubernetes cluster. Ensure kubeconfig is properly configured."
fi
# Step 3: Install Helm
echo "[3/8] Installing Helm 3..."
if ! command -v helm >/dev/null 2>&1; then
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
fi
# Verify Helm installation
if ! helm version >/dev/null 2>&1; then
log_error "Helm installation failed"
exit 1
fi
# Step 4: Add Helm repositories
echo "[4/8] Adding required Helm repositories..."
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo add kedacore https://kedacore.github.io/charts
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update
# Step 5: Create namespaces
echo "[5/8] Creating dedicated namespaces..."
kubectl create namespace kube-system-autoscaler --dry-run=client -o yaml | kubectl apply -f -
kubectl create namespace keda --dry-run=client -o yaml | kubectl apply -f -
kubectl create namespace vpa-system --dry-run=client -o yaml | kubectl apply -f -
# Step 6: Configure cluster autoscaler
echo "[6/8] Configuring cluster autoscaler..."
cat > /tmp/cluster-autoscaler-values.yaml << EOF
autoDiscovery:
clusterName: "$CLUSTER_NAME"
enabled: true
cloudProvider: aws
image:
tag: "v1.28.2"
rbac:
create: true
serviceAccount:
create: true
name: "cluster-autoscaler"
annotations:
${AWS_ACCOUNT_ID:+eks.amazonaws.com/role-arn: "arn:aws:iam::${AWS_ACCOUNT_ID}:role/cluster-autoscaler"}
extraArgs:
scale-down-enabled: true
scale-down-delay-after-add: "10m"
scale-down-unneeded-time: "10m"
skip-nodes-with-local-storage: false
skip-nodes-with-system-pods: false
max-node-provision-time: "15m"
node-group-auto-discovery: "asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/$CLUSTER_NAME"
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
EOF
# Deploy cluster autoscaler
helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system-autoscaler \
--values /tmp/cluster-autoscaler-values.yaml \
--wait --timeout=300s
# Step 7: Configure and deploy KEDA
echo "[7/8] Configuring and deploying KEDA..."
cat > /tmp/keda-values.yaml << EOF
image:
keda:
tag: "2.12.1"
metricsApiServer:
tag: "2.12.1"
webhooks:
tag: "2.12.1"
resources:
operator:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
metricServer:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
webhooks:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 100Mi
prometheus:
metricServer:
enabled: true
operator:
enabled: true
webhooks:
enabled: true
EOF
helm upgrade --install keda kedacore/keda \
--namespace keda \
--values /tmp/keda-values.yaml \
--wait --timeout=300s
# Step 8: Configure and deploy VPA
echo "[8/8] Configuring and deploying Vertical Pod Autoscaler..."
cat > /tmp/vpa-values.yaml << EOF
recommender:
enabled: true
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
updater:
enabled: true
resources:
limits:
cpu: 200m
memory: 1000Mi
requests:
cpu: 50m
memory: 500Mi
admissionController:
enabled: true
resources:
limits:
cpu: 200m
memory: 500Mi
requests:
cpu: 50m
memory: 200Mi
EOF
helm upgrade --install vpa fairwinds-stable/vpa \
--namespace vpa-system \
--values /tmp/vpa-values.yaml \
--wait --timeout=300s
# Verification
echo "Verifying installation..."
sleep 10
# Check cluster autoscaler
if kubectl get pods -n kube-system-autoscaler | grep -q "Running"; then
log_info "Cluster Autoscaler: Running"
else
log_warn "Cluster Autoscaler: Not running properly"
fi
# Check KEDA
if kubectl get pods -n keda | grep -q "Running"; then
log_info "KEDA: Running"
else
log_warn "KEDA: Not running properly"
fi
# Check VPA
if kubectl get pods -n vpa-system | grep -q "Running"; then
log_info "VPA: Running"
else
log_warn "VPA: Not running properly"
fi
# Cleanup temporary files
rm -f /tmp/cluster-autoscaler-values.yaml /tmp/keda-values.yaml /tmp/vpa-values.yaml
log_info "Kubernetes cluster autoscaling installation completed successfully!"
log_info "Components installed:"
log_info "- Cluster Autoscaler in namespace: kube-system-autoscaler"
log_info "- KEDA in namespace: keda"
log_info "- Vertical Pod Autoscaler in namespace: vpa-system"
Review the script before running. Execute with: bash install.sh