Kubernetes Autoscaling with Helm, KEDA & VPA

Configure comprehensive Kubernetes autoscaling with cluster autoscaler for node management, KEDA for event-driven pod scaling, and vertical pod autoscaler for resource optimization. This tutorial covers production-grade deployment using Helm charts with monitoring and optimization strategies.

Prerequisites

Existing Kubernetes cluster with kubectl access
Helm 3 installed
Cloud provider IAM permissions for cluster autoscaler
Prometheus monitoring stack for metrics-based scaling
Basic understanding of Kubernetes resource management

What this solves

Kubernetes autoscaling ensures your applications automatically scale based on demand, optimizing resource usage and costs. This tutorial implements three complementary autoscaling strategies: cluster autoscaler manages node scaling, KEDA provides event-driven horizontal pod autoscaling beyond CPU/memory metrics, and vertical pod autoscaler optimizes resource requests and limits.

Step-by-step installation

Update system packages and install prerequisites

Start by updating your system and installing the required tools for Kubernetes management.

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget git

sudo dnf update -y
sudo dnf install -y curl wget git

Install kubectl and verify Kubernetes cluster access

Install kubectl to manage your Kubernetes cluster and verify connectivity. This assumes you have an existing cluster as covered in our Kubernetes cluster installation guide.

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client

Install Helm 3 for package management

Install Helm to deploy autoscaling components using charts. If you already have Helm installed, you can reference our comprehensive Helm guide.

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version

Add required Helm repositories

Add the official repositories for cluster autoscaler, KEDA, and vertical pod autoscaler components.

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo add kedacore https://kedacore.github.io/charts
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update

Create dedicated namespaces for autoscaling components

Organize autoscaling components in separate namespaces for better management and security isolation.

kubectl create namespace kube-system-autoscaler
kubectl create namespace keda
kubectl create namespace vpa-system

Configure cluster autoscaler values

Create configuration for the cluster autoscaler with cloud provider-specific settings. This example uses AWS, but adapt for your cloud provider.

autoDiscovery:
  clusterName: "your-cluster-name"
  enabled: true

cloudProvider: aws

cloudConfigPath: "/etc/kubernetes/cloud-config"

image:
  tag: "v1.28.2"

rbac:
  create: true
  serviceAccount:
    create: true
    name: "cluster-autoscaler"
    annotations:
      eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/cluster-autoscaler"

extraArgs:
  scale-down-enabled: true
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"
  skip-nodes-with-local-storage: false
  skip-nodes-with-system-pods: false
  max-node-provision-time: "15m"
  node-group-auto-discovery: "asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/your-cluster-name"

resources:
  limits:
    cpu: 100m
    memory: 300Mi
  requests:
    cpu: 100m
    memory: 300Mi

Deploy cluster autoscaler with Helm

Install the cluster autoscaler using Helm with your custom configuration values.

helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system-autoscaler \
  --values cluster-autoscaler-values.yaml \
  --wait

kubectl get pods -n kube-system-autoscaler

Configure KEDA for event-driven autoscaling

Create KEDA configuration to enable advanced autoscaling based on metrics like queue length, database connections, or custom metrics.

image:
  keda:
    tag: "2.12.1"
  metricsApiServer:
    tag: "2.12.1"
  webhooks:
    tag: "2.12.1"

resources:
  operator:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  metricServer:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  webhooks:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi

service:
  type: ClusterIP
  portHttp: 8080
  portHttpTarget: 8080
  portHttps: 6443
  portHttpsTarget: 6443

securityContext:
  operator:
    runAsNonRoot: true
    runAsUser: 1001
  metricServer:
    runAsNonRoot: true
    runAsUser: 1001
  webhooks:
    runAsNonRoot: true
    runAsUser: 1001

Deploy KEDA with Helm

Install KEDA components including the operator, metrics server, and admission webhooks for comprehensive event-driven scaling.

helm install keda kedacore/keda \
  --namespace keda \
  --values keda-values.yaml \
  --wait

kubectl get pods -n keda

Configure vertical pod autoscaler

Create VPA configuration to automatically adjust pod resource requests and limits based on actual usage patterns.

recommender:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi
  extraArgs:
    storage: prometheus
    prometheus-address: http://prometheus-server.monitoring.svc.cluster.local:80
    prometheus-cadvisor-job-name: kubernetes-cadvisor

updater:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi

admissionController:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 200Mi
  generateCertificate: true

rbac:
  create: true

Deploy vertical pod autoscaler

Install VPA components to automatically optimize resource allocation for your workloads.

helm install vpa fairwinds-stable/vpa \
  --namespace vpa-system \
  --values vpa-values.yaml \
  --wait

kubectl get pods -n vpa-system

Create example KEDA ScaledObject

Configure a sample application with KEDA scaling based on Prometheus metrics to demonstrate event-driven autoscaling.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: sample-app
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  minReplicaCount: 1
  maxReplicaCount: 50
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
      metricName: http_requests_per_second
      threshold: '100'
      query: sum(rate(http_requests_total{job="sample-app"}[1m]))
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - name: sample-app
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi

Apply KEDA scaling configuration

Deploy the sample application with KEDA scaling configuration to test event-driven autoscaling.

kubectl apply -f keda-scaledobject-example.yaml
kubectl get scaledobject -n default

Create VPA policy for sample application

Configure vertical pod autoscaler policy to automatically adjust resource requests based on actual usage.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: sample-app-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: sample-app
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 1000m
        memory: 1Gi
      controlledResources: ["cpu", "memory"]

Apply VPA policy

Deploy the VPA configuration to enable automatic resource optimization for your sample application.

kubectl apply -f vpa-policy-example.yaml
kubectl get vpa -n default

Configure monitoring and alerting

Set up monitoring rules to track autoscaling events and performance. This integrates with Prometheus monitoring setup.

apiVersion: v1
kind: ConfigMap
metadata:
  name: autoscaling-alerts
  namespace: monitoring
data:
  autoscaling-rules.yaml: |
    groups:
    - name: autoscaling
      rules:
      - alert: ClusterAutoscalerError
        expr: increase(cluster_autoscaler_errors_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Cluster autoscaler errors detected"
          description: "Cluster autoscaler has {{ $value }} errors in the last 5 minutes"
      
      - alert: KEDAScalerError
        expr: increase(keda_scaler_errors_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "KEDA scaler errors detected"
          description: "KEDA has {{ $value }} scaler errors in the last 5 minutes"
      
      - alert: VPARecommendationMissing
        expr: (time() - vpa_status_recommendation_last_updated) > 3600
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "VPA recommendations outdated"
          description: "VPA recommendations haven't been updated for {{ $value }} seconds"

Apply monitoring configuration

Deploy the monitoring rules to track autoscaling performance and identify issues early.

kubectl apply -f autoscaling-monitoring.yaml
kubectl get configmap -n monitoring

Verify your setup

Check that all autoscaling components are running correctly and verify their functionality.

# Check cluster autoscaler status
kubectl get pods -n kube-system-autoscaler
kubectl logs -n kube-system-autoscaler -l app.kubernetes.io/name=cluster-autoscaler

Verify KEDA components
kubectl get pods -n keda
kubectl get scaledobject --all-namespaces

Check VPA status
kubectl get pods -n vpa-system
kubectl get vpa --all-namespaces
kubectl describe vpa sample-app-vpa -n default

Test scaling behavior
kubectl get hpa --all-namespaces
kubectl top pods -n default

Note: It may take 5-10 minutes for VPA to generate initial recommendations and for autoscaling policies to take effect.

Configure optimization policies

Fine-tune cluster autoscaler settings

Optimize cluster autoscaler behavior for your specific workload patterns and cost requirements.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-status
  namespace: kube-system-autoscaler
data:
  nodes.max: "100"
  scale-down-delay-after-add: "10m"
  scale-down-delay-after-delete: "10s"
  scale-down-delay-after-failure: "3m"
  scale-down-unneeded-time: "10m"
  scale-down-utilization-threshold: "0.5"
  skip-nodes-with-local-storage: "false"
  skip-nodes-with-system-pods: "false"
  new-pod-scale-up-delay: "10s"
  max-node-provision-time: "15m"

Configure KEDA scaling policies

Set up advanced KEDA scaling behaviors to prevent flapping and optimize scaling decisions.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: advanced-scaling-policy
  namespace: default
spec:
  scaleTargetRef:
    name: sample-app
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  minReplicaCount: 2
  maxReplicaCount: 100
  advanced:
    restoreToOriginalReplicaCount: true
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 50
            periodSeconds: 60
        scaleUp:
          stabilizationWindowSeconds: 0
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
          - type: Pods
            value: 4
            periodSeconds: 15
          selectPolicy: Max
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
      metricName: custom_metric_rate
      threshold: '10'
      query: sum(rate(custom_metric_total{service="sample-app"}[2m]))

Apply optimization configurations

Deploy the optimized autoscaling policies to improve performance and reduce costs.

kubectl apply -f cluster-autoscaler-optimization.yaml
kubectl apply -f keda-scaling-policy.yaml

Common issues

Symptom	Cause	Fix
Cluster autoscaler not scaling nodes	Missing IAM permissions or incorrect node group tags	Verify IAM role has autoscaling permissions and node groups have correct tags
KEDA ScaledObject shows "Unknown" status	Cannot connect to metrics source	Check Prometheus connectivity and query syntax: `kubectl describe scaledobject`
VPA not updating pod resources	Insufficient metrics data or admission webhook issues	Wait 24 hours for data collection or check webhook certificates: `kubectl get validatingwebhookconfiguration`
Pods stuck in pending state	Resource limits or node selection constraints	Check pod events and node capacity: `kubectl describe pod` and `kubectl describe nodes`
Excessive scaling up and down	Aggressive scaling policies or insufficient stabilization	Increase cooldown periods and adjust stabilization windows in HPA behavior
KEDA metrics server connection refused	Network policies or service mesh interference	Verify network connectivity and check service mesh proxy configuration

Warning: Never disable resource limits to "fix" scaling issues. Instead, properly configure resource requests and limits based on actual application requirements and VPA recommendations.

Next steps

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly NC='\033[0m' # No Color

# Configuration
readonly SCRIPT_NAME="$(basename "$0")"
readonly CLUSTER_NAME="${1:-my-k8s-cluster}"
readonly AWS_ACCOUNT_ID="${2:-}"

# Usage message
usage() {
    echo "Usage: $SCRIPT_NAME <cluster-name> [aws-account-id]"
    echo "  cluster-name:   Name of your Kubernetes cluster"
    echo "  aws-account-id: AWS account ID for EKS role ARN (optional)"
    exit 1
}

# Error handling
cleanup() {
    echo -e "${RED}[ERROR]${NC} Installation failed. Cleaning up..."
    helm uninstall cluster-autoscaler -n kube-system-autoscaler 2>/dev/null || true
    helm uninstall keda -n keda 2>/dev/null || true
    helm uninstall vpa -n vpa-system 2>/dev/null || true
    kubectl delete namespace kube-system-autoscaler keda vpa-system 2>/dev/null || true
    rm -f /tmp/cluster-autoscaler-values.yaml /tmp/keda-values.yaml /tmp/vpa-values.yaml
}

trap cleanup ERR

# Logging functions
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }

# Validate arguments
if [[ -z "$CLUSTER_NAME" ]]; then
    usage
fi

# Check prerequisites
if [[ $EUID -eq 0 ]]; then
    log_error "This script should not be run as root"
    exit 1
fi

if ! command -v sudo >/dev/null 2>&1; then
    log_error "sudo is required but not installed"
    exit 1
fi

# Auto-detect distribution
if [[ -f /etc/os-release ]]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_UPDATE="apt update && apt upgrade -y"
            PKG_INSTALL="apt install -y"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_UPDATE="dnf update -y"
            PKG_INSTALL="dnf install -y"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_UPDATE="yum update -y"
            PKG_INSTALL="yum install -y"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution - /etc/os-release not found"
    exit 1
fi

log_info "Detected distribution: $ID using $PKG_MGR"

# Step 1: Update system and install prerequisites
echo "[1/8] Updating system packages and installing prerequisites..."
sudo $PKG_UPDATE
sudo $PKG_INSTALL curl wget git

# Step 2: Install kubectl
echo "[2/8] Installing kubectl..."
if ! command -v kubectl >/dev/null 2>&1; then
    curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
    sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
    rm -f kubectl
fi

# Verify kubectl access
if ! kubectl version --client >/dev/null 2>&1; then
    log_error "kubectl installation failed"
    exit 1
fi

if ! kubectl cluster-info >/dev/null 2>&1; then
    log_warn "Cannot connect to Kubernetes cluster. Ensure kubeconfig is properly configured."
fi

# Step 3: Install Helm
echo "[3/8] Installing Helm 3..."
if ! command -v helm >/dev/null 2>&1; then
    curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
fi

# Verify Helm installation
if ! helm version >/dev/null 2>&1; then
    log_error "Helm installation failed"
    exit 1
fi

# Step 4: Add Helm repositories
echo "[4/8] Adding required Helm repositories..."
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo add kedacore https://kedacore.github.io/charts
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update

# Step 5: Create namespaces
echo "[5/8] Creating dedicated namespaces..."
kubectl create namespace kube-system-autoscaler --dry-run=client -o yaml | kubectl apply -f -
kubectl create namespace keda --dry-run=client -o yaml | kubectl apply -f -
kubectl create namespace vpa-system --dry-run=client -o yaml | kubectl apply -f -

# Step 6: Configure cluster autoscaler
echo "[6/8] Configuring cluster autoscaler..."
cat > /tmp/cluster-autoscaler-values.yaml << EOF
autoDiscovery:
  clusterName: "$CLUSTER_NAME"
  enabled: true

cloudProvider: aws

image:
  tag: "v1.28.2"

rbac:
  create: true
  serviceAccount:
    create: true
    name: "cluster-autoscaler"
    annotations:
      ${AWS_ACCOUNT_ID:+eks.amazonaws.com/role-arn: "arn:aws:iam::${AWS_ACCOUNT_ID}:role/cluster-autoscaler"}

extraArgs:
  scale-down-enabled: true
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"
  skip-nodes-with-local-storage: false
  skip-nodes-with-system-pods: false
  max-node-provision-time: "15m"
  node-group-auto-discovery: "asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/$CLUSTER_NAME"

resources:
  limits:
    cpu: 100m
    memory: 300Mi
  requests:
    cpu: 100m
    memory: 300Mi
EOF

# Deploy cluster autoscaler
helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system-autoscaler \
  --values /tmp/cluster-autoscaler-values.yaml \
  --wait --timeout=300s

# Step 7: Configure and deploy KEDA
echo "[7/8] Configuring and deploying KEDA..."
cat > /tmp/keda-values.yaml << EOF
image:
  keda:
    tag: "2.12.1"
  metricsApiServer:
    tag: "2.12.1"
  webhooks:
    tag: "2.12.1"

resources:
  operator:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  metricServer:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  webhooks:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi

prometheus:
  metricServer:
    enabled: true
  operator:
    enabled: true
  webhooks:
    enabled: true
EOF

helm upgrade --install keda kedacore/keda \
  --namespace keda \
  --values /tmp/keda-values.yaml \
  --wait --timeout=300s

# Step 8: Configure and deploy VPA
echo "[8/8] Configuring and deploying Vertical Pod Autoscaler..."
cat > /tmp/vpa-values.yaml << EOF
recommender:
  enabled: true
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi

updater:
  enabled: true
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi

admissionController:
  enabled: true
  resources:
    limits:
      cpu: 200m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 200Mi
EOF

helm upgrade --install vpa fairwinds-stable/vpa \
  --namespace vpa-system \
  --values /tmp/vpa-values.yaml \
  --wait --timeout=300s

# Verification
echo "Verifying installation..."
sleep 10

# Check cluster autoscaler
if kubectl get pods -n kube-system-autoscaler | grep -q "Running"; then
    log_info "Cluster Autoscaler: Running"
else
    log_warn "Cluster Autoscaler: Not running properly"
fi

# Check KEDA
if kubectl get pods -n keda | grep -q "Running"; then
    log_info "KEDA: Running"
else
    log_warn "KEDA: Not running properly"
fi

# Check VPA
if kubectl get pods -n vpa-system | grep -q "Running"; then
    log_info "VPA: Running"
else
    log_warn "VPA: Not running properly"
fi

# Cleanup temporary files
rm -f /tmp/cluster-autoscaler-values.yaml /tmp/keda-values.yaml /tmp/vpa-values.yaml

log_info "Kubernetes cluster autoscaling installation completed successfully!"
log_info "Components installed:"
log_info "- Cluster Autoscaler in namespace: kube-system-autoscaler"
log_info "- KEDA in namespace: keda"
log_info "- Vertical Pod Autoscaler in namespace: vpa-system"

Review the script before running. Execute with: bash install.sh

#kubernetes #autoscaling #helm #keda #cluster-autoscaler

Implement Kubernetes cluster autoscaling with Helm charts and KEDA for dynamic workload scaling