Implement Kubernetes cluster autoscaling with Helm charts and KEDA for dynamic workload scaling

Advanced 45 min Apr 03, 2026 246 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure comprehensive Kubernetes autoscaling with cluster autoscaler for node management, KEDA for event-driven pod scaling, and vertical pod autoscaler for resource optimization. This tutorial covers production-grade deployment using Helm charts with monitoring and optimization strategies.

Prerequisites

  • Existing Kubernetes cluster with kubectl access
  • Helm 3 installed
  • Cloud provider IAM permissions for cluster autoscaler
  • Prometheus monitoring stack for metrics-based scaling
  • Basic understanding of Kubernetes resource management

What this solves

Kubernetes autoscaling ensures your applications automatically scale based on demand, optimizing resource usage and costs. This tutorial implements three complementary autoscaling strategies: cluster autoscaler manages node scaling, KEDA provides event-driven horizontal pod autoscaling beyond CPU/memory metrics, and vertical pod autoscaler optimizes resource requests and limits.

Step-by-step installation

Update system packages and install prerequisites

Start by updating your system and installing the required tools for Kubernetes management.

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget git
sudo dnf update -y
sudo dnf install -y curl wget git

Install kubectl and verify Kubernetes cluster access

Install kubectl to manage your Kubernetes cluster and verify connectivity. This assumes you have an existing cluster as covered in our Kubernetes cluster installation guide.

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client

Install Helm 3 for package management

Install Helm to deploy autoscaling components using charts. If you already have Helm installed, you can reference our comprehensive Helm guide.

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version

Add required Helm repositories

Add the official repositories for cluster autoscaler, KEDA, and vertical pod autoscaler components.

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo add kedacore https://kedacore.github.io/charts
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update

Create dedicated namespaces for autoscaling components

Organize autoscaling components in separate namespaces for better management and security isolation.

kubectl create namespace kube-system-autoscaler
kubectl create namespace keda
kubectl create namespace vpa-system

Configure cluster autoscaler values

Create configuration for the cluster autoscaler with cloud provider-specific settings. This example uses AWS, but adapt for your cloud provider.

autoDiscovery:
  clusterName: "your-cluster-name"
  enabled: true

cloudProvider: aws

cloudConfigPath: "/etc/kubernetes/cloud-config"

image:
  tag: "v1.28.2"

rbac:
  create: true
  serviceAccount:
    create: true
    name: "cluster-autoscaler"
    annotations:
      eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/cluster-autoscaler"

extraArgs:
  scale-down-enabled: true
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"
  skip-nodes-with-local-storage: false
  skip-nodes-with-system-pods: false
  max-node-provision-time: "15m"
  node-group-auto-discovery: "asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/your-cluster-name"

resources:
  limits:
    cpu: 100m
    memory: 300Mi
  requests:
    cpu: 100m
    memory: 300Mi

Deploy cluster autoscaler with Helm

Install the cluster autoscaler using Helm with your custom configuration values.

helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system-autoscaler \
  --values cluster-autoscaler-values.yaml \
  --wait

kubectl get pods -n kube-system-autoscaler

Configure KEDA for event-driven autoscaling

Create KEDA configuration to enable advanced autoscaling based on metrics like queue length, database connections, or custom metrics.

image:
  keda:
    tag: "2.12.1"
  metricsApiServer:
    tag: "2.12.1"
  webhooks:
    tag: "2.12.1"

resources:
  operator:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  metricServer:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  webhooks:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi

service:
  type: ClusterIP
  portHttp: 8080
  portHttpTarget: 8080
  portHttps: 6443
  portHttpsTarget: 6443

securityContext:
  operator:
    runAsNonRoot: true
    runAsUser: 1001
  metricServer:
    runAsNonRoot: true
    runAsUser: 1001
  webhooks:
    runAsNonRoot: true
    runAsUser: 1001

Deploy KEDA with Helm

Install KEDA components including the operator, metrics server, and admission webhooks for comprehensive event-driven scaling.

helm install keda kedacore/keda \
  --namespace keda \
  --values keda-values.yaml \
  --wait

kubectl get pods -n keda

Configure vertical pod autoscaler

Create VPA configuration to automatically adjust pod resource requests and limits based on actual usage patterns.

recommender:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi
  extraArgs:
    storage: prometheus
    prometheus-address: http://prometheus-server.monitoring.svc.cluster.local:80
    prometheus-cadvisor-job-name: kubernetes-cadvisor

updater:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 1000Mi
    requests:
      cpu: 50m
      memory: 500Mi

admissionController:
  enabled: true
  image:
    tag: "1.0.0"
  resources:
    limits:
      cpu: 200m
      memory: 500Mi
    requests:
      cpu: 50m
      memory: 200Mi
  generateCertificate: true

rbac:
  create: true

Deploy vertical pod autoscaler

Install VPA components to automatically optimize resource allocation for your workloads.

helm install vpa fairwinds-stable/vpa \
  --namespace vpa-system \
  --values vpa-values.yaml \
  --wait

kubectl get pods -n vpa-system

Create example KEDA ScaledObject

Configure a sample application with KEDA scaling based on Prometheus metrics to demonstrate event-driven autoscaling.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: sample-app
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  minReplicaCount: 1
  maxReplicaCount: 50
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
      metricName: http_requests_per_second
      threshold: '100'
      query: sum(rate(http_requests_total{job="sample-app"}[1m]))
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
      - name: sample-app
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi

Apply KEDA scaling configuration

Deploy the sample application with KEDA scaling configuration to test event-driven autoscaling.

kubectl apply -f keda-scaledobject-example.yaml
kubectl get scaledobject -n default

Create VPA policy for sample application

Configure vertical pod autoscaler policy to automatically adjust resource requests based on actual usage.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: sample-app-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: sample-app
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 1000m
        memory: 1Gi
      controlledResources: ["cpu", "memory"]

Apply VPA policy

Deploy the VPA configuration to enable automatic resource optimization for your sample application.

kubectl apply -f vpa-policy-example.yaml
kubectl get vpa -n default

Configure monitoring and alerting

Set up monitoring rules to track autoscaling events and performance. This integrates with Prometheus monitoring setup.

apiVersion: v1
kind: ConfigMap
metadata:
  name: autoscaling-alerts
  namespace: monitoring
data:
  autoscaling-rules.yaml: |
    groups:
    - name: autoscaling
      rules:
      - alert: ClusterAutoscalerError
        expr: increase(cluster_autoscaler_errors_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Cluster autoscaler errors detected"
          description: "Cluster autoscaler has {{ $value }} errors in the last 5 minutes"
      
      - alert: KEDAScalerError
        expr: increase(keda_scaler_errors_total[5m]) > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "KEDA scaler errors detected"
          description: "KEDA has {{ $value }} scaler errors in the last 5 minutes"
      
      - alert: VPARecommendationMissing
        expr: (time() - vpa_status_recommendation_last_updated) > 3600
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "VPA recommendations outdated"
          description: "VPA recommendations haven't been updated for {{ $value }} seconds"

Apply monitoring configuration

Deploy the monitoring rules to track autoscaling performance and identify issues early.

kubectl apply -f autoscaling-monitoring.yaml
kubectl get configmap -n monitoring

Verify your setup

Check that all autoscaling components are running correctly and verify their functionality.

# Check cluster autoscaler status
kubectl get pods -n kube-system-autoscaler
kubectl logs -n kube-system-autoscaler -l app.kubernetes.io/name=cluster-autoscaler

Verify KEDA components

kubectl get pods -n keda kubectl get scaledobject --all-namespaces

Check VPA status

kubectl get pods -n vpa-system kubectl get vpa --all-namespaces kubectl describe vpa sample-app-vpa -n default

Test scaling behavior

kubectl get hpa --all-namespaces kubectl top pods -n default
Note: It may take 5-10 minutes for VPA to generate initial recommendations and for autoscaling policies to take effect.

Configure optimization policies

Fine-tune cluster autoscaler settings

Optimize cluster autoscaler behavior for your specific workload patterns and cost requirements.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-status
  namespace: kube-system-autoscaler
data:
  nodes.max: "100"
  scale-down-delay-after-add: "10m"
  scale-down-delay-after-delete: "10s"
  scale-down-delay-after-failure: "3m"
  scale-down-unneeded-time: "10m"
  scale-down-utilization-threshold: "0.5"
  skip-nodes-with-local-storage: "false"
  skip-nodes-with-system-pods: "false"
  new-pod-scale-up-delay: "10s"
  max-node-provision-time: "15m"

Configure KEDA scaling policies

Set up advanced KEDA scaling behaviors to prevent flapping and optimize scaling decisions.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: advanced-scaling-policy
  namespace: default
spec:
  scaleTargetRef:
    name: sample-app
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  minReplicaCount: 2
  maxReplicaCount: 100
  advanced:
    restoreToOriginalReplicaCount: true
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 50
            periodSeconds: 60
        scaleUp:
          stabilizationWindowSeconds: 0
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
          - type: Pods
            value: 4
            periodSeconds: 15
          selectPolicy: Max
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local:80
      metricName: custom_metric_rate
      threshold: '10'
      query: sum(rate(custom_metric_total{service="sample-app"}[2m]))

Apply optimization configurations

Deploy the optimized autoscaling policies to improve performance and reduce costs.

kubectl apply -f cluster-autoscaler-optimization.yaml
kubectl apply -f keda-scaling-policy.yaml

Common issues

Symptom Cause Fix
Cluster autoscaler not scaling nodes Missing IAM permissions or incorrect node group tags Verify IAM role has autoscaling permissions and node groups have correct tags
KEDA ScaledObject shows "Unknown" status Cannot connect to metrics source Check Prometheus connectivity and query syntax: kubectl describe scaledobject
VPA not updating pod resources Insufficient metrics data or admission webhook issues Wait 24 hours for data collection or check webhook certificates: kubectl get validatingwebhookconfiguration
Pods stuck in pending state Resource limits or node selection constraints Check pod events and node capacity: kubectl describe pod and kubectl describe nodes
Excessive scaling up and down Aggressive scaling policies or insufficient stabilization Increase cooldown periods and adjust stabilization windows in HPA behavior
KEDA metrics server connection refused Network policies or service mesh interference Verify network connectivity and check service mesh proxy configuration
Warning: Never disable resource limits to "fix" scaling issues. Instead, properly configure resource requests and limits based on actual application requirements and VPA recommendations.

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.