Configure Kubernetes Cluster Autoscaler Mixed Instances

Set up Kubernetes cluster autoscaler 1.30 with mixed instance types and spot instances to automatically scale nodes based on demand while minimizing infrastructure costs through intelligent instance selection and workload optimization.

Prerequisites

Kubernetes cluster 1.28+
kubectl with admin access
Cloud provider with autoscaling groups
Prometheus for monitoring (optional)

What this solves

The Kubernetes cluster autoscaler automatically adjusts your cluster size by adding or removing nodes based on pod scheduling demands. By configuring mixed instance types with spot instances, you can reduce infrastructure costs by up to 70% while maintaining application availability. This tutorial shows you how to deploy cluster autoscaler 1.30 with cost optimization policies for production workloads.

Prerequisites and preparation

Verify cluster requirements

Ensure your Kubernetes cluster is running version 1.28 or later and you have admin access.

kubectl version --short
kubectl auth can-i create nodes --all-namespaces

Install required tools

Install Helm 3 for managing the cluster autoscaler deployment.

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt update
sudo apt install -y helm

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

Step-by-step configuration

Create autoscaler namespace

Create a dedicated namespace for the cluster autoscaler components.

kubectl create namespace cluster-autoscaler

Configure RBAC permissions

Create the service account and RBAC rules required for the cluster autoscaler to manage nodes.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources: ["namespaces", "pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create","list","watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: cluster-autoscaler

kubectl apply -f cluster-autoscaler-rbac.yaml

Configure mixed instance node groups

Create node groups with different instance types and spot instances for cost optimization. This example configures three node groups: general purpose, compute optimized, and spot instances.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-nodegroups
  namespace: cluster-autoscaler
data:
  nodegroups.yaml: |
    nodeGroups:
    - name: general-purpose
      minSize: 1
      maxSize: 10
      desiredCapacity: 2
      instanceTypes:
        - t3.medium
        - t3.large
        - t3.xlarge
      spotAllocationStrategy: diversified
      spotInstancePools: 3
      onDemandPercentageAboveBaseCapacity: 20
      labels:
        node.kubernetes.io/instance-type: general
        node.kubernetes.io/capacity-type: mixed
      taints: []
      tags:
        k8s.io/cluster-autoscaler/enabled: "true"
        k8s.io/cluster-autoscaler/node-template/label/workload-type: general
    - name: compute-optimized
      minSize: 0
      maxSize: 5
      desiredCapacity: 0
      instanceTypes:
        - c5.large
        - c5.xlarge
        - c5.2xlarge
      spotAllocationStrategy: capacity-optimized
      spotInstancePools: 2
      onDemandPercentageAboveBaseCapacity: 0
      labels:
        node.kubernetes.io/instance-type: compute
        node.kubernetes.io/capacity-type: spot
      taints:
        - key: workload-type
          value: compute
          effect: NoSchedule
      tags:
        k8s.io/cluster-autoscaler/enabled: "true"
        k8s.io/cluster-autoscaler/node-template/label/workload-type: compute
    - name: memory-optimized
      minSize: 0
      maxSize: 3
      desiredCapacity: 0
      instanceTypes:
        - r5.large
        - r5.xlarge
        - r5.2xlarge
      spotAllocationStrategy: capacity-optimized
      spotInstancePools: 2
      onDemandPercentageAboveBaseCapacity: 30
      labels:
        node.kubernetes.io/instance-type: memory
        node.kubernetes.io/capacity-type: mixed
      taints:
        - key: workload-type
          value: memory
          effect: NoSchedule
      tags:
        k8s.io/cluster-autoscaler/enabled: "true"
        k8s.io/cluster-autoscaler/node-template/label/workload-type: memory

kubectl apply -f mixed-nodegroups.yaml

Deploy cluster autoscaler with cost optimization

Deploy the cluster autoscaler 1.30 with configurations optimized for mixed instance types and cost savings.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    app: cluster-autoscaler
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.30.0
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 600Mi
          requests:
            cpu: 100m
            memory: 600Mi
        command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste,spot-instance-mixing,priority
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/$(CLUSTER_NAME)
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false
        - --scale-down-enabled=true
        - --scale-down-delay-after-add=10m
        - --scale-down-delay-after-delete=10s
        - --scale-down-delay-after-failure=3m
        - --scale-down-unneeded-time=10m
        - --scale-down-utilization-threshold=0.5
        - --max-node-provision-time=15m
        - --max-empty-bulk-delete=10
        - --max-nodes-total=100
        - --cores-total=0:1000
        - --memory-total=0:1000Gi
        - --new-pod-scale-up-delay=0s
        - --max-bulk-soft-taint-count=10
        - --max-bulk-soft-taint-time=3s
        env:
        - name: CLUSTER_NAME
          value: YOUR_CLUSTER_NAME
        - name: AWS_REGION
          value: us-west-2
        volumeMounts:
        - name: ssl-certs
          mountPath: /etc/ssl/certs/ca-certificates.crt
          readOnly: true
        imagePullPolicy: Always
      volumes:
      - name: ssl-certs
        hostPath:
          path: /etc/ssl/certs/ca-certificates.crt
      nodeSelector:
        kubernetes.io/os: linux

kubectl apply -f cluster-autoscaler-deployment.yaml

Configure cost optimization policies

Create priority-based scaling policies that prefer cost-effective instances and implement pod disruption budgets.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: cluster-autoscaler
data:
  priorities: |
    10:
      - .spot.
    5:
      - .t3\..
    1:
      - .c5\..
      - .r5\..
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: cluster-autoscaler-pdb
  namespace: cluster-autoscaler
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-status
  namespace: cluster-autoscaler
data:
  config: |
    {
      "scale_down_gpu_utilization_threshold": 0.5,
      "scale_down_utilization_threshold": 0.5,
      "skip_nodes_with_local_storage": false,
      "skip_nodes_with_system_pods": false,
      "max_node_provision_time": "15m",
      "node_startup_timeout": "15m",
      "node_deletion_delay_timeout": "2m",
      "scan_interval": "10s",
      "expendable_pods_priority_cutoff": -10,
      "ignore_daemon_sets_utilization": false,
      "max_bulk_soft_taint_count": 10,
      "max_bulk_soft_taint_time": "3s"
    }

kubectl apply -f cost-optimization-config.yaml

Set up monitoring and alerting

Configure monitoring for the cluster autoscaler with Prometheus metrics and cost tracking. This integrates with existing Prometheus monitoring setups.

apiVersion: v1
kind: Service
metadata:
  name: cluster-autoscaler-metrics
  namespace: cluster-autoscaler
  labels:
    app: cluster-autoscaler
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8085'
    prometheus.io/path: '/metrics'
spec:
  ports:
  - port: 8085
    targetPort: 8085
    protocol: TCP
    name: http-metrics
  selector:
    app: cluster-autoscaler
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    app: cluster-autoscaler
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  endpoints:
  - port: http-metrics
    interval: 30s
    path: /metrics

kubectl apply -f autoscaler-monitoring.yaml

Configure workload-specific scheduling

Create node selectors and tolerations for different workload types to optimize resource allocation and costs.

apiVersion: v1
kind: ConfigMap
metadata:
  name: workload-scheduling-config
  namespace: default
data:
  general-workload.yaml: |
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: general-app
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: general-app
      template:
        metadata:
          labels:
            app: general-app
        spec:
          nodeSelector:
            workload-type: general
            node.kubernetes.io/capacity-type: mixed
          tolerations:
          - key: "node.kubernetes.io/capacity-type"
            operator: "Equal"
            value: "spot"
            effect: "NoSchedule"
          containers:
          - name: app
            image: nginx:latest
            resources:
              requests:
                cpu: 100m
                memory: 128Mi
              limits:
                cpu: 200m
                memory: 256Mi
  compute-workload.yaml: |
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: compute-intensive-app
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: compute-app
      template:
        metadata:
          labels:
            app: compute-app
        spec:
          nodeSelector:
            workload-type: compute
          tolerations:
          - key: "workload-type"
            operator: "Equal"
            value: "compute"
            effect: "NoSchedule"
          - key: "node.kubernetes.io/capacity-type"
            operator: "Equal"
            value: "spot"
            effect: "NoSchedule"
          containers:
          - name: app
            image: nginx:latest
            resources:
              requests:
                cpu: 500m
                memory: 512Mi
              limits:
                cpu: 1000m
                memory: 1Gi

kubectl apply -f workload-scheduling.yaml

Verify your setup

Check that the cluster autoscaler is running and monitoring your node groups correctly.

# Check autoscaler pod status
kubectl get pods -n cluster-autoscaler
kubectl logs -n cluster-autoscaler deployment/cluster-autoscaler

Verify node groups are discovered
kubectl get nodes --show-labels | grep capacity-type

Check autoscaler events
kubectl get events -n cluster-autoscaler --sort-by=.metadata.creationTimestamp

Monitor scaling metrics
kubectl top nodes
kubectl describe configmap cluster-autoscaler-status -n cluster-autoscaler

Pro tip: The cluster autoscaler takes 10-15 minutes to make scaling decisions. Be patient when testing scaling behavior.

Configure advanced cost optimization

Implement predictive scaling

Configure vertical pod autoscaler integration and predictive scaling based on historical usage patterns.

apiVersion: v1
kind: ConfigMap
metadata:
  name: predictive-scaling-config
  namespace: cluster-autoscaler
data:
  config.yaml: |
    predictiveScaling:
      enabled: true
      lookbackWindow: 7d
      predictionWindow: 1h
      scalingBuffer: 0.1
      costOptimization:
        spotInstancePreference: 0.7
        instanceTypeMixing: true
        preemptionTolerance: 0.3
      scheduleBasedScaling:
        - name: business-hours
          schedule: "0 8   1-5"
          minNodes: 5
          maxNodes: 20
        - name: off-hours
          schedule: "0 18   1-5"
          minNodes: 2
          maxNodes: 10
        - name: weekend
          schedule: "0 0   0,6"
          minNodes: 1
          maxNodes: 5

kubectl apply -f predictive-scaling.yaml

Set up cost alerts and budgets

Configure alerts for cost thresholds and automated responses to prevent budget overruns.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cost-monitoring-alerts
  namespace: cluster-autoscaler
data:
  alerts.yaml: |
    groups:
    - name: cluster-autoscaler-cost
      rules:
      - alert: HighClusterCost
        expr: increase(cluster_autoscaler_nodes_count[1h]) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Cluster scaling rapidly - check for cost impact"
          description: "Node count increased by {{ $value }} in the last hour"
      - alert: SpotInstanceFailureRate
        expr: rate(cluster_autoscaler_failed_scale_ups_total[5m]) > 0.1
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "High spot instance failure rate"
          description: "Spot instance provisioning failing at {{ $value }} rate"
      - alert: NodeUtilizationLow
        expr: avg(cluster_autoscaler_node_utilization) < 0.3
        for: 15m
        labels:
          severity: info
        annotations:
          summary: "Low cluster utilization - consider scaling down"
          description: "Average node utilization is {{ $value }}"
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: cost-optimization-report
  namespace: cluster-autoscaler
spec:
  schedule: "0 9   1"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: cost-reporter
            image: alpine/curl:latest
            command:
            - /bin/sh
            - -c
            - |
              echo "Weekly Cost Optimization Report"
              kubectl top nodes
              kubectl get nodes -l node.kubernetes.io/capacity-type=spot --no-headers | wc -l
              kubectl get nodes -l node.kubernetes.io/capacity-type=on-demand --no-headers | wc -l
          restartPolicy: OnFailure

kubectl apply -f cost-alerts.yaml

Test scaling behavior

Deploy a test workload to verify the autoscaler correctly provisions mixed instance types based on demand.

# Create a test deployment that will trigger scaling
kubectl create deployment test-scale --image=nginx:latest
kubectl scale deployment test-scale --replicas=20

Add resource requests to trigger node scaling
kubectl patch deployment test-scale -p '{"spec":{"template":{"spec":{"containers":[{"name":"nginx","resources":{"requests":{"cpu":"500m","memory":"1Gi"}}}]}}}}'

Watch the scaling events
watch kubectl get nodes
watch kubectl get pods -o wide

Check autoscaler decisions
kubectl logs -n cluster-autoscaler deployment/cluster-autoscaler --tail=50

Clean up test deployment
kubectl delete deployment test-scale

Common issues

Symptom	Cause	Fix
Pods stuck in Pending state	Node group max size reached or instance type unavailable	Check `kubectl describe pod` and increase max nodes or add more instance types
Autoscaler not scaling down	Utilization threshold too low or system pods preventing scale down	Adjust `--scale-down-utilization-threshold` or use `--skip-nodes-with-system-pods=false`
High costs despite spot instances	Falling back to on-demand due to spot unavailability	Increase `spotInstancePools` and diversify instance types across AZs
Frequent node replacements	Spot instance interruptions with poor handling	Configure pod disruption budgets and increase `onDemandPercentageAboveBaseCapacity`
Slow scaling response	Default scan interval too high	Reduce `scan-interval` to 10s and `scale-down-delay-after-add` to 5m
Mixed instance selection not working	Expander priority not configured correctly	Verify ConfigMap `cluster-autoscaler-priority-expander` and use `least-waste` expander

Monitor and optimize costs

Set up comprehensive monitoring to track cost savings and scaling efficiency.

# Monitor cost metrics
kubectl get nodes -l node.kubernetes.io/capacity-type=spot -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.node\.kubernetes\.io/instance-type}{"\n"}{end}'

Check node utilization
kubectl top nodes --sort-by=cpu

Monitor autoscaler metrics via Prometheus
curl -s http://localhost:8085/metrics | grep cluster_autoscaler

Generate cost report
kubectl get nodes -o custom-columns=NAME:.metadata.name,INSTANCE-TYPE:.metadata.labels.node\.kubernetes\.io/instance-type,CAPACITY-TYPE:.metadata.labels.node\.kubernetes\.io/capacity-type,ZONE:.metadata.labels.topology\.kubernetes\.io/zone

This setup integrates well with custom metrics autoscaling for application-specific scaling and resource quotas for better resource management.

Next steps

Running this in production?

Want this handled for you? Running cluster autoscaling at scale adds complexity: capacity planning, cost monitoring, failover testing, and 24/7 incident response when scaling decisions go wrong. See how we run infrastructure like this for European teams with automatic cost optimization and expert monitoring.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Global variables
CLUSTER_NAME=""
AWS_REGION=""
NODE_GROUP_NAME=""

# Usage message
usage() {
    echo "Usage: $0 --cluster-name CLUSTER_NAME --region AWS_REGION --node-group NODE_GROUP_NAME"
    echo "Example: $0 --cluster-name my-cluster --region us-west-2 --node-group my-nodegroup"
    exit 1
}

# Error handling and cleanup
cleanup() {
    echo -e "${RED}[ERROR] Installation failed. Cleaning up...${NC}"
    kubectl delete namespace cluster-autoscaler --ignore-not-found=true 2>/dev/null || true
    rm -f /tmp/cluster-autoscaler-rbac.yaml /tmp/cluster-autoscaler-values.yaml
}
trap cleanup ERR

# Logging functions
log_info() { echo -e "${GREEN}$1${NC}"; }
log_warn() { echo -e "${YELLOW}$1${NC}"; }
log_error() { echo -e "${RED}$1${NC}"; }

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --cluster-name)
            CLUSTER_NAME="$2"
            shift 2
            ;;
        --region)
            AWS_REGION="$2"
            shift 2
            ;;
        --node-group)
            NODE_GROUP_NAME="$2"
            shift 2
            ;;
        -h|--help)
            usage
            ;;
        *)
            log_error "Unknown option: $1"
            usage
            ;;
    esac
done

# Validate required arguments
if [[ -z "$CLUSTER_NAME" || -z "$AWS_REGION" || -z "$NODE_GROUP_NAME" ]]; then
    log_error "Missing required arguments"
    usage
fi

# Auto-detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_UPDATE="apt update"
            PKG_INSTALL="apt install -y"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_UPDATE="dnf check-update || true"
            PKG_INSTALL="dnf install -y"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_UPDATE="yum check-update || true"
            PKG_INSTALL="yum install -y"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution. /etc/os-release not found."
    exit 1
fi

# Check prerequisites
echo "[1/8] Checking prerequisites..."

# Check if running as root or with sudo
if [[ $EUID -ne 0 && -z "${SUDO_USER:-}" ]]; then
    log_error "This script must be run as root or with sudo privileges"
    exit 1
fi

# Check if kubectl is installed and working
if ! command -v kubectl >/dev/null 2>&1; then
    log_error "kubectl is not installed or not in PATH"
    exit 1
fi

# Verify cluster connectivity and requirements
echo "[2/8] Verifying cluster requirements..."
if ! kubectl version --short >/dev/null 2>&1; then
    log_error "Cannot connect to Kubernetes cluster"
    exit 1
fi

if ! kubectl auth can-i create nodes --all-namespaces >/dev/null 2>&1; then
    log_error "Insufficient permissions to create nodes"
    exit 1
fi

# Update package manager
echo "[3/8] Updating package manager..."
$PKG_UPDATE

# Install required tools
echo "[4/8] Installing required tools..."

# Install curl and gnupg if not present
if [[ "$PKG_MGR" == "apt" ]]; then
    $PKG_INSTALL curl gnupg2 apt-transport-https
else
    $PKG_INSTALL curl gnupg2
fi

# Install Helm
if ! command -v helm >/dev/null 2>&1; then
    if [[ "$PKG_MGR" == "apt" ]]; then
        curl -fsSL https://baltocdn.com/helm/signing.asc | gpg --dearmor | tee /usr/share/keyrings/helm.gpg > /dev/null
        echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | tee /etc/apt/sources.list.d/helm-stable-debian.list
        $PKG_UPDATE
        $PKG_INSTALL helm
    else
        curl -fsSL -o /tmp/get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
        chmod 755 /tmp/get_helm.sh
        /tmp/get_helm.sh
        rm -f /tmp/get_helm.sh
    fi
    log_info "Helm installed successfully"
else
    log_info "Helm is already installed"
fi

# Create autoscaler namespace
echo "[5/8] Creating cluster autoscaler namespace..."
kubectl create namespace cluster-autoscaler --dry-run=client -o yaml | kubectl apply -f -

# Configure RBAC permissions
echo "[6/8] Configuring RBAC permissions..."
cat > /tmp/cluster-autoscaler-rbac.yaml << 'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources: ["namespaces", "pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create","list","watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: cluster-autoscaler
EOF

chmod 644 /tmp/cluster-autoscaler-rbac.yaml
kubectl apply -f /tmp/cluster-autoscaler-rbac.yaml

# Deploy cluster autoscaler with Helm
echo "[7/8] Deploying cluster autoscaler..."

# Add autoscaler Helm repository
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update

# Create Helm values file for cost optimization
cat > /tmp/cluster-autoscaler-values.yaml << EOF
autoDiscovery:
  clusterName: ${CLUSTER_NAME}
  enabled: true

awsRegion: ${AWS_REGION}

extraArgs:
  v: 4
  stderrthreshold: info
  logtostderr: true
  skip-nodes-with-local-storage: false
  expander: least-waste,spot-instance-mixing,priority
  node-group-auto-discovery: asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/${CLUSTER_NAME}
  balance-similar-node-groups: true
  skip-nodes-with-system-pods: false
  scale-down-enabled: true
  scale-down-delay-after-add: 10m
  scale-down-unneeded-time: 10m
  scale-down-utilization-threshold: 0.5
  max-node-provision-time: 15m

image:
  tag: v1.30.0

nodeSelector:
  kubernetes.io/os: linux

podAnnotations:
  cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

resources:
  limits:
    cpu: 100m
    memory: 600Mi
  requests:
    cpu: 100m
    memory: 600Mi

serviceAccount:
  create: false
  name: cluster-autoscaler

securityContext:
  runAsNonRoot: true
  runAsUser: 65534
  fsGroup: 65534

priorityClassName: system-cluster-critical

tolerations:
  - key: node-role.kubernetes.io/control-plane
    operator: Exists
    effect: NoSchedule
  - key: node-role.kubernetes.io/master
    operator: Exists
    effect: NoSchedule

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/os
            operator: In
            values:
              - linux
EOF

chmod 644 /tmp/cluster-autoscaler-values.yaml

# Install/upgrade cluster autoscaler
helm upgrade --install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace cluster-autoscaler \
  --values /tmp/cluster-autoscaler-values.yaml \
  --wait --timeout=300s

# Verification
echo "[8/8] Verifying installation..."

# Wait for deployment to be ready
kubectl rollout status deployment/cluster-autoscaler -n cluster-autoscaler --timeout=300s

# Verify pods are running
if kubectl get pods -n cluster-autoscaler -l app.kubernetes.io/name=cluster-autoscaler | grep -q "Running"; then
    log_info "Cluster autoscaler pod is running"
else
    log_error "Cluster autoscaler pod is not running"
    exit 1
fi

# Check logs for any obvious errors
if kubectl logs -n cluster-autoscaler -l app.kubernetes.io/name=cluster-autoscaler --tail=20 | grep -qi error; then
    log_warn "Found errors in cluster autoscaler logs. Please check with: kubectl logs -n cluster-autoscaler -l app.kubernetes.io/name=cluster-autoscaler"
fi

# Cleanup temporary files
rm -f /tmp/cluster-autoscaler-rbac.yaml /tmp/cluster-autoscaler-values.yaml

# Final success message
log_info "✅ Kubernetes cluster autoscaler installation completed successfully!"
log_info "📊 Configuration includes cost optimization settings with mixed instance types support"
log_info "🔍 Monitor with: kubectl logs -f -n cluster-autoscaler -l app.kubernetes.io/name=cluster-autoscaler"
log_info "📈 View status: kubectl describe configmap cluster-autoscaler-status -n cluster-autoscaler"

echo
echo "Next steps:"
echo "1. Ensure your node groups have proper tags for auto-discovery"
echo "2. Configure spot instances in your node groups for maximum cost savings"
echo "3. Monitor scaling events in the autoscaler logs"

Review the script before running. Execute with: bash install.sh

#kubernetes #cluster-autoscaler #cost-optimization #spot-instances #mixed-instance-types

Configure Kubernetes cluster autoscaler with mixed instance types for cost optimization

Prerequisites

What this solves

Prerequisites and preparation

Verify cluster requirements

Install required tools

Step-by-step configuration

Create autoscaler namespace

Configure RBAC permissions

Configure mixed instance node groups

Deploy cluster autoscaler with cost optimization

Configure cost optimization policies

Set up monitoring and alerting

Configure workload-specific scheduling

Verify your setup

Verify node groups are discovered

Check autoscaler events

Monitor scaling metrics

Configure advanced cost optimization

Implement predictive scaling

Set up cost alerts and budgets

Test scaling behavior

Add resource requests to trigger node scaling

Watch the scaling events

Check autoscaler decisions

Clean up test deployment

Common issues

Monitor and optimize costs

Check node utilization

Monitor autoscaler metrics via Prometheus

Generate cost report

Next steps

Running this in production?

Related tutorials

Implement Kubernetes workload rightsizing with VPA recommendations and cost analysis

Configure Apache Airflow DAG performance optimization best practices

Set up Kubernetes custom metrics autoscaling with Prometheus adapter for application-specific scaling

Don't want to manage this yourself?