Set up Vertical Pod Autoscaler to automatically optimize resource requests and limits for your Kubernetes workloads. Create cost analysis dashboards to track resource utilization and identify opportunities for rightsizing containers in production clusters.
Prerequisites
- Kubernetes cluster with admin access
- Helm 3 installed
- At least 4GB available cluster memory
- metrics-server running
What this solves
Kubernetes workloads often run with poorly configured resource requests and limits, leading to wasted CPU and memory or application performance issues. The Vertical Pod Autoscaler (VPA) analyzes historical resource usage and provides recommendations for optimal resource allocation. This tutorial shows you how to deploy VPA, configure it for your workloads, and build cost analysis dashboards to track resource efficiency across your cluster.
Prerequisites
You need a running Kubernetes cluster with metrics-server installed and at least 4 GB of available memory. Your cluster should have Helm 3 installed for package management. You'll also need cluster-admin permissions to deploy VPA components and configure RBAC policies.
Step-by-step installation
Install metrics-server for resource collection
The VPA requires metrics-server to collect resource usage data from your pods. Install it using the official manifest if not already present.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Verify metrics-server is running and collecting data:
kubectl get pods -n kube-system | grep metrics-server
kubectl top nodes
kubectl top pods --all-namespaces
Deploy Vertical Pod Autoscaler
Clone the VPA repository and deploy the components using the provided installation script.
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-install.sh
Verify all VPA components are running:
kubectl get pods -n kube-system | grep vpa
kubectl get crd | grep verticalpodautoscaler
Install Prometheus for metrics collection
Deploy Prometheus using Helm to collect detailed resource metrics for cost analysis. Add the Prometheus community Helm repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Create a values file for Prometheus configuration:
server:
retention: "15d"
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
persistentVolume:
size: 20Gi
nodeExporter:
enabled: true
kubeStateMetrics:
enabled: true
alertmanager:
enabled: false
pushgateway:
enabled: false
Install Prometheus in the monitoring namespace:
kubectl create namespace monitoring
helm install prometheus prometheus-community/prometheus \
--namespace monitoring \
--values prometheus-values.yaml
Deploy Grafana for visualization
Install Grafana to create cost analysis dashboards using the official Helm chart:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
Create Grafana configuration values:
adminPassword: "AdminPassword123!"
persistence:
enabled: true
size: 10Gi
service:
type: ClusterIP
port: 80
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus-server.monitoring.svc.cluster.local
access: proxy
isDefault: true
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Install Grafana:
helm install grafana grafana/grafana \
--namespace monitoring \
--values grafana-values.yaml
Configure VPA for workload analysis
Create a VPA resource to analyze an existing deployment. This example targets a web application deployment:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: "Off" # Only provide recommendations
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 10m
memory: 64Mi
maxAllowed:
cpu: 1000m
memory: 2Gi
controlledResources: ["cpu", "memory"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp
image: nginx:1.24
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
ports:
- containerPort: 80
Apply the configuration:
kubectl apply -f vpa-recommendation.yaml
Create cost analysis dashboard
Create a Grafana dashboard configuration for resource utilization and cost tracking:
{
"dashboard": {
"id": null,
"title": "Kubernetes Cost Analysis",
"tags": ["kubernetes", "cost", "vpa"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "CPU Utilization by Namespace",
"type": "stat",
"targets": [
{
"expr": "sum(rate(container_cpu_usage_seconds_total[5m])) by (namespace)",
"legendFormat": "{{namespace}}"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent"
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"id": 2,
"title": "Memory Utilization by Namespace",
"type": "stat",
"targets": [
{
"expr": "sum(container_memory_usage_bytes) by (namespace) / 1024^3",
"legendFormat": "{{namespace}}"
}
],
"fieldConfig": {
"defaults": {
"unit": "GB"
}
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"id": 3,
"title": "Resource Requests vs Usage",
"type": "timeseries",
"targets": [
{
"expr": "sum(kube_pod_container_resource_requests{resource=\"cpu\"}) by (namespace)",
"legendFormat": "CPU Requests - {{namespace}}"
},
{
"expr": "sum(rate(container_cpu_usage_seconds_total[5m])) by (namespace)",
"legendFormat": "CPU Usage - {{namespace}}"
}
],
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
Import this dashboard through the Grafana UI or using the API after setting up port forwarding.
Configure VPA recommendOnly mode
Set up VPA to provide recommendations without automatically updating pods. This is safer for production workloads:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: production-app-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: production-app
updatePolicy:
updateMode: "Off"
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: 50m
memory: 128Mi
maxAllowed:
cpu: 2000m
memory: 4Gi
controlledResources: ["cpu", "memory"]
controlledValues: "RequestsAndLimits"
Apply the VPA configuration:
kubectl apply -f vpa-recommendonly.yaml
Set up cost monitoring alerts
Create Prometheus alerting rules for resource waste detection:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cost-optimization-alerts
namespace: monitoring
spec:
groups:
- name: resource-waste
rules:
- alert: HighCPURequest
expr: |
(
sum(kube_pod_container_resource_requests{resource="cpu"}) by (namespace, pod, container) -
sum(rate(container_cpu_usage_seconds_total[1h])) by (namespace, pod, container)
) > 0.5
for: 30m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.namespace }}/{{ $labels.pod }} has excessive CPU requests"
description: "Container {{ $labels.container }} is requesting {{ $value }} more CPU than it uses"
- alert: HighMemoryRequest
expr: |
(
sum(kube_pod_container_resource_requests{resource="memory"}) by (namespace, pod, container) -
sum(container_memory_usage_bytes) by (namespace, pod, container)
) / 1024^3 > 0.5
for: 30m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.namespace }}/{{ $labels.pod }} has excessive memory requests"
description: "Container {{ $labels.container }} is requesting {{ $value }}GB more memory than it uses"
Apply the alerting rules:
kubectl apply -f cost-alerts.yaml
Create automated rightsizing script
Build a script to extract VPA recommendations and generate rightsized resource configurations:
#!/bin/bash
Get VPA recommendations for all namespaces
echo "Fetching VPA recommendations..."
for vpa in $(kubectl get vpa --all-namespaces -o jsonpath='{range .items[*]}{.metadata.namespace}{" "}{.metadata.name}{"\n"}{end}'); do
namespace=$(echo $vpa | cut -d' ' -f1)
vpa_name=$(echo $vpa | cut -d' ' -f2)
echo "\n=== VPA Recommendations for $namespace/$vpa_name ==="
# Extract target deployment
target_ref=$(kubectl get vpa $vpa_name -n $namespace -o jsonpath='{.spec.targetRef.name}')
# Get current resource requests
echo "Current resource requests:"
kubectl get deployment $target_ref -n $namespace -o jsonpath='{range .spec.template.spec.containers[*]}{.name}: CPU={.resources.requests.cpu}, Memory={.resources.requests.memory}{"\n"}{end}'
# Get VPA recommendations
echo "\nVPA recommendations:"
kubectl get vpa $vpa_name -n $namespace -o jsonpath='{range .status.recommendation.containerRecommendations[*]}{.containerName}: CPU={.target.cpu}, Memory={.target.memory}{"\n"}{end}'
# Calculate potential savings
echo "\nRecommendation status:"
kubectl get vpa $vpa_name -n $namespace -o jsonpath='{.status.conditions[*].message}'
done
Make the script executable and run it:
chmod +x rightsize-workloads.sh
./rightsize-workloads.sh
Access your monitoring setup
Set up port forwarding to access Grafana and view your cost analysis dashboards:
kubectl port-forward -n monitoring service/grafana 3000:80
Access Grafana at http://localhost:3000 with username admin and the password you configured. Import the cost dashboard JSON to start analyzing resource utilization patterns.
Check VPA recommendations for your workloads:
kubectl describe vpa webapp-vpa
kubectl get vpa --all-namespaces -o wide
Configure workload optimization
For production workloads, consider these optimization strategies based on VPA recommendations:
| Scenario | VPA Mode | Update Strategy |
|---|---|---|
| Development/Testing | Auto | Immediate pod restart |
| Staging | Auto | Rolling update during maintenance |
| Production | Off | Manual review and deployment |
For more advanced cluster management, you might want to explore cluster autoscaling with mixed instance types to complement your workload rightsizing efforts.
Verify your setup
Confirm all components are functioning correctly:
# Check VPA components
kubectl get pods -n kube-system | grep vpa
Verify metrics collection
kubectl top pods --all-namespaces
Check VPA recommendations
kubectl get vpa --all-namespaces
Test Prometheus metrics
kubectl port-forward -n monitoring service/prometheus-server 9090:80 &
curl -s "http://localhost:9090/api/v1/query?query=up" | jq '.data.result[] | select(.metric.__name__ == "up") | .value[1]'
Verify Grafana access
kubectl get secret -n monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode
echo
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| VPA pods not starting | Insufficient cluster resources | Ensure cluster has at least 4GB available memory and check node capacity |
| No VPA recommendations | Insufficient metrics data | Wait 24-48 hours for VPA to collect usage patterns, verify metrics-server is working |
| Grafana dashboard empty | Prometheus data source misconfigured | Verify Prometheus service URL and test data source connection in Grafana |
| VPA recommendations too aggressive | Default safety margins too low | Adjust minAllowed and maxAllowed in VPA resource policy |
| Metrics-server certificate errors | Self-signed certificates | Add --kubelet-insecure-tls flag to metrics-server deployment |
Next steps
- Set up custom metrics autoscaling with Prometheus adapter for more advanced scaling strategies
- Configure resource quotas and limit ranges to enforce resource governance across namespaces
- Implement cluster autoscaler for automatic node scaling to complement your workload optimization
- Set up horizontal pod autoscaling for traffic-based scaling alongside VPA recommendations
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Default values
CLUSTER_NAME="${1:-k8s-cluster}"
PROMETHEUS_RETENTION="${2:-15d}"
GRAFANA_PASSWORD="${3:-AdminPassword123!}"
# Usage function
usage() {
echo "Usage: $0 [cluster-name] [prometheus-retention] [grafana-password]"
echo "Example: $0 my-cluster 30d MySecurePassword123"
exit 1
}
# Logging functions
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
# Cleanup function
cleanup() {
log_error "Script failed. Check logs above for details."
# Remove temp files if they exist
rm -f /tmp/prometheus-values.yaml /tmp/grafana-values.yaml /tmp/webapp-vpa.yaml 2>/dev/null || true
}
trap cleanup ERR
# Check if running as root or with sudo
check_privileges() {
if [[ $EUID -eq 0 ]]; then
log_warn "Running as root. Consider using a non-root user with sudo for kubectl operations."
elif ! sudo -n true 2>/dev/null; then
log_error "This script requires sudo privileges for package installation"
exit 1
fi
}
# Detect Linux distribution
detect_distro() {
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf check-update || true"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum check-update || true"
;;
*)
log_error "Unsupported distribution: $ID"
exit 1
;;
esac
log_info "Detected distribution: $ID ($VERSION_ID)"
else
log_error "Cannot detect Linux distribution"
exit 1
fi
}
# Install prerequisites
install_prerequisites() {
echo "[1/8] Installing prerequisites..."
# Update package manager
if [[ $EUID -eq 0 ]]; then
$PKG_UPDATE
else
sudo $PKG_UPDATE
fi
# Install common packages
local packages="curl wget git"
if [[ $EUID -eq 0 ]]; then
$PKG_INSTALL $packages
else
sudo $PKG_INSTALL $packages
fi
log_info "Prerequisites installed successfully"
}
# Install kubectl
install_kubectl() {
echo "[2/8] Installing kubectl..."
if command -v kubectl &> /dev/null; then
log_info "kubectl already installed: $(kubectl version --client --short 2>/dev/null || kubectl version --client)"
return
fi
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod 755 kubectl
if [[ $EUID -eq 0 ]]; then
mv kubectl /usr/local/bin/
else
sudo mv kubectl /usr/local/bin/
fi
log_info "kubectl installed successfully"
}
# Install Helm
install_helm() {
echo "[3/8] Installing Helm..."
if command -v helm &> /dev/null; then
log_info "Helm already installed: $(helm version --short)"
return
fi
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
log_info "Helm installed successfully"
}
# Install metrics-server
install_metrics_server() {
echo "[4/8] Installing metrics-server..."
# Check if metrics-server is already running
if kubectl get pods -n kube-system | grep metrics-server | grep Running &>/dev/null; then
log_info "metrics-server already running"
return
fi
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# Wait for metrics-server to be ready
log_info "Waiting for metrics-server to be ready..."
kubectl wait --for=condition=ready pod -l k8s-app=metrics-server -n kube-system --timeout=300s
log_info "metrics-server installed and ready"
}
# Install VPA
install_vpa() {
echo "[5/8] Installing Vertical Pod Autoscaler..."
# Check if VPA CRDs already exist
if kubectl get crd | grep verticalpodautoscaler &>/dev/null; then
log_info "VPA already installed"
return
fi
# Clone VPA repository
local temp_dir="/tmp/autoscaler-$$"
git clone https://github.com/kubernetes/autoscaler.git "$temp_dir"
cd "$temp_dir/vertical-pod-autoscaler"
# Install VPA
./hack/vpa-install.sh
# Wait for VPA components to be ready
log_info "Waiting for VPA components to be ready..."
kubectl wait --for=condition=ready pod -l app=vpa-admission-controller -n kube-system --timeout=300s
kubectl wait --for=condition=ready pod -l app=vpa-recommender -n kube-system --timeout=300s
kubectl wait --for=condition=ready pod -l app=vpa-updater -n kube-system --timeout=300s
# Cleanup
cd /
rm -rf "$temp_dir"
log_info "VPA installed successfully"
}
# Install Prometheus
install_prometheus() {
echo "[6/8] Installing Prometheus..."
# Create monitoring namespace
kubectl create namespace monitoring --dry-run=client -o yaml | kubectl apply -f -
# Add Prometheus Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Create Prometheus values file
cat > /tmp/prometheus-values.yaml << EOF
server:
retention: "$PROMETHEUS_RETENTION"
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
persistentVolume:
size: 20Gi
nodeExporter:
enabled: true
kubeStateMetrics:
enabled: true
alertmanager:
enabled: false
pushgateway:
enabled: false
EOF
# Install Prometheus
if helm list -n monitoring | grep prometheus &>/dev/null; then
log_info "Prometheus already installed, upgrading..."
helm upgrade prometheus prometheus-community/prometheus \
--namespace monitoring \
--values /tmp/prometheus-values.yaml
else
helm install prometheus prometheus-community/prometheus \
--namespace monitoring \
--values /tmp/prometheus-values.yaml
fi
# Wait for Prometheus to be ready
log_info "Waiting for Prometheus to be ready..."
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=prometheus,app.kubernetes.io/component=server -n monitoring --timeout=300s
rm -f /tmp/prometheus-values.yaml
log_info "Prometheus installed successfully"
}
# Install Grafana
install_grafana() {
echo "[7/8] Installing Grafana..."
# Add Grafana Helm repository
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Create Grafana values file
cat > /tmp/grafana-values.yaml << EOF
adminPassword: "$GRAFANA_PASSWORD"
persistence:
enabled: true
size: 10Gi
service:
type: ClusterIP
port: 80
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus-server.monitoring.svc.cluster.local
access: proxy
isDefault: true
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
EOF
# Install Grafana
if helm list -n monitoring | grep grafana &>/dev/null; then
log_info "Grafana already installed, upgrading..."
helm upgrade grafana grafana/grafana \
--namespace monitoring \
--values /tmp/grafana-values.yaml
else
helm install grafana grafana/grafana \
--namespace monitoring \
--values /tmp/grafana-values.yaml
fi
# Wait for Grafana to be ready
log_info "Waiting for Grafana to be ready..."
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=grafana -n monitoring --timeout=300s
rm -f /tmp/grafana-values.yaml
log_info "Grafana installed successfully"
}
# Configure sample VPA
configure_sample_vpa() {
echo "[8/8] Configuring sample VPA..."
# Create sample webapp with VPA
cat > /tmp/webapp-vpa.yaml << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp
image: nginx:1.21
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
ports:
- containerPort: 80
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: webapp-vpa
namespace: default
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
updatePolicy:
updateMode: "Off"
resourcePolicy:
containerPolicies:
- containerName: webapp
minAllowed:
cpu: 10m
memory: 64Mi
maxAllowed:
cpu: 1000m
memory: 2Gi
controlledResources: ["cpu", "memory"]
EOF
kubectl apply -f /tmp/webapp-vpa.yaml
rm -f /tmp/webapp-vpa.yaml
log_info "Sample webapp with VPA configured"
}
# Verify installation
verify_installation() {
echo
log_info "=== Installation Verification ==="
# Check metrics-server
if kubectl get pods -n kube-system | grep metrics-server | grep Running &>/dev/null; then
log_info "✓ metrics-server is running"
else
log_error "✗ metrics-server is not running"
fi
# Check VPA components
if kubectl get crd | grep verticalpodautoscaler &>/dev/null; then
log_info "✓ VPA CRDs are installed"
else
log_error "✗ VPA CRDs are missing"
fi
# Check Prometheus
if kubectl get pods -n monitoring | grep prometheus-server | grep Running &>/dev/null; then
log_info "✓ Prometheus is running"
else
log_error "✗ Prometheus is not running"
fi
# Check Grafana
if kubectl get pods -n monitoring | grep grafana | grep Running &>/dev/null; then
log_info "✓ Grafana is running"
echo
log_info "Grafana admin password: $GRAFANA_PASSWORD"
log_info "Access Grafana with: kubectl port-forward -n monitoring svc/grafana 3000:80"
else
log_error "✗ Grafana is not running"
fi
# Check sample VPA
if kubectl get vpa webapp-vpa &>/dev/null; then
log_info "✓ Sample VPA is configured"
echo
log_info "View VPA recommendations with: kubectl describe vpa webapp-vpa"
else
log_error "✗ Sample VPA is not configured"
fi
echo
log_info "Installation completed! Wait a few minutes for VPA to generate recommendations."
}
# Main execution
main() {
log_info "Starting Kubernetes VPA and Cost Analysis installation..."
check_privileges
detect_distro
# Validate kubectl connectivity
if ! kubectl cluster-info &>/dev/null; then
log_error "Cannot connect to Kubernetes cluster. Please ensure kubectl is configured."
exit 1
fi
install_prerequisites
install_kubectl
install_helm
install_metrics_server
install_vpa
install_prometheus
install_grafana
configure_sample_vpa
verify_installation
log_info "All components installed successfully!"
}
# Validate arguments
if [[ $# -gt 3 ]]; then
usage
fi
main "$@"
Review the script before running. Execute with: bash install.sh