Set up automated Varnish cache warming using Kubernetes CronJobs to preload frequently accessed content and improve website performance. This tutorial covers creating cache warming scripts, Docker containers, and automated scheduling for production environments.
Prerequisites
- Kubernetes cluster with kubectl access
- Varnish Cache backend service
- Container registry access
- Basic knowledge of Kubernetes manifests
What this solves
Cache warming prevents cache misses by preloading frequently accessed content into Varnish before users request it. This eliminates the performance penalty of cold cache hits and ensures consistent response times. Kubernetes CronJobs provide a reliable, scalable way to automate cache warming across multiple environments.
Prerequisites
You need a functioning Kubernetes cluster with kubectl access and Varnish Cache running as a backend service. This tutorial assumes you have basic knowledge of Kubernetes manifests and container deployment.
Step-by-step configuration
Install required tools
Install curl and other tools needed for cache warming scripts and Kubernetes management.
sudo apt update
sudo apt install -y curl jq wget git
Create cache warming script
Create a Python script that reads URLs from a file and makes HTTP requests to warm the cache. This script will run inside a container.
#!/usr/bin/env python3
import sys
import time
import requests
import logging
from concurrent.futures import ThreadPoolExecutor, as_completed
import os
Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
class CacheWarmer:
def __init__(self, base_url, max_workers=10, request_timeout=30):
self.base_url = base_url.rstrip('/')
self.max_workers = max_workers
self.request_timeout = request_timeout
self.session = requests.Session()
self.session.headers.update({
'User-Agent': 'VarnishCacheWarmer/1.0',
'Cache-Control': 'no-cache',
'Pragma': 'no-cache'
})
def warm_url(self, path):
"""Warm a single URL and return result"""
url = f"{self.base_url}{path}"
try:
start_time = time.time()
response = self.session.get(url, timeout=self.request_timeout)
duration = time.time() - start_time
cache_status = response.headers.get('X-Varnish-Cache', 'UNKNOWN')
logger.info(f"Warmed {url} - Status: {response.status_code} - Duration: {duration:.2f}s - Cache: {cache_status}")
return {
'url': url,
'status_code': response.status_code,
'duration': duration,
'cache_status': cache_status,
'success': 200 <= response.status_code < 400
}
except Exception as e:
logger.error(f"Failed to warm {url}: {str(e)}")
return {
'url': url,
'success': False,
'error': str(e)
}
def warm_urls_from_file(self, urls_file):
"""Read URLs from file and warm them concurrently"""
try:
with open(urls_file, 'r') as f:
urls = [line.strip() for line in f if line.strip() and not line.startswith('#')]
except FileNotFoundError:
logger.error(f"URLs file {urls_file} not found")
return False
if not urls:
logger.warning("No URLs found to warm")
return True
logger.info(f"Starting cache warming for {len(urls)} URLs with {self.max_workers} workers")
results = []
with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
future_to_url = {executor.submit(self.warm_url, url): url for url in urls}
for future in as_completed(future_to_url):
result = future.result()
results.append(result)
# Summary statistics
successful = sum(1 for r in results if r.get('success', False))
failed = len(results) - successful
logger.info(f"Cache warming completed. Successful: {successful}, Failed: {failed}")
if failed > 0:
logger.warning(f"{failed} URLs failed to warm")
return failed == 0
def main():
base_url = os.getenv('VARNISH_BASE_URL', 'http://varnish-service:80')
urls_file = os.getenv('URLS_FILE', '/app/urls.txt')
max_workers = int(os.getenv('MAX_WORKERS', '10'))
warmer = CacheWarmer(base_url, max_workers)
success = warmer.warm_urls_from_file(urls_file)
sys.exit(0 if success else 1)
if __name__ == '__main__':
main()
Create URL list file
Create a file containing the URLs you want to warm. Include your most frequently accessed pages and API endpoints.
# Homepage and main pages
/
/products
/services
/about
/contact
API endpoints
/api/v1/products
/api/v1/categories
/api/v1/featured
Popular product pages
/products/bestseller-1
/products/featured-item-2
/products/category/electronics
Static assets that benefit from warming
/css/main.css
/js/app.js
/images/hero-banner.jpg
Create Dockerfile for cache warmer
Build a lightweight container image that includes the cache warming script and dependencies.
FROM python:3.11-slim
Install system dependencies
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*
Set working directory
WORKDIR /app
Install Python dependencies
RUN pip install --no-cache-dir requests
Copy application files
COPY cache-warmer.py /app/
COPY urls.txt /app/
Make script executable
RUN chmod +x /app/cache-warmer.py
Create non-root user
RUN useradd -m -u 1001 warmer
USER warmer
Set environment variables
ENV PYTHONUNBUFFERED=1
ENV VARNISH_BASE_URL=http://varnish-service:80
ENV URLS_FILE=/app/urls.txt
ENV MAX_WORKERS=10
Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python3 -c "import requests; requests.get('http://localhost:8080/health', timeout=5)" || exit 1
CMD ["python3", "/app/cache-warmer.py"]
Build and push Docker image
Build the cache warmer container image and push it to your container registry.
# Build the image
docker build -t varnish-cache-warmer:latest .
Tag for your registry (replace with your registry URL)
docker tag varnish-cache-warmer:latest your-registry.example.com/varnish-cache-warmer:latest
Push to registry
docker push your-registry.example.com/varnish-cache-warmer:latest
Create Kubernetes namespace
Create a dedicated namespace for cache warming resources to maintain organization and apply specific policies.
kubectl create namespace cache-warming
Create ConfigMap for URLs
Store the URLs list in a Kubernetes ConfigMap so you can update it without rebuilding the container image.
apiVersion: v1
kind: ConfigMap
metadata:
name: cache-warmer-urls
namespace: cache-warming
data:
urls.txt: |
# Homepage and main pages
/
/products
/services
/about
/contact
# API endpoints
/api/v1/products
/api/v1/categories
/api/v1/featured
# Popular product pages
/products/bestseller-1
/products/featured-item-2
/products/category/electronics
# Static assets
/css/main.css
/js/app.js
/images/hero-banner.jpg
Apply the ConfigMap
Create the ConfigMap in your Kubernetes cluster.
kubectl apply -f cache-warmer-config.yaml
Create ServiceAccount and RBAC
Create proper RBAC permissions for the cache warming jobs to access necessary cluster resources.
apiVersion: v1
kind: ServiceAccount
metadata:
name: cache-warmer
namespace: cache-warming
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: cache-warming
name: cache-warmer-role
rules:
- apiGroups: [""]
resources: ["pods", "configmaps"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cache-warmer-binding
namespace: cache-warming
subjects:
- kind: ServiceAccount
name: cache-warmer
namespace: cache-warming
roleRef:
kind: Role
name: cache-warmer-role
apiGroup: rbac.authorization.k8s.io
Apply RBAC configuration
Create the ServiceAccount and RBAC permissions in the cluster.
kubectl apply -f cache-warmer-rbac.yaml
Create CronJob manifest
Define a CronJob that runs the cache warming process on a schedule. This example runs every 15 minutes during business hours.
apiVersion: batch/v1
kind: CronJob
metadata:
name: varnish-cache-warmer
namespace: cache-warming
labels:
app: cache-warmer
version: v1
spec:
# Run every 15 minutes during business hours (9 AM to 6 PM UTC)
schedule: "/15 9-18 * 1-5"
# Timezone support (requires Kubernetes 1.25+)
timeZone: "UTC"
# Job configuration
jobTemplate:
metadata:
labels:
app: cache-warmer
spec:
# Keep last 3 successful and 1 failed job for debugging
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
# Job timeout (10 minutes)
activeDeadlineSeconds: 600
template:
metadata:
labels:
app: cache-warmer
spec:
serviceAccountName: cache-warmer
restartPolicy: OnFailure
# Security context
securityContext:
runAsNonRoot: true
runAsUser: 1001
runAsGroup: 1001
fsGroup: 1001
containers:
- name: cache-warmer
image: your-registry.example.com/varnish-cache-warmer:latest
imagePullPolicy: Always
env:
- name: VARNISH_BASE_URL
value: "http://varnish-service.default.svc.cluster.local:80"
- name: URLS_FILE
value: "/app/urls.txt"
- name: MAX_WORKERS
value: "10"
- name: REQUEST_TIMEOUT
value: "30"
# Resource limits
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
# Mount ConfigMap
volumeMounts:
- name: urls-config
mountPath: /app/urls.txt
subPath: urls.txt
readOnly: true
# Security context for container
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumes:
- name: urls-config
configMap:
name: cache-warmer-urls
# Concurrency policy
concurrencyPolicy: Forbid
# Start deadline
startingDeadlineSeconds: 300
# Success and failure history
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
Deploy the CronJob
Apply the CronJob manifest to start automated cache warming.
kubectl apply -f cache-warmer-cronjob.yaml
Create monitoring ServiceMonitor
Set up monitoring for the cache warming jobs using Prometheus metrics if you have Prometheus monitoring configured.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cache-warmer-metrics
namespace: cache-warming
labels:
app: cache-warmer
spec:
selector:
matchLabels:
app: cache-warmer
endpoints:
- port: metrics
interval: 30s
path: /metrics
---
apiVersion: v1
kind: Service
metadata:
name: cache-warmer-metrics
namespace: cache-warming
labels:
app: cache-warmer
spec:
selector:
app: cache-warmer
ports:
- name: metrics
port: 8080
targetPort: 8080
protocol: TCP
Create manual cache warming Job
Create a one-time Job for immediate cache warming or testing purposes.
apiVersion: batch/v1
kind: Job
metadata:
name: varnish-cache-warmer-manual
namespace: cache-warming
labels:
app: cache-warmer
type: manual
spec:
template:
metadata:
labels:
app: cache-warmer
type: manual
spec:
serviceAccountName: cache-warmer
restartPolicy: Never
securityContext:
runAsNonRoot: true
runAsUser: 1001
runAsGroup: 1001
fsGroup: 1001
containers:
- name: cache-warmer
image: your-registry.example.com/varnish-cache-warmer:latest
imagePullPolicy: Always
env:
- name: VARNISH_BASE_URL
value: "http://varnish-service.default.svc.cluster.local:80"
- name: URLS_FILE
value: "/app/urls.txt"
- name: MAX_WORKERS
value: "20"
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
volumeMounts:
- name: urls-config
mountPath: /app/urls.txt
subPath: urls.txt
readOnly: true
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumes:
- name: urls-config
configMap:
name: cache-warmer-urls
backoffLimit: 3
activeDeadlineSeconds: 600
Verify your setup
Check that the CronJob is scheduled and monitor the cache warming execution.
# Check CronJob status
kubectl get cronjobs -n cache-warming
View recent jobs
kubectl get jobs -n cache-warming
Check logs from latest job
kubectl logs -l app=cache-warmer -n cache-warming --tail=50
Run manual cache warming for testing
kubectl apply -f cache-warmer-job.yaml
Monitor manual job execution
kubectl logs -f job/varnish-cache-warmer-manual -n cache-warming
Check Varnish cache hit rates
curl -H "Host: example.com" http://varnish-service.default.svc.cluster.local:80/ -I
Verify cache warming metrics if monitoring is enabled
kubectl port-forward -n cache-warming service/cache-warmer-metrics 8080:8080
curl http://localhost:8080/metrics
Performance optimization
Tune worker concurrency
Adjust the number of concurrent workers based on your Varnish backend capacity and response times.
# Update the CronJob with optimized worker count
kubectl patch cronjob varnish-cache-warmer -n cache-warming -p '{
"spec": {
"jobTemplate": {
"spec": {
"template": {
"spec": {
"containers": [{
"name": "cache-warmer",
"env": [{
"name": "MAX_WORKERS",
"value": "20"
}]
}]
}
}
}
}
}
}'
Configure cache warming frequency
Adjust the cron schedule based on your cache TTL and traffic patterns.
# Update schedule to run every 5 minutes during peak hours
kubectl patch cronjob varnish-cache-warmer -n cache-warming -p '{
"spec": {
"schedule": "/5 8-20 * 1-5"
}
}'
Set up cache warming alerts
Create Prometheus alerts to monitor cache warming job failures and performance.
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: cache-warming-alerts
namespace: cache-warming
labels:
prometheus: kube-prometheus
role: alert-rules
spec:
groups:
- name: cache-warming
rules:
- alert: CacheWarmingJobFailed
expr: kube_job_status_failed{namespace="cache-warming",job_name=~"varnish-cache-warmer.*"} > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Cache warming job failed"
description: "Cache warming job {{ $labels.job_name }} has failed"
- alert: CacheWarmingJobTookTooLong
expr: time() - kube_job_status_start_time{namespace="cache-warming",job_name=~"varnish-cache-warmer.*"} > 600
for: 1m
labels:
severity: warning
annotations:
summary: "Cache warming job running too long"
description: "Cache warming job {{ $labels.job_name }} has been running for more than 10 minutes"
Update cache warming URLs
To update the list of URLs without rebuilding the container, modify the ConfigMap.
# Edit the ConfigMap directly
kubectl edit configmap cache-warmer-urls -n cache-warming
Or update from file
kubectl create configmap cache-warmer-urls --from-file=urls.txt --dry-run=client -o yaml | kubectl apply -n cache-warming -f -
Verify the update
kubectl get configmap cache-warmer-urls -n cache-warming -o yaml
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| CronJob not running | Invalid cron schedule syntax | Validate schedule at crontab.guru and update manifest |
| Cache warming timeouts | Too many concurrent workers or slow backend | Reduce MAX_WORKERS and increase REQUEST_TIMEOUT |
| "Connection refused" errors | Incorrect Varnish service URL | Verify service name with kubectl get svc -A |
| Permission denied errors | Insufficient RBAC permissions | Check ServiceAccount and Role bindings |
| Cache warming job stuck | Resource limits too low | Increase memory and CPU limits in job template |
| URLs not updating | ConfigMap not mounted correctly | Verify volumeMounts and ConfigMap name match |