Configure Varnish cache invalidation with automated purging strategies for high-performance web acceleration

Advanced 45 min Apr 09, 2026 48 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Learn to configure advanced Varnish VCL for cache invalidation, implement PURGE and BAN strategies, and set up automated cache tagging for optimal performance. Master selective invalidation techniques and monitoring for production-grade web acceleration.

Prerequisites

  • Working Varnish installation
  • Backend web servers configured
  • Administrative access
  • Basic VCL knowledge
  • Python 3 (for webhook functionality)

What this solves

Varnish cache invalidation is critical for high-traffic websites where cached content must be updated immediately when backend data changes. Without proper invalidation strategies, users see stale content, leading to data inconsistency and poor user experience. This tutorial configures advanced VCL rules for selective cache purging, automated BAN operations, and cache tagging systems that maintain optimal cache hit rates while ensuring content freshness.

Prerequisites

You need a working Varnish installation with backend web servers configured. This tutorial assumes you have administrative access and basic VCL knowledge. We'll build upon an existing Varnish setup to add sophisticated cache invalidation capabilities.

Step-by-step configuration

Install Varnish and required tools

Start by installing Varnish and tools needed for cache management and monitoring.

sudo apt update
sudo apt install -y varnish varnish-dev curl jq htop
sudo systemctl enable varnish
sudo dnf update -y
sudo dnf install -y varnish varnish-devel curl jq htop
sudo systemctl enable varnish

Configure Varnish service parameters

Set up Varnish with appropriate memory allocation and listening ports for cache invalidation operations.

[Service]
ExecStart=
ExecStart=/usr/sbin/varnishd \
  -a :80 \
  -a :6081,PROXY \
  -T localhost:6082 \
  -f /etc/varnish/default.vcl \
  -S /etc/varnish/secret \
  -s malloc,2g \
  -p feature=+http2

Create the override directory if it doesn't exist:

sudo mkdir -p /etc/systemd/system/varnish.service.d
sudo systemctl daemon-reload

Create advanced VCL configuration with cache invalidation

Configure comprehensive VCL rules supporting PURGE, BAN operations, and cache tagging for selective invalidation.

vcl 4.1;

import std;
import directors;

Define backend servers

backend web1 { .host = "192.168.1.10"; .port = "8080"; .probe = { .url = "/health"; .timeout = 2s; .interval = 5s; .window = 3; .threshold = 2; }; } backend web2 { .host = "192.168.1.11"; .port = "8080"; .probe = { .url = "/health"; .timeout = 2s; .interval = 5s; .window = 3; .threshold = 2; }; }

Access Control List for cache management

acl purge_acl { "localhost"; "127.0.0.1"; "192.168.1.0"/24; "10.0.0.0"/8; } sub vcl_init { # Round-robin director for load balancing new web_director = directors.round_robin(); web_director.add_backend(web1); web_director.add_backend(web2); } sub vcl_recv { # Set backend director set req.backend_hint = web_director.backend(); # Handle PURGE requests if (req.method == "PURGE") { if (client.ip !~ purge_acl) { return (synth(405, "PURGE not allowed from " + client.ip)); } # PURGE specific URL ban("req.url ~ " + req.url); return (synth(200, "Purged: " + req.url)); } # Handle BAN requests for bulk invalidation if (req.method == "BAN") { if (client.ip !~ purge_acl) { return (synth(405, "BAN not allowed from " + client.ip)); } if (req.http.X-Ban-Expression) { ban(req.http.X-Ban-Expression); return (synth(200, "Banned: " + req.http.X-Ban-Expression)); } else { return (synth(400, "X-Ban-Expression header required")); } } # Handle tagged cache invalidation if (req.method == "PURGETAG") { if (client.ip !~ purge_acl) { return (synth(405, "PURGETAG not allowed from " + client.ip)); } if (req.http.X-Cache-Tags) { ban("obj.http.X-Cache-Tags ~ " + req.http.X-Cache-Tags); return (synth(200, "Purged tags: " + req.http.X-Cache-Tags)); } else { return (synth(400, "X-Cache-Tags header required")); } } # Normalize request for better caching if (req.url ~ "\?$") { set req.url = regsub(req.url, "\?$", ""); } # Remove tracking parameters if (req.url ~ "(\?|&)(utm_[a-z]+|gclid|fbclid|_ga|_gid)=") { set req.url = regsuball(req.url, "(\?|&)(utm_[a-z]+|gclid|fbclid|_ga|_gid)=[^&]*", ""); set req.url = regsuball(req.url, "&+", "&"); set req.url = regsuball(req.url, "\?&", "?"); set req.url = regsuball(req.url, "\?$", ""); } # Handle cache bypass for specific scenarios if (req.http.Cache-Control ~ "no-cache" || req.http.Pragma ~ "no-cache") { if (client.ip ~ purge_acl) { set req.hash_always_miss = true; } } # Set cache key components set req.http.X-Forwarded-For = client.ip; return (pass) if (req.method != "GET" && req.method != "HEAD"); return (pass) if (req.http.Authorization); return (hash); } sub vcl_backend_response { # Set cache tags from backend response if (beresp.http.X-Cache-Tags) { set beresp.http.X-Cache-Tags = beresp.http.X-Cache-Tags; } # Configure TTL based on content type if (beresp.http.Content-Type ~ "text/html") { set beresp.ttl = 300s; # 5 minutes for HTML set beresp.grace = 1h; } elsif (beresp.http.Content-Type ~ "(css|js)") { set beresp.ttl = 1h; # 1 hour for CSS/JS set beresp.grace = 6h; } elsif (beresp.http.Content-Type ~ "image/") { set beresp.ttl = 24h; # 24 hours for images set beresp.grace = 48h; } # Handle backend errors gracefully if (beresp.status >= 500 && beresp.status <= 599) { set beresp.ttl = 0s; set beresp.grace = 10s; return (abandon); } # Enable streaming for large objects if (std.integer(beresp.http.Content-Length, 0) > 10485760) { set beresp.do_stream = true; set beresp.ttl = 3600s; } return (deliver); } sub vcl_deliver { # Add cache status headers if (obj.hits > 0) { set resp.http.X-Cache = "HIT"; set resp.http.X-Cache-Hits = obj.hits; } else { set resp.http.X-Cache = "MISS"; } # Add cache age information set resp.http.X-Cache-Age = obj.age; # Remove internal headers from client response unset resp.http.X-Cache-Tags; unset resp.http.Via; unset resp.http.X-Varnish; return (deliver); } sub vcl_hit { # Handle cache invalidation during hit if (req.method == "PURGE") { return (synth(200, "Purged")); } return (deliver); } sub vcl_miss { # Handle cache invalidation during miss if (req.method == "PURGE") { return (synth(404, "Not in cache")); } return (fetch); }

Create cache invalidation automation scripts

Set up scripts for automated cache purging based on content changes and scheduled maintenance.

#!/bin/bash

Varnish Cache Invalidation Script

Usage: varnish-purge.sh [purge|ban|tag] [target]

VARNISH_HOST="localhost:80" VARNISH_MGMT="localhost:6082" LOG_FILE="/var/log/varnish/purge.log"

Ensure log directory exists

sudo mkdir -p /var/log/varnish

Logging function

log_action() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | sudo tee -a "$LOG_FILE" }

PURGE specific URL

purge_url() { local url="$1" if [[ -z "$url" ]]; then echo "Usage: $0 purge /path/to/resource" exit 1 fi response=$(curl -s -X PURGE -w "%{http_code}" "http://$VARNISH_HOST$url") http_code=${response: -3} if [[ "$http_code" == "200" ]]; then log_action "PURGE SUCCESS: $url (HTTP $http_code)" echo "Successfully purged: $url" else log_action "PURGE FAILED: $url (HTTP $http_code)" echo "Failed to purge: $url (HTTP $http_code)" exit 1 fi }

BAN with expression

ban_expression() { local expression="$1" if [[ -z "$expression" ]]; then echo "Usage: $0 ban 'req.url ~ /api/'" exit 1 fi response=$(curl -s -X BAN -H "X-Ban-Expression: $expression" -w "%{http_code}" "http://$VARNISH_HOST/") http_code=${response: -3} if [[ "$http_code" == "200" ]]; then log_action "BAN SUCCESS: $expression (HTTP $http_code)" echo "Successfully banned: $expression" else log_action "BAN FAILED: $expression (HTTP $http_code)" echo "Failed to ban: $expression (HTTP $http_code)" exit 1 fi }

PURGE by cache tags

purge_tags() { local tags="$1" if [[ -z "$tags" ]]; then echo "Usage: $0 tag 'user:123,content:article'" exit 1 fi response=$(curl -s -X PURGETAG -H "X-Cache-Tags: $tags" -w "%{http_code}" "http://$VARNISH_HOST/") http_code=${response: -3} if [[ "$http_code" == "200" ]]; then log_action "PURGETAG SUCCESS: $tags (HTTP $http_code)" echo "Successfully purged tags: $tags" else log_action "PURGETAG FAILED: $tags (HTTP $http_code)" echo "Failed to purge tags: $tags (HTTP $http_code)" exit 1 fi }

Main script logic

case "$1" in purge) purge_url "$2" ;; ban) ban_expression "$2" ;; tag) purge_tags "$2" ;; *) echo "Usage: $0 {purge|ban|tag} [target]" echo "Examples:" echo " $0 purge /api/users/123" echo " $0 ban 'req.url ~ /old-content/'" echo " $0 tag 'user:123,content:article'" exit 1 ;; esac

Make the script executable:

sudo chmod 755 /usr/local/bin/varnish-purge.sh

Create cache monitoring and metrics script

Set up automated monitoring to track cache hit rates, invalidation frequency, and performance metrics.

#!/bin/bash

Varnish Cache Monitoring Script

Collects and reports cache performance metrics

VARNISH_MGMT="localhost:6082" METRICS_LOG="/var/log/varnish/metrics.log" ALERT_THRESHOLD=80 # Alert if cache hit rate drops below 80%

Get cache statistics

get_cache_stats() { local stats=$(varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss -f MAIN.n_object -f MAIN.n_expired -f MAIN.n_lru_nuked) cache_hits=$(echo "$stats" | grep 'MAIN.cache_hit' | awk '{print $2}') cache_misses=$(echo "$stats" | grep 'MAIN.cache_miss' | awk '{print $2}') objects_cached=$(echo "$stats" | grep 'MAIN.n_object' | awk '{print $2}') objects_expired=$(echo "$stats" | grep 'MAIN.n_expired' | awk '{print $2}') objects_lru_nuked=$(echo "$stats" | grep 'MAIN.n_lru_nuked' | awk '{print $2}') # Calculate hit rate total_requests=$((cache_hits + cache_misses)) if [[ $total_requests -gt 0 ]]; then hit_rate=$((cache_hits * 100 / total_requests)) else hit_rate=0 fi # Log metrics timestamp=$(date '+%Y-%m-%d %H:%M:%S') metrics_line="$timestamp,hits:$cache_hits,misses:$cache_misses,hit_rate:$hit_rate%,objects:$objects_cached,expired:$objects_expired,lru_nuked:$objects_lru_nuked" echo "$metrics_line" | sudo tee -a "$METRICS_LOG" # Alert on low hit rate if [[ $hit_rate -lt $ALERT_THRESHOLD ]]; then echo "WARNING: Cache hit rate dropped to $hit_rate% (threshold: $ALERT_THRESHOLD%)" logger -t varnish-monitor "Cache hit rate alert: $hit_rate%" fi # Display current stats echo "=== Varnish Cache Statistics ===" echo "Cache Hits: $cache_hits" echo "Cache Misses: $cache_misses" echo "Hit Rate: $hit_rate%" echo "Objects Cached: $objects_cached" echo "Objects Expired: $objects_expired" echo "Objects LRU Nuked: $objects_lru_nuked" }

Get ban list information

get_ban_stats() { echo "\n=== Active Bans ===" varnishadm -T "$VARNISH_MGMT" ban.list | head -10 }

Display top URLs by cache status

get_top_urls() { echo "\n=== Top Cached URLs (last 1000 requests) ===" varnishlog -n default -d -i ReqURL,VCL_return -m RxURL:. | \ head -1000 | \ grep -E 'ReqURL|VCL_return' | \ paste - - | \ awk '{print $4, $8}' | \ sort | uniq -c | sort -rn | head -20 }

Main execution

echo "Varnish Cache Monitor - $(date)" get_cache_stats get_ban_stats get_top_urls

Make the monitoring script executable:

sudo chmod 755 /usr/local/bin/varnish-monitor.sh

Set up automated cache warming

Create a cache warming system that preloads frequently accessed content after cache invalidations.

#!/bin/bash

Varnish Cache Warming Script

Preloads critical URLs after cache invalidation

WARM_URLS_FILE="/etc/varnish/warm-urls.txt" VARNISH_HOST="localhost:80" CONCURRENCY=10 TIMEOUT=30

Create URLs file if it doesn't exist

if [[ ! -f "$WARM_URLS_FILE" ]]; then sudo tee "$WARM_URLS_FILE" << EOF

Critical URLs to warm after cache invalidation

One URL per line

/ /api/status /popular-content /category/technology /category/news EOF fi

Warm cache function

warm_cache() { local url="$1" local start_time=$(date +%s.%N) response=$(curl -s -w "%{http_code}:%{time_total}" -m "$TIMEOUT" "http://$VARNISH_HOST$url") http_code=$(echo "$response" | cut -d':' -f1) response_time=$(echo "$response" | cut -d':' -f2) local end_time=$(date +%s.%N) local duration=$(echo "$end_time - $start_time" | bc) if [[ "$http_code" == "200" ]]; then echo "WARM SUCCESS: $url (${response_time}s)" else echo "WARM FAILED: $url (HTTP $http_code)" fi }

Export function for parallel execution

export -f warm_cache export VARNISH_HOST TIMEOUT

Main warming process

echo "Starting cache warming at $(date)" echo "Using $CONCURRENCY concurrent connections" echo "Timeout: ${TIMEOUT}s per request" echo ""

Process URLs in parallel

if [[ -f "$WARM_URLS_FILE" ]]; then grep -v '^#' "$WARM_URLS_FILE" | grep -v '^$' | \ xargs -n 1 -P "$CONCURRENCY" -I {} bash -c 'warm_cache "$@"' _ {} else echo "Warning: URLs file not found: $WARM_URLS_FILE" exit 1 fi echo "" echo "Cache warming completed at $(date)"

Make the warming script executable:

sudo chmod 755 /usr/local/bin/varnish-warm.sh

Configure systemd timers for automated monitoring

Set up systemd services and timers for regular cache monitoring and maintenance.

[Unit]
Description=Varnish Cache Monitoring
After=varnish.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/varnish-monitor.sh
User=varnish
Group=varnish
[Unit]
Description=Run Varnish Cache Monitoring every 5 minutes
Requires=varnish-monitor.service

[Timer]
OnCalendar=*:0/5
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the monitoring timer:

sudo systemctl daemon-reload
sudo systemctl enable varnish-monitor.timer
sudo systemctl start varnish-monitor.timer

Start and verify Varnish configuration

Reload the Varnish configuration and ensure all invalidation features are working correctly.

sudo systemctl restart varnish
sudo systemctl status varnish
varnishd -C -f /etc/varnish/default.vcl

Verify your cache invalidation setup

Test the various cache invalidation methods to ensure they work correctly.

Test PURGE functionality

# Fetch a page to cache it
curl -I http://localhost/

PURGE the specific URL

curl -X PURGE http://localhost/

Use the purge script

/usr/local/bin/varnish-purge.sh purge /api/users

Test BAN operations

# BAN all URLs matching a pattern
curl -X BAN -H "X-Ban-Expression: req.url ~ /api/" http://localhost/

Use the purge script for BAN

/usr/local/bin/varnish-purge.sh ban 'req.url ~ /old-content/'

Test cache tag invalidation

# PURGE by cache tags
curl -X PURGETAG -H "X-Cache-Tags: user:123" http://localhost/

Use the purge script for tags

/usr/local/bin/varnish-purge.sh tag 'content:article,category:tech'

Monitor cache performance

# Run monitoring script
/usr/local/bin/varnish-monitor.sh

Check Varnish statistics

varnishstat

View active bans

varnishadm ban.list

Check cache logs

sudo tail -f /var/log/varnish/purge.log

Advanced cache invalidation strategies

Implement webhook-based invalidation

Create a webhook endpoint that triggers cache invalidation when content management systems update content.

#!/usr/bin/env python3

import json
import subprocess
import sys
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import urlparse, parse_qs
import logging

Configure logging

logging.basicConfig( filename='/var/log/varnish/webhook.log', level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s' ) class VarnishWebhookHandler(BaseHTTPRequestHandler): def do_POST(self): try: content_length = int(self.headers['Content-Length']) post_data = self.rfile.read(content_length).decode('utf-8') # Parse JSON payload data = json.loads(post_data) # Extract invalidation parameters invalidation_type = data.get('type', 'purge') target = data.get('target', '/') tags = data.get('tags', []) # Execute appropriate invalidation if invalidation_type == 'purge': result = subprocess.run([ '/usr/local/bin/varnish-purge.sh', 'purge', target ], capture_output=True, text=True) elif invalidation_type == 'ban': result = subprocess.run([ '/usr/local/bin/varnish-purge.sh', 'ban', target ], capture_output=True, text=True) elif invalidation_type == 'tag' and tags: tag_string = ','.join(tags) result = subprocess.run([ '/usr/local/bin/varnish-purge.sh', 'tag', tag_string ], capture_output=True, text=True) else: raise ValueError(f"Invalid invalidation type: {invalidation_type}") if result.returncode == 0: logging.info(f"Webhook invalidation success: {invalidation_type} - {target}") self.send_response(200) self.send_header('Content-type', 'application/json') self.end_headers() response = {'status': 'success', 'message': result.stdout} self.wfile.write(json.dumps(response).encode('utf-8')) else: logging.error(f"Webhook invalidation failed: {result.stderr}") self.send_response(500) self.send_header('Content-type', 'application/json') self.end_headers() response = {'status': 'error', 'message': result.stderr} self.wfile.write(json.dumps(response).encode('utf-8')) except Exception as e: logging.error(f"Webhook error: {str(e)}") self.send_response(400) self.send_header('Content-type', 'application/json') self.end_headers() response = {'status': 'error', 'message': str(e)} self.wfile.write(json.dumps(response).encode('utf-8')) if __name__ == '__main__': server = HTTPServer(('127.0.0.1', 8090), VarnishWebhookHandler) print("Varnish webhook server starting on port 8090...") logging.info("Varnish webhook server started on port 8090") server.serve_forever()

Make the webhook script executable:

sudo chmod 755 /usr/local/bin/varnish-webhook.py

Create systemd service for webhook

[Unit]
Description=Varnish Cache Invalidation Webhook
After=network.target varnish.service

[Service]
Type=simple
ExecStart=/usr/local/bin/varnish-webhook.py
Restart=always
RestartSec=5
User=varnish
Group=varnish

[Install]
WantedBy=multi-user.target

Enable and start the webhook service:

sudo systemctl daemon-reload
sudo systemctl enable varnish-webhook.service
sudo systemctl start varnish-webhook.service

Integration with backend applications

Configure your backend applications to use Varnish cache tagging and trigger invalidations. For applications using NGINX as a reverse proxy, add cache tag headers to responses.

Example NGINX backend configuration

server {
    listen 8080;
    server_name example.com;
    
    location / {
        # Add cache tags based on content
        add_header X-Cache-Tags "content:page,user:$remote_user";
        
        # Handle cache invalidation requests
        if ($request_method = PURGE) {
            return 200 "Purged";
        }
        
        # Your application logic here
        proxy_pass http://app_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
    
    location /api/ {
        add_header X-Cache-Tags "api:general";
        proxy_pass http://api_backend;
    }
}

Performance optimization techniques

For high-traffic scenarios, consider implementing advanced optimization strategies similar to those used in NGINX Lua scripting for caching.

Optimization Configuration Expected Impact
Memory allocation -s malloc,4g Reduces disk I/O, increases hit rate
Worker threads -p thread_pool_min=200 Better concurrency handling
Grace time set beresp.grace = 24h Serves stale content during backend issues
ESI processing -p feature=+esi_disable_xml_check Faster ESI fragment processing

Common issues

Symptom Cause Fix
PURGE returns 405 error Client IP not in purge ACL Add client IP to purge_acl in VCL
BAN operations not working Missing X-Ban-Expression header Include proper ban expression in request
Cache tags not invalidating Backend not setting X-Cache-Tags Configure backend to send cache tag headers
High memory usage Too many objects in cache Reduce TTL values or increase memory allocation
Low hit rate after purging No cache warming after invalidation Run varnish-warm.sh after bulk invalidations
Monitoring scripts fail Insufficient permissions Ensure scripts run as varnish user or with sudo

Next steps

Automated install script

Run this to automate the entire setup

#varnish #cache-invalidation #purge #ban #vcl-configuration

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer