Set up keepalived with VRRP to create highly available network services with automatic failover. Configure virtual IP addresses that move between servers when one fails, ensuring zero-downtime load balancing.
Prerequisites
- Two or more Linux servers with network connectivity
- Root or sudo access on all nodes
- Basic understanding of IP networking
- Network interface names (ens3, eth0, etc.)
What this solves
Keepalived uses VRRP (Virtual Router Redundancy Protocol) to create high-availability network services by sharing virtual IP addresses between multiple servers. When the primary server fails, the backup automatically takes over the virtual IP, ensuring continuous service availability without manual intervention. This tutorial sets up two-node keepalived clusters with health checks and automatic failover for production load balancing.
Step-by-step installation
Update system packages
Start by updating your package manager to ensure you get the latest versions of keepalived and dependencies.
sudo apt update && sudo apt upgrade -yInstall keepalived
Install keepalived package which includes VRRP support and health checking capabilities.
sudo apt install -y keepalived ipvsadmEnable IP forwarding
Enable IP forwarding in the kernel to allow the system to route traffic between interfaces and handle virtual IP addresses.
echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.ip_nonlocal_bind = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -pConfigure primary server keepalived
Create the keepalived configuration for the primary server. This server will have higher priority and own the virtual IP by default.
global_defs {
router_id LB_PRIMARY
enable_script_security
script_user root
}
vrrp_script chk_nginx {
script "/bin/curl -f http://localhost:80 || exit 1"
interval 2
weight -2
fall 3
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens3
virtual_router_id 51
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass changeme123
}
virtual_ipaddress {
203.0.113.10/24
}
track_script {
chk_nginx
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
}Configure backup server keepalived
Create the keepalived configuration for the backup server with lower priority. Replace the interface name and adjust IP addresses for your network.
global_defs {
router_id LB_BACKUP
enable_script_security
script_user root
}
vrrp_script chk_nginx {
script "/bin/curl -f http://localhost:80 || exit 1"
interval 2
weight -2
fall 3
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens3
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass changeme123
}
virtual_ipaddress {
203.0.113.10/24
}
track_script {
chk_nginx
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
}Create notification scripts
Create scripts that run when VRRP state changes occur. These scripts can update DNS, send alerts, or perform custom actions during failover.
#!/bin/bash
echo "$(date): Becoming MASTER" >> /var/log/keepalived-state.log
Add custom actions when becoming master
Example: systemctl start nginx
Example: curl -X POST https://monitoring.example.com/webhook
sudo chmod 755 /etc/keepalived/master.shCreate backup and fault scripts
Create additional notification scripts for backup and fault states to provide complete state tracking.
#!/bin/bash
echo "$(date): Becoming BACKUP" >> /var/log/keepalived-state.log
Add custom actions when becoming backup
Example: systemctl stop nginx
#!/bin/bash
echo "$(date): Entering FAULT state" >> /var/log/keepalived-state.log
Add custom actions when entering fault state
Example: send alert email
sudo chmod 755 /etc/keepalived/backup.sh
sudo chmod 755 /etc/keepalived/fault.shConfigure firewall rules
Open the necessary firewall ports for VRRP communication. VRRP uses IP protocol 112 for heartbeat messages between nodes.
sudo ufw allow in on ens3 to 224.0.0.18
sudo ufw allow in on ens3 proto vrrp
sudo ufw reloadInstall and configure NGINX for testing
Install a web server on both nodes to test the health checks and load balancing functionality.
sudo apt install -y nginx
echo "$(hostname) - Primary Server
" | sudo tee /var/www/html/index.html
sudo systemctl enable --now nginxStart keepalived services
Enable and start keepalived on both servers. The primary server should claim the virtual IP address within a few seconds.
sudo systemctl enable --now keepalived
sudo systemctl status keepalivedConfigure advanced health checks
Create a more comprehensive health check script that monitors multiple services and system resources.
#!/bin/bash
Check HTTP response
if ! curl -f -s http://localhost:80 > /dev/null; then
echo "HTTP check failed"
exit 1
fi
Check system load
load=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | tr -d ',')
if (( $(echo "$load > 10.0" | bc -l) )); then
echo "System load too high: $load"
exit 1
fi
Check memory usage
mem_usage=$(free | grep Mem | awk '{printf("%.1f", $3/$2 * 100.0)}')
if (( $(echo "$mem_usage > 90.0" | bc -l) )); then
echo "Memory usage too high: $mem_usage%"
exit 1
fi
echo "All health checks passed"
exit 0sudo chmod 755 /etc/keepalived/check_services.shUpdate configuration with advanced checks
Modify the keepalived configuration to use the comprehensive health check script instead of the simple curl command.
vrrp_script chk_services {
script "/etc/keepalived/check_services.sh"
interval 5
weight -10
fall 2
rise 2
timeout 3
}sudo systemctl restart keepalivedConfigure multiple virtual IPs
Add secondary VRRP instance
Configure a second VRRP instance with different virtual IP and router ID for additional services or load distribution.
vrrp_instance VI_2 {
state MASTER
interface ens3
virtual_router_id 52
priority 110
advert_int 1
authentication {
auth_type PASS
auth_pass changeme456
}
virtual_ipaddress {
203.0.113.11/24
}
track_script {
chk_services
}
}Configure VRRP with load balancer backend
Add virtual server configuration
Configure keepalived to act as a load balancer with real servers for distributing traffic beyond simple IP failover.
virtual_server 203.0.113.10 80 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
real_server 192.168.1.10 80 {
weight 1
HTTP_GET {
url {
path /health
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.1.11 80 {
weight 1
HTTP_GET {
url {
path /health
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}Verify your setup
# Check keepalived status
sudo systemctl status keepalived
View virtual IP assignments
ip addr show
Check VRRP state
sudo journalctl -u keepalived -f
Test virtual IP connectivity
ping -c 3 203.0.113.10
curl http://203.0.113.10
View keepalived statistics
sudo ipvsadm -ln
Check notification script logs
sudo tail -f /var/log/keepalived-state.logip addr show on both servers to verify which node currently owns the VIP.Test automatic failover
Simulate primary server failure
Test the failover mechanism by stopping services or disconnecting the primary server to verify automatic VIP migration.
# Stop keepalived on primary to trigger failover
sudo systemctl stop keepalived
Or stop the monitored service
sudo systemctl stop nginx
Watch failover in logs on backup server
sudo journalctl -u keepalived -fMonitor failover timing
Use these commands to monitor how quickly failover occurs and verify the backup server assumes the virtual IP.
# Monitor VIP changes
watch -n 1 'ip addr show | grep 203.0.113.10'
Test connectivity during failover
while true; do curl -m 2 http://203.0.113.10 && sleep 1; done
Check VRRP advertisements
sudo tcpdump -i ens3 vrrpCommon issues
| Symptom | Cause | Fix |
|---|---|---|
| Both nodes claim master | VRRP traffic blocked by firewall | Open protocol 112 and multicast 224.0.0.18 |
| Virtual IP not responding | Network interface mismatch | Verify interface name matches in config |
| Constant failover flapping | Health check too aggressive | Increase interval and fall/rise thresholds |
| Authentication errors in logs | Password mismatch between nodes | Ensure auth_pass identical on both servers |
| Permission denied on scripts | Notification scripts not executable | sudo chmod 755 /etc/keepalived/*.sh |
| VRRP process exits | Invalid router_id conflict | Use unique virtual_router_id for each VRRP instance |
Next steps
- Optimize HAProxy performance with connection pooling and advanced load balancing algorithms
- Configure Linux firewall rules with fail2ban for SSH brute force protection and intrusion prevention
- Configure keepalived with HAProxy backend health monitoring
- Set up keepalived cluster monitoring with Prometheus alerts
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration variables
VIRTUAL_IP=""
INTERFACE=""
NODE_TYPE=""
AUTH_PASS="changeme123"
PRIORITY=""
usage() {
echo "Usage: $0 -i VIRTUAL_IP -n INTERFACE -t NODE_TYPE"
echo " -i: Virtual IP address (e.g., 203.0.113.10)"
echo " -n: Network interface (e.g., eth0, ens3)"
echo " -t: Node type (master|backup)"
echo " -p: Optional auth password (default: changeme123)"
exit 1
}
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1" >&2
exit 1
}
cleanup() {
warn "Installation failed. Cleaning up..."
systemctl stop keepalived 2>/dev/null || true
systemctl disable keepalived 2>/dev/null || true
}
trap cleanup ERR
# Parse arguments
while getopts "i:n:t:p:h" opt; do
case $opt in
i) VIRTUAL_IP="$OPTARG" ;;
n) INTERFACE="$OPTARG" ;;
t) NODE_TYPE="$OPTARG" ;;
p) AUTH_PASS="$OPTARG" ;;
h) usage ;;
*) usage ;;
esac
done
# Validate arguments
[[ -z "$VIRTUAL_IP" || -z "$INTERFACE" || -z "$NODE_TYPE" ]] && usage
[[ "$NODE_TYPE" != "master" && "$NODE_TYPE" != "backup" ]] && error "Node type must be 'master' or 'backup'"
# Check if running as root
[[ $EUID -ne 0 ]] && error "This script must be run as root"
# Detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_UPDATE="apt update && apt upgrade -y"
PKG_INSTALL="apt install -y"
FIREWALL_CMD="ufw"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_UPDATE="dnf update -y"
PKG_INSTALL="dnf install -y"
FIREWALL_CMD="firewall-cmd"
;;
amzn)
PKG_MGR="yum"
PKG_UPDATE="yum update -y"
PKG_INSTALL="yum install -y"
FIREWALL_CMD="firewall-cmd"
;;
*) error "Unsupported distribution: $ID" ;;
esac
else
error "Cannot detect distribution. /etc/os-release not found."
fi
# Set priority based on node type
if [[ "$NODE_TYPE" == "master" ]]; then
PRIORITY=110
STATE="MASTER"
ROUTER_ID="LB_PRIMARY"
else
PRIORITY=100
STATE="BACKUP"
ROUTER_ID="LB_BACKUP"
fi
log "[1/9] Updating system packages..."
eval $PKG_UPDATE
log "[2/9] Installing keepalived and dependencies..."
$PKG_INSTALL keepalived ipvsadm curl
log "[3/9] Enabling IP forwarding..."
echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf
echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf
sysctl -p
log "[4/9] Creating keepalived configuration..."
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
router_id $ROUTER_ID
enable_script_security
script_user root
}
vrrp_script chk_nginx {
script "/bin/curl -f http://localhost:80 || exit 1"
interval 2
weight -2
fall 3
rise 2
}
vrrp_instance VI_1 {
state $STATE
interface $INTERFACE
virtual_router_id 51
priority $PRIORITY
advert_int 1
authentication {
auth_type PASS
auth_pass $AUTH_PASS
}
virtual_ipaddress {
$VIRTUAL_IP/24
}
track_script {
chk_nginx
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
}
EOF
log "[5/9] Creating notification scripts..."
cat > /etc/keepalived/master.sh << 'EOF'
#!/bin/bash
echo "$(date): Becoming MASTER" >> /var/log/keepalived-state.log
# Add custom actions when becoming master
# Example: systemctl start nginx
EOF
cat > /etc/keepalived/backup.sh << 'EOF'
#!/bin/bash
echo "$(date): Becoming BACKUP" >> /var/log/keepalived-state.log
# Add custom actions when becoming backup
EOF
cat > /etc/keepalived/fault.sh << 'EOF'
#!/bin/bash
echo "$(date): Entering FAULT state" >> /var/log/keepalived-state.log
# Add custom actions when entering fault state
EOF
chmod 755 /etc/keepalived/master.sh
chmod 755 /etc/keepalived/backup.sh
chmod 755 /etc/keepalived/fault.sh
chmod 644 /etc/keepalived/keepalived.conf
log "[6/9] Creating log file..."
touch /var/log/keepalived-state.log
chmod 644 /var/log/keepalived-state.log
log "[7/9] Configuring firewall..."
if [[ "$FIREWALL_CMD" == "ufw" ]]; then
ufw --force enable 2>/dev/null || true
ufw allow 112/any 2>/dev/null || true
elif [[ "$FIREWALL_CMD" == "firewall-cmd" ]]; then
systemctl enable firewalld --now 2>/dev/null || true
firewall-cmd --permanent --add-protocol=vrrp 2>/dev/null || true
firewall-cmd --permanent --add-rich-rule="rule protocol value='112' accept" 2>/dev/null || true
firewall-cmd --reload 2>/dev/null || true
fi
log "[8/9] Starting and enabling keepalived..."
systemctl enable keepalived
systemctl start keepalived
log "[9/9] Verifying installation..."
sleep 3
if systemctl is-active --quiet keepalived; then
log "Keepalived is running successfully"
else
error "Keepalived failed to start"
fi
if ip addr show | grep -q "$VIRTUAL_IP" && [[ "$NODE_TYPE" == "master" ]]; then
log "Virtual IP $VIRTUAL_IP is assigned to this node"
elif [[ "$NODE_TYPE" == "backup" ]]; then
log "Backup node configured successfully"
fi
log "Installation completed successfully!"
log "Configuration file: /etc/keepalived/keepalived.conf"
log "Log file: /var/log/keepalived-state.log"
warn "Remember to:"
warn "1. Install and configure your application (nginx, apache, etc.)"
warn "2. Update the health check script in keepalived.conf if needed"
warn "3. Configure the same setup on the other node with opposite type"
warn "4. Change the default auth_pass in production"
Review the script before running. Execute with: bash install.sh