Set up HAProxy high availability with keepalived clustering for automatic failover

Advanced 45 min Apr 28, 2026 77 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure HAProxy load balancer with keepalived VRRP clustering for automatic failover. Set up virtual IP failover, health checks, and monitor the cluster for production high availability.

Prerequisites

  • Two Linux servers with network connectivity
  • Root or sudo access on both servers
  • Backend servers to load balance
  • Basic understanding of networking concepts

What this solves

HAProxy provides load balancing, but a single HAProxy instance creates a single point of failure. This tutorial sets up two HAProxy nodes with keepalived for VRRP (Virtual Router Redundancy Protocol) clustering. When the primary HAProxy fails, keepalived automatically moves the virtual IP to the backup node, ensuring continuous service availability without manual intervention.

Architecture overview

You'll configure two HAProxy servers (primary and backup) with keepalived managing a shared virtual IP address. The primary node owns the virtual IP and handles all traffic. If the primary fails health checks, keepalived promotes the backup node to primary and moves the virtual IP automatically.

ComponentPrimary NodeBackup NodeVirtual IP
HAProxy203.0.113.10203.0.113.11203.0.113.100
keepalivedPriority 110Priority 100Floats between nodes
Backend servers203.0.113.20203.0.113.21Load balanced targets

Step-by-step configuration

Update system packages on both nodes

Start by updating both HAProxy nodes to ensure you get the latest packages and security updates.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install HAProxy and keepalived on both nodes

Install both HAProxy for load balancing and keepalived for VRRP clustering on each node.

sudo apt install -y haproxy keepalived
sudo dnf install -y haproxy keepalived

Enable IP forwarding

Enable kernel IP forwarding on both nodes to allow keepalived to manage the virtual IP address routing.

echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.ip_nonlocal_bind = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Configure HAProxy on both nodes

Create an identical HAProxy configuration on both nodes. This config includes health checks for backend servers and statistics monitoring.

global
    log stdout local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    mode http
    log global
    option httplog
    option dontlognull
    option redispatch
    retries 3
    timeout connect 5000
    timeout client 50000
    timeout server 50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend web_frontend
    bind *:80
    bind *:443
    default_backend web_servers

backend web_servers
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server web1 203.0.113.20:80 check inter 3000 rise 2 fall 3
    server web2 203.0.113.21:80 check inter 3000 rise 2 fall 3

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s
    stats admin if TRUE

Configure keepalived on the primary node

Configure keepalived on the primary node (203.0.113.10) with higher priority and health check script for HAProxy monitoring.

global_defs {
    router_id LB_DEVEL_PRIMARY
    vrrp_skip_check_adv_addr
    vrrp_strict
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight 2
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 110
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass changeme123
    }
    virtual_ipaddress {
        203.0.113.100
    }
    track_script {
        chk_haproxy
    }
    notify_master "/etc/keepalived/notify_master.sh"
    notify_backup "/etc/keepalived/notify_backup.sh"
    notify_fault "/etc/keepalived/notify_fault.sh"
}

Configure keepalived on the backup node

Configure keepalived on the backup node (203.0.113.11) with lower priority. The configuration is identical except for state, priority, and router_id.

global_defs {
    router_id LB_DEVEL_BACKUP
    vrrp_skip_check_adv_addr
    vrrp_strict
    vrrp_garp_interval 0
    vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight 2
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass changeme123
    }
    virtual_ipaddress {
        203.0.113.100
    }
    track_script {
        chk_haproxy
    }
    notify_master "/etc/keepalived/notify_master.sh"
    notify_backup "/etc/keepalived/notify_backup.sh"
    notify_fault "/etc/keepalived/notify_fault.sh"
}

Create keepalived notification scripts

Create notification scripts on both nodes to log state changes and send alerts when failover occurs.

#!/bin/bash
echo "$(date): Became MASTER" | logger -t keepalived

Optional: Send email or webhook notification

curl -X POST https://your-webhook-url.com/alerts -d "HAProxy cluster: $(hostname) became MASTER"

#!/bin/bash
echo "$(date): Became BACKUP" | logger -t keepalived

Optional: Send email or webhook notification

curl -X POST https://your-webhook-url.com/alerts -d "HAProxy cluster: $(hostname) became BACKUP"

#!/bin/bash
echo "$(date): Entered FAULT state" | logger -t keepalived

Optional: Send critical alert

curl -X POST https://your-webhook-url.com/alerts -d "CRITICAL: HAProxy cluster: $(hostname) FAULT"

Make the scripts executable on both nodes:

sudo chmod +x /etc/keepalived/notify_*.sh

Configure firewall rules for VRRP

Open the required ports for VRRP communication, HAProxy, and monitoring. VRRP uses protocol 112 for communication between nodes.

sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw allow 8404/tcp
sudo ufw allow from 203.0.113.10 to any port 112
sudo ufw allow from 203.0.113.11 to any port 112
sudo ufw --force enable
sudo firewall-cmd --permanent --add-port=80/tcp
sudo firewall-cmd --permanent --add-port=443/tcp
sudo firewall-cmd --permanent --add-port=8404/tcp
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='203.0.113.10' protocol value='vrrp' accept"
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='203.0.113.11' protocol value='vrrp' accept"
sudo firewall-cmd --reload

Start and enable services on both nodes

Enable and start HAProxy and keepalived services. Start HAProxy first, then keepalived to ensure proper health check initialization.

sudo systemctl enable --now haproxy
sudo systemctl enable --now keepalived

Configure health checks with monitoring script

Create a more sophisticated health check script that monitors both HAProxy process and port availability.

#!/bin/bash

Check if HAProxy process is running

if ! pgrep haproxy > /dev/null; then logger -t haproxy-check "HAProxy process not found" exit 1 fi

Check if HAProxy is listening on port 80

if ! netstat -tlnp | grep :80 | grep haproxy > /dev/null; then logger -t haproxy-check "HAProxy not listening on port 80" exit 1 fi

Check HAProxy stats page

if ! curl -s http://localhost:8404/stats > /dev/null; then logger -t haproxy-check "HAProxy stats page unreachable" exit 1 fi logger -t haproxy-check "HAProxy health check passed" exit 0

Make the script executable and update keepalived configuration:

sudo chmod +x /usr/local/bin/check_haproxy.sh

Update the vrrp_script section in /etc/keepalived/keepalived.conf on both nodes:

vrrp_script chk_haproxy {
    script "/usr/local/bin/check_haproxy.sh"
    interval 3
    weight 2
    fall 3
    rise 2
    timeout 2
}

Test automatic failover

Verify initial cluster state

Check which node currently owns the virtual IP and confirm both nodes are running properly.

# Check virtual IP assignment
ip addr show | grep 203.0.113.100

Check keepalived status

sudo systemctl status keepalived

Check HAProxy status

sudo systemctl status haproxy

View keepalived logs

sudo journalctl -u keepalived -f

Test HAProxy failover

Stop HAProxy on the primary node to trigger automatic failover to the backup node.

# Stop HAProxy to trigger failover
sudo systemctl stop haproxy

Watch keepalived logs for state change

sudo journalctl -u keepalived -f

On the backup node, verify it became the new master:

# Check if this node now has the virtual IP
ip addr show | grep 203.0.113.100

Check keepalived logs

sudo journalctl -u keepalived -f

Test service recovery

Restart HAProxy on the original primary node and verify it becomes backup (or master if configured with higher priority).

# Start HAProxy again
sudo systemctl start haproxy

Check current state

sudo journalctl -u keepalived --since "1 minute ago"

Monitor cluster health

Set up monitoring commands to track cluster status and performance. For comprehensive monitoring, consider integrating with Prometheus and Grafana.

#!/bin/bash

Save as /usr/local/bin/cluster-status.sh

echo "=== HAProxy Cluster Status ===" echo "Date: $(date)" echo

Check which node has VIP

echo "Virtual IP Status:" ip addr show | grep -A 1 -B 1 203.0.113.100 || echo "Virtual IP not found on this node" echo

Check keepalived state

echo "Keepalived State:" sudo systemctl is-active keepalived echo

Check HAProxy state

echo "HAProxy State:" sudo systemctl is-active haproxy echo

Show recent keepalived events

echo "Recent Keepalived Events:" sudo journalctl -u keepalived --since "10 minutes ago" --no-pager echo

Show HAProxy stats

echo "HAProxy Backend Status:" curl -s http://localhost:8404/stats | grep -E "web1|web2" | awk -F, '{print $1 ": " $18}' 2>/dev/null || echo "Stats unavailable"

Verify your setup

# Test virtual IP responds
curl -I http://203.0.113.100

Check HAProxy stats

curl http://203.0.113.100:8404/stats

Verify both nodes are in cluster

sudo ip addr show | grep 203.0.113.100

Check service status on both nodes

sudo systemctl status haproxy keepalived

View cluster communication

sudo tcpdump -i any vrrp

Common issues

SymptomCauseFix
Split-brain (both nodes master)Firewall blocks VRRP trafficOpen protocol 112 between nodes: sudo ufw allow from peer_ip
Virtual IP not accessibleInterface binding issuesCheck interface name in config matches: ip link show
Failover not happeningHealth check script failingTest script manually: /usr/local/bin/check_haproxy.sh
Authentication errors in logsMismatched VRRP passwordsEnsure auth_pass identical on both nodes
VRRP instance flappingNetwork latency or packet lossIncrease advert_int to 3 seconds in config
Backend servers show as downHealth check URL not respondingVerify /health endpoint exists on backends
Security note: Change the default VRRP authentication password from 'changeme123' to a strong password in production. Use the same password on both nodes for proper cluster communication.

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle private cloud infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.