Set up iptables high availability clustering with keepalived for automatic failover

Advanced 45 min Apr 14, 2026 14 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure a highly available firewall cluster using iptables and keepalived with VRRP for automatic failover. Set up rule synchronization between nodes and implement monitoring for production-grade firewall redundancy.

Prerequisites

  • Two Linux servers with network connectivity
  • Root or sudo access on both nodes
  • Basic understanding of iptables and networking concepts
  • SSH access between nodes

What this solves

This tutorial sets up a high availability iptables firewall cluster using keepalived and VRRP (Virtual Router Redundancy Protocol). When your primary firewall fails, the secondary node automatically takes over with synchronized rules, ensuring continuous network security without manual intervention.

Step-by-step configuration

Update system packages

Start by updating your package manager and installing required dependencies on both firewall nodes.

sudo apt update && sudo apt upgrade -y
sudo apt install -y keepalived iptables-persistent rsync
sudo dnf update -y
sudo dnf install -y keepalived iptables-services rsync

Enable IP forwarding and configure kernel parameters

Enable packet forwarding and configure kernel parameters required for VRRP operation across both nodes.

net.ipv4.ip_forward = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
sudo sysctl -p

Create base iptables rules

Set up identical base firewall rules on both nodes that will be synchronized. These rules protect the cluster and allow keepalived communication.

# Allow loopback traffic
-A INPUT -i lo -j ACCEPT
-A OUTPUT -o lo -j ACCEPT

Allow established connections

-A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

Allow SSH (adjust port as needed)

-A INPUT -p tcp --dport 22 -j ACCEPT

Allow VRRP multicast traffic between keepalived nodes

-A INPUT -p vrrp -j ACCEPT -A OUTPUT -p vrrp -j ACCEPT

Allow keepalived multicast

-A INPUT -d 224.0.0.18/32 -j ACCEPT -A OUTPUT -d 224.0.0.18/32 -j ACCEPT

Allow HTTP/HTTPS traffic to pass through

-A FORWARD -p tcp -m multiport --dports 80,443 -j ACCEPT

Allow DNS

-A FORWARD -p udp --dport 53 -j ACCEPT -A FORWARD -p tcp --dport 53 -j ACCEPT

Default policies

-P INPUT DROP -P FORWARD DROP -P OUTPUT ACCEPT

Configure keepalived on the primary node

Set up the master keepalived configuration with higher priority. Replace the interface and IP addresses with your network settings.

global_defs {
    router_id FIREWALL_MASTER
    enable_script_security
    script_user keepalived_script
}

vrrp_script chk_iptables {
    script "/usr/local/bin/check_iptables.sh"
    interval 5
    weight -50
    fall 2
    rise 1
    timeout 3
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 110
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass firewall_cluster_2024
    }
    virtual_ipaddress {
        203.0.113.100/24
    }
    track_script {
        chk_iptables
    }
    notify_master "/usr/local/bin/master_notify.sh"
    notify_backup "/usr/local/bin/backup_notify.sh"
    notify_fault "/usr/local/bin/fault_notify.sh"
}

Configure keepalived on the backup node

Set up the backup node configuration with lower priority. Use the same virtual IP and authentication settings.

global_defs {
    router_id FIREWALL_BACKUP
    enable_script_security
    script_user keepalived_script
}

vrrp_script chk_iptables {
    script "/usr/local/bin/check_iptables.sh"
    interval 5
    weight -50
    fall 2
    rise 1
    timeout 3
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass firewall_cluster_2024
    }
    virtual_ipaddress {
        203.0.113.100/24
    }
    track_script {
        chk_iptables
    }
    notify_master "/usr/local/bin/master_notify.sh"
    notify_backup "/usr/local/bin/backup_notify.sh"
    notify_fault "/usr/local/bin/fault_notify.sh"
}

Create the iptables health check script

This script verifies that iptables is running and responsive. Create it on both nodes with identical content.

#!/bin/bash

Check if iptables service is running

if ! systemctl is-active --quiet iptables 2>/dev/null; then if ! systemctl is-active --quiet netfilter-persistent 2>/dev/null; then exit 1 fi fi

Test basic iptables functionality

if ! iptables -L INPUT -n >/dev/null 2>&1; then exit 1 fi

Check if we can count packets (ensures rules are active)

if ! iptables -L INPUT -n -v >/dev/null 2>&1; then exit 1 fi exit 0
sudo chmod 755 /usr/local/bin/check_iptables.sh

Create notification scripts

Set up scripts that handle state transitions and rule synchronization. These ensure rules stay synchronized when roles change.

#!/bin/bash

LOGFILE="/var/log/keepalived-transitions.log"
BACKUP_NODE="203.0.113.101"
RULES_FILE="/etc/iptables/rules.v4"

echo "$(date): Transitioning to MASTER state" >> $LOGFILE

Ensure our iptables rules are loaded

if command -v iptables-restore >/dev/null 2>&1; then iptables-restore < $RULES_FILE fi

Synchronize rules to backup node

if ping -c 1 $BACKUP_NODE >/dev/null 2>&1; then rsync -avz $RULES_FILE root@$BACKUP_NODE:$RULES_FILE ssh root@$BACKUP_NODE "iptables-restore < $RULES_FILE" 2>/dev/null fi echo "$(date): MASTER transition completed" >> $LOGFILE
#!/bin/bash

LOGFILE="/var/log/keepalived-transitions.log"

echo "$(date): Transitioning to BACKUP state" >> $LOGFILE

Ensure iptables rules are still loaded

if [ -f "/etc/iptables/rules.v4" ]; then iptables-restore < /etc/iptables/rules.v4 fi echo "$(date): BACKUP transition completed" >> $LOGFILE
#!/bin/bash

LOGFILE="/var/log/keepalived-transitions.log"

echo "$(date): Node entered FAULT state" >> $LOGFILE

Log system status for troubleshooting

echo "$(date): System status during fault:" >> $LOGFILE systemctl status iptables >> $LOGFILE 2>&1 systemctl status netfilter-persistent >> $LOGFILE 2>&1 echo "$(date): FAULT logging completed" >> $LOGFILE
sudo chmod 755 /usr/local/bin/master_notify.sh
sudo chmod 755 /usr/local/bin/backup_notify.sh
sudo chmod 755 /usr/local/bin/fault_notify.sh

Create keepalived script user

Create a dedicated user for running keepalived scripts with minimal privileges.

sudo useradd -r -s /bin/false keepalived_script
sudo mkdir -p /var/log
sudo touch /var/log/keepalived-transitions.log
sudo chown keepalived_script:keepalived_script /var/log/keepalived-transitions.log

Set up SSH key authentication between nodes

Configure passwordless SSH authentication for rule synchronization. Run this on the primary node first.

sudo ssh-keygen -t ed25519 -f /root/.ssh/firewall_cluster -N ""
sudo ssh-copy-id -i /root/.ssh/firewall_cluster.pub root@203.0.113.101

Then configure SSH to use the key automatically:

Host 203.0.113.101
    IdentityFile /root/.ssh/firewall_cluster
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null

Enable and start services

Start the firewall and keepalived services on both nodes. Enable them to start automatically on boot.

sudo iptables-restore < /etc/iptables/rules.v4
sudo systemctl enable netfilter-persistent
sudo systemctl enable --now keepalived
sudo iptables-restore < /etc/iptables/rules.v4
sudo systemctl enable --now iptables
sudo systemctl enable --now keepalived

Create rule synchronization script

Set up automated rule synchronization that can be triggered manually or by cron jobs for ongoing rule management.

#!/bin/bash

PRIMARY_NODE="203.0.113.100"
BACKUP_NODE="203.0.113.101"
RULES_FILE="/etc/iptables/rules.v4"
LOGFILE="/var/log/iptables-sync.log"

Check if we're the current master

if ip addr show | grep -q $PRIMARY_NODE; then echo "$(date): Syncing rules from master to backup" >> $LOGFILE # Save current rules iptables-save > $RULES_FILE # Sync to backup node if rsync -avz $RULES_FILE root@$BACKUP_NODE:$RULES_FILE; then ssh root@$BACKUP_NODE "iptables-restore < $RULES_FILE" echo "$(date): Rules synchronized successfully" >> $LOGFILE else echo "$(date): Failed to synchronize rules" >> $LOGFILE exit 1 fi else echo "$(date): Not master, skipping sync" >> $LOGFILE fi
sudo chmod 755 /usr/local/bin/sync_iptables.sh

Set up automated monitoring

Create a monitoring script that checks cluster health and logs status information for troubleshooting.

#!/bin/bash

VIP="203.0.113.100"
BACKUP_NODE="203.0.113.101"
LOGFILE="/var/log/cluster-monitor.log"

Check if we have the VIP

if ip addr show | grep -q $VIP; then ROLE="MASTER" else ROLE="BACKUP" fi

Check keepalived status

KEEPALIVED_STATUS=$(systemctl is-active keepalived)

Check iptables rules count

RULE_COUNT=$(iptables -L | grep -c "^Chain\|^target")

Check if backup node is reachable

if ping -c 1 -W 2 $BACKUP_NODE >/dev/null 2>&1; then BACKUP_REACHABLE="YES" else BACKUP_REACHABLE="NO" fi echo "$(date): Role=$ROLE, Keepalived=$KEEPALIVED_STATUS, Rules=$RULE_COUNT, Backup_Reachable=$BACKUP_REACHABLE" >> $LOGFILE

Alert if issues detected

if [ "$KEEPALIVED_STATUS" != "active" ] || [ "$RULE_COUNT" -lt "5" ]; then echo "$(date): WARNING - Cluster health issues detected" >> $LOGFILE fi
sudo chmod 755 /usr/local/bin/monitor_cluster.sh

Configure monitoring cron job

Set up automated monitoring that runs every 5 minutes to track cluster health and log important events.

sudo crontab -e

Add this line to the root crontab on both nodes:

/5    * /usr/local/bin/monitor_cluster.sh

Verify your setup

Test the cluster configuration and verify that failover works correctly:

# Check keepalived status on both nodes
sudo systemctl status keepalived

Verify which node has the VIP

ip addr show | grep 203.0.113.100

Check iptables rules are loaded

sudo iptables -L -n

Test VRRP communication

sudo tcpdump -i eth0 vrrp

Monitor cluster logs

sudo tail -f /var/log/keepalived-transitions.log

Test failover by stopping keepalived on the master node:

# On master node - simulate failure
sudo systemctl stop keepalived

On backup node - should see VIP assignment

ip addr show | grep 203.0.113.100

Restore master node

sudo systemctl start keepalived
Note: The backup node should automatically assume the master role within 3-5 seconds when the primary fails. Monitor logs on both nodes during testing to verify proper operation.

Common issues

Symptom Cause Fix
VIP not assigned to any node VRRP authentication mismatch Verify identical auth_pass in both configs
Both nodes claim master role Network split-brain condition Check network connectivity, VRRP multicast
Keepalived fails to start Script user permissions sudo chown keepalived_script:keepalived_script /usr/local/bin/check_iptables.sh
Rules not synchronizing SSH key authentication failure Test SSH: ssh root@backup_node whoami
Health check script failing Iptables service not running sudo systemctl start iptables or netfilter-persistent
Failover too slow Default detection timers Reduce advert_int to 1 second in keepalived.conf
Warning: Never disable the firewall to troubleshoot connectivity issues. Instead, add specific rules to allow required traffic. Always test configuration changes in a non-production environment first.

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle private cloud infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.