Configure NGINX load balancing with health checks and automatic failover

Intermediate 25 min Apr 05, 2026 26 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up NGINX as a load balancer with upstream servers, active health checks, and automatic failover capabilities. This tutorial covers round-robin, least-connections, and IP-hash load balancing methods with real-time backend monitoring.

Prerequisites

  • Multiple backend servers or containers
  • Root or sudo access
  • Basic understanding of HTTP and networking concepts

What this solves

NGINX load balancing distributes incoming requests across multiple backend servers to improve performance, availability, and scalability. This tutorial shows you how to configure NGINX with upstream modules, implement health checks to monitor backend server status, and set up automatic failover when servers become unavailable.

Step-by-step configuration

Update system packages

Start by updating your package manager to ensure you get the latest versions of NGINX and its modules.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install NGINX with required modules

Install NGINX along with the upstream modules needed for load balancing and health checks. The nginx-extras package includes additional modules for advanced functionality.

sudo apt install -y nginx nginx-extras
sudo dnf install -y nginx nginx-mod-http-upstream-fair nginx-mod-stream

Create backup of default configuration

Always backup the original configuration before making changes. This allows you to restore the default settings if needed.

sudo cp /etc/nginx/nginx.conf /etc/nginx/nginx.conf.backup
sudo cp /etc/nginx/sites-available/default /etc/nginx/sites-available/default.backup

Configure upstream backend servers

Create an upstream block that defines your backend servers. This example configures three web servers with different load balancing options and health check parameters.

upstream backend_servers {
    # Load balancing method (default is round-robin)
    least_conn;
    
    # Backend servers with health check parameters
    server 203.0.113.10:80 max_fails=3 fail_timeout=30s weight=2;
    server 203.0.113.11:80 max_fails=3 fail_timeout=30s weight=2;
    server 203.0.113.12:80 max_fails=3 fail_timeout=30s weight=1;
    
    # Backup server (only used when all primary servers are down)
    server 203.0.113.20:80 backup;
    
    # Health check configuration
    keepalive 32;
    keepalive_requests 100;
    keepalive_timeout 60s;
}

Upstream for SSL backend servers

upstream ssl_backend { ip_hash; # Session persistence based on client IP server 203.0.113.10:443 max_fails=2 fail_timeout=15s; server 203.0.113.11:443 max_fails=2 fail_timeout=15s; server 203.0.113.12:443 max_fails=2 fail_timeout=15s; }

Configure load balancer virtual host

Create a virtual host configuration that uses the upstream backend servers. This configuration includes proxy settings, health monitoring, and failover behavior.

server {
    listen 80;
    server_name example.com www.example.com;
    
    # Logging for monitoring
    access_log /var/log/nginx/loadbalancer_access.log;
    error_log /var/log/nginx/loadbalancer_error.log;
    
    # Health check endpoint
    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        allow 203.0.113.0/24;
        deny all;
    }
    
    # Main application proxy
    location / {
        proxy_pass http://backend_servers;
        
        # Proxy headers for backend servers
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Connection and timeout settings
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
        proxy_next_upstream_tries 3;
        proxy_next_upstream_timeout 10s;
        
        # Enable keepalive connections
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
    
    # Health check for backend servers
    location /health {
        proxy_pass http://backend_servers/health;
        proxy_set_header Host $host;
        access_log off;
    }
}

SSL load balancer configuration

server { listen 443 ssl http2; server_name example.com www.example.com; # SSL configuration ssl_certificate /etc/ssl/certs/example.com.crt; ssl_certificate_key /etc/ssl/private/example.com.key; ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers HIGH:!aNULL:!MD5; location / { proxy_pass https://ssl_backend; proxy_ssl_verify off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_connect_timeout 5s; proxy_send_timeout 60s; proxy_read_timeout 60s; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503; } }

Enable the load balancer site

Enable the new load balancer configuration and disable the default NGINX site to avoid conflicts.

sudo ln -s /etc/nginx/sites-available/loadbalancer /etc/nginx/sites-enabled/
sudo rm /etc/nginx/sites-enabled/default

Configure advanced health checks

Create a more sophisticated health check configuration with custom monitoring endpoints and automatic recovery settings.

# Advanced upstream configuration with detailed health monitoring
upstream app_backend {
    zone backend 64k;
    
    # Primary servers with detailed health parameters
    server 203.0.113.10:80 max_fails=3 fail_timeout=30s weight=3 slow_start=30s;
    server 203.0.113.11:80 max_fails=3 fail_timeout=30s weight=3 slow_start=30s;
    server 203.0.113.12:80 max_fails=3 fail_timeout=30s weight=2 slow_start=30s;
    
    # Queue configuration for handling failures
    queue 100 timeout=70s;
}

Map to track backend server status

map $upstream_addr $backend_pool { ~^203\.0\.113\.10 "server1"; ~^203\.0\.113\.11 "server2"; ~^203\.0\.113\.12 "server3"; default "unknown"; }

Log format for tracking backend performance

log_format upstream_tracking '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' 'rt=$request_time uct="$upstream_connect_time" ' 'uht="$upstream_header_time" urt="$upstream_response_time" ' 'upstream=$upstream_addr backend=$backend_pool';

Create monitoring script for backend health

Create a script that monitors backend server health and provides detailed status information for automated monitoring systems.

#!/bin/bash

NGINX Backend Health Monitor Script

Monitors upstream server health and logs status

LOG_FILE="/var/log/nginx/backend-health.log" STATUS_URL="http://localhost/nginx_status" ALERT_EMAIL="admin@example.com"

Function to check backend server directly

check_backend() { local server=$1 local port=$2 if timeout 5 bash -c "> $LOG_FILE return 0 else echo "$(date): Backend $server:$port is down" >> $LOG_FILE return 1 fi }

Check all backend servers

SERVERS=("203.0.113.10" "203.0.113.11" "203.0.113.12") PORT=80 FAILED_SERVERS=() for server in "${SERVERS[@]}"; do if ! check_backend $server $PORT; then FAILED_SERVERS+=("$server:$PORT") fi done

Send alert if servers are down

if [ ${#FAILED_SERVERS[@]} -gt 0 ]; then ALERT_MSG="NGINX Load Balancer Alert: The following backend servers are down: ${FAILED_SERVERS[*]}" echo "$ALERT_MSG" | mail -s "Backend Server Alert" $ALERT_EMAIL 2>/dev/null echo "$(date): ALERT - Servers down: ${FAILED_SERVERS[*]}" >> $LOG_FILE fi

Get NGINX upstream status

curl -s $STATUS_URL >> $LOG_FILE echo "$(date): Health check completed" >> $LOG_FILE

Make health monitor script executable

Set the correct permissions for the health monitoring script and create the log directory.

sudo chmod 755 /usr/local/bin/nginx-health-monitor.sh
sudo mkdir -p /var/log/nginx
sudo touch /var/log/nginx/backend-health.log
sudo chown www-data:www-data /var/log/nginx/backend-health.log

Configure automated health monitoring

Set up a cron job to run the health monitoring script every 5 minutes and create a systemd service for continuous monitoring.

sudo crontab -e

Add the following line to run health checks every 5 minutes:

/5    * /usr/local/bin/nginx-health-monitor.sh

Test NGINX configuration

Verify that your NGINX configuration is syntactically correct before applying the changes.

sudo nginx -t

Apply configuration and start services

Reload NGINX with the new configuration and ensure it starts automatically on boot.

sudo systemctl reload nginx
sudo systemctl enable nginx
sudo systemctl status nginx

Configure log rotation

Set up log rotation for the load balancer logs to prevent disk space issues.

/var/log/nginx/loadbalancer_*.log {
    daily
    missingok
    rotate 52
    compress
    delaycompress
    notifempty
    create 644 www-data www-data
    postrotate
        if [ -f /var/run/nginx.pid ]; then
            kill -USR1 cat /var/run/nginx.pid
        fi
    endscript
}

/var/log/nginx/backend-health.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    create 644 www-data www-data
}

Verify your setup

Test the load balancer configuration and verify that health checks and failover are working correctly.

# Check NGINX status and configuration
sudo systemctl status nginx
sudo nginx -t

Test load balancer endpoint

curl -H "Host: example.com" http://localhost/

Check NGINX status page

curl http://localhost/nginx_status

Monitor backend server connections

sudo tail -f /var/log/nginx/loadbalancer_access.log

Check health monitoring logs

sudo tail -f /var/log/nginx/backend-health.log

Test failover by stopping a backend server (if available)

The load balancer should automatically route traffic to remaining servers

Note: Replace the IP addresses in the configuration with your actual backend server IPs. Ensure your backend servers have health check endpoints at /health or adjust the configuration accordingly.

Load balancing methods comparison

MethodUse CaseConfigurationSession Persistence
round_robinEqual distributionDefault methodNo
least_connUneven request processingAdd least_conn directiveNo
ip_hashSession-based applicationsAdd ip_hash directiveYes
weightedServers with different capacityAdd weight parameter to serversDepends on method

Common issues

SymptomCauseFix
502 Bad Gateway errorsBackend servers unavailableCheck backend server status and firewall rules
Uneven load distributionWrong load balancing methodUse least_conn for dynamic content, round_robin for static
Session persistence issuesUsing round_robin with stateful appsSwitch to ip_hash or implement session storage
Health checks not workingIncorrect endpoint or permissionsVerify /health endpoint exists on backend servers
High response timesKeepalive connections not enabledAdd keepalive directive to upstream block
Logs not rotatingLogrotate configuration missingVerify /etc/logrotate.d/nginx-loadbalancer exists

Next steps

Automated install script

Run this to automate the entire setup

#nginx #load balancing #health checks #upstream #failover

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer