TimescaleDB High Availability Setup with Failover

Configure TimescaleDB with PostgreSQL streaming replication for high availability. Set up primary and standby servers with hot standby mode, implement automatic failover with pg_auto_failover, and monitor replication status for production-ready time-series database clustering.

Prerequisites

At least 2 servers with 4GB RAM
PostgreSQL 16 compatible system
Network connectivity between servers
Basic knowledge of PostgreSQL administration

What this solves

TimescaleDB high availability ensures your time-series database remains operational during server failures, hardware issues, or maintenance windows. This tutorial sets up PostgreSQL streaming replication with TimescaleDB extensions, configures automatic failover using pg_auto_failover, and establishes monitoring for replication lag and cluster health.

Step-by-step installation

Update system packages

Start by updating your package manager on both primary and standby servers to ensure you get the latest versions.

sudo apt update && sudo apt upgrade -y

sudo dnf update -y

Install PostgreSQL and TimescaleDB

Install PostgreSQL 16 and TimescaleDB on both primary and standby servers. TimescaleDB provides time-series optimizations on top of PostgreSQL.

wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/timescaledb.gpg
echo "deb [signed-by=/usr/share/keyrings/timescaledb.gpg] https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main" | sudo tee /etc/apt/sources.list.d/timescaledb.list
sudo apt update
sudo apt install -y postgresql-16 postgresql-client-16 timescaledb-2-postgresql-16

sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
echo "[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/\$releasever/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300" | sudo tee /etc/yum.repos.d/timescale_timescaledb.repo
sudo dnf install -y postgresql16-server postgresql16 timescaledb-2-postgresql-16

Initialize PostgreSQL database (AlmaLinux/Rocky only)

On AlmaLinux and Rocky Linux systems, manually initialize the PostgreSQL database cluster.

# Database is automatically initialized on Ubuntu/Debian
echo "PostgreSQL automatically initialized"

sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
sudo systemctl enable postgresql-16
sudo systemctl start postgresql-16

Configure TimescaleDB extension

Add TimescaleDB to PostgreSQL's shared libraries and tune the configuration for time-series workloads.

sudo timescaledb-tune --quiet --yes

Configure primary server for replication

On the primary server, configure PostgreSQL for streaming replication by enabling WAL archiving and setting replication parameters.

# Replication settings
listen_addresses = '*'
wal_level = replica
max_wal_senders = 3
max_replication_slots = 3
wal_keep_size = 1GB
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/16/main/archive/%f'

TimescaleDB settings
shared_preload_libraries = 'timescaledb'
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 1GB
work_mem = 4MB
maintenance_work_mem = 64MB

Create archive directory

Create the WAL archive directory with proper permissions for the postgres user.

sudo mkdir -p /var/lib/postgresql/16/main/archive
sudo chown postgres:postgres /var/lib/postgresql/16/main/archive
sudo chmod 750 /var/lib/postgresql/16/main/archive

Configure authentication for replication

Set up host-based authentication to allow the standby server to connect for replication.

# Add these lines (replace 203.0.113.20 with your standby server IP)
host    replication     replicator      203.0.113.20/32         scram-sha-256
host    all             postgres        203.0.113.20/32         scram-sha-256

Create replication user

Create a dedicated user for replication with appropriate privileges.

sudo systemctl restart postgresql
sudo -u postgres psql -c "CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'SecureReplicationPass123';"
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"

Test primary server setup

Verify that TimescaleDB is properly installed and the primary server is ready for replication.

sudo -u postgres psql -c "SELECT version();"
sudo -u postgres psql -c "\dx timescaledb"
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"

Create base backup for standby

On the standby server, create a base backup from the primary server to initialize the replica.

sudo systemctl stop postgresql
sudo -u postgres rm -rf /var/lib/postgresql/16/main/*
sudo -u postgres pg_basebackup -h 203.0.113.10 -D /var/lib/postgresql/16/main -U replicator -W -v -P -R

Configure standby server

Configure the standby server for hot standby mode and streaming replication.

# Hot standby settings
hot_standby = on
hot_standby_feedback = on
wal_receiver_timeout = 60s
max_standby_streaming_delay = 30s
max_standby_archive_delay = 30s

TimescaleDB settings (same as primary)
shared_preload_libraries = 'timescaledb'
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 1GB

Start standby server

Start the standby server and verify it connects to the primary for streaming replication.

sudo systemctl start postgresql
sudo systemctl enable postgresql
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
sudo -u postgres psql -c "SELECT * FROM pg_stat_wal_receiver;"

Install pg_auto_failover

Install pg_auto_failover for automatic failover management on both servers.

sudo apt install -y postgresql-16-auto-failover

sudo dnf install -y pg_auto_failover_16

Set up monitor node

Initialize the pg_auto_failover monitor on a separate server or the primary server.

export PGDATA=/var/lib/postgresql/monitor
sudo -u postgres mkdir -p /var/lib/postgresql/monitor
sudo -u postgres pg_autoctl create monitor --hostname 203.0.113.10 --pgdata /var/lib/postgresql/monitor

Register primary server with monitor

export PGDATA=/var/lib/postgresql/16/main
sudo -u postgres pg_autoctl create postgres --hostname 203.0.113.10 --pgdata /var/lib/postgresql/16/main --monitor 'postgres://autoctl_node@203.0.113.10:5432/pg_auto_failover'

Register standby server with monitor

export PGDATA=/var/lib/postgresql/16/main
sudo -u postgres pg_autoctl create postgres --hostname 203.0.113.20 --pgdata /var/lib/postgresql/16/main --monitor 'postgres://autoctl_node@203.0.113.10:5432/pg_auto_failover'

Create systemd services for pg_auto_failover

Create systemd services to manage pg_auto_failover automatically.

[Unit]
Description=pg_auto_failover
Requires=postgresql.service
After=postgresql.service
Requisite=postgresql.service

[Service]
Type=exec
User=postgres
Group=postgres
ExecStart=/usr/lib/postgresql/16/bin/pg_autoctl run
Environment=PGDATA=/var/lib/postgresql/16/main
Restart=always
RestartSec=5
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable pgautofailover
sudo systemctl start pgautofailover

Configure connection pooling

Set up connection pooling to handle failover transparently for applications.

sudo apt install -y pgbouncer

sudo dnf install -y pgbouncer

[databases]
timescaledb = host=203.0.113.10 port=5432 dbname=postgres

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 200
default_pool_size = 20

Step-by-step configuration

Create monitoring database

Create a dedicated database for monitoring TimescaleDB replication metrics.

sudo -u postgres createdb tsdb_monitor
sudo -u postgres psql -d tsdb_monitor -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"
sudo -u postgres psql -d tsdb_monitor -c "CREATE TABLE replication_lag (time TIMESTAMPTZ NOT NULL, lag_seconds REAL, lag_bytes BIGINT);"
sudo -u postgres psql -d tsdb_monitor -c "SELECT create_hypertable('replication_lag', 'time');"

Set up replication monitoring

Create a script to monitor replication lag and store metrics in TimescaleDB.

#!/bin/bash
set -e

Get replication lag from primary server
LAG_SECONDS=$(sudo -u postgres psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int;" 2>/dev/null || echo "0")
LAG_BYTES=$(sudo -u postgres psql -t -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication WHERE application_name = 'walreceiver';" 2>/dev/null || echo "0")

Insert metrics into monitoring table
sudo -u postgres psql -d tsdb_monitor -c "INSERT INTO replication_lag (time, lag_seconds, lag_bytes) VALUES (NOW(), $LAG_SECONDS, $LAG_BYTES);"

echo "Replication lag: ${LAG_SECONDS}s, ${LAG_BYTES} bytes"

sudo chmod +x /usr/local/bin/monitor-replication.sh

Configure automated monitoring

Set up a cron job to collect replication metrics every minute.

echo "    * /usr/local/bin/monitor-replication.sh >> /var/log/replication-monitor.log 2>&1" | sudo -u postgres crontab -

Create sample time-series data

Create a sample hypertable to test TimescaleDB replication functionality.

sudo -u postgres psql -d postgres -c "CREATE TABLE sensor_data (time TIMESTAMPTZ NOT NULL, sensor_id INT, temperature REAL, humidity REAL);"
sudo -u postgres psql -d postgres -c "SELECT create_hypertable('sensor_data', 'time');"
sudo -u postgres psql -d postgres -c "INSERT INTO sensor_data VALUES (NOW(), 1, 22.5, 45.2), (NOW(), 2, 23.1, 44.8);"

Verify your setup

Test the TimescaleDB high availability setup with these verification commands.

# Check PostgreSQL and TimescaleDB status
sudo -u postgres psql -c "SELECT version();"
sudo -u postgres psql -c "\dx timescaledb"

Check replication status
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
sudo -u postgres psql -c "SELECT * FROM pg_replication_slots;"

Verify pg_auto_failover status
sudo -u postgres pg_autoctl show state
sudo -u postgres pg_autoctl show events

# Verify standby server is in recovery mode
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

Check WAL receiver status
sudo -u postgres psql -c "SELECT * FROM pg_stat_wal_receiver;"

Test read-only access to replicated data
sudo -u postgres psql -c "SELECT COUNT(*) FROM sensor_data;"

# Simulate primary server failure
sudo systemctl stop postgresql

Verify automatic failover (run on former standby)
sudo -u postgres pg_autoctl show state

Test write access on new primary
sudo -u postgres psql -c "INSERT INTO sensor_data VALUES (NOW(), 3, 21.8, 46.5);"

Monitor replication status and performance

Set up alerting for replication lag

Create a script to alert when replication lag exceeds thresholds.

#!/bin/bash
set -e

MAX_LAG_SECONDS=30
MAX_LAG_BYTES=16777216  # 16MB

Get current replication lag
LAG_SECONDS=$(sudo -u postgres psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int;" 2>/dev/null || echo "999")
LAG_BYTES=$(sudo -u postgres psql -t -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication WHERE application_name = 'walreceiver';" 2>/dev/null || echo "999999999")

Check thresholds and alert
if [[ $LAG_SECONDS -gt $MAX_LAG_SECONDS ]] || [[ $LAG_BYTES -gt $MAX_LAG_BYTES ]]; then
    echo "ALERT: Replication lag exceeded thresholds - ${LAG_SECONDS}s, ${LAG_BYTES} bytes" | logger -t timescaledb-replication
    # Add email/Slack notification here
fi

echo "Replication health check: OK (lag: ${LAG_SECONDS}s, ${LAG_BYTES} bytes)"

sudo chmod +x /usr/local/bin/check-replication-health.sh
echo "/5    * /usr/local/bin/check-replication-health.sh" | sudo crontab -

Monitor cluster with pg_auto_failover

Use pg_auto_failover commands to monitor cluster health and perform maintenance.

# Show current cluster state
sudo -u postgres pg_autoctl show state

Show cluster events and history
sudo -u postgres pg_autoctl show events --count 10

Show cluster configuration
sudo -u postgres pg_autoctl config show

Enable maintenance mode for updates
sudo -u postgres pg_autoctl enable maintenance

Test failover scenarios and recovery procedures

Test automatic failover

Simulate primary server failure and verify automatic promotion of standby.

# Stop primary PostgreSQL to trigger failover
sudo systemctl stop postgresql
sudo systemctl stop pgautofailover

# Watch failover process
sudo -u postgres pg_autoctl watch

Verify new primary status (run on former standby)
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
sudo -u postgres psql -c "INSERT INTO sensor_data VALUES (NOW(), 4, 20.5, 47.2);"

Test manual failover

Perform planned failover for maintenance scenarios.

# Enable maintenance mode on primary
sudo -u postgres pg_autoctl enable maintenance

Perform manual failover
sudo -u postgres pg_autoctl perform failover

Verify cluster state after failover
sudo -u postgres pg_autoctl show state

Test recovery procedures

Restore the original primary server and add it back to the cluster.

# Restart services after fixing issues
sudo systemctl start postgresql
sudo systemctl start pgautofailover

Verify node rejoins as standby
sudo -u postgres pg_autoctl show state
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

Test data consistency
sudo -u postgres psql -c "SELECT COUNT(*) FROM sensor_data;"

Create backup and recovery strategy

Set up automated backups that work with your high availability setup.

#!/bin/bash
set -e

BACKUP_DIR="/var/backups/timescaledb"
DATE=$(date +%Y%m%d_%H%M%S)

Create backup directory
sudo mkdir -p $BACKUP_DIR

Create logical backup
sudo -u postgres pg_dumpall --clean --if-exists > $BACKUP_DIR/full_backup_$DATE.sql

Create WAL archive backup
sudo tar -czf $BACKUP_DIR/wal_archive_$DATE.tar.gz /var/lib/postgresql/16/main/archive/

Cleanup old backups (keep 7 days)
find $BACKUP_DIR -name "*.sql" -mtime +7 -delete
find $BACKUP_DIR -name "*.tar.gz" -mtime +7 -delete

echo "Backup completed: $BACKUP_DIR/full_backup_$DATE.sql"

sudo chmod +x /usr/local/bin/timescaledb-backup.sh
echo "0 2   * /usr/local/bin/timescaledb-backup.sh >> /var/log/timescaledb-backup.log 2>&1" | sudo crontab -

Common issues

Symptom	Cause	Fix
Standby server not connecting	Firewall blocking port 5432	`sudo ufw allow 5432` or configure iptables
Authentication failure for replicator	Incorrect pg_hba.conf configuration	Check host entry and restart PostgreSQL
High replication lag	Network latency or disk I/O bottleneck	Check network connectivity and disk performance
pg_auto_failover not promoting standby	Monitor node cannot reach servers	Check monitor node connectivity and logs
TimescaleDB extension missing after failover	Extension not enabled on standby	`CREATE EXTENSION timescaledb;` after promotion
WAL archive directory full	Archive cleanup not configured	Configure `archive_cleanup_command` or implement cleanup script

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash
set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Script configuration
SCRIPT_NAME="$(basename "$0")"
REPLICATION_USER="replicator"
REPLICATION_PASSWORD="${REPLICATION_PASSWORD:-SecureReplicationPass123}"
STANDBY_IP="${1:-}"
PRIMARY_IP="${2:-}"
POSTGRES_VERSION="16"

# Usage message
usage() {
    echo "Usage: $SCRIPT_NAME <standby_ip> [primary_ip]"
    echo "Example: $SCRIPT_NAME 203.0.113.20 203.0.113.10"
    echo ""
    echo "Environment variables:"
    echo "  REPLICATION_PASSWORD - Password for replication user (default: SecureReplicationPass123)"
    exit 1
}

# Logging functions
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1" >&2; }

# Cleanup on failure
cleanup() {
    log_error "Script failed. Check logs above for details."
    exit 1
}
trap cleanup ERR

# Check arguments
if [[ -z "$STANDBY_IP" ]]; then
    usage
fi

# Check if running as root
if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root or with sudo"
    exit 1
fi

# Auto-detect distribution
if [[ ! -f /etc/os-release ]]; then
    log_error "Cannot detect OS distribution"
    exit 1
fi

. /etc/os-release
case "$ID" in
    ubuntu|debian)
        PKG_MGR="apt"
        PKG_INSTALL="apt install -y"
        PKG_UPDATE="apt update"
        PKG_UPGRADE="apt upgrade -y"
        POSTGRES_USER="postgres"
        POSTGRES_SERVICE="postgresql"
        POSTGRES_DATA_DIR="/var/lib/postgresql/$POSTGRES_VERSION/main"
        POSTGRES_CONFIG_DIR="/etc/postgresql/$POSTGRES_VERSION/main"
        POSTGRES_CONFIG="$POSTGRES_CONFIG_DIR/postgresql.conf"
        POSTGRES_HBA="$POSTGRES_CONFIG_DIR/pg_hba.conf"
        ;;
    almalinux|rocky|centos|rhel|ol|fedora)
        PKG_MGR="dnf"
        PKG_INSTALL="dnf install -y"
        PKG_UPDATE="dnf update -y"
        PKG_UPGRADE="dnf update -y"
        POSTGRES_USER="postgres"
        POSTGRES_SERVICE="postgresql-$POSTGRES_VERSION"
        POSTGRES_DATA_DIR="/var/lib/pgsql/$POSTGRES_VERSION/data"
        POSTGRES_CONFIG_DIR="/var/lib/pgsql/$POSTGRES_VERSION/data"
        POSTGRES_CONFIG="$POSTGRES_CONFIG_DIR/postgresql.conf"
        POSTGRES_HBA="$POSTGRES_CONFIG_DIR/pg_hba.conf"
        POSTGRES_BIN_DIR="/usr/pgsql-$POSTGRES_VERSION/bin"
        ;;
    amzn)
        PKG_MGR="yum"
        PKG_INSTALL="yum install -y"
        PKG_UPDATE="yum update -y"
        PKG_UPGRADE="yum update -y"
        POSTGRES_USER="postgres"
        POSTGRES_SERVICE="postgresql-$POSTGRES_VERSION"
        POSTGRES_DATA_DIR="/var/lib/pgsql/$POSTGRES_VERSION/data"
        POSTGRES_CONFIG_DIR="/var/lib/pgsql/$POSTGRES_VERSION/data"
        POSTGRES_CONFIG="$POSTGRES_CONFIG_DIR/postgresql.conf"
        POSTGRES_HBA="$POSTGRES_CONFIG_DIR/pg_hba.conf"
        POSTGRES_BIN_DIR="/usr/pgsql-$POSTGRES_VERSION/bin"
        ;;
    *)
        log_error "Unsupported distribution: $ID"
        exit 1
        ;;
esac

echo "[1/10] Updating system packages..."
$PKG_UPDATE
$PKG_UPGRADE

echo "[2/10] Installing PostgreSQL and TimescaleDB..."
case "$ID" in
    ubuntu|debian)
        wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | gpg --dearmor -o /usr/share/keyrings/timescaledb.gpg
        echo "deb [signed-by=/usr/share/keyrings/timescaledb.gpg] https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main" > /etc/apt/sources.list.d/timescaledb.list
        $PKG_UPDATE
        $PKG_INSTALL postgresql-$POSTGRES_VERSION postgresql-client-$POSTGRES_VERSION timescaledb-2-postgresql-$POSTGRES_VERSION
        ;;
    almalinux|rocky|centos|rhel|ol|fedora|amzn)
        $PKG_INSTALL "https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm" || true
        cat > /etc/yum.repos.d/timescale_timescaledb.repo << EOF
[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/\$releasever/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
EOF
        $PKG_INSTALL postgresql$POSTGRES_VERSION-server postgresql$POSTGRES_VERSION timescaledb-2-postgresql-$POSTGRES_VERSION
        ;;
esac

echo "[3/10] Initializing PostgreSQL database..."
case "$ID" in
    ubuntu|debian)
        log_info "PostgreSQL automatically initialized on Debian-based systems"
        ;;
    almalinux|rocky|centos|rhel|ol|fedora|amzn)
        if [[ ! -f "$POSTGRES_DATA_DIR/PG_VERSION" ]]; then
            sudo -u postgres $POSTGRES_BIN_DIR/initdb -D "$POSTGRES_DATA_DIR"
        fi
        systemctl enable $POSTGRES_SERVICE
        systemctl start $POSTGRES_SERVICE
        ;;
esac

echo "[4/10] Configuring TimescaleDB extension..."
systemctl start $POSTGRES_SERVICE || true
timescaledb-tune --quiet --yes

echo "[5/10] Configuring primary server for replication..."
# Backup original config
cp "$POSTGRES_CONFIG" "$POSTGRES_CONFIG.backup.$(date +%Y%m%d_%H%M%S)"

# Configure PostgreSQL settings
cat >> "$POSTGRES_CONFIG" << EOF

# TimescaleDB High Availability Configuration
listen_addresses = '*'
wal_level = replica
max_wal_senders = 3
max_replication_slots = 3
wal_keep_size = 1GB
archive_mode = on
archive_command = 'cp %p $POSTGRES_DATA_DIR/archive/%f'

# TimescaleDB settings
shared_preload_libraries = 'timescaledb'
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 1GB
work_mem = 4MB
maintenance_work_mem = 64MB
EOF

echo "[6/10] Creating archive directory..."
mkdir -p "$POSTGRES_DATA_DIR/archive"
chown $POSTGRES_USER:$POSTGRES_USER "$POSTGRES_DATA_DIR/archive"
chmod 750 "$POSTGRES_DATA_DIR/archive"

echo "[7/10] Configuring authentication for replication..."
# Backup original pg_hba.conf
cp "$POSTGRES_HBA" "$POSTGRES_HBA.backup.$(date +%Y%m%d_%H%M%S)"

# Add replication entries
cat >> "$POSTGRES_HBA" << EOF

# TimescaleDB High Availability replication entries
host    replication     $REPLICATION_USER      $STANDBY_IP/32         scram-sha-256
host    all             postgres        $STANDBY_IP/32         scram-sha-256
EOF

echo "[8/10] Restarting PostgreSQL and creating replication user..."
systemctl restart $POSTGRES_SERVICE

# Wait for PostgreSQL to be ready
sleep 5

# Create replication user and enable TimescaleDB
sudo -u postgres psql -c "CREATE USER $REPLICATION_USER WITH REPLICATION ENCRYPTED PASSWORD '$REPLICATION_PASSWORD';" 2>/dev/null || true
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS timescaledb;" 2>/dev/null || true

echo "[9/10] Configuring firewall..."
case "$ID" in
    ubuntu|debian)
        if command -v ufw >/dev/null 2>&1; then
            ufw allow from "$STANDBY_IP" to any port 5432 comment "PostgreSQL replication"
        fi
        ;;
    almalinux|rocky|centos|rhel|ol|fedora|amzn)
        if systemctl is-active --quiet firewalld; then
            firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='$STANDBY_IP' port protocol='tcp' port='5432' accept"
            firewall-cmd --reload
        fi
        if command -v semanage >/dev/null 2>&1; then
            setsebool -P postgresql_can_rsync on
        fi
        ;;
esac

echo "[10/10] Running verification checks..."
# Check PostgreSQL status
if ! systemctl is-active --quiet $POSTGRES_SERVICE; then
    log_error "PostgreSQL service is not running"
    exit 1
fi

# Check if replication user exists
if ! sudo -u postgres psql -tAc "SELECT 1 FROM pg_roles WHERE rolname='$REPLICATION_USER';" | grep -q 1; then
    log_error "Replication user not created"
    exit 1
fi

# Check if TimescaleDB extension is installed
if ! sudo -u postgres psql -tAc "SELECT 1 FROM pg_extension WHERE extname='timescaledb';" | grep -q 1; then
    log_warn "TimescaleDB extension not found, but PostgreSQL is configured for replication"
fi

log_info "TimescaleDB high availability primary server setup completed successfully!"
log_info "Next steps:"
log_info "1. Run this script on the standby server: $STANDBY_IP"
log_info "2. Configure pg_auto_failover for automatic failover"
log_info "3. Test replication by creating data on primary and checking standby"
log_info ""
log_info "Replication user: $REPLICATION_USER"
log_info "Archive directory: $POSTGRES_DATA_DIR/archive"
log_info "Config backup: $POSTGRES_CONFIG.backup.*"

Review the script before running. Execute with: bash install.sh

#timescaledb #postgresql #high-availability #streaming-replication #failover #clustering

Set up TimescaleDB high availability with streaming replication and automatic failover

Prerequisites

What this solves

Step-by-step installation

Update system packages

Install PostgreSQL and TimescaleDB

Initialize PostgreSQL database (AlmaLinux/Rocky only)

Configure TimescaleDB extension

Configure primary server for replication

TimescaleDB settings

Create archive directory

Configure authentication for replication

Create replication user

Test primary server setup

Create base backup for standby

Configure standby server

TimescaleDB settings (same as primary)

Start standby server

Install pg_auto_failover

Set up monitor node

Register primary server with monitor

Register standby server with monitor

Create systemd services for pg_auto_failover

Configure connection pooling

Step-by-step configuration

Create monitoring database

Set up replication monitoring

Get replication lag from primary server

Insert metrics into monitoring table

Configure automated monitoring

Create sample time-series data

Verify your setup

Check replication status

Verify pg_auto_failover status

Check WAL receiver status

Test read-only access to replicated data

Verify automatic failover (run on former standby)

Test write access on new primary

Monitor replication status and performance

Set up alerting for replication lag

Get current replication lag

Check thresholds and alert

Monitor cluster with pg_auto_failover

Show cluster events and history

Show cluster configuration

Enable maintenance mode for updates

sudo -u postgres pg_autoctl enable maintenance

Test failover scenarios and recovery procedures

Test automatic failover

Verify new primary status (run on former standby)

Test manual failover

Perform manual failover

Verify cluster state after failover

Test recovery procedures

Verify node rejoins as standby

Test data consistency

Create backup and recovery strategy

Create backup directory

Create logical backup

Create WAL archive backup

Cleanup old backups (keep 7 days)

Common issues

Next steps

Running this in production?

Related tutorials

Setup ScyllaDB backup validation and automated restore testing

Implement MariaDB backup encryption with Mariabackup and automated restoration

Configure MariaDB Galera cluster for multi-master replication with automatic failover

Don't want to manage this yourself?

`sudo -u postgres pg_autoctl enable maintenance`